xref: /aosp_15_r20/external/cronet/net/docs/proxy.md (revision 6777b5387eb2ff775bb5750e3f5d96f37fb7352b)
1# Proxy support in Chrome
2
3This document establishes basic proxy terminology and describes Chrome-specific
4proxy behaviors.
5
6[TOC]
7
8## Proxy server identifiers
9
10A proxy server is an intermediary used for network requests. A proxy server can
11be described by its address, along with the proxy scheme that should be used to
12communicate with it.
13
14This can be written as a string using either the "PAC format" or the "URI
15format".
16
17The PAC format is how one names a proxy server in [Proxy
18auto-config](https://en.wikipedia.org/wiki/Proxy_auto-config) scripts. For
19example:
20* `PROXY foo:2138`
21* `SOCKS5 foo:1080`
22* `DIRECT`
23
24The "URI format" instead encodes the information as a URL. For example:
25* `foo:2138`
26* `http://foo:2138`
27* `socks5://foo:1080`
28* `direct://`
29
30The port number is optional in both formats. When omitted, a per-scheme default
31is used.
32
33See the [Proxy server schemes](#Proxy-server-schemes) section for details on
34what schemes Chrome supports, and how to write them in the PAC and URI formats.
35
36Most UI surfaces in Chrome (including command lines and policy) expect URI
37formatted proxy server identifiers. However outside of Chrome, proxy servers
38are generally identified less precisely by just an address -- the proxy
39scheme is assumed based on context.
40
41In Windows' proxy settings there are host and port fields for the
42"HTTP", "Secure", "FTP", and "SOCKS" proxy. With the exception of "SOCKS",
43those are all identifiers for insecure HTTP proxy servers (proxy scheme is
44assumed as HTTP).
45
46## Proxy resolution
47
48Proxying in Chrome is done at the URL level.
49
50When the browser is asked to fetch a URL, it needs to decide which IP endpoint
51to send the request to. This can be either a proxy server, or the target host.
52
53This is called proxy resolution. The input to proxy resolution is a URL, and
54the output is an ordered list of [proxy server
55identifiers](#Proxy-server-identifiers).
56
57What proxies to use can be described using either:
58
59* [Manual proxy settings](#Manual-proxy-settings) - proxy resolution is defined
60  using a declarative set of rules. These rules are expressed as a mapping from
61  URL scheme to proxy server identifier(s), and a list of proxy bypass rules for
62  when to go DIRECT instead of using the mapped proxy.
63
64* PAC script - proxy resolution is defined using a JavaScript program, that is
65  invoked whenever fetching a URL to get the list of proxy server identifiers
66  to use.
67
68* Auto-detect - the WPAD protocol is used to probe the network (using DHCP/DNS)
69  and possibly discover the URL of a PAC script.
70
71## Proxy server schemes
72
73When using an explicit proxy in the browser, multiple layers of the network
74request are impacted, depending on the scheme that is used. Some implications
75of the proxy scheme are:
76
77* Is communication to the proxy done over a secure channel?
78* Is name resolution (ex: DNS) done client side, or proxy side?
79* What authentication schemes to the proxy server are supported?
80* What network traffic can be sent through the proxy?
81
82Chrome supports these proxy server schemes:
83
84* [DIRECT](#DIRECT-proxy-scheme)
85* [HTTP](#HTTP-proxy-scheme)
86* [HTTPS](#HTTPS-proxy-scheme)
87* [SOCKSv4](#SOCKSv4-proxy-scheme)
88* [SOCKSv5](#SOCKSv5-proxy-scheme)
89* [QUIC](#QUIC-proxy-scheme)
90
91### DIRECT proxy scheme
92
93* Default port: N/A (neither host nor port are applicable)
94* Example identifier (PAC): `DIRECT`
95* Example identifier (URI): `direct://`
96
97This is a pseudo proxy scheme that indicates instead of using a proxy we are
98sending the request directly to the target server.
99
100It is imprecise to call this a "proxy server", but it is a convenient abstraction.
101
102### HTTP proxy scheme
103
104* Default port: 80
105* Example identifier (PAC): `PROXY proxy:8080`, `proxy` (non-standard; don't use)
106* Example identifiers (URI): `http://proxy:8080`, `proxy:8080` (can omit scheme)
107
108Generally when one refers to a "proxy server" or "web proxy", they are talking
109about an HTTP proxy.
110
111When using an HTTP proxy in Chrome, name resolution is always deferred to the
112proxy. HTTP proxies can proxy `http://`, `https://`, `ws://` and `wss://` URLs.
113
114Communication to HTTP proxy servers is insecure, meaning proxied `http://`
115requests are sent in the clear. When proxying `https://` requests through an
116HTTP proxy, the TLS exchange is forwarded through the proxy using the `CONNECT`
117method, so end-to-end encryption is not broken. However when establishing the
118tunnel, the hostname of the target URL is sent to the proxy server in the
119clear.
120
121HTTP proxies in Chrome support the same HTTP authentiation schemes as for
122target servers: Basic, Digest, Negotiate, NTLM.
123
124### HTTPS proxy scheme
125
126* Default port: 443
127* Example identifier (PAC): `HTTPS proxy:8080`
128* Example identifier (URI): `https://proxy:8080`
129
130This works like an [HTTP proxy](#HTTP-proxy-scheme), except the
131communication to the proxy server is protected by TLS, and may negotiate
132HTTP/2 (but not QUIC).
133
134Because the connection to the proxy server is secure, https:// requests
135sent through the proxy are not sent in the clear as with an HTTP proxy.
136Similarly, since CONNECT requests are sent over a protected channel, the
137hostnames for proxied https:// URLs is also not revealed.
138
139In addition to the usual HTTP authentication methods, HTTPS proxies also
140support client certificates.
141
142HTTPS proxies using HTTP/2 can offer better performance in Chrome than a
143regular HTTP proxy due to higher connection limits (HTTP/1.1 proxies in Chrome
144are limited to 32 simultaneous connections across all domains).
145
146Chrome, Firefox, and Opera support HTTPS proxies; however, most older HTTP
147stacks do not.
148
149Specifying an HTTPS proxy is generally not possible through system proxy
150settings. Instead, one must use either a PAC script or a Chrome proxy setting
151(command line, extension, or policy).
152
153See the dev.chromium.org document on [secure web
154proxies](http://dev.chromium.org/developers/design-documents/secure-web-proxy)
155for tips on how to run and test against an HTTPS proxy.
156
157### SOCKSv4 proxy scheme
158
159* Default port: 1080
160* Example identifiers (PAC): `SOCKS4 proxy:8080`, `SOCKS proxy:8080`
161* Example identifier (URI): `socks4://proxy:8080`
162
163SOCKSv4 is a simple transport layer proxy that wraps a TCP socket. Its use
164is transparent to the rest of the protocol stack; after an initial
165handshake when connecting the TCP socket (to the proxy), the rest of the
166loading stack is unchanged.
167
168No proxy authentication methods are supported for SOCKSv4.
169
170When using a SOCKSv4 proxy, name resolution for target hosts is always done
171client side, and moreover must resolve to an IPv4 address (SOCKSv4 encodes
172target address as 4 octets, so IPv6 targets are not possible).
173
174There are extensions to SOCKSv4 that allow for proxy side name resolution, and
175IPv6, namely SOCKSv4a. However Chrome does not allow configuring, or falling
176back to v4a.
177
178A better alternative is to just use the newer version of the protocol, SOCKSv5
179(which is still 20+ years old).
180
181### SOCKSv5 proxy scheme
182
183* Default port: 1080
184* Example identifier (PAC): `SOCKS5 proxy:8080`
185* Example identifiers (URI): `socks://proxy:8080`, `socks5://proxy:8080`
186
187[SOCKSv5](https://tools.ietf.org/html/rfc1928) is a transport layer proxy that
188wraps a TCP socket, and allows for name resolution to be deferred to the proxy.
189
190In Chrome when a proxy's scheme is set to SOCKSv5, name resolution is always
191done proxy side (even though the protocol allows for client side as well). In
192Firefox client side vs proxy side name resolution can be configured with
193`network.proxy.socks_remote_dns`; Chrome has no equivalent option and will
194always use proxy side resolution.
195
196No authentication methods are supported for SOCKSv5 in Chrome (although some do
197exist for the protocol).
198
199A handy way to create a SOCKSv5 proxy is with `ssh -D`, which can be used to
200tunnel web traffic to a remote host over SSH.
201
202In Chrome SOCKSv5 is only used to proxy TCP-based URL requests. It cannot be
203used to relay UDP traffic.
204
205### QUIC proxy scheme
206
207* Default (UDP) port: 443
208* Example identifier (PAC): `QUIC proxy:8080`
209* Example identifier (URI): `quic://proxy:8080`
210
211A QUIC proxy uses QUIC (UDP) as the underlying transport, but otherwise
212behaves as an HTTP proxy. It has similar properties to an [HTTPS
213proxy](#HTTPS-proxy-scheme), in that the connection to the proxy server
214is secure, and connection limits are less restrictive.
215
216Support for QUIC proxies in Chrome is currently experimental and not
217ready for production use. In particular, sending https:// and wss://
218URLs through a QUIC proxy is [disabled by
219default](https://bugs.chromium.org/p/chromium/issues/detail?id=969859).
220
221Another caveat is that QUIC does not currently support
222client certificates since it does not use a TLS
223handshake. This may change in future versions.
224
225## Manual proxy settings
226
227The simplest way to configure proxy resolution is by providing a static list of
228rules comprised of:
229
2301. A mapping of URL schemes to [proxy server identifiers](#Proxy-server-identifiers).
2312. A list of [proxy bypass rules](#Proxy-bypass-rules)
232
233We refer to this mode of configuration as "manual proxy settings".
234
235Manual proxy settings can succinctly describe setups like:
236
237* Use proxy `http://foo:8080` for all requests
238* Use proxy `http://foo:8080` for all requests except those to a `google.com`
239  subdomain.
240* Use proxy `http://foo:8080` for all `https://` requests, and proxy
241  `socsk5://mysocks:90` for everything else
242
243Although manual proxy settings are a ubiquituous way to configure proxies
244across platforms, there is no standard representation or feature set.
245
246Chrome's manual proxy settings most closely resembles that of WinInet. But it
247also supports idioms from other platforms -- for instance KDE's notion of
248reversing the bypass list, or Gnome's interpretation of bypass patterns as
249suffix matches.
250
251When defining manual proxy settings in Chrome, we specify three (possibly
252empty) lists of [proxy server identifiers](#Proxy-server-identifiers).
253
254  * proxies for HTTP - A list of proxy server identifiers to use for `http://`
255    requests, if non-empty.
256  * proxies for HTTPS - A list of proxy server identifiers to use for
257    `https://` requests, if non-empty.
258  * other proxies - A list of proxy server identifiers to use for everything
259    else (whatever isn't matched by the other two lists)
260
261There are a lot of ways to end up with manual proxy settings in Chrome
262(discussed in other sections).
263
264The following examples will use the command line method. Launching Chrome with
265`--proxy-server=XXX` (and optionally `--proxy-bypass-list=YYY`)
266
267Example: To use proxy `http://foo:8080` for all requests we can launch
268Chrome with `--proxy-server="http://foo:8080"`. This translates to:
269
270  * proxies for HTTP - *empty*
271  * proxies for HTTPS - *empty*
272  * other proxies - `http://foo:8080`
273
274With the above configuration, if the proxy server was unreachable all requests
275would fail with `ERR_PROXY_CONNECTION_FAILED`. To address this we could add a
276fallback to `DIRECT` by launching using
277`--proxy-server="http://foo:8080,direct://"` (note the comma separated list).
278This command line means:
279
280  * proxies for HTTP - *empty*
281  * proxies for HTTPS - *empty*
282  * other proxies - `http://foo:8080`, `direct://`
283
284If instead we wanted to proxy only `http://` URLs through the
285HTTPS proxy `https://foo:443`, and have everything else use the SOCKSv5 proxy
286`socks5://mysocks:1080` we could launch Chrome with
287`--proxy-server="http=https://foo:443;socks=socks5://mysocks:1080"`. This now
288expands to:
289
290  * proxies for HTTP - `https://foo:443`
291  * proxies for HTTPS - *empty*
292  * other proxies - `socks5://mysocks:1080`
293
294The command line above uses WinInet's proxy map format, with some additional
295features:
296
297* Instead of naming proxy servers by just a hostname:port, you can use Chrome's
298  URI format for proxy server identifiers. In other words, you can prefix the
299  proxy scheme so it doesn't default to HTTP.
300* The `socks=` mapping is understood more broadly as "other proxies". The
301  subsequent proxy list can include proxies of any scheme, however if the
302  scheme is omitted it will be understood as SOCKSv4 rather than HTTP.
303
304### Mapping WebSockets URLs to a proxy
305
306[Manual proxy settings](#Manual-proxy-settings) don't have mappings for `ws://`
307or `wss://` URLs.
308
309Selecting a proxy for these URL schemes is a bit different from other URL
310schemes. The algorithm that Chrome uses is:
311
312* If "other proxies" is non-empty use it
313* If "proxies for HTTPS" is non-empty use it
314* Otherwise use "proxies for HTTP"
315
316This is per the recommendation in section 4.1.3 of [RFC
3176455](https://tools.ietf.org/html/rfc6455).
318
319It is possible to route `ws://` and `wss://` separately using a PAC script.
320
321### Proxy credentials in manual proxy settings
322
323Most platforms' [manual proxy settings](#Manual-proxy-settings) allow
324specifying a cleartext username/password for proxy sign in. Chrome does not
325implement this, and will not use any credentials embedded in the proxy
326settings.
327
328Proxy authentication will instead go through the ordinary flow to find
329credentials.
330
331## Proxy bypass rules
332
333In addition to specifying three lists of [proxy server
334identifiers](#proxy-server-identifiers), Chrome's [manual proxy
335settings](#Manual-proxy-settings) lets you specify a list of "proxy bypass
336rules".
337
338This ruleset determines whether a given URL should skip use of a proxy all
339together, even when a proxy is otherwise defined for it.
340
341This concept is also known by names like "exception list", "exclusion list" or
342"no proxy list".
343
344Proxy bypass rules can be written as an ordered list of strings. Ordering
345generally doesn't matter, but may when using subtractive rules.
346
347When manual proxy settings are specified from the command line, the
348`--proxy-bypass-list="RULES"` switch can be used, where `RULES` is a semicolon
349or comma separated list of bypass rules.
350
351Following are the string constructions for the bypass rules that Chrome
352supports. They can be used when defining a Chrome manual proxy settings from
353command line flags, extensions, or policy.
354
355When using system proxy settings, one should use the platform's rule format and
356not Chrome's.
357
358### Bypass rule: Hostname
359
360```
361[ URL_SCHEME "://" ] HOSTNAME_PATTERN [ ":" <port> ]
362```
363
364Matches a hostname using a wildcard pattern, and an optional scheme and port
365restriction.
366
367Examples:
368
369* `foobar.com` - Matches URL of any scheme and port, whose normalized host is
370  `foobar.com`
371* `*foobar.com` - Matches URL of any scheme and port, whose normalized host
372  ends with `foobar.com` (for instance `blahfoobar.com` and `foo.foobar.com`).
373* `*.org:443` - Matches URLs of any scheme, using port 443 and whose top level
374  domain is `.org`
375* `https://x.*.y.com:99` - Matches https:// URLs on port 99 whose normalized
376  hostname matches `x.*.y.com`
377
378### Bypass rule: Subdomain
379
380```
381[ URL_SCHEME "://" ] "." HOSTNAME_SUFFIX_PATTERN [ ":" PORT ]
382```
383
384Hostname patterns that start with a dot are special cased to mean a subdomain
385matches. `.foo.com` is effectively another way of writing `*.foo.com`.
386
387Examples:
388
389* `.google.com` - Matches `calendar.google.com` and `foo.bar.google.com`, but
390  not `google.com`.
391* `http://.google.com` - Matches only http:// URLs that are a subdomain of `google.com`.
392
393### Bypass rule: IP literal
394
395```
396[ SCHEME "://" ] IP_LITERAL [ ":" PORT ]
397```
398
399Matches URLs that are IP address literals, and optional scheme and port
400restrictions. This is a special case of hostname matching that takes into
401account IP literal canonicalization. For example the rules `[0:0:0::1]` and
402`[::1]` are equivalent (both represent the same IPv6 address).
403
404Examples:
405
406* `127.0.0.1`
407* `http://127.0.0.1`
408* `[::1]` - Matches any URL to the IPv6 loopback address.
409* `[0:0::1]` - Same as above
410* `http://[::1]:99` - Matches any http:// URL to the IPv6 loopback on port 99
411
412### Bypass rule: IPv4 address range
413
414```
415IPV4_LITERAL "/" PREFIX_LENGTH_IN_BITS
416```
417
418Matches any URL whose hostname is an IPv4 literal, and falls between the given
419address range.
420
421Note this [only applies to URLs that are IP
422literals](#Meaning-of-IP-address-range-bypass-rules).
423
424Examples:
425
426* `192.168.1.1/16`
427
428### Bypass rule: IPv6 address range
429
430```
431IPV6_LITERAL "/" PREFIX_LENGTH_IN_BITS
432```
433
434Matches any URL that is an IPv6 literal that falls between the given range.
435Note that IPv6 literals must *not* be bracketed.
436
437Note this [only applies to URLs that are IP
438literals](#Meaning-of-IP-address-range-bypass-rules).
439
440Examples:
441
442* `fefe:13::abc/33`
443* `[fefe::]/40` -- WRONG! IPv6 literals must not be bracketed.
444
445### Bypass rule: Simple hostnames
446
447```
448<local>
449```
450
451Matches hostnames without a period in them, and that are not IP literals. This
452is a naive string search -- meaning that periods appearing *anywhere* count
453(including trailing dots!).
454
455This rule corresponds to the "Exclude simple hostnames" checkbox on macOS and
456the "Don't use proxy server for local (intranet) addresses" on Windows.
457
458The rule name comes from WinInet, and can easily be confused with the concept
459of localhost. However the two concepts are completely orthogonal. In practice
460one wouldn't add rules to bypass localhost, as it is [already done
461implicitly](#Implicit-bypass-rules).
462
463### Bypass rule: Subtract implicit rules
464
465```
466<-loopback>
467```
468
469*Subtracts* the [implicit proxy bypass rules](#Implicit-bypass-rules)
470(localhost and link local addresses). This is generally only needed for test
471setups. Beware of the security implications to proxying localhost.
472
473Whereas regular bypass rules instruct the browser about URLs that should *not*
474use the proxy, this rule has the opposite effect and tells the browser to
475instead *use* the proxy.
476
477Ordering may matter when using a subtractive rule, as rules will be evaluated
478in a left-to-right order. `<-loopback>;127.0.0.1` has a subtly different effect
479than `127.0.0.1;<-loopback>`.
480
481### Meaning of IP address range bypass rules
482
483The IP address range bypass rules in manual proxy settings applies only to URL
484literals. This is not what one would intuitively expect.
485
486Example:
487
488Say we have have configured a proxy for all requests, but added a bypass rule
489for `192.168.0.0.1/16`. If we now navigate to `http://foo` (which resolves
490to `192.168.1.5` in our setup) will the browser connect directly (bypass proxy)
491because we have indicated a bypass rule that includes this IP?
492
493It will go through the proxy.
494
495The bypass rule in this case is not applicable, since the browser never
496actually does a name resolution for `foo`. Proxy resolution happens before
497name resolution, and depending on what proxy scheme is subsequently chosen,
498client side name resolution may never be performed.
499
500The usefulness of IP range proxy bypass rules is rather limited, as they only
501apply to requests whose URL was explicitly an IP literal.
502
503If proxy decisions need to be made based on the resolved IP address(es) of a
504URL's hostname, one must use a PAC script.
505
506## Implicit bypass rules
507
508Requests to certain hosts will not be sent through a proxy, and will instead be
509sent directly.
510
511We call these the _implicit bypass rules_. The implicit bypass rules match URLs
512whose host portion is either a localhost name or a link-local IP literal.
513Essentially it matches:
514
515```
516localhost
517*.localhost
518[::1]
519127.0.0.1/8
520169.254/16
521[FE80::]/10
522```
523
524The complete rules are slightly more complicated. For instance on
525Windows we will also recognize `loopback`.
526
527This concept of implicit proxy bypass rules is consistent with the
528platform-level proxy support on Windows and macOS (albeit with some differences
529due to their implementation quirks - see compatibility notes in
530`net::ProxyBypassRules::MatchesImplicitRules`)
531
532Why apply implicit proxy bypass rules in the first place? Certainly there are
533considerations around ergonomics and user expectation, but the bigger problem
534is security. Since the web platform treats `localhost` as a secure origin, the
535ability to proxy it grants extra powers. This is [especially
536problematic](https://bugs.chromium.org/p/chromium/issues/detail?id=899126) when
537proxy settings are externally controllable, as when using PAC scripts.
538
539Historical support in Chrome:
540
541* Prior to M71 there were no implicit proxy bypass rules, except if using
542  [`--winhttp-proxy-resolver`](#winhttp_proxy_resolver-command-line-switch).
543* In M71 Chrome applied implicit proxy bypass rules to PAC scripts
544* In M72 Chrome generalized the implicit proxy bypass rules to manually
545  configured proxies
546
547### Overriding the implicit bypass rules
548
549If you want traffic to `localhost` to be sent through a proxy despite the
550security concerns, it can be done by adding the special proxy bypass rule
551`<-loopback>`. This has the effect of _subtracting_ the implicit rules.
552
553For instance, launch Chrome with the command line flag:
554
555```
556--proxy-bypass-list="<-loopback>"
557```
558
559Note that there currently is no mechanism to disable the implicit proxy bypass
560rules when using a PAC script. Proxy bypass lists only apply to manual
561settings, so the technique above cannot be used to let PAC scripts decide the
562proxy for localhost URLs.
563
564## Evaluating proxy lists (proxy fallback)
565
566Proxy resolution results in a _list_ of [proxy server
567identifiers](#Proxy-server-identifiers) to use for a
568given request, not just a single proxy server identifier.
569
570For instance, consider this PAC script:
571
572```
573function FindProxyForURL(url, host) {
574    if (host == "www.example.com") {
575        return "PROXY proxy1; HTTPS proxy2; SOCKS5 proxy3";
576    }
577    return "DIRECT";
578}
579
580```
581
582What proxy will Chrome use for connections to `www.example.com`, given that
583we have a choice of three separate proxy server identifiers to choose from
584{`http://proxy1:80`, `https://proxy2:443`, `socks5://proxy3:1080`}?
585
586Initially, Chrome will try the proxies in order. This means first attempting
587the request through `http://proxy1:80`. If that "fails", the request is
588next attempted through `https://proxy2:443`. Lastly if that fails, the
589request is attempted through `socks5://proxy3:1080`.
590
591This process is referred to as _proxy fallback_. What constitutes a
592"failure" is described later.
593
594Proxy fallback is stateful. The actual order of proxy attempts made be Chrome
595is influenced by the past responsiveness of proxy servers.
596
597Let's say we request `http://www.example.com/`. Per the PAC script this
598resolves to a list of three proxy server identifiers:
599
600{`http://proxy1:80`, `https://proxy2:443`, `socks5://proxy3:1080`}
601
602Chrome will first attempt to issue the request through these proxies in the
603left-to-right order.
604
605Let's say that the attempt through `http://proxy1:80` fails, but then the
606attempt through `https://proxy2:443` succeeds. Chrome will mark
607`http://proxy1:80` as _bad_ for the next 5 minutes. Being marked as _bad_
608means that `http://proxy1:80` is de-prioritized with respect to
609other proxy server identifiers (including `direct://`) that are not marked as
610bad.
611
612That means the next time `http://www.example.com/` is requested, the effective
613order for proxies to attempt will be:
614
615{`https://proxy2:443`, `socks5://proxy3:1080`, `http://proxy1:80`}
616
617Conceptually, _bad_ proxies are moved to the end of the list, rather than being
618removed from consideration all together.
619
620What constitutes a "failure" when it comes to triggering proxy fallback depends
621on the proxy type. Generally speaking, only connection level failures
622are deemed eligible for proxy fallback. This includes:
623
624* Failure resolving the proxy server's DNS
625* Failure connecting a TCP socket to the proxy server
626
627(There are some caveats for how HTTPS and QUIC proxies count failures for
628fallback)
629
630Prior to M67, Chrome would consider failures establishing a
631CONNECT tunnel as an error eligible for proxy fallback. This policy [resulted
632in problems](https://bugs.chromium.org/p/chromium/issues/detail?id=680837) for
633deployments whose HTTP proxies intentionally failed certain https:// requests,
634since that necessitates inducing a failure during the CONNECT tunnel
635establishment. The problem would occur when a working proxy fallback option
636like DIRECT was given, since the failing proxy would then be marked as bad.
637
638Currently there are no options to configure proxy fallback (including disabling
639the caching of bad proxies). Future versions of Chrome may [remove caching
640of bad proxies](https://bugs.chromium.org/p/chromium/issues/detail?id=936130)
641to make fallback predictable.
642
643To investigate issues relating to proxy fallback, one can [collect a NetLog
644dump using
645chrome://net-export/](https://dev.chromium.org/for-testers/providing-network-details).
646These logs can then be loaded with the [NetLog
647viewer](https://netlog-viewer.appspot.com/).
648
649There are a few things of interest in the logs:
650
651* The "Proxy" tab will show which proxies (if any) were marked as bad at the
652  time the capture ended.
653* The "Events" tab notes what the resolved proxy list was, and what the
654  re-ordered proxy list was after taking into account bad proxies.
655* The "Events" tab notes when a proxy is marked as bad and why (provided the
656  event occurred while capturing was enabled).
657
658When debugging issues with bad proxies, it is also useful to reset Chrome's
659cache of bad proxies. This can be done by clicking the "Clear bad proxies"
660button on
661[chrome://net-internals/#proxy](chrome://net-internals/#proxy). Note the UI
662will not give feedback that the bad proxies were cleared, however capturing a
663new NetLog dump can confirm it was cleared.
664
665## Arguments passed to FindProxyForURL() in PAC scripts
666
667PAC scripts in Chrome are expected to define a JavaScript function
668`FindProxyForURL`.
669
670The historical signature for this function is:
671
672```
673function FindProxyForURL(url, host) {
674  ...
675}
676```
677
678Scripts can expect to be called with string arguments `url` and `host` such
679that:
680
681* `url` is a *sanitized* version of the request's URL
682* `host` is the unbracketed host portion of the origin.
683
684Sanitization of the URL means that the path, query, fragment, and identity
685portions of the URL are stripped. Effectively `url` will be
686limited to a `scheme://host:port/` style URL
687
688Examples of how `FindProxyForURL()` will be called:
689
690```
691// Actual URL:   https://www.google.com/Foo
692FindProxyForURL('https://www.google.com/', 'www.google.com')
693
694// Actual URL:   https://[dead::beef]/foo?bar
695FindProxyForURL('https://[dead::beef]/', 'dead::beef')
696
697// Actual URL:   https://www.example.com:8080#search
698FindProxyForURL('https://www.example.com:8080/', 'example.com')
699
700// Actual URL:   https://username:[email protected]
701FindProxyForURL('https://www.example.com/', 'example.com')
702```
703
704Stripping the path and query from the `url` is a departure from the original
705Netscape implementation of PAC. It was introduced in Chrome 52 for [security
706reasons](https://bugs.chromium.org/p/chromium/issues/detail?id=593759).
707
708There is currently no option to turn off sanitization of URLs passed to PAC
709scripts (removed in Chrome 75).
710
711The sanitization of http:// URLs currently has a different policy, and does not
712strip query and path portions of the URL. That said, users are advised not to
713depend on reading the query/path portion of any URL
714type, since future versions of Chrome may [deprecate that
715capability](https://bugs.chromium.org/p/chromium/issues/detail?id=882536) in
716favor of a consistent policy.
717
718## Resolving client's IP address within a PAC script using myIpAddress()
719
720PAC scripts can invoke `myIpAddress()` to obtain the client's IP address. This
721function returns a single IP literal, or `"127.0.0.1"` on failure.
722
723This API is [inherently ambiguous when used on multi-homed
724hosts](#myIpAddress_myIpAddressEx_and-multi_homed-hosts), as such hosts can
725have multiple IP addresses and yet the browser can pick just one to return.
726
727Chrome's algorithm for `myIpAddress()` favors returning the IP that would be
728used if we were to connect to the public internet, by executing the following
729ordered steps and short-circuiting once the first candidate IP is found:
730
7311. Select the IP of an interface that can route to public Internet:
732    * Probe for route to `8.8.8.8`.
733    * Probe for route to `2001:4860:4860::8888`.
7342. Select an IP by doing a DNS resolve of the machine's hostname:
735    * Select the first IPv4 result if there is one.
736    * Select the first IP result if there is one.
7373. Select the IP of an interface that can route to private IP space:
738    * Probe for route to `10.0.0.0`.
739    * Probe for route to `172.16.0.0`.
740    * Probe for route to `192.168.0.0`.
741    * Probe for route to `FC00::`.
742
743Note that when searching for candidate IP addresses, link-local and loopback
744addresses are skipped over. Link-local or loopback address will only be returned as a
745last resort when no other IP address was found by following these steps.
746
747This sequence of steps explicitly favors IPv4 over IPv6 results, to match
748Internet Explorer's IPv6 support.
749
750*Historical note*: Prior to M72, Chrome's implementation of `myIpAddress()` was
751effectively just `getaddrinfo(gethostname)`. This is now step 2 of the heuristic.
752
753## Resolving client's IP address within a PAC script using myIpAddressEx()
754
755Chrome supports the [Microsoft PAC
756extension](https://docs.microsoft.com/en-us/windows/desktop/winhttp/myipaddressex)
757`myIpAddressEx()`.
758
759This is like `myIpAddress()`, but instead of returning a single IP address, it
760can return multiple IP addresses. It returns a string containing a semi-colon
761separated list of addresses. On failure it returns an empty string to indicate
762no results (whereas `myIpAddress()` returns `127.0.0.1`).
763
764There are some differences with Chrome's implementation:
765
766* In Chrome the function is unconditionally defined, whereas in Internet
767  Explorer one must have used the `FindProxyForURLEx` entrypoint.
768* Chrome [does not necessarily enumerate all of the host's network
769  interfaces](#myIpAddress_myIpAddressEx_and-multi_homed-hosts)
770* Chrome does not return link-local or loopback addresses (except if no other
771  addresses were found).
772
773The algorithm that Chrome uses is nearly identical to that of `myIpAddress()`
774described earlier, but in certain cases may return multiple IPs.
775
7761. Select all the IPs of interfaces that can route to public Internet:
777    * Probe for route to `8.8.8.8`.
778    * Probe for route to `2001:4860:4860::8888`.
779    * If any IPs were found, return them, and finish.
7802. Select an IP by doing a DNS resolve of the machine's hostname:
781    * If any IPs were found, return them, and finish.
7823. Select the IP of an interface that can route to private IP space:
783    * Probe for route to `10.0.0.0`.
784    * Probe for route to `172.16.0.0`.
785    * Probe for route to `192.168.0.0`.
786    * Probe for route to `FC00::`.
787    * If any IPs were found, return them, and finish.
788
789Note that short-circuiting happens whenever steps 1-3 find a candidate IP. So
790for example if at least one IP address was discovered by checking routes to
791public Internet, only those IPs will be returned, and steps 2-3 will not run.
792
793## myIpAddress() / myIpAddressEx() and multi-homed hosts
794
795`myIpAddress()` is a poor API for hosts that have multiple IP addresses, as it
796can only return a single IP, which may or may not be the one you wanted. Both
797`myIpAddress()` and `myIpAddressEx()` favor returning the IP for the interface
798that would be used to route to the public internet.
799
800As an API, `myIpAddressEx()` offers more flexibility since it can return
801multiple IP addresses. However Chrome's implementation restricts which IPs a
802PAC script can see [due to privacy
803concerns](https://bugs.chromium.org/p/chromium/issues/detail?id=905366). So
804using `myIpAddressEx()` is not as powerful as enumerating all the host's IPs,
805and may not address all use-cases.
806
807A more reliable strategy for PAC scripts to check which network(s) a user is on
808is to probe test domains using `dnsResolve()` / `dnsResolveEx()`.
809
810Moreover, note that Chrome does not support the Firefox-specific
811`pacUseMultihomedDNS` option, so adding that global to a PAC script has no
812special side-effect in Chrome. Whereas in Firefox it reconfigures
813`myIpAddress()` to be dependent on the target URL that `FindProxyForURL()` was
814called with.
815
816## Android quirks
817
818Proxy resolving via PAC works differently on Android than other desktop Chrome
819platforms:
820
821* Android Chrome uses the same Chromium PAC resolver, however does not run it
822  out-of-process as on Desktop Chrome. This architectural difference is
823  due to the higher process cost on Android, and means Android Chrome is more
824  susceptible to malicious PAC scripts. The other consequence is that Android
825  Chrome can have distinct regressions from Desktop Chrome as the service setup
826  is quite different (and most `browser_tests` are not run on Android either).
827
828* [WebView does not use Chrome's PAC
829  resolver](https://bugs.chromium.org/p/chromium/issues/detail?id=989667).
830  Instead Android WebView uses the Android system's PAC resolver, which is less
831  optimized and uses an old build of V8. When the system is configured to use
832  PAC, Android WebView's net code will see the proxy settings as being a
833  single HTTP proxy on `localhost`. The system localhost proxy will in turn
834  evaluate the PAC script and forward the HTTP request on to the resolved
835  proxy. This translation has a number of effects, including what proxy
836  schemes are supported, the maximum connection limits, how proxy fallback
837  works, and overall performance (the current Android PAC evaluator blocks on
838  DNS).
839
840* Android system log messages for `PacProcessor` are not related to Chrome or
841  its PAC evaluator. Rather, these are log messages generated by the Android
842  system's PAC implementation. This confusion can arise when users add
843  `alert()` to debug PAC script logic, and then refer to output in `logcat` to
844  try and diagnose a resolving issue in Android Chrome.
845
846## Downloading PAC scripts
847
848When a network context is configured to use a PAC script, proxy resolution will
849stall while downloading the PAC script.
850
851Fetches for PAC URLs are initiated by the network stack, and behave differently
852from ordinary web visible requests:
853
854* Must complete within 30 seconds.
855* Must complete with an HTTP response code of exactly 200.
856* Must have an uncompressed body smaller than 1 MB.
857* Do not follow ordinary HTTP caching semantics.
858* Are never fetched through a proxy
859* Are not visible to the WebRequest extension API, or to service workers.
860* Do not support HTTP authentication (ambient authentication may work, but
861  cannot prompt UI for credentials).
862* Do not support client certificates (including `AutoSelectCertificateForUrls`)
863* Do not support auxiliary certificate network fetches (will only used cached
864  OCSP, AIA, and CRL responses during certificate verification).
865
866### Caching of successful PAC fetches
867
868PAC URLs are always fetched from the network, and never from the HTTP cache.
869After a PAC URL is successfully fetched, its contents (which are used to create
870a long-lived Java Script context) will be assumed to be fresh until either:
871
872* The network changes (IP address changes, DNS configuration changes)
873* The response becomes older than 12 hours
874* A user explicitly invalidates PAC through `chrome://net-internals#proxy`
875
876Once considered stale, the PAC URL will be re-fetched the next time proxy
877resolution is requested.
878
879### Fallback for failed PAC fetches
880
881When the proxy settings are configured to use a PAC URL, and that PAC URL
882cannot be fetched, proxy resolution will fallback to the next option, which is
883often `DIRECT`:
884
885* If using system proxy settings, and the platform supports fallback to manual
886  proxy settings (e.g. Windows), the specified manual proxy servers will be
887  used after the PAC fetch fails.
888* If using Chrome's proxy settings, and the PAC script was marked as
889  [mandatory](https://developer.chrome.com/extensions/proxy), fallback to
890  `DIRECT` is not permitted. Subsequent network requests will fail proxy
891  resolution and complete with `ERR_MANDATORY_PROXY_CONFIGURATION_FAILED`.
892* Otherwise proxy resolution will silently fall back to `DIRECT`.
893
894### Recovering from failed PAC fetches
895
896When fetching an explicitly configured PAC URL fails, the browser will try to
897re-fetch it:
898
899* In exactly 8 seconds
900* 32 seconds after that
901* 2 minutes after that
902* Every 4 hours thereafter
903
904This background polling of the PAC URL is only initiated in response to an
905incoming proxy resolution request, so it will not trigger work when the browser
906is otherwise idle.
907
908Similarly to successful fetches, the PAC URL will be also be re-fetched
909whenever the network changes, the proxy settings change, or it was manually
910invalidated via `chrome://net-internals#proxy`.
911
912### Text encoding
913
914Note that UTF-8 is *not* the default interpretation of PAC response bodies.
915
916The priority for encoding is determined in this order:
917
9181. The `charset` property of the HTTP response's `Content-Type`
9192. Any BOM at the start of response body
9203. Otherwise defaults to ISO-8859-1.
921
922When setting the `Content-Type`, servers should prefer using a mime type of
923`application/x-ns-proxy-autoconfig` or `application/x-javascript-config`.
924However in practice, Chrome does not enforce the mime type.
925
926## Capturing a Net Log for debugging proxy resolution issues
927
928Issues in proxy resolution are best investigated using a Net Log.
929
930A good starting point is to follow the [general instructions for
931net-export](https://www.chromium.org/for-testers/providing-network-details),
932*and while the Net Log is being captured perform these steps*:
933
9341. Reproduce the failure (ex: load a URL that fails)
9352. If you can reproduce a success, do so (ex: load a different URL that succeeds).
9363. In a new tab, navigate to `chrome://net-internals/#proxy` and click both
937   buttons ("Re-apply settings" and "Clear bad proxies").
9384. Repeat step (1)
9395. Stop the Net Log and save the file.
940
941The resulting Net Log should have enough information to diagnose common
942problems. It can be attached to a bug report, or explored using the [Net Log
943Viewer](https://netlog-viewer.appspot.com/). See the next section for some tips
944on analyzing it.
945
946## Analyzing Net Logs for proxy issues
947
948Load saved Net Logs using [Net Log Viewer](https://netlog-viewer.appspot.com/).
949
950### Proxy overview tab
951
952Start by getting a big-picture view of the proxy settings by clicking to the
953"Proxy" tab on the left. This summarizes the proxy settings at the time the
954_capture ended_.
955
956* Does the _original_ proxy settings match expectation?
957  The proxy settings might be coming from:
958  * Managed Chrome policy (chrome://policy)
959  * Command line flags (ex: `--proxy-server`)
960  * (per-profile) Chrome extensions (ex: [chrome.proxy](https://developer.chrome.com/extensions/proxy))
961  * (per-network) System proxy settings
962
963* Was [proxy autodetect (WPAD)](#Web-Proxy-Auto_Discovery-WPAD) specified? In
964  this case the final URL probed will be reflected by the difference between
965  the "Effective" and "Original" settings.
966
967* Internally, proxy settings are per-NetworkContext. The proxy
968  overview tab shows settings for a *particular* NetworkContext, namely the
969  one associated with the Profile used to navigate to `chrome://net-export`. For
970  instance if the net-export was initiated from an Incognito window, it may
971  show different proxy settings here than a net-export capture initiated by a
972  non-Incognito window. When the net-export was triggered from command line
973  (`--log-net-log`) no particular NetworkContext is associated with the
974  capture and hence no proxy settings will be shown in this overview.
975
976* Were any proxies marked as bad?
977
978### Import tab
979
980Skim through the Import tab and look for relevant command line flags and active
981field trials. A find-in-page for `proxy` is a good starting point. Be on the lookout for
982[`--winhttp-proxy-resolver`](#winhttp_proxy_resolver-command-line-switch) which
983has [known problems](https://bugs.chromium.org/p/chromium/issues/detail?id=644030).
984
985### Events tab
986
987To deep dive into proxy resolution, switch to the Events tab.
988
989You can start by filtering on `type:URL_REQUEST` to see all the top level
990requests, and then keep click through the dependency links to
991trace the proxy resolution steps and outcome.
992
993The most relevant events have either `PROXY_`, `PAC_`, or
994`WPAD_` in their names. You can also try filtering for each of those.
995
996Documentation on specific events is available in
997[net_log_event_type_list.h](https://chromium.googlesource.com/chromium/src/+/HEAD/net/log/net_log_event_type_list.h).
998
999Network change events can also be key to understanding proxy issues. After
1000switching networks (ex VPN), the effective proxy settings, as well as content
1001of any PAC scripts/auto-detect can change.
1002
1003## Web Proxy Auto-Discovery (WPAD)
1004
1005When configured to use WPAD (aka "autotmaticaly detect proxy settings"), Chrome
1006will prioritize:
1007
10081. DHCP-based WPAD (option 252)
10092. DNS-based WPAD
1010
1011These are tried in order, however DHCP-based WPAD is only supported for Chrome
1012on Windows and Chrome on Chrome OS.
1013
1014WPAD is the system default for many home and Enterprise users.
1015
1016### Chrome on macOS support for DHCP-based WPAD
1017
1018Chrome on macOS does not support DHCP-based WPAD when configured to use
1019"autodetect".
1020
1021However, macOS might perform DHCP-based WPAD and embed this discovered PAC URL
1022as part of the system proxy settings. So effectively when Chrome is configured
1023to "use system proxy settings" it may behave as if it supports DHCP-based WPAD.
1024
1025### Dangers of DNS-based WPAD and DNS search suffix list
1026
1027DNS-based WPAD involves probing for the non-FQDN `wpad`. This means
1028WPAD's performance and security is directly tied to the user's DNS search
1029suffix list.
1030
1031When resolving `wpad`, the host's DNS resolver will complete the hostname using
1032each of the suffixes in the search list:
1033
10341. If the suffix list is long this process can very slow, as it triggers a
1035   cascade of NXDOMAIN.
10362. If the suffix list includes domains *outside of the administrative domain*,
1037   WPAD may select an attacker controlled PAC server, and can subsequently
1038   funnel the user's traffic through a proxy server of their choice. The
1039   evolution of TLDs further increases this risk, since what were previously
1040   private suffixes used by an enterprise can become publicly registerable.
1041   See also [WPAD Name Collision
1042   Vulnerability](https://www.us-cert.gov/ncas/alerts/TA16-144A)
1043
1044## --winhttp-proxy-resolver command line switch
1045
1046Passing the `--winhttp-proxy-resolver` command line argument instructs Chrome
1047to use the system libraries for *one narrow part of proxy resolution*: evaluating
1048a given PAC script.
1049
1050Use of this flag is NOT a supported mode, and has [known
1051problems](https://bugs.chromium.org/p/chromium/issues/detail?id=644030): It
1052can break Chrome extensions (`chrome.proxy` API), the interpretation of
1053Proxy policies, hurt performance, and doesn't ensure full fidelity
1054interpretation of system proxy settings.
1055
1056Another oddity of this switch is that it actually gets interpreted with a
1057smilar meaning on other platforms (macOS), despite its Windows-specific naming.
1058
1059This flag was historically exposed for debugging, and to mitigate unresolved
1060policy differences in PAC execution. In the future this switch [will be
1061removed](https://bugs.chromium.org/p/chromium/issues/detail?id=644030).
1062
1063Although Chrome would like full fidelity with Windows proxy settings, there are
1064limits to those integrations. Dependencies like NRPT for proxy
1065resolution necessitate using Windows proxy resolution libraries directly
1066instead of Chrome's. We hope these less common use cases will be fully
1067addressed by [this
1068feature](https://bugs.chromium.org/p/chromium/issues/detail?id=1032820)
1069