xref: /aosp_15_r20/external/cronet/url/README.md (revision 6777b5387eb2ff775bb5750e3f5d96f37fb7352b)
1*6777b538SAndroid Build Coastguard Worker# Chrome's URL library
2*6777b538SAndroid Build Coastguard Worker
3*6777b538SAndroid Build Coastguard Worker## Layers
4*6777b538SAndroid Build Coastguard Worker
5*6777b538SAndroid Build Coastguard WorkerThere are several conceptual layers in this directory. Going from the lowest
6*6777b538SAndroid Build Coastguard Workerlevel up, they are:
7*6777b538SAndroid Build Coastguard Worker
8*6777b538SAndroid Build Coastguard Worker### Parsing
9*6777b538SAndroid Build Coastguard Worker
10*6777b538SAndroid Build Coastguard WorkerThe `url_parse.*` files are the parser. This code does no string
11*6777b538SAndroid Build Coastguard Workertransformations. Its only job is to take an input string and split out the
12*6777b538SAndroid Build Coastguard Workercomponents of the URL as best as it can deduce them, for a given type of URL.
13*6777b538SAndroid Build Coastguard WorkerParsing can never fail, it will take its best guess. This layer does not
14*6777b538SAndroid Build Coastguard Workerhave logic for determining the type of URL parsing to apply, that needs to
15*6777b538SAndroid Build Coastguard Workerbe applied at a higher layer (the "util" layer below).
16*6777b538SAndroid Build Coastguard Worker
17*6777b538SAndroid Build Coastguard WorkerBecause the parser code is derived (_very_ distantly) from some code in
18*6777b538SAndroid Build Coastguard WorkerMozilla, some of the parser files are in `url/third_party/mozilla/`.
19*6777b538SAndroid Build Coastguard Worker
20*6777b538SAndroid Build Coastguard WorkerThe main header to include for calling the parser is
21*6777b538SAndroid Build Coastguard Worker`url/third_party/mozilla/url_parse.h`.
22*6777b538SAndroid Build Coastguard Worker
23*6777b538SAndroid Build Coastguard Worker### Canonicalization
24*6777b538SAndroid Build Coastguard Worker
25*6777b538SAndroid Build Coastguard WorkerThe `url_canon*` files are the canonicalizer. This code will transform specific
26*6777b538SAndroid Build Coastguard WorkerURL components or specific types of URLs into a standard form. For some
27*6777b538SAndroid Build Coastguard Workerdangerous or invalid data, the canonicalizer will report that a URL is invalid,
28*6777b538SAndroid Build Coastguard Workeralthough it will always try its best to produce output (so the calling code
29*6777b538SAndroid Build Coastguard Workercan, for example, show the user an error that the URL is invalid). The
30*6777b538SAndroid Build Coastguard Workercanonicalizer attempts to provide as consistent a representation as possible
31*6777b538SAndroid Build Coastguard Workerwithout changing the meaning of a URL.
32*6777b538SAndroid Build Coastguard Worker
33*6777b538SAndroid Build Coastguard WorkerThe canonicalizer layer is designed to be independent of the string type of
34*6777b538SAndroid Build Coastguard Workerthe embedder, so all string output is done through a `CanonOutput` wrapper
35*6777b538SAndroid Build Coastguard Workerobject. An implementation for `std::string` output is provided in
36*6777b538SAndroid Build Coastguard Worker`url_canon_stdstring.h`.
37*6777b538SAndroid Build Coastguard Worker
38*6777b538SAndroid Build Coastguard WorkerThe main header to include for calling the canonicalizer is
39*6777b538SAndroid Build Coastguard Worker`url/url_canon.h`.
40*6777b538SAndroid Build Coastguard Worker
41*6777b538SAndroid Build Coastguard Worker### Utility
42*6777b538SAndroid Build Coastguard Worker
43*6777b538SAndroid Build Coastguard WorkerThe `url_util*` files provide a higher-level wrapper around the parser and
44*6777b538SAndroid Build Coastguard Workercanonicalizer. While it can be called directly, it is designed to be the
45*6777b538SAndroid Build Coastguard Workerfoundation for writing URL wrapper objects (The GURL later and Blink's KURL
46*6777b538SAndroid Build Coastguard Workerobject use the Utility layer to implement the low-level logic).
47*6777b538SAndroid Build Coastguard Worker
48*6777b538SAndroid Build Coastguard WorkerThe Utility code makes decisions about URL types and calls the correct parsing
49*6777b538SAndroid Build Coastguard Workerand canonicalzation functions for those types. It provides an interface to
50*6777b538SAndroid Build Coastguard Workerregister application-specific schemes that have specific requirements.
51*6777b538SAndroid Build Coastguard WorkerSharing this loigic between KURL and GURL is important so that URLs are
52*6777b538SAndroid Build Coastguard Workerhandled consistently across the application.
53*6777b538SAndroid Build Coastguard Worker
54*6777b538SAndroid Build Coastguard WorkerThe main header to include is `url/url_util.h`.
55*6777b538SAndroid Build Coastguard Worker
56*6777b538SAndroid Build Coastguard Worker### Google URL (GURL) and Origin
57*6777b538SAndroid Build Coastguard Worker
58*6777b538SAndroid Build Coastguard WorkerAt the highest layer, a C++ object for representing URLs is provided. This
59*6777b538SAndroid Build Coastguard Workerobject uses STL. Most uses need only this layer. Include `url/gurl.h`.
60*6777b538SAndroid Build Coastguard Worker
61*6777b538SAndroid Build Coastguard WorkerAlso at this layer is also the Origin object which exists to make security
62*6777b538SAndroid Build Coastguard Workerdecisions on the web. Include `url/origin.h`.
63*6777b538SAndroid Build Coastguard Worker
64*6777b538SAndroid Build Coastguard Worker## Historical background
65*6777b538SAndroid Build Coastguard Worker
66*6777b538SAndroid Build Coastguard WorkerThis code was originally a separate library that was designed to be embedded
67*6777b538SAndroid Build Coastguard Workerinto both Chrome (which uses STL) and WebKit (which didn't use any STL at the
68*6777b538SAndroid Build Coastguard Workertime). As a result, the parsing, canonicalization, and utility code could
69*6777b538SAndroid Build Coastguard Workernot use STL, or any other common code in Chromium like base.
70*6777b538SAndroid Build Coastguard Worker
71*6777b538SAndroid Build Coastguard WorkerWhen WebKit was forked into the Chromium repo and renamed Blink, this
72*6777b538SAndroid Build Coastguard Workerrestriction has been relaxed somewhat. Blink still provides its own URL object
73*6777b538SAndroid Build Coastguard Workerusing its own string type, so the insulation that the Utility layer provides is
74*6777b538SAndroid Build Coastguard Workerstill useful. But some STL strings and calls to base functions have gradually
75*6777b538SAndroid Build Coastguard Workerbeen added in places where doing so is possible.
76*6777b538SAndroid Build Coastguard Worker
77*6777b538SAndroid Build Coastguard Worker## Caution for terminologies
78*6777b538SAndroid Build Coastguard Worker
79*6777b538SAndroid Build Coastguard WorkerDue to historical usage, the term "Standard URL" is currently used within the
80*6777b538SAndroid Build Coastguard Workercode to represent "[Special URLs][1]", except for "file:" scheme URL, as defined
81*6777b538SAndroid Build Coastguard Workerin the URL Standard. However, this terminology is outdated and can lead to
82*6777b538SAndroid Build Coastguard Workerconfusion, particularly now that we are supporting [non-special URLs][2] as well
83*6777b538SAndroid Build Coastguard Worker([crbug/1416006][3]). For the sake of consistency and clarity, it is recommended
84*6777b538SAndroid Build Coastguard Workerto switch to the more accurate term "Special URL" throughout the codebase.
85*6777b538SAndroid Build Coastguard WorkerHowever, this change should be carefully planned and executed due to the
86*6777b538SAndroid Build Coastguard Workerwidespread use of the current terminology in both internal and third-party code.
87*6777b538SAndroid Build Coastguard WorkerFor a while, "Standard URL" and "Special URL" are used interchangeably.
88*6777b538SAndroid Build Coastguard Worker
89*6777b538SAndroid Build Coastguard Worker[1]: https://url.spec.whatwg.org/#is-special
90*6777b538SAndroid Build Coastguard Worker[2]: https://url.spec.whatwg.org/#is-not-special
91*6777b538SAndroid Build Coastguard Worker[3]: https://crbug.com/1416006
92