xref: /aosp_15_r20/external/libwebsockets/lib/misc/fts/README.md (revision 1c60b9aca93fdbc9b5f19b2d2194c91294b22281)
1*1c60b9acSAndroid Build Coastguard Worker# LWS Full Text Search
2*1c60b9acSAndroid Build Coastguard Worker
3*1c60b9acSAndroid Build Coastguard Worker## Introduction
4*1c60b9acSAndroid Build Coastguard Worker
5*1c60b9acSAndroid Build Coastguard Worker![lwsac flow](/doc-assets/lws-fts.svg)
6*1c60b9acSAndroid Build Coastguard Worker
7*1c60b9acSAndroid Build Coastguard WorkerThe general approach is to scan one or more UTF-8 input text "files" (they may
8*1c60b9acSAndroid Build Coastguard Workeronly exist in memory) and create an in-memory optimized trie for every token in
9*1c60b9acSAndroid Build Coastguard Workerthe file.
10*1c60b9acSAndroid Build Coastguard Worker
11*1c60b9acSAndroid Build Coastguard WorkerThis can then be serialized out to disk in the form of a single index file (no
12*1c60b9acSAndroid Build Coastguard Workermatter how many input files were involved or how large they were).
13*1c60b9acSAndroid Build Coastguard Worker
14*1c60b9acSAndroid Build Coastguard WorkerThe implementation is designed to be modest on memory and cpu for both index
15*1c60b9acSAndroid Build Coastguard Workercreation and querying, and suitable for weak machines with some kind of random
16*1c60b9acSAndroid Build Coastguard Workeraccess storage.  For searching only memory to hold results is required, the
17*1c60b9acSAndroid Build Coastguard Workeractual searches and autocomplete suggestions are done very rapidly by seeking
18*1c60b9acSAndroid Build Coastguard Workeraround structures in the on-disk index file.
19*1c60b9acSAndroid Build Coastguard Worker
20*1c60b9acSAndroid Build Coastguard WorkerFunction|Related Link
21*1c60b9acSAndroid Build Coastguard Worker---|---
22*1c60b9acSAndroid Build Coastguard WorkerPublic API|[include/libwebsockets/lws-fts.h](https://libwebsockets.org/git/libwebsockets/tree/include/libwebsockets/lws-fts.h)
23*1c60b9acSAndroid Build Coastguard WorkerCI test app|[minimal-examples/api-tests/api-test-fts](https://libwebsockets.org/git/libwebsockets/tree/minimal-examples/api-tests/api-test-fts)
24*1c60b9acSAndroid Build Coastguard WorkerDemo minimal example|[minimal-examples/http-server/minimal-http-server-fulltext-search](https://libwebsockets.org/git/libwebsockets/tree/minimal-examples/http-server/minimal-http-server-fulltext-search)
25*1c60b9acSAndroid Build Coastguard WorkerLive Demo|[https://libwebsockets.org/ftsdemo/](https://libwebsockets.org/ftsdemo/)
26*1c60b9acSAndroid Build Coastguard Worker
27*1c60b9acSAndroid Build Coastguard Worker## Query API overview
28*1c60b9acSAndroid Build Coastguard Worker
29*1c60b9acSAndroid Build Coastguard WorkerSearching returns a potentially very large lwsac allocated object, with contents
30*1c60b9acSAndroid Build Coastguard Workerand max size controlled by the members of a struct lws_fts_search_params passed
31*1c60b9acSAndroid Build Coastguard Workerto the search function.  Three kinds of result are possible:
32*1c60b9acSAndroid Build Coastguard Worker
33*1c60b9acSAndroid Build Coastguard Worker### Autocomplete suggestions
34*1c60b9acSAndroid Build Coastguard Worker
35*1c60b9acSAndroid Build Coastguard WorkerThese are useful to provide lists of extant results in
36*1c60b9acSAndroid Build Coastguard Workerrealtime as the user types characters that constrain the search.  So if the
37*1c60b9acSAndroid Build Coastguard Workeruser has typed 'len', any hits for 'len' itself are reported along with
38*1c60b9acSAndroid Build Coastguard Worker'length', and whatever else is in the index beginning 'len'..  The results are
39*1c60b9acSAndroid Build Coastguard Workerselected using and are accompanied by an aggregated count of results down that
40*1c60b9acSAndroid Build Coastguard Workerpath, and the results so the "most likely" results already measured by potential
41*1c60b9acSAndroid Build Coastguard Workerhits appear first.
42*1c60b9acSAndroid Build Coastguard Worker
43*1c60b9acSAndroid Build Coastguard WorkerThese results are in a linked-list headed by `result.autocomplete_head` and
44*1c60b9acSAndroid Build Coastguard Workereach is in a `struct lws_fts_result_autocomplete`.
45*1c60b9acSAndroid Build Coastguard Worker
46*1c60b9acSAndroid Build Coastguard WorkerThey're enabled in the search results by giving the flag
47*1c60b9acSAndroid Build Coastguard Worker `LWSFTS_F_QUERY_AUTOCOMPLETE` in the search parameter flags.
48*1c60b9acSAndroid Build Coastguard Worker
49*1c60b9acSAndroid Build Coastguard Worker### Filepath results
50*1c60b9acSAndroid Build Coastguard Worker
51*1c60b9acSAndroid Build Coastguard WorkerSimply a list of input files containing the search term with some statistics,
52*1c60b9acSAndroid Build Coastguard Workerone file is mentioned in a `struct lws_fts_result_filepath` result struct.
53*1c60b9acSAndroid Build Coastguard Worker
54*1c60b9acSAndroid Build Coastguard WorkerThis would be useful for creating a selection UI to "drill down" to individual
55*1c60b9acSAndroid Build Coastguard Workerfiles when there are many with matches.
56*1c60b9acSAndroid Build Coastguard Worker
57*1c60b9acSAndroid Build Coastguard WorkerThis is enabled by the `LWSFTS_F_QUERY_FILES` search flag.
58*1c60b9acSAndroid Build Coastguard Worker
59*1c60b9acSAndroid Build Coastguard Worker### Filepath and line results
60*1c60b9acSAndroid Build Coastguard Worker
61*1c60b9acSAndroid Build Coastguard WorkerSame as the file path list, but for each filepath, information on the line
62*1c60b9acSAndroid Build Coastguard Workernumbers and input file offset where the line starts are provided.
63*1c60b9acSAndroid Build Coastguard Worker
64*1c60b9acSAndroid Build Coastguard WorkerThis is enabled by `LWSFTS_F_QUERY_FILE_LINES`... if you additionally give
65*1c60b9acSAndroid Build Coastguard Worker`LWSFTS_F_QUERY_QUOTE_LINE` flag then the contents of each hit line from the
66*1c60b9acSAndroid Build Coastguard Workerinput file are also provided.
67*1c60b9acSAndroid Build Coastguard Worker
68*1c60b9acSAndroid Build Coastguard Worker## Result format inside the lwsac
69*1c60b9acSAndroid Build Coastguard Worker
70*1c60b9acSAndroid Build Coastguard WorkerA `struct lws_fts_result` at the start of the lwsac contains heads for linked-
71*1c60b9acSAndroid Build Coastguard Workerlists of autocomplete and filepath results inside the lwsac.
72*1c60b9acSAndroid Build Coastguard Worker
73*1c60b9acSAndroid Build Coastguard WorkerFor autocomplete suggestions, the string itself is immediately after the
74*1c60b9acSAndroid Build Coastguard Worker`struct lws_fts_result_autocomplete` in memory.  For filepath results, after
75*1c60b9acSAndroid Build Coastguard Workereach `struct lws_fts_result_filepath` is
76*1c60b9acSAndroid Build Coastguard Worker
77*1c60b9acSAndroid Build Coastguard Worker - match information depending on the flags given to the search
78*1c60b9acSAndroid Build Coastguard Worker - the filepath string
79*1c60b9acSAndroid Build Coastguard Worker
80*1c60b9acSAndroid Build Coastguard WorkerYou can always skip the line number table to get the filepath string by adding
81*1c60b9acSAndroid Build Coastguard Worker.matches_length to the address of the byte after the struct.
82*1c60b9acSAndroid Build Coastguard Worker
83*1c60b9acSAndroid Build Coastguard WorkerThe matches information is either
84*1c60b9acSAndroid Build Coastguard Worker
85*1c60b9acSAndroid Build Coastguard Worker - 0 bytes per match
86*1c60b9acSAndroid Build Coastguard Worker
87*1c60b9acSAndroid Build Coastguard Worker - 2x int32_t per match (8 bytes) if `LWSFTS_F_QUERY_FILE_LINES` given... the
88*1c60b9acSAndroid Build Coastguard Worker   first is the native-endian line number of the match, the second is the
89*1c60b9acSAndroid Build Coastguard Worker   byte offset in the original file where that line starts
90*1c60b9acSAndroid Build Coastguard Worker
91*1c60b9acSAndroid Build Coastguard Worker - 2 x int32_t as above plus a const char * if `LWSFTS_F_QUERY_QUOTE_LINE` is
92*1c60b9acSAndroid Build Coastguard Worker   also given... this points to a NUL terminated string also stored in the
93*1c60b9acSAndroid Build Coastguard Worker   results lwsac that contains up to 255 chars of the line from the original
94*1c60b9acSAndroid Build Coastguard Worker   file.  In some cases, the original file was either virtual (you are indexing
95*1c60b9acSAndroid Build Coastguard Worker   a git revision) or is not stored with the index, in that case you can't
96*1c60b9acSAndroid Build Coastguard Worker   usefully use `LWSFTS_F_QUERY_QUOTE_LINE`.
97*1c60b9acSAndroid Build Coastguard Worker
98*1c60b9acSAndroid Build Coastguard WorkerTo facilitate interpreting what is stored per match, the original search flags
99*1c60b9acSAndroid Build Coastguard Workerthat created the result are stored in the `struct lws_fts_result`.
100*1c60b9acSAndroid Build Coastguard Worker
101*1c60b9acSAndroid Build Coastguard Worker## Indexing In-memory and serialized to file
102*1c60b9acSAndroid Build Coastguard Worker
103*1c60b9acSAndroid Build Coastguard WorkerWhen creating the trie, in-memory structs are used with various optimization
104*1c60b9acSAndroid Build Coastguard Workerschemes trading off memory usage for speed.  While in-memory, it's possible to
105*1c60b9acSAndroid Build Coastguard Workeradd more indexed filepaths to the single index.  Once the trie is complete in
106*1c60b9acSAndroid Build Coastguard Workerterms of having indexed everything, it is serialized to disk.
107*1c60b9acSAndroid Build Coastguard Worker
108*1c60b9acSAndroid Build Coastguard WorkerThese contain many additional housekeeping pointers and trie entries which can
109*1c60b9acSAndroid Build Coastguard Workerbe optimized out.  Most in-memory values must be held literally in large types,
110*1c60b9acSAndroid Build Coastguard Workerwhereas most of the values in the serialized file use smaller VLI which use
111*1c60b9acSAndroid Build Coastguard Workermore or less bytes according to the value.  So the peak memory requirements for
112*1c60b9acSAndroid Build Coastguard Workerlarge tries are much bigger than the size of the serialized trie file that is
113*1c60b9acSAndroid Build Coastguard Workeroutput.
114*1c60b9acSAndroid Build Coastguard Worker
115*1c60b9acSAndroid Build Coastguard WorkerFor the linux kernel at 4.14 and default indexing list on a 2.8GHz AMD
116*1c60b9acSAndroid Build Coastguard Workerthreadripper (using one thread), the stats are:
117*1c60b9acSAndroid Build Coastguard Worker
118*1c60b9acSAndroid Build Coastguard WorkerName|Value
119*1c60b9acSAndroid Build Coastguard Worker---|---
120*1c60b9acSAndroid Build Coastguard WorkerFiles indexed|52932
121*1c60b9acSAndroid Build Coastguard WorkerInput corpus size|694MiB
122*1c60b9acSAndroid Build Coastguard WorkerIndexing cpu time|50.1s (>1000 files / sec; 13.8MBytes/sec)
123*1c60b9acSAndroid Build Coastguard WorkerPeak alloc|78MiB
124*1c60b9acSAndroid Build Coastguard WorkerSerialization time|202ms
125*1c60b9acSAndroid Build Coastguard WorkerTrie File size|347MiB
126*1c60b9acSAndroid Build Coastguard Worker
127*1c60b9acSAndroid Build Coastguard WorkerTo index libwebsockets main branch under the same conditions:
128*1c60b9acSAndroid Build Coastguard Worker
129*1c60b9acSAndroid Build Coastguard WorkerName|Value
130*1c60b9acSAndroid Build Coastguard Worker---|---
131*1c60b9acSAndroid Build Coastguard WorkerFiles indexed|489
132*1c60b9acSAndroid Build Coastguard WorkerInput corpus size|3MiB
133*1c60b9acSAndroid Build Coastguard WorkerIndexing time|123ms
134*1c60b9acSAndroid Build Coastguard WorkerPeak alloc|3MiB
135*1c60b9acSAndroid Build Coastguard WorkerSerialization time|1ms
136*1c60b9acSAndroid Build Coastguard WorkerTrie File size|1.4MiB
137*1c60b9acSAndroid Build Coastguard Worker
138*1c60b9acSAndroid Build Coastguard Worker
139*1c60b9acSAndroid Build Coastguard WorkerOnce it's generated, querying the trie file is very inexpensive, even when there
140*1c60b9acSAndroid Build Coastguard Workerare lots of results.
141*1c60b9acSAndroid Build Coastguard Worker
142*1c60b9acSAndroid Build Coastguard Worker - trie entry child lists are kept sorted by the character they map to.  This
143*1c60b9acSAndroid Build Coastguard Worker   allows discovering there is no match as soon as a character later in the
144*1c60b9acSAndroid Build Coastguard Worker   order than the one being matched is seen
145*1c60b9acSAndroid Build Coastguard Worker
146*1c60b9acSAndroid Build Coastguard Worker - for the root trie, in addition to the linked-list child + sibling entries,
147*1c60b9acSAndroid Build Coastguard Worker   a 256-entry pointer table is associated with the root trie, allowing one-
148*1c60b9acSAndroid Build Coastguard Worker   step lookup.  But as the table is 2KiB, it's too expensive to use on all
149*1c60b9acSAndroid Build Coastguard Worker   trie entries
150*1c60b9acSAndroid Build Coastguard Worker
151*1c60b9acSAndroid Build Coastguard Worker## Structure on disk
152*1c60b9acSAndroid Build Coastguard Worker
153*1c60b9acSAndroid Build Coastguard WorkerAll explicit multibyte numbers are stored in Network (MSB-first) byte order.
154*1c60b9acSAndroid Build Coastguard Worker
155*1c60b9acSAndroid Build Coastguard Worker - file header
156*1c60b9acSAndroid Build Coastguard Worker - filepath line number tables
157*1c60b9acSAndroid Build Coastguard Worker - filepath information
158*1c60b9acSAndroid Build Coastguard Worker - filepath map table
159*1c60b9acSAndroid Build Coastguard Worker - tries, trie instances (hits), trie child tables
160*1c60b9acSAndroid Build Coastguard Worker
161*1c60b9acSAndroid Build Coastguard Worker### VLI coding
162*1c60b9acSAndroid Build Coastguard Worker
163*1c60b9acSAndroid Build Coastguard WorkerVLI (Variable Length Integer) coding works like this
164*1c60b9acSAndroid Build Coastguard Worker
165*1c60b9acSAndroid Build Coastguard Worker[b7 EON] [b6 .. b0  DATA]
166*1c60b9acSAndroid Build Coastguard Worker
167*1c60b9acSAndroid Build Coastguard WorkerIf EON = 0, then DATA represents the Least-significant 7 bits of the number.
168*1c60b9acSAndroid Build Coastguard Workerif EON = 1, DATA represents More-significant 7-bits that should be shifted
169*1c60b9acSAndroid Build Coastguard Workerleft until the byte with EON = 0 is found to terminate the number.
170*1c60b9acSAndroid Build Coastguard Worker
171*1c60b9acSAndroid Build Coastguard WorkerThe VLI used is predicated around 32-bit unsigned integers
172*1c60b9acSAndroid Build Coastguard Worker
173*1c60b9acSAndroid Build Coastguard WorkerExamples:
174*1c60b9acSAndroid Build Coastguard Worker
175*1c60b9acSAndroid Build Coastguard Worker - 0x30            =    48
176*1c60b9acSAndroid Build Coastguard Worker - 0x81 30         =   176
177*1c60b9acSAndroid Build Coastguard Worker - 0x81 0x80 0x00  = 16384
178*1c60b9acSAndroid Build Coastguard Worker
179*1c60b9acSAndroid Build Coastguard WorkerBytes | Range
180*1c60b9acSAndroid Build Coastguard Worker---|---
181*1c60b9acSAndroid Build Coastguard Worker1|<= 127
182*1c60b9acSAndroid Build Coastguard Worker2|<= 16K - 1
183*1c60b9acSAndroid Build Coastguard Worker3|<= 2M -1
184*1c60b9acSAndroid Build Coastguard Worker4|<= 256M - 1
185*1c60b9acSAndroid Build Coastguard Worker5|<= 4G - 1
186*1c60b9acSAndroid Build Coastguard Worker
187*1c60b9acSAndroid Build Coastguard WorkerThe coding is very efficient if there's a high probabilty the number being
188*1c60b9acSAndroid Build Coastguard Workerstored is not large.  So it's great for line numbers for example, where most
189*1c60b9acSAndroid Build Coastguard Workerfiles have less that 16K lines and the VLI for the line number fits in 2 bytes,
190*1c60b9acSAndroid Build Coastguard Workerbut if you meet a huge file, the VLI coding can also handle it.
191*1c60b9acSAndroid Build Coastguard Worker
192*1c60b9acSAndroid Build Coastguard WorkerAll numbers except a few in the headers that are actually written after the
193*1c60b9acSAndroid Build Coastguard Workerfollowing data are stored using VLI for space- efficiency without limiting
194*1c60b9acSAndroid Build Coastguard Workercapability.  The numbers that are fixed up after the fact have to have a fixed
195*1c60b9acSAndroid Build Coastguard Workersize and can't use VLI.
196*1c60b9acSAndroid Build Coastguard Worker
197*1c60b9acSAndroid Build Coastguard Worker### File header
198*1c60b9acSAndroid Build Coastguard Worker
199*1c60b9acSAndroid Build Coastguard WorkerThe first byte of the file header where the magic is, is "fileoffset" 0.  All
200*1c60b9acSAndroid Build Coastguard Workerthe stored "fileoffset"s are relative to that.
201*1c60b9acSAndroid Build Coastguard Worker
202*1c60b9acSAndroid Build Coastguard WorkerThe header has a fixed size of 16 bytes.
203*1c60b9acSAndroid Build Coastguard Worker
204*1c60b9acSAndroid Build Coastguard Workersize|function
205*1c60b9acSAndroid Build Coastguard Worker---|---
206*1c60b9acSAndroid Build Coastguard Worker32-bits|Magic 0xCA7A5F75
207*1c60b9acSAndroid Build Coastguard Worker32-bits|Fileoffset to root trie entry
208*1c60b9acSAndroid Build Coastguard Worker32-bits|Size of the trie file when it was created (to detect truncation)
209*1c60b9acSAndroid Build Coastguard Worker32-bits|Fileoffset to the filepath map
210*1c60b9acSAndroid Build Coastguard Worker32-bits|Number of filepaths
211*1c60b9acSAndroid Build Coastguard Worker
212*1c60b9acSAndroid Build Coastguard Worker### Filepath line tables
213*1c60b9acSAndroid Build Coastguard Worker
214*1c60b9acSAndroid Build Coastguard WorkerImmediately after the file header are the line length tables.
215*1c60b9acSAndroid Build Coastguard Worker
216*1c60b9acSAndroid Build Coastguard WorkerAs the input files are parsed, line length tables are written for each file...
217*1c60b9acSAndroid Build Coastguard Workerat that time the rest of the parser data is held in memory so nothing else is
218*1c60b9acSAndroid Build Coastguard Workerin the file yet.  These allow you to map logical line numbers in the file to
219*1c60b9acSAndroid Build Coastguard Workerfile offsets space- and time- efficiently without having to walk through the
220*1c60b9acSAndroid Build Coastguard Workerfile contents.
221*1c60b9acSAndroid Build Coastguard Worker
222*1c60b9acSAndroid Build Coastguard WorkerThe line information is cut into blocks, allowing quick skipping over the VLI
223*1c60b9acSAndroid Build Coastguard Workerdata that doesn't contain the line you want just by following the 8-byte header
224*1c60b9acSAndroid Build Coastguard Workerpart.
225*1c60b9acSAndroid Build Coastguard Worker
226*1c60b9acSAndroid Build Coastguard WorkerOnce you find the block with your line, you have to iteratively add the VLIs
227*1c60b9acSAndroid Build Coastguard Workeruntil you hit the one you want.
228*1c60b9acSAndroid Build Coastguard Worker
229*1c60b9acSAndroid Build Coastguard WorkerFor normal text files with average line length below 128, the VLIs will
230*1c60b9acSAndroid Build Coastguard Workertypically be a single byte.  So a block of 200 line lengths is typically
231*1c60b9acSAndroid Build Coastguard Worker208 bytes long.
232*1c60b9acSAndroid Build Coastguard Worker
233*1c60b9acSAndroid Build Coastguard WorkerThere is a final linetable chunk consisting of all zeros to indicate the end
234*1c60b9acSAndroid Build Coastguard Workerof the filepath line chunk series for a filepath.
235*1c60b9acSAndroid Build Coastguard Worker
236*1c60b9acSAndroid Build Coastguard Workersize|function
237*1c60b9acSAndroid Build Coastguard Worker---|---
238*1c60b9acSAndroid Build Coastguard Worker16-bit|length of this chunk itself in bytes
239*1c60b9acSAndroid Build Coastguard Worker16-bit|count of lines covered in this chunk
240*1c60b9acSAndroid Build Coastguard Worker32-bit|count of bytes in the input file this chunk covers
241*1c60b9acSAndroid Build Coastguard WorkerVLI...|for each line in the chunk, the number of bytes in the line
242*1c60b9acSAndroid Build Coastguard Worker
243*1c60b9acSAndroid Build Coastguard Worker
244*1c60b9acSAndroid Build Coastguard Worker### Filepaths
245*1c60b9acSAndroid Build Coastguard Worker
246*1c60b9acSAndroid Build Coastguard WorkerThe single trie in the file may contain information from multiple files, for
247*1c60b9acSAndroid Build Coastguard Workerexample one trie may cover all files in a directory.  The "Filepaths" are
248*1c60b9acSAndroid Build Coastguard Workerlisted after the line tables, and referred to by index thereafter.
249*1c60b9acSAndroid Build Coastguard Worker
250*1c60b9acSAndroid Build Coastguard WorkerFor each filepath, one after the other:
251*1c60b9acSAndroid Build Coastguard Worker
252*1c60b9acSAndroid Build Coastguard Workersize|function
253*1c60b9acSAndroid Build Coastguard Worker---|---
254*1c60b9acSAndroid Build Coastguard WorkerVLI|fileoffset of the start of this filepath's line table
255*1c60b9acSAndroid Build Coastguard WorkerVLI|count of lines in the file
256*1c60b9acSAndroid Build Coastguard WorkerVLI|length of filepath in bytes
257*1c60b9acSAndroid Build Coastguard Worker...|the filepath (with no NUL)
258*1c60b9acSAndroid Build Coastguard Worker
259*1c60b9acSAndroid Build Coastguard Worker### Filepath map
260*1c60b9acSAndroid Build Coastguard Worker
261*1c60b9acSAndroid Build Coastguard WorkerTo facilitate rapid filepath lookup, there's a filepath map table with a 32-bit
262*1c60b9acSAndroid Build Coastguard Workerfileoffset per filepath.  This is the way to convert filepath indexes to
263*1c60b9acSAndroid Build Coastguard Workerinformation on the filepath like its name, etc
264*1c60b9acSAndroid Build Coastguard Worker
265*1c60b9acSAndroid Build Coastguard Workersize|function
266*1c60b9acSAndroid Build Coastguard Worker---|---
267*1c60b9acSAndroid Build Coastguard Worker32-bit...|fileoffset to filepath table for each filepath
268*1c60b9acSAndroid Build Coastguard Worker
269*1c60b9acSAndroid Build Coastguard Worker### Trie entries
270*1c60b9acSAndroid Build Coastguard Worker
271*1c60b9acSAndroid Build Coastguard WorkerImmediately after that, the trie entries are dumped, for each one a header:
272*1c60b9acSAndroid Build Coastguard Worker
273*1c60b9acSAndroid Build Coastguard Worker#### Trie entry header
274*1c60b9acSAndroid Build Coastguard Worker
275*1c60b9acSAndroid Build Coastguard Workersize|function
276*1c60b9acSAndroid Build Coastguard Worker---|---
277*1c60b9acSAndroid Build Coastguard WorkerVLI|Fileoffset of first file table in this trie entry instance list
278*1c60b9acSAndroid Build Coastguard WorkerVLI|number of child trie entries this trie entry has
279*1c60b9acSAndroid Build Coastguard WorkerVLI|number of instances this trie entry has
280*1c60b9acSAndroid Build Coastguard Worker
281*1c60b9acSAndroid Build Coastguard WorkerThe child list follows immediately after this header
282*1c60b9acSAndroid Build Coastguard Worker
283*1c60b9acSAndroid Build Coastguard Worker#### Trie entry instance file
284*1c60b9acSAndroid Build Coastguard Worker
285*1c60b9acSAndroid Build Coastguard WorkerFor each file that has instances of this symbol:
286*1c60b9acSAndroid Build Coastguard Worker
287*1c60b9acSAndroid Build Coastguard Workersize|function
288*1c60b9acSAndroid Build Coastguard Worker---|---
289*1c60b9acSAndroid Build Coastguard WorkerVLI|Fileoffset of next file table in this trie entry instance list
290*1c60b9acSAndroid Build Coastguard WorkerVLI|filepath index
291*1c60b9acSAndroid Build Coastguard WorkerVLI|count of line number instances following
292*1c60b9acSAndroid Build Coastguard Worker
293*1c60b9acSAndroid Build Coastguard Worker#### Trie entry file line number table
294*1c60b9acSAndroid Build Coastguard Worker
295*1c60b9acSAndroid Build Coastguard WorkerThen for the file mentioned above, a list of all line numbers in the file with
296*1c60b9acSAndroid Build Coastguard Workerthe symbol in them, in ascending order.  As a VLI, the median size per entry
297*1c60b9acSAndroid Build Coastguard Workerwill typically be ~15.9 bits due to the probability of line numbers below 16K.
298*1c60b9acSAndroid Build Coastguard Worker
299*1c60b9acSAndroid Build Coastguard Workersize|function
300*1c60b9acSAndroid Build Coastguard Worker---|---
301*1c60b9acSAndroid Build Coastguard WorkerVLI|line number
302*1c60b9acSAndroid Build Coastguard Worker...
303*1c60b9acSAndroid Build Coastguard Worker
304*1c60b9acSAndroid Build Coastguard Worker#### Trie entry child table
305*1c60b9acSAndroid Build Coastguard Worker
306*1c60b9acSAndroid Build Coastguard WorkerFor each child node
307*1c60b9acSAndroid Build Coastguard Worker
308*1c60b9acSAndroid Build Coastguard Workersize|function
309*1c60b9acSAndroid Build Coastguard Worker---|---
310*1c60b9acSAndroid Build Coastguard WorkerVLI|file offset of child
311*1c60b9acSAndroid Build Coastguard WorkerVLI|instance count belonging directly to this child
312*1c60b9acSAndroid Build Coastguard WorkerVLI|aggregated number of instances down all descendent paths of child
313*1c60b9acSAndroid Build Coastguard WorkerVLI|aggregated number of children down all descendent paths of child
314*1c60b9acSAndroid Build Coastguard WorkerVLI|match string length
315*1c60b9acSAndroid Build Coastguard Worker...|the match string
316