xref: /aosp_15_r20/external/pigweed/pw_tokenizer/docs.rst (revision 61c4878ac05f98d0ceed94b57d316916de578985)
1*61c4878aSAndroid Build Coastguard Worker.. _module-pw_tokenizer:
2*61c4878aSAndroid Build Coastguard Worker
3*61c4878aSAndroid Build Coastguard Worker============
4*61c4878aSAndroid Build Coastguard Workerpw_tokenizer
5*61c4878aSAndroid Build Coastguard Worker============
6*61c4878aSAndroid Build Coastguard Worker.. pigweed-module::
7*61c4878aSAndroid Build Coastguard Worker   :name: pw_tokenizer
8*61c4878aSAndroid Build Coastguard Worker
9*61c4878aSAndroid Build Coastguard WorkerLogging is critical, but developers are often forced to choose between
10*61c4878aSAndroid Build Coastguard Workeradditional logging or saving crucial flash space. The ``pw_tokenizer`` module
11*61c4878aSAndroid Build Coastguard Workerenables **extensive logging with substantially less memory usage** by replacing
12*61c4878aSAndroid Build Coastguard Workerprintf-style strings with binary tokens during compilation. It is designed to
13*61c4878aSAndroid Build Coastguard Workerintegrate easily into existing logging systems.
14*61c4878aSAndroid Build Coastguard Worker
15*61c4878aSAndroid Build Coastguard WorkerAlthough the most common application of ``pw_tokenizer`` is binary logging,
16*61c4878aSAndroid Build Coastguard Worker**the tokenizer is general purpose and can be used to tokenize any strings**,
17*61c4878aSAndroid Build Coastguard Workerwith or without printf-style arguments.
18*61c4878aSAndroid Build Coastguard Worker
19*61c4878aSAndroid Build Coastguard WorkerWhy tokenize strings?
20*61c4878aSAndroid Build Coastguard Worker
21*61c4878aSAndroid Build Coastguard Worker* **Dramatically reduce binary size** by removing string literals from binaries.
22*61c4878aSAndroid Build Coastguard Worker* **Reduce I/O traffic, RAM, and flash usage** by sending and storing compact tokens
23*61c4878aSAndroid Build Coastguard Worker  instead of strings. We've seen over 50% reduction in encoded log contents.
24*61c4878aSAndroid Build Coastguard Worker* **Reduce CPU usage** by replacing snprintf calls with simple tokenization code.
25*61c4878aSAndroid Build Coastguard Worker* **Remove potentially sensitive log, assert, and other strings** from binaries.
26*61c4878aSAndroid Build Coastguard Worker
27*61c4878aSAndroid Build Coastguard Worker.. grid:: 1
28*61c4878aSAndroid Build Coastguard Worker
29*61c4878aSAndroid Build Coastguard Worker   .. grid-item-card:: :octicon:`rocket` Get started
30*61c4878aSAndroid Build Coastguard Worker      :link: module-pw_tokenizer-get-started
31*61c4878aSAndroid Build Coastguard Worker      :link-type: ref
32*61c4878aSAndroid Build Coastguard Worker      :class-item: sales-pitch-cta-primary
33*61c4878aSAndroid Build Coastguard Worker
34*61c4878aSAndroid Build Coastguard Worker      Integrate pw_tokenizer into your project.
35*61c4878aSAndroid Build Coastguard Worker
36*61c4878aSAndroid Build Coastguard Worker.. grid:: 2
37*61c4878aSAndroid Build Coastguard Worker
38*61c4878aSAndroid Build Coastguard Worker   .. grid-item-card:: :octicon:`code-square` Tokenization
39*61c4878aSAndroid Build Coastguard Worker      :link: module-pw_tokenizer-tokenization
40*61c4878aSAndroid Build Coastguard Worker      :link-type: ref
41*61c4878aSAndroid Build Coastguard Worker      :class-item: sales-pitch-cta-secondary
42*61c4878aSAndroid Build Coastguard Worker
43*61c4878aSAndroid Build Coastguard Worker      Convert strings and arguments to tokens.
44*61c4878aSAndroid Build Coastguard Worker
45*61c4878aSAndroid Build Coastguard Worker   .. grid-item-card:: :octicon:`code-square` Token databases
46*61c4878aSAndroid Build Coastguard Worker      :link: module-pw_tokenizer-token-databases
47*61c4878aSAndroid Build Coastguard Worker      :link-type: ref
48*61c4878aSAndroid Build Coastguard Worker      :class-item: sales-pitch-cta-secondary
49*61c4878aSAndroid Build Coastguard Worker
50*61c4878aSAndroid Build Coastguard Worker      Store a mapping of tokens to the strings and arguments they represent.
51*61c4878aSAndroid Build Coastguard Worker
52*61c4878aSAndroid Build Coastguard Worker.. grid:: 2
53*61c4878aSAndroid Build Coastguard Worker
54*61c4878aSAndroid Build Coastguard Worker   .. grid-item-card:: :octicon:`code-square` Detokenization
55*61c4878aSAndroid Build Coastguard Worker      :link: module-pw_tokenizer-detokenization
56*61c4878aSAndroid Build Coastguard Worker      :link-type: ref
57*61c4878aSAndroid Build Coastguard Worker      :class-item: sales-pitch-cta-secondary
58*61c4878aSAndroid Build Coastguard Worker
59*61c4878aSAndroid Build Coastguard Worker      Expand tokens back to the strings and arguments they represent.
60*61c4878aSAndroid Build Coastguard Worker
61*61c4878aSAndroid Build Coastguard Worker   .. grid-item-card:: :octicon:`info` API reference
62*61c4878aSAndroid Build Coastguard Worker      :link: module-pw_tokenizer-api
63*61c4878aSAndroid Build Coastguard Worker      :link-type: ref
64*61c4878aSAndroid Build Coastguard Worker      :class-item: sales-pitch-cta-secondary
65*61c4878aSAndroid Build Coastguard Worker
66*61c4878aSAndroid Build Coastguard Worker      Detailed reference information about the pw_tokenizer API.
67*61c4878aSAndroid Build Coastguard Worker
68*61c4878aSAndroid Build Coastguard Worker
69*61c4878aSAndroid Build Coastguard Worker.. _module-pw_tokenizer-tokenized-logging-example:
70*61c4878aSAndroid Build Coastguard Worker
71*61c4878aSAndroid Build Coastguard Worker---------------------------
72*61c4878aSAndroid Build Coastguard WorkerTokenized logging in action
73*61c4878aSAndroid Build Coastguard Worker---------------------------
74*61c4878aSAndroid Build Coastguard WorkerHere's an example of how ``pw_tokenizer`` enables you to store
75*61c4878aSAndroid Build Coastguard Workerand send the same logging information using significantly less
76*61c4878aSAndroid Build Coastguard Workerresources:
77*61c4878aSAndroid Build Coastguard Worker
78*61c4878aSAndroid Build Coastguard Worker.. mermaid::
79*61c4878aSAndroid Build Coastguard Worker
80*61c4878aSAndroid Build Coastguard Worker   flowchart TD
81*61c4878aSAndroid Build Coastguard Worker
82*61c4878aSAndroid Build Coastguard Worker     subgraph after["After: Tokenized Logs (37 bytes saved!)"]
83*61c4878aSAndroid Build Coastguard Worker       after_log["LOG(#quot;Battery Voltage: %d mV#quot;, voltage)"] -- 4 bytes stored on-device as... -->
84*61c4878aSAndroid Build Coastguard Worker       after_encoding["d9 28 47 8e"] -- 6 bytes sent over the wire as... -->
85*61c4878aSAndroid Build Coastguard Worker       after_transmission["d9 28 47 8e aa 3e"] -- Displayed in logs as... -->
86*61c4878aSAndroid Build Coastguard Worker       after_display["#quot;Battery Voltage: 3989 mV#quot;"]
87*61c4878aSAndroid Build Coastguard Worker     end
88*61c4878aSAndroid Build Coastguard Worker
89*61c4878aSAndroid Build Coastguard Worker     subgraph before["Before: No Tokenization"]
90*61c4878aSAndroid Build Coastguard Worker       before_log["LOG(#quot;Battery Voltage: %d mV#quot;, voltage)"] -- 41 bytes stored on-device as... -->
91*61c4878aSAndroid Build Coastguard Worker       before_encoding["#quot;Battery Voltage: %d mV#quot;"] -- 43 bytes sent over the wire as... -->
92*61c4878aSAndroid Build Coastguard Worker       before_transmission["#quot;Battery Voltage: 3989 mV#quot;"] -- Displayed in logs as... -->
93*61c4878aSAndroid Build Coastguard Worker       before_display["#quot;Battery Voltage: 3989 mV#quot;"]
94*61c4878aSAndroid Build Coastguard Worker     end
95*61c4878aSAndroid Build Coastguard Worker
96*61c4878aSAndroid Build Coastguard Worker     style after stroke:#00c852,stroke-width:3px
97*61c4878aSAndroid Build Coastguard Worker     style before stroke:#ff5252,stroke-width:3px
98*61c4878aSAndroid Build Coastguard Worker
99*61c4878aSAndroid Build Coastguard WorkerA quick overview of how the tokenized version works:
100*61c4878aSAndroid Build Coastguard Worker
101*61c4878aSAndroid Build Coastguard Worker* You tokenize ``"Battery Voltage: %d mV"`` with a macro like
102*61c4878aSAndroid Build Coastguard Worker  :c:macro:`PW_TOKENIZE_STRING`. You can use :ref:`module-pw_log_tokenized`
103*61c4878aSAndroid Build Coastguard Worker  to handle the tokenization automatically.
104*61c4878aSAndroid Build Coastguard Worker* After tokenization, ``"Battery Voltage: %d mV"`` becomes ``d9 28 47 8e``.
105*61c4878aSAndroid Build Coastguard Worker* The first 4 bytes sent over the wire is the tokenized version of
106*61c4878aSAndroid Build Coastguard Worker  ``"Battery Voltage: %d mV"``. The last 2 bytes are the value of ``voltage``
107*61c4878aSAndroid Build Coastguard Worker  converted to a varint using :ref:`module-pw_varint`.
108*61c4878aSAndroid Build Coastguard Worker* The logs are converted back to the original, human-readable message
109*61c4878aSAndroid Build Coastguard Worker  via the :ref:`Detokenization API <module-pw_tokenizer-detokenization>`
110*61c4878aSAndroid Build Coastguard Worker  and a :ref:`token database <module-pw_tokenizer-token-databases>`.
111*61c4878aSAndroid Build Coastguard Worker
112*61c4878aSAndroid Build Coastguard Worker.. toctree::
113*61c4878aSAndroid Build Coastguard Worker   :hidden:
114*61c4878aSAndroid Build Coastguard Worker   :maxdepth: 1
115*61c4878aSAndroid Build Coastguard Worker
116*61c4878aSAndroid Build Coastguard Worker   Get started <get_started>
117*61c4878aSAndroid Build Coastguard Worker   tokenization
118*61c4878aSAndroid Build Coastguard Worker   token_databases
119*61c4878aSAndroid Build Coastguard Worker   detokenization
120*61c4878aSAndroid Build Coastguard Worker   API reference <api>
121