xref: /aosp_15_r20/external/pigweed/pw_tokenizer/detokenization.rst (revision 61c4878ac05f98d0ceed94b57d316916de578985)
1*61c4878aSAndroid Build Coastguard Worker:tocdepth: 3
2*61c4878aSAndroid Build Coastguard Worker
3*61c4878aSAndroid Build Coastguard Worker.. _module-pw_tokenizer-detokenization:
4*61c4878aSAndroid Build Coastguard Worker
5*61c4878aSAndroid Build Coastguard Worker==============
6*61c4878aSAndroid Build Coastguard WorkerDetokenization
7*61c4878aSAndroid Build Coastguard Worker==============
8*61c4878aSAndroid Build Coastguard Worker.. pigweed-module-subpage::
9*61c4878aSAndroid Build Coastguard Worker   :name: pw_tokenizer
10*61c4878aSAndroid Build Coastguard Worker
11*61c4878aSAndroid Build Coastguard WorkerDetokenization is the process of expanding a token to the string it represents
12*61c4878aSAndroid Build Coastguard Workerand decoding its arguments. ``pw_tokenizer`` provides Python, C++ and
13*61c4878aSAndroid Build Coastguard WorkerTypeScript detokenization libraries.
14*61c4878aSAndroid Build Coastguard Worker
15*61c4878aSAndroid Build Coastguard Worker--------------------------------
16*61c4878aSAndroid Build Coastguard WorkerExample: decoding tokenized logs
17*61c4878aSAndroid Build Coastguard Worker--------------------------------
18*61c4878aSAndroid Build Coastguard WorkerA project might tokenize its log messages with the
19*61c4878aSAndroid Build Coastguard Worker:ref:`module-pw_tokenizer-base64-format`. Consider the following log file, which
20*61c4878aSAndroid Build Coastguard Workerhas four tokenized logs and one plain text log:
21*61c4878aSAndroid Build Coastguard Worker
22*61c4878aSAndroid Build Coastguard Worker.. code-block:: text
23*61c4878aSAndroid Build Coastguard Worker
24*61c4878aSAndroid Build Coastguard Worker   20200229 14:38:58 INF $HL2VHA==
25*61c4878aSAndroid Build Coastguard Worker   20200229 14:39:00 DBG $5IhTKg==
26*61c4878aSAndroid Build Coastguard Worker   20200229 14:39:20 DBG Crunching numbers to calculate probability of success
27*61c4878aSAndroid Build Coastguard Worker   20200229 14:39:21 INF $EgFj8lVVAUI=
28*61c4878aSAndroid Build Coastguard Worker   20200229 14:39:23 ERR $DFRDNwlOT1RfUkVBRFk=
29*61c4878aSAndroid Build Coastguard Worker
30*61c4878aSAndroid Build Coastguard WorkerThe project's log strings are stored in a database like the following:
31*61c4878aSAndroid Build Coastguard Worker
32*61c4878aSAndroid Build Coastguard Worker.. code-block::
33*61c4878aSAndroid Build Coastguard Worker
34*61c4878aSAndroid Build Coastguard Worker   1c95bd1c,          ,"Initiating retrieval process for recovery object"
35*61c4878aSAndroid Build Coastguard Worker   2a5388e4,          ,"Determining optimal approach and coordinating vectors"
36*61c4878aSAndroid Build Coastguard Worker   3743540c,          ,"Recovery object retrieval failed with status %s"
37*61c4878aSAndroid Build Coastguard Worker   f2630112,          ,"Calculated acceptable probability of success (%.2f%%)"
38*61c4878aSAndroid Build Coastguard Worker
39*61c4878aSAndroid Build Coastguard WorkerUsing the detokenizing tools with the database, the logs can be decoded:
40*61c4878aSAndroid Build Coastguard Worker
41*61c4878aSAndroid Build Coastguard Worker.. code-block:: text
42*61c4878aSAndroid Build Coastguard Worker
43*61c4878aSAndroid Build Coastguard Worker   20200229 14:38:58 INF Initiating retrieval process for recovery object
44*61c4878aSAndroid Build Coastguard Worker   20200229 14:39:00 DBG Determining optimal algorithm and coordinating approach vectors
45*61c4878aSAndroid Build Coastguard Worker   20200229 14:39:20 DBG Crunching numbers to calculate probability of success
46*61c4878aSAndroid Build Coastguard Worker   20200229 14:39:21 INF Calculated acceptable probability of success (32.33%)
47*61c4878aSAndroid Build Coastguard Worker   20200229 14:39:23 ERR Recovery object retrieval failed with status NOT_READY
48*61c4878aSAndroid Build Coastguard Worker
49*61c4878aSAndroid Build Coastguard Worker.. note::
50*61c4878aSAndroid Build Coastguard Worker
51*61c4878aSAndroid Build Coastguard Worker   This example uses the :ref:`module-pw_tokenizer-base64-format`, which
52*61c4878aSAndroid Build Coastguard Worker   occupies about 4/3 (133%) as much space as the default binary format when
53*61c4878aSAndroid Build Coastguard Worker   encoded. For projects that wish to interleave tokenized with plain text,
54*61c4878aSAndroid Build Coastguard Worker   using Base64 is a worthwhile tradeoff.
55*61c4878aSAndroid Build Coastguard Worker
56*61c4878aSAndroid Build Coastguard Worker------------------------
57*61c4878aSAndroid Build Coastguard WorkerDetokenization in Python
58*61c4878aSAndroid Build Coastguard Worker------------------------
59*61c4878aSAndroid Build Coastguard WorkerTo detokenize in Python, import ``Detokenizer`` from the ``pw_tokenizer``
60*61c4878aSAndroid Build Coastguard Workerpackage, and instantiate it with paths to token databases or ELF files.
61*61c4878aSAndroid Build Coastguard Worker
62*61c4878aSAndroid Build Coastguard Worker.. code-block:: python
63*61c4878aSAndroid Build Coastguard Worker
64*61c4878aSAndroid Build Coastguard Worker   import pw_tokenizer
65*61c4878aSAndroid Build Coastguard Worker
66*61c4878aSAndroid Build Coastguard Worker   detokenizer = pw_tokenizer.Detokenizer('path/to/database.csv', 'other/path.elf')
67*61c4878aSAndroid Build Coastguard Worker
68*61c4878aSAndroid Build Coastguard Worker   def process_log_message(log_message):
69*61c4878aSAndroid Build Coastguard Worker       result = detokenizer.detokenize(log_message.payload)
70*61c4878aSAndroid Build Coastguard Worker       self._log(str(result))
71*61c4878aSAndroid Build Coastguard Worker
72*61c4878aSAndroid Build Coastguard WorkerThe ``pw_tokenizer`` package also provides the ``AutoUpdatingDetokenizer``
73*61c4878aSAndroid Build Coastguard Workerclass, which can be used in place of the standard ``Detokenizer``. This class
74*61c4878aSAndroid Build Coastguard Workermonitors database files for changes and automatically reloads them when they
75*61c4878aSAndroid Build Coastguard Workerchange. This is helpful for long-running tools that use detokenization. The
76*61c4878aSAndroid Build Coastguard Workerclass also supports filtering token domains for the given database files in the
77*61c4878aSAndroid Build Coastguard Worker``<path>#<domain>`` format.
78*61c4878aSAndroid Build Coastguard Worker
79*61c4878aSAndroid Build Coastguard WorkerFor messages that are optionally tokenized and may be encoded as binary,
80*61c4878aSAndroid Build Coastguard WorkerBase64, or plaintext UTF-8, use
81*61c4878aSAndroid Build Coastguard Worker:func:`pw_tokenizer.proto.decode_optionally_tokenized`. This will attempt to
82*61c4878aSAndroid Build Coastguard Workerdetermine the correct method to detokenize and always provide a printable
83*61c4878aSAndroid Build Coastguard Workerstring.
84*61c4878aSAndroid Build Coastguard Worker
85*61c4878aSAndroid Build Coastguard Worker.. _module-pw_tokenizer-base64-decoding:
86*61c4878aSAndroid Build Coastguard Worker
87*61c4878aSAndroid Build Coastguard WorkerDecoding Base64
88*61c4878aSAndroid Build Coastguard Worker===============
89*61c4878aSAndroid Build Coastguard WorkerThe Python ``Detokenizer`` class supports decoding and detokenizing prefixed
90*61c4878aSAndroid Build Coastguard WorkerBase64 messages with ``detokenize_base64`` and related methods.
91*61c4878aSAndroid Build Coastguard Worker
92*61c4878aSAndroid Build Coastguard Worker.. tip::
93*61c4878aSAndroid Build Coastguard Worker   The Python detokenization tools support recursive detokenization for prefixed
94*61c4878aSAndroid Build Coastguard Worker   Base64 text. Tokenized strings found in detokenized text are detokenized, so
95*61c4878aSAndroid Build Coastguard Worker   prefixed Base64 messages can be passed as ``%s`` arguments.
96*61c4878aSAndroid Build Coastguard Worker
97*61c4878aSAndroid Build Coastguard Worker   For example, the tokenized string for "Wow!" is ``$RhYjmQ==``. This could be
98*61c4878aSAndroid Build Coastguard Worker   passed as an argument to the printf-style string ``Nested message: %s``, which
99*61c4878aSAndroid Build Coastguard Worker   encodes to ``$pEVTYQkkUmhZam1RPT0=``. The detokenizer would decode the message
100*61c4878aSAndroid Build Coastguard Worker   as follows:
101*61c4878aSAndroid Build Coastguard Worker
102*61c4878aSAndroid Build Coastguard Worker   ::
103*61c4878aSAndroid Build Coastguard Worker
104*61c4878aSAndroid Build Coastguard Worker     "$pEVTYQkkUmhZam1RPT0=" → "Nested message: $RhYjmQ==" → "Nested message: Wow!"
105*61c4878aSAndroid Build Coastguard Worker
106*61c4878aSAndroid Build Coastguard WorkerBase64 decoding is supported in C++ or C with the
107*61c4878aSAndroid Build Coastguard Worker``pw::tokenizer::PrefixedBase64Decode`` or ``pw_tokenizer_PrefixedBase64Decode``
108*61c4878aSAndroid Build Coastguard Workerfunctions.
109*61c4878aSAndroid Build Coastguard Worker
110*61c4878aSAndroid Build Coastguard WorkerInvestigating undecoded Base64 messages
111*61c4878aSAndroid Build Coastguard Worker---------------------------------------
112*61c4878aSAndroid Build Coastguard WorkerTokenized messages cannot be decoded if the token is not recognized. The Python
113*61c4878aSAndroid Build Coastguard Workerpackage includes the ``parse_message`` tool, which parses tokenized Base64
114*61c4878aSAndroid Build Coastguard Workermessages without looking up the token in a database. This tool attempts to guess
115*61c4878aSAndroid Build Coastguard Workerthe types of the arguments and displays potential ways to decode them.
116*61c4878aSAndroid Build Coastguard Worker
117*61c4878aSAndroid Build Coastguard WorkerThis tool can be used to extract argument information from an otherwise unusable
118*61c4878aSAndroid Build Coastguard Workermessage. It could help identify which statement in the code produced the
119*61c4878aSAndroid Build Coastguard Workermessage. This tool is not particularly helpful for tokenized messages without
120*61c4878aSAndroid Build Coastguard Workerarguments, since all it can do is show the value of the unknown token.
121*61c4878aSAndroid Build Coastguard Worker
122*61c4878aSAndroid Build Coastguard WorkerThe tool is executed by passing Base64 tokenized messages, with or without the
123*61c4878aSAndroid Build Coastguard Worker``$`` prefix, to ``pw_tokenizer.parse_message``. Pass ``-h`` or ``--help`` to
124*61c4878aSAndroid Build Coastguard Workersee full usage information.
125*61c4878aSAndroid Build Coastguard Worker
126*61c4878aSAndroid Build Coastguard WorkerExample
127*61c4878aSAndroid Build Coastguard Worker^^^^^^^
128*61c4878aSAndroid Build Coastguard Worker.. code-block::
129*61c4878aSAndroid Build Coastguard Worker
130*61c4878aSAndroid Build Coastguard Worker   $ python -m pw_tokenizer.parse_message '$329JMwA=' koSl524TRkFJTEVEX1BSRUNPTkRJVElPTgJPSw== --specs %s %d
131*61c4878aSAndroid Build Coastguard Worker
132*61c4878aSAndroid Build Coastguard Worker   INF Decoding arguments for '$329JMwA='
133*61c4878aSAndroid Build Coastguard Worker   INF Binary: b'\xdfoI3\x00' [df 6f 49 33 00] (5 bytes)
134*61c4878aSAndroid Build Coastguard Worker   INF Token:  0x33496fdf
135*61c4878aSAndroid Build Coastguard Worker   INF Args:   b'\x00' [00] (1 bytes)
136*61c4878aSAndroid Build Coastguard Worker   INF Decoding with up to 8 %s or %d arguments
137*61c4878aSAndroid Build Coastguard Worker   INF   Attempt 1: [%s]
138*61c4878aSAndroid Build Coastguard Worker   INF   Attempt 2: [%d] 0
139*61c4878aSAndroid Build Coastguard Worker
140*61c4878aSAndroid Build Coastguard Worker   INF Decoding arguments for '$koSl524TRkFJTEVEX1BSRUNPTkRJVElPTgJPSw=='
141*61c4878aSAndroid Build Coastguard Worker   INF Binary: b'\x92\x84\xa5\xe7n\x13FAILED_PRECONDITION\x02OK' [92 84 a5 e7 6e 13 46 41 49 4c 45 44 5f 50 52 45 43 4f 4e 44 49 54 49 4f 4e 02 4f 4b] (28 bytes)
142*61c4878aSAndroid Build Coastguard Worker   INF Token:  0xe7a58492
143*61c4878aSAndroid Build Coastguard Worker   INF Args:   b'n\x13FAILED_PRECONDITION\x02OK' [6e 13 46 41 49 4c 45 44 5f 50 52 45 43 4f 4e 44 49 54 49 4f 4e 02 4f 4b] (24 bytes)
144*61c4878aSAndroid Build Coastguard Worker   INF Decoding with up to 8 %s or %d arguments
145*61c4878aSAndroid Build Coastguard Worker   INF   Attempt 1: [%d %s %d %d %d] 55 FAILED_PRECONDITION 1 -40 -38
146*61c4878aSAndroid Build Coastguard Worker   INF   Attempt 2: [%d %s %s] 55 FAILED_PRECONDITION OK
147*61c4878aSAndroid Build Coastguard Worker
148*61c4878aSAndroid Build Coastguard Worker
149*61c4878aSAndroid Build Coastguard Worker.. _module-pw_tokenizer-protobuf-tokenization-python:
150*61c4878aSAndroid Build Coastguard Worker
151*61c4878aSAndroid Build Coastguard WorkerDetokenizing protobufs
152*61c4878aSAndroid Build Coastguard Worker======================
153*61c4878aSAndroid Build Coastguard WorkerThe :py:mod:`pw_tokenizer.proto` Python module defines functions that may be
154*61c4878aSAndroid Build Coastguard Workerused to detokenize protobuf objects in Python. The function
155*61c4878aSAndroid Build Coastguard Worker:py:func:`pw_tokenizer.proto.detokenize_fields` detokenizes all fields
156*61c4878aSAndroid Build Coastguard Workerannotated as tokenized, replacing them with their detokenized version. For
157*61c4878aSAndroid Build Coastguard Workerexample:
158*61c4878aSAndroid Build Coastguard Worker
159*61c4878aSAndroid Build Coastguard Worker.. code-block:: python
160*61c4878aSAndroid Build Coastguard Worker
161*61c4878aSAndroid Build Coastguard Worker   my_detokenizer = pw_tokenizer.Detokenizer(some_database)
162*61c4878aSAndroid Build Coastguard Worker
163*61c4878aSAndroid Build Coastguard Worker   my_message = SomeMessage(tokenized_field=b'$YS1EMQ==')
164*61c4878aSAndroid Build Coastguard Worker   pw_tokenizer.proto.detokenize_fields(my_detokenizer, my_message)
165*61c4878aSAndroid Build Coastguard Worker
166*61c4878aSAndroid Build Coastguard Worker   assert my_message.tokenized_field == b'The detokenized string! Cool!'
167*61c4878aSAndroid Build Coastguard Worker
168*61c4878aSAndroid Build Coastguard WorkerDecoding optionally tokenized strings
169*61c4878aSAndroid Build Coastguard Worker-------------------------------------
170*61c4878aSAndroid Build Coastguard WorkerThe encoding used for an optionally tokenized field is not recorded in the
171*61c4878aSAndroid Build Coastguard Workerprotobuf. Despite this, the text can reliably be decoded. This is accomplished
172*61c4878aSAndroid Build Coastguard Workerby attempting to decode the field as binary or Base64 tokenized data before
173*61c4878aSAndroid Build Coastguard Workertreating it like plain text.
174*61c4878aSAndroid Build Coastguard Worker
175*61c4878aSAndroid Build Coastguard WorkerThe following diagram describes the decoding process for optionally tokenized
176*61c4878aSAndroid Build Coastguard Workerfields in detail.
177*61c4878aSAndroid Build Coastguard Worker
178*61c4878aSAndroid Build Coastguard Worker.. mermaid::
179*61c4878aSAndroid Build Coastguard Worker
180*61c4878aSAndroid Build Coastguard Worker  flowchart TD
181*61c4878aSAndroid Build Coastguard Worker     start([Received bytes]) --> binary
182*61c4878aSAndroid Build Coastguard Worker
183*61c4878aSAndroid Build Coastguard Worker     binary[Decode as<br>binary tokenized] --> binary_ok
184*61c4878aSAndroid Build Coastguard Worker     binary_ok{Detokenizes<br>successfully?} -->|no| utf8
185*61c4878aSAndroid Build Coastguard Worker     binary_ok -->|yes| done_binary([Display decoded binary])
186*61c4878aSAndroid Build Coastguard Worker
187*61c4878aSAndroid Build Coastguard Worker     utf8[Decode as UTF-8] --> utf8_ok
188*61c4878aSAndroid Build Coastguard Worker     utf8_ok{Valid UTF-8?} -->|no| base64_encode
189*61c4878aSAndroid Build Coastguard Worker     utf8_ok -->|yes| base64
190*61c4878aSAndroid Build Coastguard Worker
191*61c4878aSAndroid Build Coastguard Worker     base64_encode[Encode as<br>tokenized Base64] --> display
192*61c4878aSAndroid Build Coastguard Worker     display([Display encoded Base64])
193*61c4878aSAndroid Build Coastguard Worker
194*61c4878aSAndroid Build Coastguard Worker     base64[Decode as<br>Base64 tokenized] --> base64_ok
195*61c4878aSAndroid Build Coastguard Worker
196*61c4878aSAndroid Build Coastguard Worker     base64_ok{Fully<br>or partially<br>detokenized?} -->|no| is_plain_text
197*61c4878aSAndroid Build Coastguard Worker     base64_ok -->|yes| base64_results
198*61c4878aSAndroid Build Coastguard Worker
199*61c4878aSAndroid Build Coastguard Worker     is_plain_text{Text is<br>printable?} -->|no| base64_encode
200*61c4878aSAndroid Build Coastguard Worker     is_plain_text-->|yes| plain_text
201*61c4878aSAndroid Build Coastguard Worker
202*61c4878aSAndroid Build Coastguard Worker     base64_results([Display decoded Base64])
203*61c4878aSAndroid Build Coastguard Worker     plain_text([Display text])
204*61c4878aSAndroid Build Coastguard Worker
205*61c4878aSAndroid Build Coastguard WorkerPotential decoding problems
206*61c4878aSAndroid Build Coastguard Worker---------------------------
207*61c4878aSAndroid Build Coastguard WorkerThe decoding process for optionally tokenized fields will yield correct results
208*61c4878aSAndroid Build Coastguard Workerin almost every situation. In rare circumstances, it is possible for it to fail,
209*61c4878aSAndroid Build Coastguard Workerbut these can be avoided with a low-overhead mitigation if desired.
210*61c4878aSAndroid Build Coastguard Worker
211*61c4878aSAndroid Build Coastguard WorkerThere are two ways in which the decoding process may fail.
212*61c4878aSAndroid Build Coastguard Worker
213*61c4878aSAndroid Build Coastguard WorkerAccidentally interpreting plain text as tokenized binary
214*61c4878aSAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
215*61c4878aSAndroid Build Coastguard WorkerIf a plain-text string happens to decode as a binary tokenized message, the
216*61c4878aSAndroid Build Coastguard Workerincorrect message could be displayed. This is very unlikely to occur. While many
217*61c4878aSAndroid Build Coastguard Workertokens will incidentally end up being valid UTF-8 strings, it is highly unlikely
218*61c4878aSAndroid Build Coastguard Workerthat a device will happen to log one of these strings as plain text. The
219*61c4878aSAndroid Build Coastguard Workeroverwhelming majority of these strings will be nonsense.
220*61c4878aSAndroid Build Coastguard Worker
221*61c4878aSAndroid Build Coastguard WorkerIf an implementation wishes to guard against this extremely improbable
222*61c4878aSAndroid Build Coastguard Workersituation, it is possible to prevent it. This situation is prevented by
223*61c4878aSAndroid Build Coastguard Workerappending 0xFF (or another byte never valid in UTF-8) to binary tokenized data
224*61c4878aSAndroid Build Coastguard Workerthat happens to be valid UTF-8 (or all binary tokenized messages, if desired).
225*61c4878aSAndroid Build Coastguard WorkerWhen decoding, if there is an extra 0xFF byte, it is discarded.
226*61c4878aSAndroid Build Coastguard Worker
227*61c4878aSAndroid Build Coastguard WorkerDisplaying undecoded binary as plain text instead of Base64
228*61c4878aSAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
229*61c4878aSAndroid Build Coastguard WorkerIf a message fails to decode as binary tokenized and it is not valid UTF-8, it
230*61c4878aSAndroid Build Coastguard Workeris displayed as tokenized Base64. This makes it easily recognizable as a
231*61c4878aSAndroid Build Coastguard Workertokenized message and makes it simple to decode later from the text output (for
232*61c4878aSAndroid Build Coastguard Workerexample, with an updated token database).
233*61c4878aSAndroid Build Coastguard Worker
234*61c4878aSAndroid Build Coastguard WorkerA binary message for which the token is not known may coincidentally be valid
235*61c4878aSAndroid Build Coastguard WorkerUTF-8 or ASCII. 6.25% of 4-byte sequences are composed only of ASCII characters
236*61c4878aSAndroid Build Coastguard WorkerWhen decoding with an out-of-date token database, it is possible that some
237*61c4878aSAndroid Build Coastguard Workerbinary tokenized messages will be displayed as plain text rather than tokenized
238*61c4878aSAndroid Build Coastguard WorkerBase64.
239*61c4878aSAndroid Build Coastguard Worker
240*61c4878aSAndroid Build Coastguard WorkerThis situation is likely to occur, but should be infrequent. Even if it does
241*61c4878aSAndroid Build Coastguard Workerhappen, it is not a serious issue. A very small number of strings will be
242*61c4878aSAndroid Build Coastguard Workerdisplayed incorrectly, but these strings cannot be decoded anyway. One nonsense
243*61c4878aSAndroid Build Coastguard Workerstring (e.g. ``a-D1``) would be displayed instead of another (``$YS1EMQ==``).
244*61c4878aSAndroid Build Coastguard WorkerUpdating the token database would resolve the issue, though the non-Base64 logs
245*61c4878aSAndroid Build Coastguard Workerwould be difficult decode later from a log file.
246*61c4878aSAndroid Build Coastguard Worker
247*61c4878aSAndroid Build Coastguard WorkerThis situation can be avoided with the same approach described in
248*61c4878aSAndroid Build Coastguard Worker`Accidentally interpreting plain text as tokenized binary`_. Appending
249*61c4878aSAndroid Build Coastguard Workeran invalid UTF-8 character prevents the undecoded binary message from being
250*61c4878aSAndroid Build Coastguard Workerinterpreted as plain text.
251*61c4878aSAndroid Build Coastguard Worker
252*61c4878aSAndroid Build Coastguard Worker---------------------
253*61c4878aSAndroid Build Coastguard WorkerDetokenization in C++
254*61c4878aSAndroid Build Coastguard Worker---------------------
255*61c4878aSAndroid Build Coastguard WorkerThe C++ detokenization libraries can be used in C++ or any language that can
256*61c4878aSAndroid Build Coastguard Workercall into C++ with a C-linkage wrapper, such as Java or Rust. A reference
257*61c4878aSAndroid Build Coastguard WorkerJava Native Interface (JNI) implementation is provided.
258*61c4878aSAndroid Build Coastguard Worker
259*61c4878aSAndroid Build Coastguard WorkerThe C++ detokenization library uses binary-format token databases (created with
260*61c4878aSAndroid Build Coastguard Worker``database.py create --type binary``). Read a binary format database from a
261*61c4878aSAndroid Build Coastguard Workerfile or include it in the source code. Pass the database array to
262*61c4878aSAndroid Build Coastguard Worker``TokenDatabase::Create``, and construct a detokenizer.
263*61c4878aSAndroid Build Coastguard Worker
264*61c4878aSAndroid Build Coastguard Worker.. code-block:: cpp
265*61c4878aSAndroid Build Coastguard Worker
266*61c4878aSAndroid Build Coastguard Worker   Detokenizer detokenizer(TokenDatabase::Create(token_database_array));
267*61c4878aSAndroid Build Coastguard Worker
268*61c4878aSAndroid Build Coastguard Worker   std::string ProcessLog(span<uint8_t> log_data) {
269*61c4878aSAndroid Build Coastguard Worker     return detokenizer.Detokenize(log_data).BestString();
270*61c4878aSAndroid Build Coastguard Worker   }
271*61c4878aSAndroid Build Coastguard Worker
272*61c4878aSAndroid Build Coastguard WorkerThe ``TokenDatabase`` class verifies that its data is valid before using it. If
273*61c4878aSAndroid Build Coastguard Workerit is invalid, the ``TokenDatabase::Create`` returns an empty database for which
274*61c4878aSAndroid Build Coastguard Worker``ok()`` returns false. If the token database is included in the source code,
275*61c4878aSAndroid Build Coastguard Workerthis check can be done at compile time.
276*61c4878aSAndroid Build Coastguard Worker
277*61c4878aSAndroid Build Coastguard Worker.. code-block:: cpp
278*61c4878aSAndroid Build Coastguard Worker
279*61c4878aSAndroid Build Coastguard Worker   // This line fails to compile with a static_assert if the database is invalid.
280*61c4878aSAndroid Build Coastguard Worker   constexpr TokenDatabase kDefaultDatabase =  TokenDatabase::Create<kData>();
281*61c4878aSAndroid Build Coastguard Worker
282*61c4878aSAndroid Build Coastguard Worker   Detokenizer OpenDatabase(std::string_view path) {
283*61c4878aSAndroid Build Coastguard Worker     std::vector<uint8_t> data = ReadWholeFile(path);
284*61c4878aSAndroid Build Coastguard Worker
285*61c4878aSAndroid Build Coastguard Worker     TokenDatabase database = TokenDatabase::Create(data);
286*61c4878aSAndroid Build Coastguard Worker
287*61c4878aSAndroid Build Coastguard Worker     // This checks if the file contained a valid database. It is safe to use a
288*61c4878aSAndroid Build Coastguard Worker     // TokenDatabase that failed to load (it will be empty), but it may be
289*61c4878aSAndroid Build Coastguard Worker     // desirable to provide a default database or otherwise handle the error.
290*61c4878aSAndroid Build Coastguard Worker     if (database.ok()) {
291*61c4878aSAndroid Build Coastguard Worker       return Detokenizer(database);
292*61c4878aSAndroid Build Coastguard Worker     }
293*61c4878aSAndroid Build Coastguard Worker     return Detokenizer(kDefaultDatabase);
294*61c4878aSAndroid Build Coastguard Worker   }
295*61c4878aSAndroid Build Coastguard Worker
296*61c4878aSAndroid Build Coastguard Worker----------------------------
297*61c4878aSAndroid Build Coastguard WorkerDetokenization in TypeScript
298*61c4878aSAndroid Build Coastguard Worker----------------------------
299*61c4878aSAndroid Build Coastguard WorkerTo detokenize in TypeScript, import ``Detokenizer`` from the ``pigweedjs``
300*61c4878aSAndroid Build Coastguard Workerpackage, and instantiate it with a CSV token database.
301*61c4878aSAndroid Build Coastguard Worker
302*61c4878aSAndroid Build Coastguard Worker.. code-block:: typescript
303*61c4878aSAndroid Build Coastguard Worker
304*61c4878aSAndroid Build Coastguard Worker   import { pw_tokenizer, pw_hdlc } from 'pigweedjs';
305*61c4878aSAndroid Build Coastguard Worker   const { Detokenizer } = pw_tokenizer;
306*61c4878aSAndroid Build Coastguard Worker   const { Frame } = pw_hdlc;
307*61c4878aSAndroid Build Coastguard Worker
308*61c4878aSAndroid Build Coastguard Worker   const detokenizer = new Detokenizer(String(tokenCsv));
309*61c4878aSAndroid Build Coastguard Worker
310*61c4878aSAndroid Build Coastguard Worker   function processLog(frame: Frame){
311*61c4878aSAndroid Build Coastguard Worker     const result = detokenizer.detokenize(frame);
312*61c4878aSAndroid Build Coastguard Worker     console.log(result);
313*61c4878aSAndroid Build Coastguard Worker   }
314*61c4878aSAndroid Build Coastguard Worker
315*61c4878aSAndroid Build Coastguard WorkerFor messages that are encoded in Base64, use ``Detokenizer::detokenizeBase64``.
316*61c4878aSAndroid Build Coastguard Worker`detokenizeBase64` will also attempt to detokenize nested Base64 tokens. There
317*61c4878aSAndroid Build Coastguard Workeris also `detokenizeUint8Array` that works just like `detokenize` but expects
318*61c4878aSAndroid Build Coastguard Worker`Uint8Array` instead of a `Frame` argument.
319*61c4878aSAndroid Build Coastguard Worker
320*61c4878aSAndroid Build Coastguard Worker
321*61c4878aSAndroid Build Coastguard Worker
322*61c4878aSAndroid Build Coastguard Worker.. _module-pw_tokenizer-cli-detokenizing:
323*61c4878aSAndroid Build Coastguard Worker
324*61c4878aSAndroid Build Coastguard Worker---------------------
325*61c4878aSAndroid Build Coastguard WorkerDetokenizing CLI tool
326*61c4878aSAndroid Build Coastguard Worker---------------------
327*61c4878aSAndroid Build Coastguard Worker``pw_tokenizer`` provides two standalone command line utilities for detokenizing
328*61c4878aSAndroid Build Coastguard WorkerBase64-encoded tokenized strings.
329*61c4878aSAndroid Build Coastguard Worker
330*61c4878aSAndroid Build Coastguard Worker* ``detokenize.py`` -- Detokenizes Base64-encoded strings in files or from
331*61c4878aSAndroid Build Coastguard Worker  stdin.
332*61c4878aSAndroid Build Coastguard Worker* ``serial_detokenizer.py`` -- Detokenizes Base64-encoded strings from a
333*61c4878aSAndroid Build Coastguard Worker  connected serial device.
334*61c4878aSAndroid Build Coastguard Worker
335*61c4878aSAndroid Build Coastguard WorkerIf the ``pw_tokenizer`` Python package is installed, these tools may be executed
336*61c4878aSAndroid Build Coastguard Workeras runnable modules. For example:
337*61c4878aSAndroid Build Coastguard Worker
338*61c4878aSAndroid Build Coastguard Worker.. code-block::
339*61c4878aSAndroid Build Coastguard Worker
340*61c4878aSAndroid Build Coastguard Worker   # Detokenize Base64-encoded strings in a file
341*61c4878aSAndroid Build Coastguard Worker   python -m pw_tokenizer.detokenize -i input_file.txt
342*61c4878aSAndroid Build Coastguard Worker
343*61c4878aSAndroid Build Coastguard Worker   # Detokenize Base64-encoded strings in output from a serial device
344*61c4878aSAndroid Build Coastguard Worker   python -m pw_tokenizer.serial_detokenizer --device /dev/ttyACM0
345*61c4878aSAndroid Build Coastguard Worker
346*61c4878aSAndroid Build Coastguard WorkerSee the ``--help`` options for these tools for full usage information.
347*61c4878aSAndroid Build Coastguard Worker
348*61c4878aSAndroid Build Coastguard Worker--------
349*61c4878aSAndroid Build Coastguard WorkerAppendix
350*61c4878aSAndroid Build Coastguard Worker--------
351*61c4878aSAndroid Build Coastguard Worker
352*61c4878aSAndroid Build Coastguard Worker.. _module-pw_tokenizer-python-detokenization-c99-printf-notes:
353*61c4878aSAndroid Build Coastguard Worker
354*61c4878aSAndroid Build Coastguard WorkerPython detokenization: C99 ``printf`` compatibility notes
355*61c4878aSAndroid Build Coastguard Worker=========================================================
356*61c4878aSAndroid Build Coastguard WorkerThis implementation is designed to align with the
357*61c4878aSAndroid Build Coastguard Worker`C99 specification, section 7.19.6
358*61c4878aSAndroid Build Coastguard Worker<https://www.dii.uchile.cl/~daespino/files/Iso_C_1999_definition.pdf>`_.
359*61c4878aSAndroid Build Coastguard WorkerNotably, this specification is slightly different than what is implemented
360*61c4878aSAndroid Build Coastguard Workerin most compilers due to each compiler choosing to interpret undefined
361*61c4878aSAndroid Build Coastguard Workerbehavior in slightly different ways. Treat the following description as the
362*61c4878aSAndroid Build Coastguard Workersource of truth.
363*61c4878aSAndroid Build Coastguard Worker
364*61c4878aSAndroid Build Coastguard WorkerThis implementation supports:
365*61c4878aSAndroid Build Coastguard Worker
366*61c4878aSAndroid Build Coastguard Worker- Overall Format: ``%[flags][width][.precision][length][specifier]``
367*61c4878aSAndroid Build Coastguard Worker- Flags (Zero or More)
368*61c4878aSAndroid Build Coastguard Worker   - ``-``: Left-justify within the given field width; Right justification is
369*61c4878aSAndroid Build Coastguard Worker     the default (see Width modifier).
370*61c4878aSAndroid Build Coastguard Worker   - ``+``: Forces to preceed the result with a plus or minus sign (``+`` or
371*61c4878aSAndroid Build Coastguard Worker     ``-``) even for positive numbers. By default, only negative numbers are
372*61c4878aSAndroid Build Coastguard Worker     preceded with a ``-`` sign.
373*61c4878aSAndroid Build Coastguard Worker   - (space): If no sign is going to be written, a blank space is inserted
374*61c4878aSAndroid Build Coastguard Worker     before the value.
375*61c4878aSAndroid Build Coastguard Worker   - ``#``: Specifies an alternative print syntax should be used.
376*61c4878aSAndroid Build Coastguard Worker      - Used with ``o``, ``x`` or ``X`` specifiers the value is preceeded with
377*61c4878aSAndroid Build Coastguard Worker        ``0``, ``0x`` or ``0X``, respectively, for values different than zero.
378*61c4878aSAndroid Build Coastguard Worker      - Used with ``a``, ``A``, ``e``, ``E``, ``f``, ``F``, ``g``, or ``G`` it
379*61c4878aSAndroid Build Coastguard Worker        forces the written output to contain a decimal point even if no more
380*61c4878aSAndroid Build Coastguard Worker        digits follow. By default, if no digits follow, no decimal point is
381*61c4878aSAndroid Build Coastguard Worker        written.
382*61c4878aSAndroid Build Coastguard Worker   - ``0``: Left-pads the number with zeroes (``0``) instead of spaces when
383*61c4878aSAndroid Build Coastguard Worker     padding is specified (see width sub-specifier).
384*61c4878aSAndroid Build Coastguard Worker- Width (Optional)
385*61c4878aSAndroid Build Coastguard Worker   - ``(number)``: Minimum number of characters to be printed. If the value to
386*61c4878aSAndroid Build Coastguard Worker     be printed is shorter than this number, the result is padded with blank
387*61c4878aSAndroid Build Coastguard Worker     spaces or ``0`` if the ``0`` flag is present. The value is not truncated
388*61c4878aSAndroid Build Coastguard Worker     even if the result is larger. If the value is negative and the ``0`` flag
389*61c4878aSAndroid Build Coastguard Worker     is present, the ``0``\s are padded after the ``-`` symbol.
390*61c4878aSAndroid Build Coastguard Worker   - ``*``: The width is not specified in the format string, but as an
391*61c4878aSAndroid Build Coastguard Worker     additional integer value argument preceding the argument that has to be
392*61c4878aSAndroid Build Coastguard Worker     formatted.
393*61c4878aSAndroid Build Coastguard Worker- Precision (Optional)
394*61c4878aSAndroid Build Coastguard Worker   - ``.(number)``
395*61c4878aSAndroid Build Coastguard Worker      - For ``d``, ``i``, ``o``, ``u``, ``x``, ``X``, specifies the minimum
396*61c4878aSAndroid Build Coastguard Worker        number of digits to be written. If the value to be written is shorter
397*61c4878aSAndroid Build Coastguard Worker        than this number, the result is padded with leading zeros. The value is
398*61c4878aSAndroid Build Coastguard Worker        not truncated even if the result is longer.
399*61c4878aSAndroid Build Coastguard Worker
400*61c4878aSAndroid Build Coastguard Worker        - A precision of ``0`` means that no character is written for the value
401*61c4878aSAndroid Build Coastguard Worker          ``0``.
402*61c4878aSAndroid Build Coastguard Worker
403*61c4878aSAndroid Build Coastguard Worker      - For ``a``, ``A``, ``e``, ``E``, ``f``, and ``F``, specifies the number
404*61c4878aSAndroid Build Coastguard Worker        of digits to be printed after the decimal point. By default, this is
405*61c4878aSAndroid Build Coastguard Worker        ``6``.
406*61c4878aSAndroid Build Coastguard Worker
407*61c4878aSAndroid Build Coastguard Worker      - For ``g`` and ``G``, specifies the maximum number of significant digits
408*61c4878aSAndroid Build Coastguard Worker        to be printed.
409*61c4878aSAndroid Build Coastguard Worker
410*61c4878aSAndroid Build Coastguard Worker      - For ``s``, specifies the maximum number of characters to be printed. By
411*61c4878aSAndroid Build Coastguard Worker        default all characters are printed until the ending null character is
412*61c4878aSAndroid Build Coastguard Worker        encountered.
413*61c4878aSAndroid Build Coastguard Worker
414*61c4878aSAndroid Build Coastguard Worker      - If the period is specified without an explicit value for precision,
415*61c4878aSAndroid Build Coastguard Worker        ``0`` is assumed.
416*61c4878aSAndroid Build Coastguard Worker   - ``.*``: The precision is not specified in the format string, but as an
417*61c4878aSAndroid Build Coastguard Worker     additional integer value argument preceding the argument that has to be
418*61c4878aSAndroid Build Coastguard Worker     formatted.
419*61c4878aSAndroid Build Coastguard Worker- Length (Optional)
420*61c4878aSAndroid Build Coastguard Worker   - ``hh``: Usable with ``d``, ``i``, ``o``, ``u``, ``x``, or ``X`` specifiers
421*61c4878aSAndroid Build Coastguard Worker     to convey the argument will be a ``signed char`` or ``unsigned char``.
422*61c4878aSAndroid Build Coastguard Worker     However, this is largely ignored in the implementation due to it not being
423*61c4878aSAndroid Build Coastguard Worker     necessary for Python or argument decoding (since the argument is always
424*61c4878aSAndroid Build Coastguard Worker     encoded at least as a 32-bit integer).
425*61c4878aSAndroid Build Coastguard Worker   - ``h``: Usable with ``d``, ``i``, ``o``, ``u``, ``x``, or ``X`` specifiers
426*61c4878aSAndroid Build Coastguard Worker     to convey the argument will be a ``signed short int`` or
427*61c4878aSAndroid Build Coastguard Worker     ``unsigned short int``. However, this is largely ignored in the
428*61c4878aSAndroid Build Coastguard Worker     implementation due to it not being necessary for Python or argument
429*61c4878aSAndroid Build Coastguard Worker     decoding (since the argument is always encoded at least as a 32-bit
430*61c4878aSAndroid Build Coastguard Worker     integer).
431*61c4878aSAndroid Build Coastguard Worker   - ``l``: Usable with ``d``, ``i``, ``o``, ``u``, ``x``, or ``X`` specifiers
432*61c4878aSAndroid Build Coastguard Worker     to convey the argument will be a ``signed long int`` or
433*61c4878aSAndroid Build Coastguard Worker     ``unsigned long int``. Also is usable with ``c`` and ``s`` to specify that
434*61c4878aSAndroid Build Coastguard Worker     the arguments will be encoded with ``wchar_t`` values (which isn't
435*61c4878aSAndroid Build Coastguard Worker     different from normal ``char`` values). However, this is largely ignored in
436*61c4878aSAndroid Build Coastguard Worker     the implementation due to it not being necessary for Python or argument
437*61c4878aSAndroid Build Coastguard Worker     decoding (since the argument is always encoded at least as a 32-bit
438*61c4878aSAndroid Build Coastguard Worker     integer).
439*61c4878aSAndroid Build Coastguard Worker   - ``ll``: Usable with ``d``, ``i``, ``o``, ``u``, ``x``, or ``X`` specifiers
440*61c4878aSAndroid Build Coastguard Worker     to convey the argument will be a ``signed long long int`` or
441*61c4878aSAndroid Build Coastguard Worker     ``unsigned long long int``. This is required to properly decode the
442*61c4878aSAndroid Build Coastguard Worker     argument as a 64-bit integer.
443*61c4878aSAndroid Build Coastguard Worker   - ``L``: Usable with ``a``, ``A``, ``e``, ``E``, ``f``, ``F``, ``g``, or
444*61c4878aSAndroid Build Coastguard Worker     ``G`` conversion specifiers applies to a long double argument. However,
445*61c4878aSAndroid Build Coastguard Worker     this is ignored in the implementation due to floating point value encoded
446*61c4878aSAndroid Build Coastguard Worker     that is unaffected by bit width.
447*61c4878aSAndroid Build Coastguard Worker   - ``j``: Usable with ``d``, ``i``, ``o``, ``u``, ``x``, or ``X`` specifiers
448*61c4878aSAndroid Build Coastguard Worker     to convey the argument will be a ``intmax_t`` or ``uintmax_t``.
449*61c4878aSAndroid Build Coastguard Worker   - ``z``: Usable with ``d``, ``i``, ``o``, ``u``, ``x``, or ``X`` specifiers
450*61c4878aSAndroid Build Coastguard Worker     to convey the argument will be a ``size_t``. This will force the argument
451*61c4878aSAndroid Build Coastguard Worker     to be decoded as an unsigned integer.
452*61c4878aSAndroid Build Coastguard Worker   - ``t``: Usable with ``d``, ``i``, ``o``, ``u``, ``x``, or ``X`` specifiers
453*61c4878aSAndroid Build Coastguard Worker     to convey the argument will be a ``ptrdiff_t``.
454*61c4878aSAndroid Build Coastguard Worker   - If a length modifier is provided for an incorrect specifier, it is ignored.
455*61c4878aSAndroid Build Coastguard Worker- Specifier (Required)
456*61c4878aSAndroid Build Coastguard Worker   - ``d`` / ``i``: Used for signed decimal integers.
457*61c4878aSAndroid Build Coastguard Worker
458*61c4878aSAndroid Build Coastguard Worker   - ``u``: Used for unsigned decimal integers.
459*61c4878aSAndroid Build Coastguard Worker
460*61c4878aSAndroid Build Coastguard Worker   - ``o``: Used for unsigned decimal integers and specifies formatting should
461*61c4878aSAndroid Build Coastguard Worker     be as an octal number.
462*61c4878aSAndroid Build Coastguard Worker
463*61c4878aSAndroid Build Coastguard Worker   - ``x``: Used for unsigned decimal integers and specifies formatting should
464*61c4878aSAndroid Build Coastguard Worker     be as a hexadecimal number using all lowercase letters.
465*61c4878aSAndroid Build Coastguard Worker
466*61c4878aSAndroid Build Coastguard Worker   - ``X``: Used for unsigned decimal integers and specifies formatting should
467*61c4878aSAndroid Build Coastguard Worker     be as a hexadecimal number using all uppercase letters.
468*61c4878aSAndroid Build Coastguard Worker
469*61c4878aSAndroid Build Coastguard Worker   - ``f``: Used for floating-point values and specifies to use lowercase,
470*61c4878aSAndroid Build Coastguard Worker     decimal floating point formatting.
471*61c4878aSAndroid Build Coastguard Worker
472*61c4878aSAndroid Build Coastguard Worker     - Default precision is ``6`` decimal places unless explicitly specified.
473*61c4878aSAndroid Build Coastguard Worker
474*61c4878aSAndroid Build Coastguard Worker   - ``F``: Used for floating-point values and specifies to use uppercase,
475*61c4878aSAndroid Build Coastguard Worker     decimal floating point formatting.
476*61c4878aSAndroid Build Coastguard Worker
477*61c4878aSAndroid Build Coastguard Worker     - Default precision is ``6`` decimal places unless explicitly specified.
478*61c4878aSAndroid Build Coastguard Worker
479*61c4878aSAndroid Build Coastguard Worker   - ``e``: Used for floating-point values and specifies to use lowercase,
480*61c4878aSAndroid Build Coastguard Worker     exponential (scientific) formatting.
481*61c4878aSAndroid Build Coastguard Worker
482*61c4878aSAndroid Build Coastguard Worker     - Default precision is ``6`` decimal places unless explicitly specified.
483*61c4878aSAndroid Build Coastguard Worker
484*61c4878aSAndroid Build Coastguard Worker   - ``E``: Used for floating-point values and specifies to use uppercase,
485*61c4878aSAndroid Build Coastguard Worker     exponential (scientific) formatting.
486*61c4878aSAndroid Build Coastguard Worker
487*61c4878aSAndroid Build Coastguard Worker     - Default precision is ``6`` decimal places unless explicitly specified.
488*61c4878aSAndroid Build Coastguard Worker
489*61c4878aSAndroid Build Coastguard Worker   - ``g``: Used for floating-point values and specified to use ``f`` or ``e``
490*61c4878aSAndroid Build Coastguard Worker     formatting depending on which would be the shortest representation.
491*61c4878aSAndroid Build Coastguard Worker
492*61c4878aSAndroid Build Coastguard Worker     - Precision specifies the number of significant digits, not just digits
493*61c4878aSAndroid Build Coastguard Worker       after the decimal place.
494*61c4878aSAndroid Build Coastguard Worker
495*61c4878aSAndroid Build Coastguard Worker     - If the precision is specified as ``0``, it is interpreted to mean ``1``.
496*61c4878aSAndroid Build Coastguard Worker
497*61c4878aSAndroid Build Coastguard Worker     - ``e`` formatting is used if the the exponent would be less than ``-4`` or
498*61c4878aSAndroid Build Coastguard Worker       is greater than or equal to the precision.
499*61c4878aSAndroid Build Coastguard Worker
500*61c4878aSAndroid Build Coastguard Worker     - Trailing zeros are removed unless the ``#`` flag is set.
501*61c4878aSAndroid Build Coastguard Worker
502*61c4878aSAndroid Build Coastguard Worker     - A decimal point only appears if it is followed by a digit.
503*61c4878aSAndroid Build Coastguard Worker
504*61c4878aSAndroid Build Coastguard Worker     - ``NaN`` or infinities always follow ``f`` formatting.
505*61c4878aSAndroid Build Coastguard Worker
506*61c4878aSAndroid Build Coastguard Worker   - ``G``: Used for floating-point values and specified to use ``f`` or ``e``
507*61c4878aSAndroid Build Coastguard Worker     formatting depending on which would be the shortest representation.
508*61c4878aSAndroid Build Coastguard Worker
509*61c4878aSAndroid Build Coastguard Worker     - Precision specifies the number of significant digits, not just digits
510*61c4878aSAndroid Build Coastguard Worker       after the decimal place.
511*61c4878aSAndroid Build Coastguard Worker
512*61c4878aSAndroid Build Coastguard Worker     - If the precision is specified as ``0``, it is interpreted to mean ``1``.
513*61c4878aSAndroid Build Coastguard Worker
514*61c4878aSAndroid Build Coastguard Worker     - ``E`` formatting is used if the the exponent would be less than ``-4`` or
515*61c4878aSAndroid Build Coastguard Worker       is greater than or equal to the precision.
516*61c4878aSAndroid Build Coastguard Worker
517*61c4878aSAndroid Build Coastguard Worker     - Trailing zeros are removed unless the ``#`` flag is set.
518*61c4878aSAndroid Build Coastguard Worker
519*61c4878aSAndroid Build Coastguard Worker     - A decimal point only appears if it is followed by a digit.
520*61c4878aSAndroid Build Coastguard Worker
521*61c4878aSAndroid Build Coastguard Worker     - ``NaN`` or infinities always follow ``F`` formatting.
522*61c4878aSAndroid Build Coastguard Worker
523*61c4878aSAndroid Build Coastguard Worker   - ``c``: Used for formatting a ``char`` value.
524*61c4878aSAndroid Build Coastguard Worker
525*61c4878aSAndroid Build Coastguard Worker   - ``s``: Used for formatting a string of ``char`` values.
526*61c4878aSAndroid Build Coastguard Worker
527*61c4878aSAndroid Build Coastguard Worker     - If width is specified, the null terminator character is included as a
528*61c4878aSAndroid Build Coastguard Worker       character for width count.
529*61c4878aSAndroid Build Coastguard Worker
530*61c4878aSAndroid Build Coastguard Worker     - If precision is specified, no more ``char``\s than that value will be
531*61c4878aSAndroid Build Coastguard Worker       written from the string (padding is used to fill additional width).
532*61c4878aSAndroid Build Coastguard Worker
533*61c4878aSAndroid Build Coastguard Worker   - ``p``: Used for formatting a pointer address.
534*61c4878aSAndroid Build Coastguard Worker
535*61c4878aSAndroid Build Coastguard Worker   - ``%``: Prints a single ``%``. Only valid as ``%%`` (supports no flags,
536*61c4878aSAndroid Build Coastguard Worker     width, precision, or length modifiers).
537*61c4878aSAndroid Build Coastguard Worker
538*61c4878aSAndroid Build Coastguard WorkerUnderspecified details:
539*61c4878aSAndroid Build Coastguard Worker
540*61c4878aSAndroid Build Coastguard Worker- If both ``+`` and (space) flags appear, the (space) is ignored.
541*61c4878aSAndroid Build Coastguard Worker- The ``+`` and (space) flags will error if used with ``c`` or ``s``.
542*61c4878aSAndroid Build Coastguard Worker- The ``#`` flag will error if used with ``d``, ``i``, ``u``, ``c``, ``s``, or
543*61c4878aSAndroid Build Coastguard Worker  ``p``.
544*61c4878aSAndroid Build Coastguard Worker- The ``0`` flag will error if used with ``c``, ``s``, or ``p``.
545*61c4878aSAndroid Build Coastguard Worker- Both ``+`` and (space) can work with the unsigned integer specifiers ``u``,
546*61c4878aSAndroid Build Coastguard Worker  ``o``, ``x``, and ``X``.
547*61c4878aSAndroid Build Coastguard Worker- If a length modifier is provided for an incorrect specifier, it is ignored.
548*61c4878aSAndroid Build Coastguard Worker- The ``z`` length modifier will decode arugments as signed as long as ``d`` or
549*61c4878aSAndroid Build Coastguard Worker  ``i`` is used.
550*61c4878aSAndroid Build Coastguard Worker- ``p`` is implementation defined.
551*61c4878aSAndroid Build Coastguard Worker
552*61c4878aSAndroid Build Coastguard Worker  - For this implementation, it will print with a ``0x`` prefix and then the
553*61c4878aSAndroid Build Coastguard Worker    pointer value was printed using ``%08X``.
554*61c4878aSAndroid Build Coastguard Worker
555*61c4878aSAndroid Build Coastguard Worker  - ``p`` supports the ``+``, ``-``, and (space) flags, but not the ``#`` or
556*61c4878aSAndroid Build Coastguard Worker    ``0`` flags.
557*61c4878aSAndroid Build Coastguard Worker
558*61c4878aSAndroid Build Coastguard Worker  - None of the length modifiers are usable with ``p``.
559*61c4878aSAndroid Build Coastguard Worker
560*61c4878aSAndroid Build Coastguard Worker  - This implementation will try to adhere to user-specified width (assuming the
561*61c4878aSAndroid Build Coastguard Worker    width provided is larger than the guaranteed minimum of ``10``).
562*61c4878aSAndroid Build Coastguard Worker
563*61c4878aSAndroid Build Coastguard Worker  - Specifying precision for ``p`` is considered an error.
564*61c4878aSAndroid Build Coastguard Worker- Only ``%%`` is allowed with no other modifiers. Things like ``%+%`` will fail
565*61c4878aSAndroid Build Coastguard Worker  to decode. Some C stdlib implementations support any modifiers being
566*61c4878aSAndroid Build Coastguard Worker  present between ``%``, but ignore any for the output.
567*61c4878aSAndroid Build Coastguard Worker- If a width is specified with the ``0`` flag for a negative value, the padded
568*61c4878aSAndroid Build Coastguard Worker  ``0``\s will appear after the ``-`` symbol.
569*61c4878aSAndroid Build Coastguard Worker- A precision of ``0`` for ``d``, ``i``, ``u``, ``o``, ``x``, or ``X`` means
570*61c4878aSAndroid Build Coastguard Worker  that no character is written for the value ``0``.
571*61c4878aSAndroid Build Coastguard Worker- Precision cannot be specified for ``c``.
572*61c4878aSAndroid Build Coastguard Worker- Using ``*`` or fixed precision with the ``s`` specifier still requires the
573*61c4878aSAndroid Build Coastguard Worker  string argument to be null-terminated. This is due to argument encoding
574*61c4878aSAndroid Build Coastguard Worker  happening on the C/C++-side while the precision value is not read or
575*61c4878aSAndroid Build Coastguard Worker  otherwise used until decoding happens in this Python code.
576*61c4878aSAndroid Build Coastguard Worker
577*61c4878aSAndroid Build Coastguard WorkerNon-conformant details:
578*61c4878aSAndroid Build Coastguard Worker
579*61c4878aSAndroid Build Coastguard Worker- ``n`` specifier: We do not support the ``n`` specifier since it is impossible
580*61c4878aSAndroid Build Coastguard Worker  for us to retroactively tell the original program how many characters have
581*61c4878aSAndroid Build Coastguard Worker  been printed since this decoding happens a great deal of time after the
582*61c4878aSAndroid Build Coastguard Worker  device sent it, usually on a separate processing device entirely.
583