xref: /aosp_15_r20/external/pigweed/seed/0103.rst (revision 61c4878ac05f98d0ceed94b57d316916de578985)
1.. _seed-0103:
2
3============================================
40103: pw_protobuf: Past, present, and future
5============================================
6.. seed::
7   :number: 0103
8   :name: pw_protobuf: Past, present, and future
9   :status: Accepted
10   :proposal_date: 2023-08-16
11   :cl: 133971
12   :authors: Alexei Frolov
13   :facilitator: Armando Montanez
14
15-------
16Summary
17-------
18``pw_protobuf`` is one of Pigweed's oldest modules and has become a foundational
19component of Pigweed and Pigweed-based projects. At its core, ``pw_protobuf``
20provides a compact and efficient `protobuf <https://protobuf.dev>`_ wire format
21encoder and decoder, but as third-party usage has grown, additional higher-level
22APIs have sprung up, many of which were contributed by third-party developers to
23address use cases within their own projects.
24
25The growth of ``pw_protobuf`` was not entirely controlled, which has resulted in
26a lack of cohesion among its components, incomplete implementations, and
27implicit, undocumented limitations. This has made the module difficult to
28approach for new users and put a lasting maintenance burden on the core Pigweed
29team.
30
31This document explores the state of ``pw_protobuf`` and proposes a plan to
32resolve the issues present in the module, both in the immediate short term and
33a longer term vision.
34
35---------------------------
36Summary of Proposed Changes
37---------------------------
38The table below summarizes the states of the different ``pw_protobuf``
39components following acceptance of this SEED. The reasoning behind these changes
40is explained in further detail throughout the rest of the SEED.
41
42.. list-table::
43   :header-rows: 1
44
45   * - Component
46     - Status
47     - Details
48   * - Wire format encoder/decoder
49     - Supported
50     - * ``pw_protobuf``'s primary API.
51       * Codegen helpers for convenient use.
52       * Works with streams and direct buffers.
53       * Recommended for compact and efficient protobuf operations.
54   * - Find API
55     - Supported
56     - * Useful for extracting fields from messages without having to set up a
57         decoder.
58       * Recommended as an alternative for in-memory objects for small, simple
59         messages.
60       * Will be expanded with better support for repeated fields.
61   * - Nanopb integration (build system / RPC)
62     - Supported
63     - * Recommended for newer projects that want a complete object model for
64         their protobuf messages.
65       * Recommended by default for RPC services.
66       * Can easily be used alongside lower-level ``pw_protobuf`` APIs in cases
67         where more control is required.
68   * - Message API (``message.h``)
69     - Deprecated
70     - * Superseded by other APIs.
71       * Only used by one project.
72       * Code will be removed.
73   * - Message structures
74     - **Short-term:** Discouraged
75
76       **Long-term:** Deprecated
77     - * Will remain supported for existing users indefinitely, though no new
78         features will be added.
79       * Docs will be updated to clearly detail its limitations.
80       * Not recommended to new users; Nanopb or the low-level APIs should be
81         preferred.
82       * Will be replaced with a newer ``pw_protobuf`` object model at an
83         unspecified future point.
84       * Code will remain until the new model is fully implemented and existing
85         users have had time to migrate (with Pigweed assistance for internal
86         customers).
87   * - ``pwpb_rpc``
88     - **Short-term:** Discouraged
89
90       **Long-term:** Deprecated
91     - * Will remain supported for existing users indefinitely, though no new
92         features will be added.
93       * Not recommended to new users; ``nanopb_rpc`` and/or raw methods should
94         be preferred.
95       * When the new ``pw_protobuf`` object model is added, it will come with
96         updated RPC integration.
97       * Code will remain until the new model is fully implemented and existing
98         users have had time to migrate (with Pigweed assistance for internal
99         customers).
100   * - New ``pw_protobuf`` object model
101     - **Long-term:** Planned
102     - * Intended to replace existing message structures as the premier
103         in-memory object model, with a more complete implementation of the
104         protobuf spec.
105       * Investigation and design will be examined in a future SEED.
106
107----------------------------
108Background and Current State
109----------------------------
110
111Protobuf Components
112===================
113``pw_protobuf`` today consists of several different layered APIs, which are
114explored below.
115
116Core encoder and decoder
117------------------------
118``pw_protobuf``'s core low-level APIs interact directly with the
119`Protobuf wire format <https://protobuf.dev/programming-guides/encoding/>`_,
120processing each field appearing in a message individually without any notion of
121higher-level message semantics such as repeated or optional fields. These APIs
122are compact and highly-capable; they are able to construct any valid protobuf
123message, albeit by pushing much of the burden onto users to ensure that they do
124not encode fields in violation of their messages' schemas.
125
126Origin
127^^^^^^
128The idea for direct wire encoding originated prior to the inception of Pigweed,
129when the team was setting up crash reporting for a project. Crash diagnostic
130data was transmitted from each device as a protobuf message, which was encoded
131using `nanopb <https://jpa.kapsi.fi/nanopb/>`_, a popular lightweight,
132embedded-friendly protobuf library for serializing and deserializing protobuf
133data to and from C structs.
134
135To send crash reports, a single, statically-allocated crash message struct was
136populated by the device's various subsystems, before being serialized to a
137buffer and queued for transmission over the appropriate interface. The fields of
138this struct ranged from single integers to complex nested messages. The nature
139of nanopb in a static memory environment required each variable-length field in
140the generated message to be reserved for its maximum allowable size, which
141quickly blew up in the cases of large strings and repeated submessages. All in
142all, the generated crash struct clocked in at around 12KB --- several times
143larger than its encoded size --- a high price to pay for such a
144memory-constrained device.
145
146This large overhead raised the question of whether it was necessary to store the
147crash data in an intermediate format, or if this could be eliminated. By the
148nature of the protobuf wire format, it is possible to build up a message in
149parts, writing one field at a time. Due to this, it would be possible for each
150subsystem to be passed some serializer which would allow them to write their
151fields directly to the final output buffer, avoiding any additional in-memory
152storage. This would be especially beneficial for variable-length fields, where
153systems could write only as much data as they had at the moment, avoiding the
154overhead of worst-case reservations. ``pw_protobuf`` was conceptualized as this
155type of wire serializer, providing a convenient wrapper around direct
156field-by-field serialization.
157
158While the project ended up shipping with their original ``nanopb`` setup, a
159prototype of this serializer was written as a proof of concept, and ended up
160being refined to support all basic protobuf operations as one of the first
161modules offered by the newly-started Pigweed project.
162
163Implementation
164^^^^^^^^^^^^^^
165The core encoders have undergone several iterations over time. The
166`original implementation <https://cs.opensource.google/pigweed/pigweed/+/bbf164c985576a348f3bcd4c48b3e9fd8a464a66:pw_protobuf/public/pw_protobuf/encoder.h;l=25>`_
167offered a simple API to directly serialize single protobuf fields to an
168in-memory buffer through a series of typed ``Encode`` functions. Message
169nesting was handled manually by the user, calling a ``Push`` function to begin
170writing fields to a submessage, followed by ``Pop`` on completion.
171
172The decoder was a
173`later addition <https://cs.opensource.google/pigweed/pigweed/+/6d9b9b447b84afb60e714ebd97523ee55b93c9a6:pw_protobuf/public/pw_protobuf/decoder.h;l=23>`_,
174initially invoking a callback on each field in the serialized message with its
175field number, giving the users the ability to extract the field by calling the
176appropriate typed ``Decode`` function. This was implemented via a
177``DecodeHandler`` virtual interface, and it persists to this day as
178``CallbackDecoder``. However, this proved to be too cumbersome to use, so the
179main decoder was `rewritten <https://cs.opensource.google/pigweed/pigweed/+/fe9723cd67796e9236022cde6ef42cda99682d77>`_
180in the style of an iterator where users manually advanced it through the
181serialized fields, decoding those which they cared about.
182
183Streaming enhancement
184^^^^^^^^^^^^^^^^^^^^^
185The original encoder and decoder were designed to operate on messages which fit
186into buffers directly in memory. However, as the ``pw_stream`` interface was
187stabilized and adopted, there was interest in processing protobuf messages whose
188data was not fully available (for example, reading out of flash
189sector-by-sector). This prompted another rewrite of the core classes to make
190``pw::Stream`` the interface to the serialized data. This was done differently
191for the encoder and decoder: the encoder only operates on streams, with
192``MemoryEncoder`` becoming a shallow wrapper instantiating a ``MemoryWriter`` on
193top of a buffer, whereas the decoder ended up having two separate, parallel
194``StreamDecoder`` and ``MemoryDecoder`` implementations.
195
196The reason for this asymmetry has to do with the manner in which the two were
197implemented. The encoder was
198`rewritten first <https://cs.opensource.google/pigweed/pigweed/+/0ed221cbb8b943205dea4ac315fe1d4b1e6b7371>`_,
199and carefully designed to function on top of the limited semantic guarantees
200offered by ``pw_stream``. Following this redesign, it seemed obvious and natural
201to use the existing MemoryStream to provide the previous encoding functionality
202nearly transparently. However, when reviewing this implementation with the
203larger team, several potential issues were noted. What was previously a simple
204memory access to write a protobuf field became an expensive virtual call which
205could not be elided. The common use case of serializing a message to a buffer
206had become significantly less performant, prompting concerns about the impact of
207the change. Additionally, it was noted that this performance impact would be far
208worse on the decoding side, where serialized varints had to be read one byte at
209a time.
210
211As a result, it was decided that a larger analysis was required. To aid this,
212the stream-based decoder would be implemented separately to the existing memory
213decoder so that direct comparisons could be made between the two
214implementations. Unfortunately, the performance of the two implementations was
215never properly analyzed as the team became entangled in higher priority
216commitments.
217
218.. code-block:: c++
219
220   class StreamEncoder {
221    public:
222     constexpr StreamEncoder(stream::Writer& writer, ByteSpan scratch_buffer);
223
224     Status WriteUint32(uint32_t field_number, uint32_t value);
225     Status WriteString(uint32_t field_number, std::string_view value);
226   };
227
228*A subset of the StreamEncoder API, demonstrating its low-level field writing
229operations.*
230
231Wire format code generation
232---------------------------
233``pw_protobuf`` provides lightweight generated code wrappers on top of its core
234wire format encoder and decoder which eliminate the need to provide the correct
235field number and type when writing/reading serialized fields. Each generated
236function calls directly into the underlying encoder/decoder API, in theory
237making them zero-overhead wrappers.
238
239The encoder codegen was part of the original implementation of ``pw_protobuf``.
240It constituted a ``protoc`` plugin written in Python, and several GN build
241templates to define protobuf libraries and invoke ``protoc`` on them to create
242a C++ target which could be depended on by others. The build integration was
243added separately to the main protobuf module, as ``pw_protobuf_compiler``, and
244has since expanded to support many different protobuf code generators in various
245languages.
246
247The decoder codegen was added at a much later date, alongside the struct object
248model. Like the encoder codegen, it defines wrappers around the underlying
249decoder functions which populate values for each of a message's fields, though
250users are still required to manually iterate through the message and extract
251each field.
252
253.. code-block:: c++
254
255   class FooEncoder : public ::pw::protobuf::StreamEncoder {
256     Status WriteBar(uint32_t value) {
257       return ::pw::protobuf::StreamEncoder::WriteUint32(
258           static_cast<uint32_t>(Fields::kBar), value);
259     }
260   };
261
262*An example of how a generated encoder wrapper calls into the underlying
263operation.*
264
265Message API
266-----------
267The ``Message`` API was the first attempt at providing higher-level semantic
268wrappers on top of ``pw_protobuf``'s direct wire serialization. It was developed
269in conjunction with the implementation of Pigweed's software update flow for a
270project and addressed several use cases that came up with the way the project
271stored its update bundle metadata.
272
273This API works on the decoding side only, giving users easier access to fields
274of a serialized message. It provides functions which scan a message for a field
275using its field number (similar to the ``Find`` APIs discussed later). However,
276instead of deserializing the field and returning its data directly, these APIs
277give the user a typed handle to the field which can be used to read it.
278
279These field handles apply protobuf semantics beyond the field-by-field iteration
280of the low level decoder. For example, a field can be accessed as a repeated
281field, whose handle provides a C++ iterator over each instance of the field in
282the serialized message. Additionally, ``Message`` is the only API currently in
283``pw_protobuf`` which allows users to work directly with protobuf ``map``
284fields, reading key-value pairs from a message.
285
286.. code-block:: c++
287
288   // Parse repeated field `repeated string rep_str = 5;`
289   RepeatedStrings rep_str = message.AsRepeatedString(5);
290   // Iterate through the entries. For iteration
291   for (String element : rep_str) {
292     // Process str
293   }
294
295   // Parse map field `map<string, bytes> str_to_bytes = 7;`
296   StringToBytesMap str_to_bytes = message.AsStringToBytesMap(7);
297   // Access the entry by a given key value
298   Bytes bytes_for_key = str_to_bytes["key"];
299   // Or iterate through map entries
300   for (StringToBytesMapEntry entry : str_to_bytes) {
301     String key = entry.Key();
302     Bytes value = entry.Value();
303     // Process entry
304   }
305
306*Examples of reading repeated and map fields from a serialized protobuf using
307the Message API.*
308
309Message structures
310------------------
311``pw_protobuf``'s message structure API is its premier high-level, in-memory
312object model. It was contributed by an external team with some guidance from
313Pigweed developers and was driven largely by a desire to work conveniently with
314protobufs in RPC methods without the burden of a third-party dependency in
315``nanopb`` (the only officially supported protobuf library in RPC at the time).
316
317Message structures function similarly to more conventional protobuf libraries,
318where every definition in a ``.proto`` file generates a corresponding C++
319object. In the case of ``pw_protobuf``, these objects are defined as structs
320containing the fields of their protobuf message as members. Functions are
321provided to encode from or decode to one of these structs, removing the manual
322per-field processing from the lower-level APIs.
323
324Each field in a protobuf message becomes an inline member of its generated
325struct. Protobuf types are mapped to C++ types where possible, with special
326handling of protobuf specifiers and variable-length fields. Fields labeled as
327optional are wrapped in a ``std::optional`` from the STL. Fields labeled as
328``oneof`` are not supported (in fact, the code generator completely ignores the
329keyword). Variable-length fields can either be inlined or handled through
330callbacks invoked by the encoder or decoder when processing the message. If
331inlined, a container sized to a user-specified maximum length is generated. For
332strings, this is a ``pw::InlineString`` while most other fields use a
333``pw::Vector``.
334
335Similar to nanopb, users can pass options to the ``pw_protobuf`` generator
336through the protobuf compiler to configure their generated message structures.
337These allow specifying the maximum size of variable-length fields, setting a
338fixed size, or forcing the use of callbacks for encoding and decoding. Options
339maybe be specified inline in the proto file or listed in a separate file
340(conventionally named ``.options``) to avoid leaking ``pw_protobuf``-specific
341metadata into protobuf files that may be shared across multiple languages and
342protobuf compiler contexts.
343
344Unlike the lower-level generated classes which require custom per-field encoding
345and decoding functions, message serialization is handled generically through the
346use of a field descriptor table. The descriptor table for a message contains an
347entry for each of its fields, storing its type, field number, and other metadata
348alongside its offset within the generated message structure. This table is
349generated once per message defined in a protobuf file, trading a small
350additional memory overhead for reduced code size when serializing and
351deserializing data.
352
353.. code-block:: proto
354
355   message Customer {
356     int32 age = 1;
357     string name = 2;
358     optional fixed32 loyalty_id = 3;
359   }
360
361.. code-block:: c++
362
363  struct Customer::Message {
364    int32_t age;
365    pw::InlineString<32> name;
366    std::optional<uint32_t> loyalty_id;
367  };
368
369*Example of how a protobuf message definition is converted as a C++ struct.*
370
371Find API
372--------
373``pw_protobuf``'s set of ``Find`` APIs constitute functions for extracting
374single fields from serialized messages. The functions scan the message for a
375field number and decode it as a specified protobuf type. Like the core
376serialization APIs, there are two levels to ``Find``: direct low-level typed
377functions, and generated code functions that invoke these for named protobuf
378fields.
379
380Extracting a single field is a common protobuf use case, and was envisioned
381early in ``pw_protobuf``'s development. An initial version of ``Find`` was
382started shortly after the original callback-based decoder was implemented,
383providing a ``DecodeHandler`` to scan for a specific field number in a message.
384This version was never fully completed and did not see any production use. More
385recently, the ``Find`` APIs were revisited and reimplemented on top of the
386iterative decoder.
387
388.. code-block:: c++
389
390   pw::Result<uint32_t> age = Customer::FindAge(serialized_customer);
391   if (age.ok()) {
392     PW_LOG_INFO("Age is %u", age.value());
393   }
394
395*An example of using a generated Find function to extract a field from a
396serialized protobuf message.*
397
398RPC integration
399---------------
400Pigweed RPC exchanges data in the form of protobufs and was designed to allow
401users to implement their services using different protobuf libraries, with some
402supported officially. Supporting the use of ``pw_protobuf`` had been a goal from
403the beginning, but it was never implemented on top of the direct wire encoders
404and decoders. Despite this, several RPC service implementations in Pigweed and
405customer projects ended up using ``pw_protobuf`` on top of the raw RPC method
406API, manually decoding and encoding messages.
407
408When message structures were contributed, they came with an expansion of RPC to
409allow their usage in method implementations, becoming the second officially
410supported protobuf library. ``pw_protobuf`` methods are structured and behave
411similarly to RPC's nanopb-based methods, automatically deserializing requests
412from and serializing responses to their generated message structures.
413
414What Works Well
415===============
416Overall, ``pw_protobuf`` has been a largely successful module despite its
417growing pains. It has become an integral part of Pigweed, used widely upstream
418across major components of the system, including logging and crash reporting.
419Several Pigweed customers have also shown to favor ``pw_protobuf``, choosing it
420over other embedded protobuf libraries like nanopb.
421
422The list below summarizes some of ``pw_protobuf``'s successes.
423
424**Overall**
425
426* Widespread adoption across Pigweed and Pigweed-based projects.
427
428* Easy to integrate into a project which uses Pigweed's build system.
429
430* Often comes at a minimal additional cost to projects, as the core of
431  ``pw_protobuf`` is already used by popular upstream modules.
432
433**Core wire format encoders/decoders**
434
435* Simple, intuitive APIs which give users a lot of control over the structure
436  of their messages.
437
438* Lightweight in terms of code size and memory use.
439
440**Codegen general**
441
442* Build system integration is extensive and generally simple to use.
443
444* Low-level codegen wrappers are convenient to use without sacrificing the
445  power of the underlying APIs.
446
447**Message API**
448
449* Though only used by a single project, it works well for their needs and
450  gives them extensive semantic processing of serialized messages without the
451  overhead of decoding to a full in-memory object.
452
453* More capable processing than the Find APIs: for example, allowing iteration
454  over elements of a repeated field.
455
456* As the entire API is stream-based, it permits useful operations such as
457  giving the user a bounded stream over a bytes field of the message,
458  eliminating the need for an additional copy of data.
459
460* Support for protobuf maps, something which is absent from any other
461  ``pw_protobuf`` API.
462
463**Message Structures**
464
465* Message structures work incredibly well for the majority of simple use cases,
466  making protobufs easy to use without having to understand the details of the
467  wire format.
468
469* Adoption of ``pw_protobuf`` increased following the addition of this API and
470  corresponding RPC support, indicating that it is more valuable to a typical
471  user who is not concerned with the minor efficiencies offered by the
472  lower-level APIs.
473
474* Encoding and decoding messages is efficient due to the struct model's generic
475  table-based implementation. Users do not have to write custom code to process
476  each message as they would with the lower-level APIs, resulting in reduced
477  overall code size in some cases.
478
479* Nested messages are far easier to handle than in any other API, which require
480  additional setup creating sub-encoders/decoders.
481
482* The use of containers such as ``pw::Vector`` for repeated fields simplifies
483  their use and avoids the issues of similar libraries such as nanopb, where
484  users have to remember to manually set their length.
485
486**Find API**
487
488* Eliminates a lot of boilerplate in the common use case where only a single
489  field from a message needs to be read.
490
491**RPC integration**
492
493* Has seen a high rate of adoption as it provides a convenient API to read and
494  write requests and responses without requiring the management of a third-party
495  library dependency.
496
497* ``pw_protobuf``-based RPC services can still fall back on the raw RPC API in
498  instances where more flexible handling is required.
499
500The Issues
501==========
502
503Overview
504--------
505This section shows a summary of the known issues present at each layer of the
506current ``pw_protobuf`` module. Several of these issues will be explored in
507further detail later.
508
509**Overall**
510
511* Lack of an overall vision and cohesive story: What is ``pw_protobuf`` trying
512  to be and what kinds of users does it target? Where does it fit into the
513  larger protobuf ecosystem?
514
515* Documentation structure doesn't clearly guide users. Should be addressed in
516  conjunction with the larger :ref:`SEED-0102 <seed-0102>` effort.
517
518* Too many overlapping implementations. We should focus on one model with a
519  clear delineation between its layers.
520
521* Despite describing itself as a lightweight and efficient protobuf library,
522  little size reporting and performance statistics are provided to substantiate
523  these claims.
524
525**Core wire format encoders/decoders**
526
527* Parallel memory and stream decoder implementations which don't share any code.
528  They also have different APIs, e.g. using ``Result`` (stream decoder) vs. a
529  ``Status`` and output pointer (memory decoder).
530
531* Effectively-deprecated APIs still exist (e.g. ``CallbackDecoder``).
532
533* Inefficiencies when working with varints and streams. When reading a varint
534  from a message, the ``StreamDecoder`` consumes its stream one byte at a time,
535  each going through a potentially costly virtual call to the underlying
536  implementation.
537
538**Codegen general**
539
540* The headers generated by ``pw_protobuf`` are poorly structured. Some users
541  have observed large compiler memory usage parsing them, which may be related.
542
543* Each message in a ``.proto`` file generates a namespace in C++, in which its
544  generated classes appear. This is unintuitive and difficult to use, with most
545  users resorting to a mess of using statements at the top of each file that
546  works with protobufs.
547
548* Due to the way ``pw_protobuf`` appends its own namespace to users' proto
549  packages, it is not always possible to deduce where this namespace will exist
550  in external compilation units. To work around this, a somewhat hacky approach
551  is used where every generated ``pw_protobuf`` namespace is aliased within a
552  root-level namespace scope.
553
554* While basic codegen works in all build systems, only the GN build supports
555  the full capabilities of ``pw_protobuf``. Several essential features, such as
556  options files, are missing from other builds.
557
558* There appear to be issues with how the codegen steps are exposed to the CMake
559  build graph, preventing protobuf files from being regenerated as a result of
560  some codegen script modifications.
561
562* Protobuf editions, the modern replacement for the proto2 and proto3 syntax
563  options, are not supported by the code generator. Files using them fail to
564  compile.
565
566**Message API**
567
568* The message API as a whole has been superseded by the structure API, and there
569  is no reason for it to be used.
570
571**Message structures**
572
573* Certain types of valid proto messages are impossible to represent due to
574  language limitations. For example, as message structs directly embed
575  submessages, a circular dependency between nested messages cannot exist.
576
577* Optional proto fields are represented in C++ by ``std::optional``. This has
578  several issues:
579
580  * Memory overhead as a result of storing each field's presence flag
581    individually.
582
583  * Inconsistent with how other protobuf libraries function. Typically, field
584    presence is exposed through a separate API, with accessors always
585    returning a value (the default if absent).
586
587* Not all types of fields are supported. Optional strings and optional
588  submessages do not work (the generator effectively ignores the ``optional``
589  specifier). ``oneof`` fields do not work.
590
591* Not all options work for all fields. Fixed/max size specifiers to inline
592  repeated fields generally only work for simple field types --- callbacks must
593  be used otherwise.
594
595* In cases where the generator does not support something, it often does not
596  indicate this to the user, silently outputting incorrect code instead.
597
598* Options files share both a filename and some option names with other protobuf
599  libraries, namely Nanopb. This can cause issues when trying to use the same
600  protobuf definition in different contexts, as the options do not always work
601  the same way in both.
602
603**Find API**
604
605* Lack of support for repeated fields. Only the first element will be found.
606
607* Similarly, does not support recurring non-repeated fields. The protobuf
608  specification requires that scalar fields are overridden if they reappear,
609  while string, bytes, or submessage fields are merged.
610
611* Only one layer of searching is supported; it is not possible to look up a
612  nested field.
613
614* The stream version of the Find API does not allow scanning for submessages due
615  to limitations with the ownership and lifetime of its decoder.
616
617**RPC integration**
618
619* RPC creates and runs message encoders and decoders for the user. Therefore, it
620  is not possible to use any messages with callback-based fields in RPC method
621  implementations.
622
623Deep dive on selected issues
624----------------------------
625
626Generated namespaces
627^^^^^^^^^^^^^^^^^^^^
628``pw_protobuf``'s generator was written to output a namespace for each message
629in a file from its first implementation, on top of which all subsequent
630generated code was added.
631
632The reason for this unusual design choice was to work around C++'s
633declaration-before-definition rule to allow circularly-referential protobuf
634messages. Each message's associated generated classes are first forward-declared
635at the start of the generated header, and later defined as necessary.
636
637For example, given a message ``Foo``, the following code is generated:
638
639.. code-block:: c++
640
641   namespace Foo {
642
643   // Message field numbers.
644   enum Fields;
645
646   // Generated struct.
647   struct Message;
648
649   class StreamEncoder;
650   class StreamDecoder;
651
652   // Some other metadata omitted.
653
654   }  // namespace Foo
655
656The more intuitive approach of generating a struct/class directly for each
657message is difficult, if not impossible, to cleanly implement under the current
658``pw_protobuf`` object model. There are several reasons why this is, with the
659primary being that cross-message dependencies cannot easily be generated due to
660the aforementioned declaration issues. C++ does not allow forward-declaring a
661subclass, so certain types of nested message relationships are not directly
662representable. Some potential workarounds have been suggested for this, such as
663defining struct members as aliases to internally-generated types, but we have
664been unable to get this correctly working following a timeboxed prototyping
665session.
666
667Message structures
668^^^^^^^^^^^^^^^^^^
669Many of the issues with message structs stem from the same language limitations
670as those described above with namespacing. As the generated structures' members
671are embedded directly within them and publicly exposed, it is not possible to
672represent certain types of valid protobuf messages. Additionally, the way
673certain types of fields are generated is problematic, as described below.
674
675**Optional fields**
676
677A field labeled as ``optional`` in a proto file generates a struct member
678wrapped in a ``std::optional`` from the C++ STL. This choice is semantically
679inconsistent with how the official protobuf libraries in other languages are
680designed. Typically, accessing a field will always return a valid value. In the
681case of absence, the field is populated with its default value (the zero value
682unless otherwise specified). Presence checking is implemented as a parallel API
683for users who require it.
684
685This choice also results in additional memory overhead, as each field's presence
686flag is stored within its optional wrapper, padding what could otherwise be a
687single bit to a much larger aligned size. In the conventional disconnected model
688of field presence, the generated object could instead store a bitfield with an
689entry for each of its members, compacting its overall size.
690
691Optional fields are not supported for all types. The compiler ignores the
692``optional`` specifier when it is set on string fields, as well as on nested
693messages, generating the member as a regular field and serializing it per
694standard ``proto3`` rules, omitting a zero-valued entry.
695
696Implementing ``optional`` the typical way would require hiding the members of
697each generated message, instead providing accessor functions to modify them,
698checking for presence and inserting default values where appropriate.
699
700**Oneof fields**
701
702The ``pw_protobuf`` code generator completely ignores the ``oneof`` specifier
703when processing a message. When multiple fields are listed within a ``oneof``
704block in a ``.proto`` file, the generated struct will contain all of them as
705direct members without any notion of exclusivity. This permits ``pw_protobuf``
706to encode semantically invalid protobuf messages: if multiple members of a
707``oneof`` are set, the encoder will serialize all of them, creating a message
708that is unprocessable by other protobuf libraries.
709
710For example, given the following protobuf definition:
711
712.. code-block:: proto
713
714   message Foo {
715     oneof variant {
716       uint32 a = 1;
717       uint32 b = 2;
718     }
719   }
720
721The generator will output the following struct, allowing invalid messages to be
722written.
723
724.. code-block:: c++
725
726   struct Foo::Message {
727     uint32_t a;
728     uint32_t b;
729   };
730
731   // This will work and create a semantically invalid message.
732   Foo::StreamEncoder encoder;
733   encoder.Write({.a = 32, .b = 100});
734
735Similarly to ``optional``, the best approach to support ``oneof`` would be to
736hide the members of each message and provide accessors. This would avoid the
737risk of incorrectly reading memory (such as a wrong ``union`` member) and not
738require manual bookkeeping as in nanopb.
739
740--------
741Proposal
742--------
743
744Short-term Plan
745===============
746A full rework of ``pw_protobuf`` does not seem feasible at this point in time
747due to limited resourcing. As a result, the most reasonable course of action is
748to tie up the loose ends of the existing code, and leave the module in a state
749where it functions properly in every supported use case, with unsupported use
750cases made explicit.
751
752The important steps to making this happen are listed below.
753
754* Restructure the module documentation to help users select which protobuf API
755  is best suited for them, and add a section explicitly detailing the
756  limitations of each.
757
758* Deprecate and hide the ``Message`` API, as it has been superseded by the
759  ``Find`` APIs.
760
761* Discourage usage of message structures in new code, while providing a
762  comprehensive upfront explanation of their limitations and unsupported use
763  cases, including:
764
765  * ``oneof`` cannot be used.
766
767  * Inlining some types of repeated fields such as submessages is not possible.
768    Callbacks must be used to encode and decode them.
769
770  * The use of ``optional`` only generates optional struct members for simple
771    scalar fields. More complex optional fields must be processed through
772    callbacks.
773
774* Update the code generator to loudly fail when it encounters an unsupported
775  message or field structure.
776
777* Discourage the use of the automatic ``pw_protobuf`` RPC generator due to the
778  limitations with message structures. ``nanopb`` or manually processed ``raw``
779  methods should be used instead.
780
781  Similarly, clearly document the limitations around callback-based messages in
782  RPCs methods, and provide examples of how to fall back to raw RPC encoding and
783  decoding.
784
785* Move all upstream usage of ``pw_protobuf`` away from message structures and
786  ``pwpb_rpc`` to the lower-level direct wire APIs, or rarely Nanopb.
787
788* Rename the options files used by ``pw_protobuf``'s message structs to
789  distinguish them from Nanopb options.
790
791* Make the ``pw_protobuf`` code generator aware of the protobuf edition option
792  so that message definitions using it can be compiled.
793
794* Extend full protobuf code generation support to the Bazel and CMake builds, as
795  well as the Android build integration.
796
797* Minimize the amount of duplication in the code generator to clean up generated
798  header files and attempt to reduce compiler memory usage.
799
800* Extend the ``Find`` APIs with support for repeated fields to bring them closer
801  to the Message API's utility.
802
803Long-term Plan
804==============
805This section lays out a long term design vision for ``pw_protobuf``. There is no
806estimated timeframe on when this work will happen, but the ideas are collected
807here for future reference.
808
809Replace message structures
810--------------------------
811As discussed above, most issues with message structures stem from having members
812exposed directly. Obscuring the internal details of messages and providing
813public accessor APIs gives the flexibility to fix the existing problems without
814running against language limitations or exposing additional complexity to users.
815
816By doing so, the internal representation of a message is no longer directly tied
817to C++'s type system. Instead of defining typed members for each field in a
818message, the entire message structure could consist of an intermediate binary
819representation, with fields located at known offsets alongside necessary
820metadata. This avoids the declaration and aliasing issues, as types are now only
821required to be defined at access rather than storage.
822
823This would require a complete rewrite which would be incompatible with the
824current APIs. The least invasive way to handle it would be to create an entirely
825new code generator, port over the core lower level generator functionality, and
826build the new messages on top of it. The old API would then be fully deprecated,
827and users could migrate over one message at a time, with assistance from the
828Pigweed team for internal customers.
829
830Investigate standardization of wire format operations
831-----------------------------------------------------
832``pw_protobuf`` is one of many libraries, both within Google and externally,
833that re-implements protobuf wire format processing. At the time it was written,
834this made sense, as there was no convenient option that fit the niche that
835``pw_protobuf`` targeted. However, since then, the core protobuf team has
836heavily invested in the development of `upb <https://github.com/protocolbuffers/upb>`_:
837a compact, low-level protobuf backend intended to be wrapped by higher-level
838libraries in various languages. Many of upb's core design goals align with the
839initial vision for ``pw_protobuf``, making it worthwhile to coordinate with its
840developers to see it may be suitable for use in Pigweed.
841
842Preliminary investigations into upb have shown that, while small in size, it is
843still larger than the core of ``pw_protobuf`` as it is a complete protobuf
844library supporting the entire protobuf specification. Not all of that is
845required for Pigweed or its customers, so any potential reuse would likely be
846contingent on the ability to selectively remove unnecessary parts.
847
848At the time of writing, upb does not have a stable API or ABI. While this is
849okay for first-party consumers, shipping it as part of Pigweed may present
850additional maintenance issues.
851
852Nonetheless, synchronizing with upb to share learnings and potentially reduce
853duplicated effort should be an essential step in any future ``pw_protobuf``
854work.
855