xref: /aosp_15_r20/external/emboss/doc/design_docs/archive/alternate_enum_cases.md (revision 99e0aae7469b87d12f0ad23e61142c2d74c1ef70)
1# Design: Alternate Enum Field Cases
2
3This document is provided for historical interest.  This feature is now
4implemented in the form of the `[enum_case]` attribute on `enum` values, which
5can also be `$default`ed on module, struct, bits, and enum definitions.
6
7## Motivation
8
9Currently, the Emboss compiler requires that enum fields are `SHOUTY_CASE`, but
10this is discouraged in some code styles, such as
11[Google's C++ style guide][google-cpp-style] which prefers`kPrefixedCamelCase`.
12This design considers options for allowing other cases in enums and their
13possible design.
14
15### Open Issues
16
17This design document is related to the following open GitHub issue:
18* [#59][issue-59]
19
20## Design
21
22This design will focus on the implementation for the C++ backend, as that is the
23only currently-supported backend in Emboss. However, this approach should be
24valid for other backends if and when they are supported, and it is encouraged
25that new backends that support this or similar functionality use the same or a
26similar design.
27
28### The `enum_case` Attribute
29
30An attribute will be added to the C++ backend: `enum_case`. It would apply to
31all enum fields and specifies which case to use for enum members. More than one
32case can be specified, in which case the backend will emit both enum member
33names with the same values. Initially this will support two cases:
34
35  * `SHOUTY_CASE` - (default) All-capital case with words separated by an underscore
36  * `kCamelCase` - Capitalized camel case prefixed with "k"
37
38The options will be provided as a string to the attribute as comma-separated
39values. At least one value must be present. More options can be supported in the
40future, and the implementation in the C++ backend will be written so that new
41case options shouldn't require much more than adding an identifier and a translation
42function.
43
44Translations will always be *from* `SHOUTY_CASE` since that is the requirement
45in an Emboss definition file. For `kCamelCase`, the words will be split on the
46underscore, the first letter of each word will remain capitalized, and all
47following letters of each word will be lowercased, then prefixed with the "k".
48
49### Transitioning From `SHOUTY_CASE` To `kCamelCase`
50
51The intended purpose of allowing multiple `enum_case` options to be specified is
52to enable transitioning between two cases in the event that the Emboss
53definition and the client code that uses the definitions cannot be updated
54atomically.
55
56When more than one option is present the backend will emit a definition that
57includes all specified name-value pairs. The names will be emitted in the order
58specified, so a reverse name lookup from an enum value will return the first
59case provided. Thus adding an additional case (by appending to the end of the
60comma-separated list) should be fully backwards-compatible.
61
62Removing a case will always be backwards-incompatible, so care should be taken
63to migrate client code to the new case before removing an old case.
64
65### Examples
66
67The examples below modify an existing Emboss definition:
68
69```
70enum Foo:
71  BAR             = 1
72  BAZ             = 2
73  MULTI_WORD_ENUM = 4
74```
75
76#### Use `kCamelCase` Instead
77
78To allow C++ code to use `kBar`, `kBaz`, or `kMultiWordEnum` to refer to the
79enum members instead of `BAR`, `BAZ`, or `MULTI_WORD_ENUM`, the `enum_case`
80attribute can be added to each field:
81```
82enum Foo:
83  BAR             = 1  [(cpp) enum_case: "kCamelCase"]
84  BAZ             = 2  [(cpp) enum_case: "kCamelCase"]
85  MULTI_WORD_ENUM = 4  [(cpp) enum_case: "kCamelCase"]
86```
87
88This would emit code similar to:
89
90```c++
91enum class Foo: uint64_t {
92  kBar = 1,
93  kBaz = 2,
94  kMultiWordEnum = 4,
95};
96```
97
98Note that as written, this would *not* allow C++ code to refer to `Foo::BAR`,
99`Foo::BAZ`, or `Foo::MULTI_WORD_ENUM`.
100
101#### Default `enum_case`
102
103Additionally, the same code would be emitted with either of the following:
104
105```
106enum Foo:
107  [$default (cpp) enum_case: "kCamelCase"]
108  BAR             = 1
109  BAZ             = 2
110  MULTI_WORD_ENUM = 4
111```
112
113or
114
115```
116[$default (cpp) enum_case: "kCamelCase"]
117
118...
119
120enum Foo:
121  BAR             = 1
122  BAZ             = 2
123  MULTI_WORD_ENUM = 4
124```
125
126With the differences being that the former would have the `enum_case` attribute
127apply to any new fields of `Foo` by default, and the latter woulds apply to all
128enum fields in the Emboss definition file by default.
129
130#### Transitioning To `kCamelCase`
131
132In the case that `Foo` should use `kCamelCase` but it is used in code that must
133be updated separately from the `.emb` file and backwards-compatibility must be
134maintained, the `enum_case` attribute will need multiple options specified. For
135instance:
136
137```
138enum Foo:
139  [$default (cpp) enum_case: "SHOUTY_CASE, kCamelCase"]
140  BAR             = 1
141  BAZ             = 2
142  MULTI_WORD_ENUM = 4
143```
144
145would emit code similar to:
146
147```cpp
148enum class Foo: uint64_t {
149  BAR = 1,
150  kBar = 1,
151  BAZ = 2,
152  kBaz = 2,
153  MULTI_WORD_ENUM = 4,
154  kMultiWordEnum = 4,
155};
156```
157
158Note that using `enum_case: "kCamelCase, SHOUTY_CASE"` would technically be
159backwards-incompatible as that would change the result of code like
160`TryToGetNameFromEnum(Foo::BAR)` from `"BAR"` to `"kBar"`, but if there are no
161usages of that functionality, it would be backwards-compatible as well.
162
163Once all usages of `Foo` have been migrated to `kShoutyCase`, and there is no
164client code that uses `SHOUTY_CASE` or relies on the reverse lookup
165functionality mentioned above, then the `SHOUTY_CASE` could be removed. The
166usual caveats of backwards-incompatible changes apply.
167
168## Alternatives Considered
169
170In the development of this design, some other alternative designs were
171considered. A short explanation is provided of each below.
172
173### Loosen Enum Name Requirements
174
175The "obvious" approach to allow names like `kCamelCase` is to simply loosen the
176requirement that an enum field name must be `SHOUTY_CASE`.
177
178#### Pros
179
180  * Flexible and straightforward for users
181
182#### Cons
183
184  * Adds complexity to the grammar and front-end.
185    * Not as simple of an implementation as it first appears.
186  * Allows Emboss definition files to diverge from each other, which goes
187    against the design goals of Emboss where all .emb files should look similar
188    to each other.
189    * Additionally it adds cognitive overhead in reading an unfamiliar Emboss
190      definition in a different "style".
191  * Backend/language considerations.
192    * A style used in C++ (`kCamelCase`) would also be used in languages where
193      that is not the style.
194    * Setting the name for all languages could cause issues in languages where
195      the case of a variable has semantic meaning, like the visibility of a
196      variable in `Go`.
197
198### Specifying An Exact Name In Attributes
199
200Instead of specifying a case transformation in an attribute, provide the
201specific name to be emitted. For example:
202
203```
204enum Foo:
205  BAR             = 1  [(cpp) name: "kBar"]
206  BAZ             = 2  [(cpp) name: "kBaz"]
207  MULTI_WORD_ENUM = 4  [(cpp) name: "kMultiWordEnum"]
208```
209
210Note that the proposed `enum_case` design does not preclude an attribute of this
211nature for resolving other use-cases. Under the principle of "specific overrides
212general" a `name`-like attribute could override any `enum_case` attribute. See
213the [future work](#future-work) section below for planned work on this.
214
215#### Pros
216
217  * Simple to implement
218  * Applies to more than just `enum` fields.
219  * Applies to other use cases (working around restrictions/reserved keywords in
220    backends that are not also restricted/reserved in Emboss).
221
222#### Cons
223
224  * Not possible to provide a `$default` attribute that applies generically to
225    all enum fields.
226    * This would require an attribute added to every enum member if the intent
227      is to always use a particular style.
228    * Requires a user to specify the translation for every field, making it
229      easier to mix cases or styles unintentionally.
230      * If mixing cases is intended, this is still possible with the `enum_case`
231	    attribute by overriding the default.
232
233### Transitional Cases or Attributes
234
235This alternative design would still use `enum_case` or something similar, but
236not allow multiple case options to be asserted. Instead, either a new
237transition-specific case or a transitional attribute would be used to mark a
238transition in progress. For example:
239
240```
241enum Foo:
242  BAR             = 1  [(cpp) enum_case: "kCamelCase-transitional"]
243  BAZ             = 2  [(cpp) enum_case: "kCamelCase-transitional"]
244  MULTI_WORD_ENUM = 4  [(cpp) enum_case: "kCamelCase-transitional"]
245```
246
247or
248
249```
250enum Foo:
251  BAR             = 1
252    [(cpp) enum_case: "kCamelCase"]
253    [(cpp) enum_case_transitional: true]
254  BAZ             = 2
255    [(cpp) enum_case: "kCamelCase"]
256    [(cpp) enum_case_transitional: true]
257  MULTI_WORD_ENUM = 4
258    [(cpp) enum_case: "kCamelCase"]
259    [(cpp) enum_case_transitional: true]
260```
261
262These would emit both `SHOUTY_CASE` and `kCamelCase` forms for each value.
263
264#### Pros
265
266  * Explicitly marks a transition in progress, and the reason for having
267    multiple aliasing names to the same enumerated value.
268  * Allows codegen to include `[[deprecated]]` attributes in the generated code
269    so that build time warnings/errors are produced when building client code.
270    * However, this could be supported by tagging the cases as transitional, see
271      the [future work](#future-work) section for planned work on this.
272
273#### Cons
274
275  * Requires migrating twice to transition between two non-`SHOUTY_CASE` cases
276    (old -> shouty -> new)
277  * Requires two separate attributes or a suffix to the case name, which can
278    cause readability issues
279  * Doesn't allow supporting more than 2 cases if needed, and requires that one
280    case be `SHOUTY_CASE`.
281
282## Implementation
283
284### Front End
285
286Now that the attribute checking is separate for the front end and backend
287([#80][pr-80]), only a small change (to both the grammar and the IR) is required
288to support attributes on enum values. Specifically:
289
290#### Grammar
291
292Change the existing grammar
293```
294enum-value                             -> constant-name "=" expression doc?
295                                          Comment? eol enum-value-body?
296enum-value-body                        -> Indent doc-line* Dedent
297```
298
299to
300
301```
302enum-value                             -> constant-name "=" expression doc?
303                                          attribute* Comment? eol
304                                          enum-value-body?
305enum-value-body                        -> Indent doc-line* attribute-line*
306                                          Dedent
307```
308
309#### Intermediate Representation
310
311The only change to IR to support this design would require a
312`Repeated(Attribute)` member field to `EnumValue`.
313
314### Back End
315
316The C++ backend can likely retain the same templates for codegen. This design
317should only require a change in codegen to read the attribute on an attribute
318name-value pair and translate the name (potentially multiple times for multiple
319specified cases).
320
321## Future Work
322
323### The `name` attribute
324
325Cases may cause name collisions which are not present in `SHOUTY_CASE`, so there
326should be some means to override the generated name. For instance, consider:
327
328```
329enum Port:
330  # Names taken from manufacturer's programming manual.
331  USB    = 128   -- USB port, virtual port 0    # kUsb
332  USB_1  = 129   -- USB port, virtual port 1    # kUsb1
333  USB1   = 1440  -- USB port 1, virtual port 0  # kUsb1 -- collision
334  USB1_1 = 1441  -- USB port 1, virtual port 1  # kUsb11
335```
336
337Additionally, there are other use-cases for setting an alternate name to the one
338used in the Emboss definition. Thus, an attribute should be provided that can
339override all naming, including the default name setting in Emboss and any
340`enum_case` attributes. For instance:
341
342```
343enum Port:
344  # Names taken from manufacturer's programming manual.
345  USB    = 128   -- USB port, virtual port 0
346    [(cpp) name: "kUsb"]
347  USB_1  = 129   -- USB port, virtual port 1
348    [(cpp) name: "kUsb_1"]
349  USB1   = 1440  -- USB port 1, virtual port 0
350    [(cpp) name: "kUsb1"]
351  USB1_1 = 1441  -- USB port 1, virtual port 1
352    [(cpp) name: "kUsb1_1")
353```
354
355This would not emit names like `kUsb11` even if a `$default` case was set to
356`kCamelCase` because the `name` attribute would always override other naming
357settings. Similar to `enum_case`, multiple names could be provided in a comma
358separated list.
359
360This will be completed in future work, the specifics of which may be updated
361here or in a separate design. However, the implementation of `enum_case` should
362be made to allow `name` or a similar attribute to be added without major
363refactoring.
364
365### Deprecated Cases/Names
366
367When transitioning between cases or alternate names, it would be useful to mark
368the old field as `[[deprecated]]` in the C++ source, so that client code that
369uses the generated Emboss code will produce build-time warnings or errors and
370alert maintainers that there will be an upcoming breaking change that could
371break the client code's build.
372
373One way to do this would be to allow tagging a name or case as deprecated in
374the attribute string itself. For instance:
375
376```
377enum Foo:
378  BAR = 1
379    [(cpp) enum_case: "SHOUTY_CASE -deprecated, kCamelCase"]
380  BAZ = 2
381    [(cpp) enum_case: "SHOUTY_CASE -deprecated, kCamelCase"]
382  MULTI_WORD_ENUM = 4
383    [(cpp) enum_case: "SHOUTY_CASE -deprecated, kCamelCase"]
384```
385
386This would follow the normal `$default` rules as it would be the same as any
387other attribute value, so for instance, to set `SHOUTY_CASE` to be deprecated in
388favor of `kCamelCase` for all members of the enum:
389
390```
391enum Foo:
392  [$default (cpp) enum_case: "SHOUTY_CASE -deprecated, kCamelCase"]
393  BAR = 1
394  BAZ = 2
395  MULTI_WORD_ENUM = 4
396```
397
398and to set it for all enums in the module:
399
400```
401[$default (cpp) enum_case: "SHOUTY_CASE -deprecated, kCamelCase"]
402
403...
404
405enum Foo:
406  BAR = 1
407  BAZ = 2
408  MULTI_WORD_ENUM = 4
409```
410
411This will be completed in future work, the specifics of which may be updated
412here or in a separate design. However, the implementation of `enum_case` should
413be made to allow adding `-deprecated` or a similar approach without major
414refactoring.
415
416
417
418
419[google-cpp-style]: https://google.github.io/styleguide/cppguide.html#Enumerator_Names
420[issue-59]: https://github.com/google/emboss/issues/59
421[pr-80]: https://github.com/google/emboss/pull/80
422