1# Design: Alternate Enum Field Cases 2 3This document is provided for historical interest. This feature is now 4implemented in the form of the `[enum_case]` attribute on `enum` values, which 5can also be `$default`ed on module, struct, bits, and enum definitions. 6 7## Motivation 8 9Currently, the Emboss compiler requires that enum fields are `SHOUTY_CASE`, but 10this is discouraged in some code styles, such as 11[Google's C++ style guide][google-cpp-style] which prefers`kPrefixedCamelCase`. 12This design considers options for allowing other cases in enums and their 13possible design. 14 15### Open Issues 16 17This design document is related to the following open GitHub issue: 18* [#59][issue-59] 19 20## Design 21 22This design will focus on the implementation for the C++ backend, as that is the 23only currently-supported backend in Emboss. However, this approach should be 24valid for other backends if and when they are supported, and it is encouraged 25that new backends that support this or similar functionality use the same or a 26similar design. 27 28### The `enum_case` Attribute 29 30An attribute will be added to the C++ backend: `enum_case`. It would apply to 31all enum fields and specifies which case to use for enum members. More than one 32case can be specified, in which case the backend will emit both enum member 33names with the same values. Initially this will support two cases: 34 35 * `SHOUTY_CASE` - (default) All-capital case with words separated by an underscore 36 * `kCamelCase` - Capitalized camel case prefixed with "k" 37 38The options will be provided as a string to the attribute as comma-separated 39values. At least one value must be present. More options can be supported in the 40future, and the implementation in the C++ backend will be written so that new 41case options shouldn't require much more than adding an identifier and a translation 42function. 43 44Translations will always be *from* `SHOUTY_CASE` since that is the requirement 45in an Emboss definition file. For `kCamelCase`, the words will be split on the 46underscore, the first letter of each word will remain capitalized, and all 47following letters of each word will be lowercased, then prefixed with the "k". 48 49### Transitioning From `SHOUTY_CASE` To `kCamelCase` 50 51The intended purpose of allowing multiple `enum_case` options to be specified is 52to enable transitioning between two cases in the event that the Emboss 53definition and the client code that uses the definitions cannot be updated 54atomically. 55 56When more than one option is present the backend will emit a definition that 57includes all specified name-value pairs. The names will be emitted in the order 58specified, so a reverse name lookup from an enum value will return the first 59case provided. Thus adding an additional case (by appending to the end of the 60comma-separated list) should be fully backwards-compatible. 61 62Removing a case will always be backwards-incompatible, so care should be taken 63to migrate client code to the new case before removing an old case. 64 65### Examples 66 67The examples below modify an existing Emboss definition: 68 69``` 70enum Foo: 71 BAR = 1 72 BAZ = 2 73 MULTI_WORD_ENUM = 4 74``` 75 76#### Use `kCamelCase` Instead 77 78To allow C++ code to use `kBar`, `kBaz`, or `kMultiWordEnum` to refer to the 79enum members instead of `BAR`, `BAZ`, or `MULTI_WORD_ENUM`, the `enum_case` 80attribute can be added to each field: 81``` 82enum Foo: 83 BAR = 1 [(cpp) enum_case: "kCamelCase"] 84 BAZ = 2 [(cpp) enum_case: "kCamelCase"] 85 MULTI_WORD_ENUM = 4 [(cpp) enum_case: "kCamelCase"] 86``` 87 88This would emit code similar to: 89 90```c++ 91enum class Foo: uint64_t { 92 kBar = 1, 93 kBaz = 2, 94 kMultiWordEnum = 4, 95}; 96``` 97 98Note that as written, this would *not* allow C++ code to refer to `Foo::BAR`, 99`Foo::BAZ`, or `Foo::MULTI_WORD_ENUM`. 100 101#### Default `enum_case` 102 103Additionally, the same code would be emitted with either of the following: 104 105``` 106enum Foo: 107 [$default (cpp) enum_case: "kCamelCase"] 108 BAR = 1 109 BAZ = 2 110 MULTI_WORD_ENUM = 4 111``` 112 113or 114 115``` 116[$default (cpp) enum_case: "kCamelCase"] 117 118... 119 120enum Foo: 121 BAR = 1 122 BAZ = 2 123 MULTI_WORD_ENUM = 4 124``` 125 126With the differences being that the former would have the `enum_case` attribute 127apply to any new fields of `Foo` by default, and the latter woulds apply to all 128enum fields in the Emboss definition file by default. 129 130#### Transitioning To `kCamelCase` 131 132In the case that `Foo` should use `kCamelCase` but it is used in code that must 133be updated separately from the `.emb` file and backwards-compatibility must be 134maintained, the `enum_case` attribute will need multiple options specified. For 135instance: 136 137``` 138enum Foo: 139 [$default (cpp) enum_case: "SHOUTY_CASE, kCamelCase"] 140 BAR = 1 141 BAZ = 2 142 MULTI_WORD_ENUM = 4 143``` 144 145would emit code similar to: 146 147```cpp 148enum class Foo: uint64_t { 149 BAR = 1, 150 kBar = 1, 151 BAZ = 2, 152 kBaz = 2, 153 MULTI_WORD_ENUM = 4, 154 kMultiWordEnum = 4, 155}; 156``` 157 158Note that using `enum_case: "kCamelCase, SHOUTY_CASE"` would technically be 159backwards-incompatible as that would change the result of code like 160`TryToGetNameFromEnum(Foo::BAR)` from `"BAR"` to `"kBar"`, but if there are no 161usages of that functionality, it would be backwards-compatible as well. 162 163Once all usages of `Foo` have been migrated to `kShoutyCase`, and there is no 164client code that uses `SHOUTY_CASE` or relies on the reverse lookup 165functionality mentioned above, then the `SHOUTY_CASE` could be removed. The 166usual caveats of backwards-incompatible changes apply. 167 168## Alternatives Considered 169 170In the development of this design, some other alternative designs were 171considered. A short explanation is provided of each below. 172 173### Loosen Enum Name Requirements 174 175The "obvious" approach to allow names like `kCamelCase` is to simply loosen the 176requirement that an enum field name must be `SHOUTY_CASE`. 177 178#### Pros 179 180 * Flexible and straightforward for users 181 182#### Cons 183 184 * Adds complexity to the grammar and front-end. 185 * Not as simple of an implementation as it first appears. 186 * Allows Emboss definition files to diverge from each other, which goes 187 against the design goals of Emboss where all .emb files should look similar 188 to each other. 189 * Additionally it adds cognitive overhead in reading an unfamiliar Emboss 190 definition in a different "style". 191 * Backend/language considerations. 192 * A style used in C++ (`kCamelCase`) would also be used in languages where 193 that is not the style. 194 * Setting the name for all languages could cause issues in languages where 195 the case of a variable has semantic meaning, like the visibility of a 196 variable in `Go`. 197 198### Specifying An Exact Name In Attributes 199 200Instead of specifying a case transformation in an attribute, provide the 201specific name to be emitted. For example: 202 203``` 204enum Foo: 205 BAR = 1 [(cpp) name: "kBar"] 206 BAZ = 2 [(cpp) name: "kBaz"] 207 MULTI_WORD_ENUM = 4 [(cpp) name: "kMultiWordEnum"] 208``` 209 210Note that the proposed `enum_case` design does not preclude an attribute of this 211nature for resolving other use-cases. Under the principle of "specific overrides 212general" a `name`-like attribute could override any `enum_case` attribute. See 213the [future work](#future-work) section below for planned work on this. 214 215#### Pros 216 217 * Simple to implement 218 * Applies to more than just `enum` fields. 219 * Applies to other use cases (working around restrictions/reserved keywords in 220 backends that are not also restricted/reserved in Emboss). 221 222#### Cons 223 224 * Not possible to provide a `$default` attribute that applies generically to 225 all enum fields. 226 * This would require an attribute added to every enum member if the intent 227 is to always use a particular style. 228 * Requires a user to specify the translation for every field, making it 229 easier to mix cases or styles unintentionally. 230 * If mixing cases is intended, this is still possible with the `enum_case` 231 attribute by overriding the default. 232 233### Transitional Cases or Attributes 234 235This alternative design would still use `enum_case` or something similar, but 236not allow multiple case options to be asserted. Instead, either a new 237transition-specific case or a transitional attribute would be used to mark a 238transition in progress. For example: 239 240``` 241enum Foo: 242 BAR = 1 [(cpp) enum_case: "kCamelCase-transitional"] 243 BAZ = 2 [(cpp) enum_case: "kCamelCase-transitional"] 244 MULTI_WORD_ENUM = 4 [(cpp) enum_case: "kCamelCase-transitional"] 245``` 246 247or 248 249``` 250enum Foo: 251 BAR = 1 252 [(cpp) enum_case: "kCamelCase"] 253 [(cpp) enum_case_transitional: true] 254 BAZ = 2 255 [(cpp) enum_case: "kCamelCase"] 256 [(cpp) enum_case_transitional: true] 257 MULTI_WORD_ENUM = 4 258 [(cpp) enum_case: "kCamelCase"] 259 [(cpp) enum_case_transitional: true] 260``` 261 262These would emit both `SHOUTY_CASE` and `kCamelCase` forms for each value. 263 264#### Pros 265 266 * Explicitly marks a transition in progress, and the reason for having 267 multiple aliasing names to the same enumerated value. 268 * Allows codegen to include `[[deprecated]]` attributes in the generated code 269 so that build time warnings/errors are produced when building client code. 270 * However, this could be supported by tagging the cases as transitional, see 271 the [future work](#future-work) section for planned work on this. 272 273#### Cons 274 275 * Requires migrating twice to transition between two non-`SHOUTY_CASE` cases 276 (old -> shouty -> new) 277 * Requires two separate attributes or a suffix to the case name, which can 278 cause readability issues 279 * Doesn't allow supporting more than 2 cases if needed, and requires that one 280 case be `SHOUTY_CASE`. 281 282## Implementation 283 284### Front End 285 286Now that the attribute checking is separate for the front end and backend 287([#80][pr-80]), only a small change (to both the grammar and the IR) is required 288to support attributes on enum values. Specifically: 289 290#### Grammar 291 292Change the existing grammar 293``` 294enum-value -> constant-name "=" expression doc? 295 Comment? eol enum-value-body? 296enum-value-body -> Indent doc-line* Dedent 297``` 298 299to 300 301``` 302enum-value -> constant-name "=" expression doc? 303 attribute* Comment? eol 304 enum-value-body? 305enum-value-body -> Indent doc-line* attribute-line* 306 Dedent 307``` 308 309#### Intermediate Representation 310 311The only change to IR to support this design would require a 312`Repeated(Attribute)` member field to `EnumValue`. 313 314### Back End 315 316The C++ backend can likely retain the same templates for codegen. This design 317should only require a change in codegen to read the attribute on an attribute 318name-value pair and translate the name (potentially multiple times for multiple 319specified cases). 320 321## Future Work 322 323### The `name` attribute 324 325Cases may cause name collisions which are not present in `SHOUTY_CASE`, so there 326should be some means to override the generated name. For instance, consider: 327 328``` 329enum Port: 330 # Names taken from manufacturer's programming manual. 331 USB = 128 -- USB port, virtual port 0 # kUsb 332 USB_1 = 129 -- USB port, virtual port 1 # kUsb1 333 USB1 = 1440 -- USB port 1, virtual port 0 # kUsb1 -- collision 334 USB1_1 = 1441 -- USB port 1, virtual port 1 # kUsb11 335``` 336 337Additionally, there are other use-cases for setting an alternate name to the one 338used in the Emboss definition. Thus, an attribute should be provided that can 339override all naming, including the default name setting in Emboss and any 340`enum_case` attributes. For instance: 341 342``` 343enum Port: 344 # Names taken from manufacturer's programming manual. 345 USB = 128 -- USB port, virtual port 0 346 [(cpp) name: "kUsb"] 347 USB_1 = 129 -- USB port, virtual port 1 348 [(cpp) name: "kUsb_1"] 349 USB1 = 1440 -- USB port 1, virtual port 0 350 [(cpp) name: "kUsb1"] 351 USB1_1 = 1441 -- USB port 1, virtual port 1 352 [(cpp) name: "kUsb1_1") 353``` 354 355This would not emit names like `kUsb11` even if a `$default` case was set to 356`kCamelCase` because the `name` attribute would always override other naming 357settings. Similar to `enum_case`, multiple names could be provided in a comma 358separated list. 359 360This will be completed in future work, the specifics of which may be updated 361here or in a separate design. However, the implementation of `enum_case` should 362be made to allow `name` or a similar attribute to be added without major 363refactoring. 364 365### Deprecated Cases/Names 366 367When transitioning between cases or alternate names, it would be useful to mark 368the old field as `[[deprecated]]` in the C++ source, so that client code that 369uses the generated Emboss code will produce build-time warnings or errors and 370alert maintainers that there will be an upcoming breaking change that could 371break the client code's build. 372 373One way to do this would be to allow tagging a name or case as deprecated in 374the attribute string itself. For instance: 375 376``` 377enum Foo: 378 BAR = 1 379 [(cpp) enum_case: "SHOUTY_CASE -deprecated, kCamelCase"] 380 BAZ = 2 381 [(cpp) enum_case: "SHOUTY_CASE -deprecated, kCamelCase"] 382 MULTI_WORD_ENUM = 4 383 [(cpp) enum_case: "SHOUTY_CASE -deprecated, kCamelCase"] 384``` 385 386This would follow the normal `$default` rules as it would be the same as any 387other attribute value, so for instance, to set `SHOUTY_CASE` to be deprecated in 388favor of `kCamelCase` for all members of the enum: 389 390``` 391enum Foo: 392 [$default (cpp) enum_case: "SHOUTY_CASE -deprecated, kCamelCase"] 393 BAR = 1 394 BAZ = 2 395 MULTI_WORD_ENUM = 4 396``` 397 398and to set it for all enums in the module: 399 400``` 401[$default (cpp) enum_case: "SHOUTY_CASE -deprecated, kCamelCase"] 402 403... 404 405enum Foo: 406 BAR = 1 407 BAZ = 2 408 MULTI_WORD_ENUM = 4 409``` 410 411This will be completed in future work, the specifics of which may be updated 412here or in a separate design. However, the implementation of `enum_case` should 413be made to allow adding `-deprecated` or a similar approach without major 414refactoring. 415 416 417 418 419[google-cpp-style]: https://google.github.io/styleguide/cppguide.html#Enumerator_Names 420[issue-59]: https://github.com/google/emboss/issues/59 421[pr-80]: https://github.com/google/emboss/pull/80 422