xref: /aosp_15_r20/external/cldr/docs/ldml/tr35-keyboards.md (revision 912701f9769bb47905792267661f0baf2b85bed5)
1*912701f9SAndroid Build Coastguard Worker## Unicode Technical Standard #35 Tech Preview
2*912701f9SAndroid Build Coastguard Worker
3*912701f9SAndroid Build Coastguard Worker# Unicode Locale Data Markup Language (LDML)<br/>Part 7: Keyboards
4*912701f9SAndroid Build Coastguard Worker
5*912701f9SAndroid Build Coastguard Worker|Version|45           |
6*912701f9SAndroid Build Coastguard Worker|-------|-------------|
7*912701f9SAndroid Build Coastguard Worker|Editors|Steven Loomis (<a href="mailto:[email protected]">[email protected]</a>) and <a href="tr35.md#Acknowledgments">other CLDR committee members</a>|
8*912701f9SAndroid Build Coastguard Worker
9*912701f9SAndroid Build Coastguard WorkerFor the full header, summary, and status, see [Part 1: Core](tr35.md).
10*912701f9SAndroid Build Coastguard Worker
11*912701f9SAndroid Build Coastguard Worker### _Summary_
12*912701f9SAndroid Build Coastguard Worker
13*912701f9SAndroid Build Coastguard WorkerThis document describes parts of an XML format (_vocabulary_) for the exchange of structured locale data. This format is used in the [Unicode Common Locale Data Repository](https://www.unicode.org/cldr/).
14*912701f9SAndroid Build Coastguard Worker
15*912701f9SAndroid Build Coastguard WorkerThis is a partial document, describing keyboards. For the other parts of the LDML see the [main LDML document](tr35.md) and the links above.
16*912701f9SAndroid Build Coastguard Worker
17*912701f9SAndroid Build Coastguard Worker_Note:_
18*912701f9SAndroid Build Coastguard WorkerSome links may lead to in-development or older
19*912701f9SAndroid Build Coastguard Workerversions of the data files.
20*912701f9SAndroid Build Coastguard WorkerSee <https://cldr.unicode.org> for up-to-date CLDR release data.
21*912701f9SAndroid Build Coastguard Worker
22*912701f9SAndroid Build Coastguard Worker### _Status_
23*912701f9SAndroid Build Coastguard Worker
24*912701f9SAndroid Build Coastguard Worker<!-- _This is a draft document which may be updated, replaced, or superseded by other documents at any time.
25*912701f9SAndroid Build Coastguard WorkerPublication does not imply endorsement by the Unicode Consortium.
26*912701f9SAndroid Build Coastguard WorkerThis is not a stable document; it is inappropriate to cite this document as other than a work in progress._ -->
27*912701f9SAndroid Build Coastguard Worker
28*912701f9SAndroid Build Coastguard Worker_This document has been reviewed by Unicode members and other interested parties, and has been approved for publication by the Unicode Consortium.
29*912701f9SAndroid Build Coastguard WorkerThis is a stable document and may be used as reference material or cited as a normative reference by other specifications._
30*912701f9SAndroid Build Coastguard Worker
31*912701f9SAndroid Build Coastguard Worker> _**A Unicode Technical Standard (UTS)** is an independent specification. Conformance to the Unicode Standard does not imply conformance to any UTS._
32*912701f9SAndroid Build Coastguard Worker
33*912701f9SAndroid Build Coastguard Worker_Please submit corrigenda and other comments with the CLDR bug reporting form [[Bugs](tr35.md#Bugs)]. Related information that is useful in understanding this document is found in the [References](tr35.md#References). For the latest version of the Unicode Standard see [[Unicode](tr35.md#Unicode)]. For a list of current Unicode Technical Reports see [[Reports](tr35.md#Reports)]. For more information about versions of the Unicode Standard, see [[Versions](tr35.md#Versions)]._
34*912701f9SAndroid Build Coastguard Worker
35*912701f9SAndroid Build Coastguard Worker
36*912701f9SAndroid Build Coastguard WorkerSee also [Compatibility Notice](#compatibility-notice).
37*912701f9SAndroid Build Coastguard Worker
38*912701f9SAndroid Build Coastguard Worker## <a name="Parts" href="#Parts">Parts</a>
39*912701f9SAndroid Build Coastguard Worker
40*912701f9SAndroid Build Coastguard WorkerThe LDML specification is divided into the following parts:
41*912701f9SAndroid Build Coastguard Worker
42*912701f9SAndroid Build Coastguard Worker*   Part 1: [Core](tr35.md#Contents) (languages, locales, basic structure)
43*912701f9SAndroid Build Coastguard Worker*   Part 2: [General](tr35-general.md#Contents) (display names & transforms, etc.)
44*912701f9SAndroid Build Coastguard Worker*   Part 3: [Numbers](tr35-numbers.md#Contents) (number & currency formatting)
45*912701f9SAndroid Build Coastguard Worker*   Part 4: [Dates](tr35-dates.md#Contents) (date, time, time zone formatting)
46*912701f9SAndroid Build Coastguard Worker*   Part 5: [Collation](tr35-collation.md#Contents) (sorting, searching, grouping)
47*912701f9SAndroid Build Coastguard Worker*   Part 6: [Supplemental](tr35-info.md#Contents) (supplemental data)
48*912701f9SAndroid Build Coastguard Worker*   Part 7: [Keyboards](tr35-keyboards.md#Contents) (keyboard mappings)
49*912701f9SAndroid Build Coastguard Worker*   Part 8: [Person Names](tr35-personNames.md#Contents) (person names)
50*912701f9SAndroid Build Coastguard Worker*   Part 9: [MessageFormat](tr35-messageFormat.md#Contents) (message format)
51*912701f9SAndroid Build Coastguard Worker
52*912701f9SAndroid Build Coastguard Worker## <a name="Contents" href="#Contents">Contents of Part 7, Keyboards</a>
53*912701f9SAndroid Build Coastguard Worker
54*912701f9SAndroid Build Coastguard Worker* [Keyboards](#keyboards)
55*912701f9SAndroid Build Coastguard Worker* [Goals and Non-goals](#goals-and-non-goals)
56*912701f9SAndroid Build Coastguard Worker  * [Compatibility Notice](#compatibility-notice)
57*912701f9SAndroid Build Coastguard Worker  * [Accessibility](#accessibility)
58*912701f9SAndroid Build Coastguard Worker* [Definitions](#definitions)
59*912701f9SAndroid Build Coastguard Worker* [Notation](#notation)
60*912701f9SAndroid Build Coastguard Worker  * [Escaping](#escaping)
61*912701f9SAndroid Build Coastguard Worker  * [UnicodeSet Escaping](#unicodeset-escaping)
62*912701f9SAndroid Build Coastguard Worker  * [UTS18 Escaping](#uts18-escaping)
63*912701f9SAndroid Build Coastguard Worker* [File and Directory Structure](#file-and-directory-structure)
64*912701f9SAndroid Build Coastguard Worker  * [Extensibility](#extensibility)
65*912701f9SAndroid Build Coastguard Worker* [Normalization](#normalization)
66*912701f9SAndroid Build Coastguard Worker  * [Where Normalization Occurs](#where-normalization-occurs)
67*912701f9SAndroid Build Coastguard Worker  * [Normalization and Transform Matching](#normalization-and-transform-matching)
68*912701f9SAndroid Build Coastguard Worker  * [Normalization and Markers](#normalization-and-markers)
69*912701f9SAndroid Build Coastguard Worker    * [Rationale for 'gluing' markers](#rationale-for-gluing-markers)
70*912701f9SAndroid Build Coastguard Worker    * [Data Model: `Marker`](#data-model-marker)
71*912701f9SAndroid Build Coastguard Worker    * [Data Model: string](#data-model-string)
72*912701f9SAndroid Build Coastguard Worker    * [Data Model: `MarkerEntry`](#data-model-markerentry)
73*912701f9SAndroid Build Coastguard Worker    * [Marker Algorithm Overview](#marker-algorithm-overview)
74*912701f9SAndroid Build Coastguard Worker    * [Phase 1: Parsing/Removing Markers](#phase-1-parsingremoving-markers)
75*912701f9SAndroid Build Coastguard Worker    * [Phase 2: Plain Text Processing](#phase-2-plain-text-processing)
76*912701f9SAndroid Build Coastguard Worker    * [Phase 3: Adding Markers](#phase-3-adding-markers)
77*912701f9SAndroid Build Coastguard Worker    * [Example Normalization with Markers](#example-normalization-with-markers)
78*912701f9SAndroid Build Coastguard Worker  * [Normalization and Character Classes](#normalization-and-character-classes)
79*912701f9SAndroid Build Coastguard Worker  * [Normalization and Reorder elements](#normalization-and-reorder-elements)
80*912701f9SAndroid Build Coastguard Worker  * [Normalization-safe Segments](#normalization-safe-segments)
81*912701f9SAndroid Build Coastguard Worker  * [Normalization and Output](#normalization-and-output)
82*912701f9SAndroid Build Coastguard Worker  * [Disabling Normalization](#disabling-normalization)
83*912701f9SAndroid Build Coastguard Worker* [Element Hierarchy](#element-hierarchy)
84*912701f9SAndroid Build Coastguard Worker  * [Element: keyboard3](#element-keyboard3)
85*912701f9SAndroid Build Coastguard Worker  * [Element: import](#element-import)
86*912701f9SAndroid Build Coastguard Worker  * [Element: locales](#element-locales)
87*912701f9SAndroid Build Coastguard Worker  * [Element: locale](#element-locale)
88*912701f9SAndroid Build Coastguard Worker  * [Element: version](#element-version)
89*912701f9SAndroid Build Coastguard Worker  * [Element: info](#element-info)
90*912701f9SAndroid Build Coastguard Worker  * [Element: settings](#element-settings)
91*912701f9SAndroid Build Coastguard Worker  * [Element: displays](#element-displays)
92*912701f9SAndroid Build Coastguard Worker  * [Element: display](#element-display)
93*912701f9SAndroid Build Coastguard Worker    * [Non-spacing marks on keytops](#non-spacing-marks-on-keytops)
94*912701f9SAndroid Build Coastguard Worker  * [Element: displayOptions](#element-displayoptions)
95*912701f9SAndroid Build Coastguard Worker  * [Element: keys](#element-keys)
96*912701f9SAndroid Build Coastguard Worker  * [Element: key](#element-key)
97*912701f9SAndroid Build Coastguard Worker    * [Implied Keys](#implied-keys)
98*912701f9SAndroid Build Coastguard Worker  * [Element: flicks](#element-flicks)
99*912701f9SAndroid Build Coastguard Worker    * [Element: flick](#element-flick)
100*912701f9SAndroid Build Coastguard Worker    * [Element: flickSegment](#element-flicksegment)
101*912701f9SAndroid Build Coastguard Worker  * [Element: forms](#element-forms)
102*912701f9SAndroid Build Coastguard Worker  * [Element: form](#element-form)
103*912701f9SAndroid Build Coastguard Worker    * [Implied Form Values](#implied-form-values)
104*912701f9SAndroid Build Coastguard Worker  * [Element: scanCodes](#element-scancodes)
105*912701f9SAndroid Build Coastguard Worker  * [Element: layers](#element-layers)
106*912701f9SAndroid Build Coastguard Worker  * [Element: layer](#element-layer)
107*912701f9SAndroid Build Coastguard Worker    * [Layer Modifier Sets](#layer-modifier-sets)
108*912701f9SAndroid Build Coastguard Worker    * [Layer Modifier Components](#layer-modifier-components)
109*912701f9SAndroid Build Coastguard Worker    * [Modifier Left- and Right- keys](#modifier-left--and-right--keys)
110*912701f9SAndroid Build Coastguard Worker    * [Layer Modifier Matching](#layer-modifier-matching)
111*912701f9SAndroid Build Coastguard Worker  * [Element: row](#element-row)
112*912701f9SAndroid Build Coastguard Worker  * [Element: variables](#element-variables)
113*912701f9SAndroid Build Coastguard Worker  * [Element: string](#element-string)
114*912701f9SAndroid Build Coastguard Worker  * [Element: set](#element-set)
115*912701f9SAndroid Build Coastguard Worker  * [Element: uset](#element-uset)
116*912701f9SAndroid Build Coastguard Worker  * [Element: transforms](#element-transforms)
117*912701f9SAndroid Build Coastguard Worker    * [Markers](#markers)
118*912701f9SAndroid Build Coastguard Worker  * [Element: transformGroup](#element-transformgroup)
119*912701f9SAndroid Build Coastguard Worker    * [Example: `transformGroup` with `transform` elements](#example-transformgroup-with-transform-elements)
120*912701f9SAndroid Build Coastguard Worker    * [Example: `transformGroup` with `reorder` elements](#example-transformgroup-with-reorder-elements)
121*912701f9SAndroid Build Coastguard Worker  * [Element: transform](#element-transform)
122*912701f9SAndroid Build Coastguard Worker    * [Regex-like Syntax](#regex-like-syntax)
123*912701f9SAndroid Build Coastguard Worker    * [Additional Features](#additional-features)
124*912701f9SAndroid Build Coastguard Worker    * [Disallowed Regex Features](#disallowed-regex-features)
125*912701f9SAndroid Build Coastguard Worker    * [Replacement syntax](#replacement-syntax)
126*912701f9SAndroid Build Coastguard Worker  * [Element: reorder](#element-reorder)
127*912701f9SAndroid Build Coastguard Worker    * [Using `<import>` with `<reorder>` elements](#using-import-with-reorder-elements)
128*912701f9SAndroid Build Coastguard Worker    * [Example Post-reorder transforms](#example-post-reorder-transforms)
129*912701f9SAndroid Build Coastguard Worker    * [Reorder and Markers](#reorder-and-markers)
130*912701f9SAndroid Build Coastguard Worker  * [Backspace Transforms](#backspace-transforms)
131*912701f9SAndroid Build Coastguard Worker* [Invariants](#invariants)
132*912701f9SAndroid Build Coastguard Worker* [Keyboard IDs](#keyboard-ids)
133*912701f9SAndroid Build Coastguard Worker  * [Principles for Keyboard IDs](#principles-for-keyboard-ids)
134*912701f9SAndroid Build Coastguard Worker* [Platform Behaviors in Edge Cases](#platform-behaviors-in-edge-cases)
135*912701f9SAndroid Build Coastguard Worker
136*912701f9SAndroid Build Coastguard Worker## Keyboards
137*912701f9SAndroid Build Coastguard Worker
138*912701f9SAndroid Build Coastguard WorkerThe Unicode Standard and related technologies such as CLDR have dramatically improved the path to language support. However, keyboard support remains platform and vendor specific, causing inconsistencies in implementation as well as timeline.
139*912701f9SAndroid Build Coastguard Worker
140*912701f9SAndroid Build Coastguard WorkerMore and more language communities are determining that digitization is vital to their approach to language preservation and that engagement with Unicode is essential to becoming fully digitized. For many of these communities, however, getting new characters or a new script added to The Unicode Standard is not the end of their journey. The next, often more challenging stage is to get device makers, operating systems, apps and services to implement the script requirements that Unicode has just added to support their language.
141*912701f9SAndroid Build Coastguard Worker
142*912701f9SAndroid Build Coastguard WorkerHowever, commensurate improvements to streamline new language support on the input side have been lacking. CLDR’s Keyboard specification has been updated in an attempt to address this gap.
143*912701f9SAndroid Build Coastguard Worker
144*912701f9SAndroid Build Coastguard WorkerThis document specifies an interchange format for the communication of keyboard mapping data independent of vendors and platforms. Keyboard authors can then create a single mapping file for their language, which implementations can use to provide that language’s keyboard mapping on their own platform.
145*912701f9SAndroid Build Coastguard Worker
146*912701f9SAndroid Build Coastguard WorkerAdditionally, the standardized identifier for keyboards can be used to communicate, internally or externally, a request for a particular keyboard mapping that is to be used to transform either text or keystrokes. The corresponding data can then be used to perform the requested actions.  For example, a remote screen-access application (such as used for customer service or server management) would be able to communicate and choose the same keyboard layout on the remote device as is used in front of the user, even if the two systems used different platforms.
147*912701f9SAndroid Build Coastguard Worker
148*912701f9SAndroid Build Coastguard WorkerThe data can also be used in analysis of the capabilities of different keyboards. It also allows better interoperability by making it easier for keyboard designers to see which characters are generally supported on keyboards for given languages.
149*912701f9SAndroid Build Coastguard Worker
150*912701f9SAndroid Build Coastguard Worker<!-- To illustrate this specification, here is an abridged layout representing the English US 101 keyboard on the macOS operating system (with an inserted long-press example). -->
151*912701f9SAndroid Build Coastguard Worker
152*912701f9SAndroid Build Coastguard WorkerFor complete examples, see the XML files in the CLDR source repository.
153*912701f9SAndroid Build Coastguard Worker
154*912701f9SAndroid Build Coastguard WorkerAttribute values should be evaluated considering the DTD and [DTD Annotations](tr35.md#dtd-annotations).
155*912701f9SAndroid Build Coastguard Worker
156*912701f9SAndroid Build Coastguard Worker* * *
157*912701f9SAndroid Build Coastguard Worker
158*912701f9SAndroid Build Coastguard Worker## Goals and Non-goals
159*912701f9SAndroid Build Coastguard Worker
160*912701f9SAndroid Build Coastguard WorkerSome goals of this format are:
161*912701f9SAndroid Build Coastguard Worker
162*912701f9SAndroid Build Coastguard Worker1. Physical and virtual keyboard layouts defined in a single file.
163*912701f9SAndroid Build Coastguard Worker2. Provide definitive platform-independent definitions for new keyboard layouts.
164*912701f9SAndroid Build Coastguard Worker    * For example, a new French standard keyboard layout would have a single definition which would be usable across all implementations.
165*912701f9SAndroid Build Coastguard Worker3. Allow platforms to be able to use CLDR keyboard data for the character-emitting keys (non-frame) aspects of keyboard layouts.
166*912701f9SAndroid Build Coastguard Worker4. Deprecate & archive existing LDML platform-specific layouts so they are not part of future releases.
167*912701f9SAndroid Build Coastguard Worker
168*912701f9SAndroid Build Coastguard Worker<!--
169*912701f9SAndroid Build Coastguard Worker1. Make the XML as readable as possible.
170*912701f9SAndroid Build Coastguard Worker2. Represent faithfully keyboard data from major platforms: it should be possible to create a functionally-equivalent data file (such that given any input, it can produce the same output).
171*912701f9SAndroid Build Coastguard Worker3. Make as much commonality in the data across platforms as possible to make comparison easy. -->
172*912701f9SAndroid Build Coastguard Worker
173*912701f9SAndroid Build Coastguard WorkerSome non-goals (outside the scope of the format) currently are:
174*912701f9SAndroid Build Coastguard Worker
175*912701f9SAndroid Build Coastguard Worker1. Adaptation for screen scaling resolution. Instead, keyboards should define layouts based on physical size. Platforms may interpret physical size definitions and adapt for different physical screen sizes with different resolutions.
176*912701f9SAndroid Build Coastguard Worker2. Unification of platform-specific virtual key and scan code mapping tables.
177*912701f9SAndroid Build Coastguard Worker3. Unification of pre-existing platform layouts themselves (e.g. existing fr-azerty on platform a, b, c).
178*912701f9SAndroid Build Coastguard Worker4. Support for prior (pre 3.0) CLDR keyboard files. See [Compatibility Notice](#compatibility-notice).
179*912701f9SAndroid Build Coastguard Worker5. Run-time efficiency. [LDML is explicitly an interchange format](tr35.md#Introduction), and so it is expected that data will be transformed to a more compact format for use by a keystroke processing engine.
180*912701f9SAndroid Build Coastguard Worker6. Platform-specific frame keys such as Fn, Numpad, IME swap keys, and cursor keys are out of scope.
181*912701f9SAndroid Build Coastguard Worker   (This also means that in this specification, modifier (frame) keys cannot generate output, such as capslock producing backslash.)
182*912701f9SAndroid Build Coastguard Worker
183*912701f9SAndroid Build Coastguard Worker<!-- 1. Display names or symbols for keycaps (eg, the German name for "Return"). If that were added to LDML, it would be in a different structure, outside the scope of this section.
184*912701f9SAndroid Build Coastguard Worker2. Advanced IME features, handwriting recognition, etc.
185*912701f9SAndroid Build Coastguard Worker3. Roundtrip mappings—the ability to recover precisely the same format as an original platform's representation. In particular, the internal structure may have no relation to the internal structure of external keyboard source data, the only goal is functional equivalence. -->
186*912701f9SAndroid Build Coastguard Worker
187*912701f9SAndroid Build Coastguard Worker<!-- Note: During development of this section, it was considered whether the modifier RAlt (= AltGr) should be merged with Option. In the end, they were kept separate, but for comparison across platforms implementers may choose to unify them. -->
188*912701f9SAndroid Build Coastguard Worker
189*912701f9SAndroid Build Coastguard WorkerNote that in parts of this document, the format `@x` is used to indicate the _attribute_ **x**.
190*912701f9SAndroid Build Coastguard Worker
191*912701f9SAndroid Build Coastguard Worker### Compatibility Notice
192*912701f9SAndroid Build Coastguard Worker
193*912701f9SAndroid Build Coastguard Worker> A major rewrite of this specification, called "Keyboard 3.0", was introduced in CLDR v45.
194*912701f9SAndroid Build Coastguard Worker> The changes required were too extensive to maintain compatibility. For this reason, the `ldmlKeyboard3.dtd` DTD is _not_ compatible with DTDs from prior versions of CLDR such as v43 and prior.
195*912701f9SAndroid Build Coastguard Worker>
196*912701f9SAndroid Build Coastguard Worker> To process earlier XML files, use the data and specification from v43.1, found at <https://www.unicode.org/reports/tr35/tr35-69/tr35.html>
197*912701f9SAndroid Build Coastguard Worker>
198*912701f9SAndroid Build Coastguard Worker> `ldmlKeyboard.dtd` continues to be made available in CLDR, however, it will not be updated.
199*912701f9SAndroid Build Coastguard Worker
200*912701f9SAndroid Build Coastguard Worker### Accessibility
201*912701f9SAndroid Build Coastguard Worker
202*912701f9SAndroid Build Coastguard WorkerKeyboard use can be challenging for individuals with various types of disabilities. For this revision, features or architectural designs specifically for the purpose of improving accessibility are not yet included. However:
203*912701f9SAndroid Build Coastguard Worker
204*912701f9SAndroid Build Coastguard Worker1. Having an industry-wide standard format for keyboards will enable accessibility software to make use of keyboard data with a reduced dependence on platform-specific knowledge.
205*912701f9SAndroid Build Coastguard Worker2. Features which require certain levels of mobility or speed of entry should be considered for their impact on accessibility. This impact could be mitigated by means of additional, accessible methods of generating the same output.
206*912701f9SAndroid Build Coastguard Worker3. Public feedback is welcome on any aspects of this document which might hinder accessibility.
207*912701f9SAndroid Build Coastguard Worker
208*912701f9SAndroid Build Coastguard Worker## Definitions
209*912701f9SAndroid Build Coastguard Worker
210*912701f9SAndroid Build Coastguard Worker**Arrangement:** The relative position of the rectangles that represent keys, either physically or virtually. A hardware keyboard has a static arrangement while a touch keyboard may have a dynamic arrangement that changes per language and/or layer. While the arrangement of keys on a keyboard may be fixed, the mapping of those keys may vary.
211*912701f9SAndroid Build Coastguard Worker
212*912701f9SAndroid Build Coastguard Worker**Base character:** The character emitted by a particular key when no modifiers are active. In ISO 9995-1:2009 terms, this is Group 1, Level 1.
213*912701f9SAndroid Build Coastguard Worker
214*912701f9SAndroid Build Coastguard Worker**Core keys:** also known as “alphanumeric” section. The primary set of key values on a keyboard that are used for typing the target language of the keyboard. For example, the three rows of letters on a standard US QWERTY keyboard (QWERTYUIOP, ASDFGHJKL, ZXCVBNM) together with the most significant punctuation keys. Usually this equates to the minimal set of keys for a language as seen on mobile phone keyboards.
215*912701f9SAndroid Build Coastguard WorkerDistinguished from the **frame keys**.
216*912701f9SAndroid Build Coastguard Worker
217*912701f9SAndroid Build Coastguard Worker**Dead keys:** These are keys which do not emit normal characters by themselves. They are so named because to the user, they may appear to be “dead,” i.e., non-functional. However, they do produce a change to the input context. For example, in many Latin keyboards hitting the `^` dead-key followed by the `e` key produces `ê`. The `^` by itself may be invisible or presented in a special way by the platform.
218*912701f9SAndroid Build Coastguard Worker
219*912701f9SAndroid Build Coastguard Worker**Frame keys:** These are keys which are outside of the area of the **core keys** and typically do not emit characters. These keys include **modifier** keys, such as Shift or Ctrl, but also include platform specific keys: Fn, IME and layout-switching keys, cursor keys, insert emoji keys etc.
220*912701f9SAndroid Build Coastguard Worker
221*912701f9SAndroid Build Coastguard Worker**Hardware keyboard:** an input device which has individual keys that are pressed. Each key has a unique identifier and the arrangement doesn't change, even if the mapping of those keys does. Also known as a physical keyboard.
222*912701f9SAndroid Build Coastguard Worker
223*912701f9SAndroid Build Coastguard Worker**Implementation:** see **Keyboard implementation**
224*912701f9SAndroid Build Coastguard Worker
225*912701f9SAndroid Build Coastguard Worker**Input Method Editor (IME):** a component or program that supports input of large character sets. Typically, IMEs employ contextual logic and candidate UI to identify the Unicode characters intended by the user.
226*912701f9SAndroid Build Coastguard Worker
227*912701f9SAndroid Build Coastguard Worker**Keyboard implementation:** Software which implements the present specification, such that keyboard XML files can be used to interpret keystrokes from a **Hardware keyboard** or an on-screen **Touch keyboard**.
228*912701f9SAndroid Build Coastguard Worker
229*912701f9SAndroid Build Coastguard WorkerKeyboard implementations will typically consist of two parts:
230*912701f9SAndroid Build Coastguard Worker
231*912701f9SAndroid Build Coastguard Worker1. A _compile/build tool_ part used by **Keyboard authors** to parse the XML file and produce a compact runtime format, and
232*912701f9SAndroid Build Coastguard Worker2. A _runtime_ part which interprets the runtime format when the keyboard is selected by the end user, and delivers the output plain text to the platform or application.
233*912701f9SAndroid Build Coastguard Worker
234*912701f9SAndroid Build Coastguard Worker**Key:** A physical key on a hardware keyboard, or a virtual key on a touch keyboard.
235*912701f9SAndroid Build Coastguard Worker
236*912701f9SAndroid Build Coastguard Worker**Key code:** The integer code sent to the application on pressing a key.
237*912701f9SAndroid Build Coastguard Worker
238*912701f9SAndroid Build Coastguard Worker**Key map:** The basic mapping between hardware or on-screen positions and the output characters for each set of modifier combinations associated with a particular layout. There may be multiple key maps for each layout.
239*912701f9SAndroid Build Coastguard Worker
240*912701f9SAndroid Build Coastguard Worker**Keyboard:** A particular arrangement of keys for the inputting of text, such as a hardware keyboard or a touch keyboard.
241*912701f9SAndroid Build Coastguard Worker
242*912701f9SAndroid Build Coastguard Worker**Keyboard author:** The person or group of people designing and producing a particular keyboard layout designed to support one or more languages. In the context of this specification, that author may be editing the LDML XML file directly or by means of software tools.
243*912701f9SAndroid Build Coastguard Worker
244*912701f9SAndroid Build Coastguard Worker**Keyboard layout:** A layout is the overall keyboard configuration for a particular locale. Within a keyboard layout, there is a single base map, one or more key maps and zero or more transforms.
245*912701f9SAndroid Build Coastguard Worker
246*912701f9SAndroid Build Coastguard Worker**Layer** is an arrangement of keys on a touch keyboard. A touch keyboard is made up of a set of layers. Each layer may have a different key layout, unlike with a hardware keyboard, and may not correspond directly to a hardware keyboard's modifier keys. A layer is accessed via a layer-switching key. See also touch keyboard and modifier.
247*912701f9SAndroid Build Coastguard Worker
248*912701f9SAndroid Build Coastguard Worker**Long-press key:** also known as a “child key”. A secondary key that is invoked from a top level key on a touch keyboard. Secondary keys typically provide access to variants of the top level key, such as accented variants (a => á, à, ä, ã)
249*912701f9SAndroid Build Coastguard Worker
250*912701f9SAndroid Build Coastguard Worker**Modifier:** A key that is held to change the behavior of a hardware keyboard. For example, the "Shift" key allows access to upper-case characters on a US keyboard. Other modifier keys include but are not limited to: Ctrl, Alt, Option, Command and Caps Lock. On a touch keyboard, keys that appear to be modifier keys should be considered to be layer-switching keys.
251*912701f9SAndroid Build Coastguard Worker
252*912701f9SAndroid Build Coastguard Worker**Physical keyboard:** see **Hardware keyboard**
253*912701f9SAndroid Build Coastguard Worker
254*912701f9SAndroid Build Coastguard Worker**Touch keyboard:** A keyboard that is rendered on a, typically, touch surface. It has a dynamic arrangement and contrasts with a hardware keyboard. This term has many synonyms: software keyboard, SIP (Software Input Panel), virtual keyboard. This contrasts with other uses of the term virtual keyboard as an on-screen keyboard for reference or accessibility data entry.
255*912701f9SAndroid Build Coastguard Worker
256*912701f9SAndroid Build Coastguard Worker**Transform:** A transform is an element that specifies a set of conversions from sequences of code points into one (or more) other code points. Transforms may reorder or replace text. They may be used to implement “dead key” behaviors, simple orthographic corrections, visual (typewriter) type input etc.
257*912701f9SAndroid Build Coastguard Worker
258*912701f9SAndroid Build Coastguard Worker**Virtual keyboard:** see **Touch keyboard**
259*912701f9SAndroid Build Coastguard Worker
260*912701f9SAndroid Build Coastguard Worker## Notation
261*912701f9SAndroid Build Coastguard Worker
262*912701f9SAndroid Build Coastguard Worker- Ellipses (`…`) in syntax examples are used to denote substituted parts.
263*912701f9SAndroid Build Coastguard Worker
264*912701f9SAndroid Build Coastguard Worker  For example, `id="…keyId"` denotes that `…keyId` (the part between double quotes) is to be replaced with something, in this case a key identifier. As another example, `\u{…usv}` denotes that the `…usv` is to be replaced with something, in this case a Unicode scalar value in hex.
265*912701f9SAndroid Build Coastguard Worker
266*912701f9SAndroid Build Coastguard Worker### Escaping
267*912701f9SAndroid Build Coastguard Worker
268*912701f9SAndroid Build Coastguard WorkerWhen explicitly specified, attribute values can contain escaped characters. This specification uses two methods of escaping, the _UnicodeSet_ notation and the `\u{…usv}` notation.
269*912701f9SAndroid Build Coastguard Worker
270*912701f9SAndroid Build Coastguard Worker### UnicodeSet Escaping
271*912701f9SAndroid Build Coastguard Worker
272*912701f9SAndroid Build Coastguard WorkerThe _UnicodeSet_ notation is described in [UTS #35 section 5.3.3](tr35.md#Unicode_Sets) and allows for comprehensive character matching, including by character range, properties, names, or codepoints.
273*912701f9SAndroid Build Coastguard Worker
274*912701f9SAndroid Build Coastguard WorkerNote that the `\u1234` and `\x{C1}` format escaping is not supported, only the `\u{…}` format (using `bracketedHex`).
275*912701f9SAndroid Build Coastguard Worker
276*912701f9SAndroid Build Coastguard WorkerCurrently, the following attribute values allow _UnicodeSet_ notation:
277*912701f9SAndroid Build Coastguard Worker
278*912701f9SAndroid Build Coastguard Worker* `from` or `before` on the `<transform>` element
279*912701f9SAndroid Build Coastguard Worker* `from` or `before` on the `<reorder>` element
280*912701f9SAndroid Build Coastguard Worker* `chars` on the [`<repertoire>`](#test-element-repertoire) test element.
281*912701f9SAndroid Build Coastguard Worker
282*912701f9SAndroid Build Coastguard Worker### UTS18 Escaping
283*912701f9SAndroid Build Coastguard Worker
284*912701f9SAndroid Build Coastguard WorkerThe `\u{…usv}` notation, a subset of hex notation, is described in [UTS #18 section 1.1](https://www.unicode.org/reports/tr18/#Hex_notation). It can refer to one or multiple individual codepoints. Currently, the following attribute values allow the `\u{…}` notation:
285*912701f9SAndroid Build Coastguard Worker
286*912701f9SAndroid Build Coastguard Worker* `output` on the `<key>` element
287*912701f9SAndroid Build Coastguard Worker* `from` or `to` on the `<transform>` element
288*912701f9SAndroid Build Coastguard Worker* `value` on the `<variable>` element
289*912701f9SAndroid Build Coastguard Worker* `output` and `display` on the `<display>` element
290*912701f9SAndroid Build Coastguard Worker* `baseCharacter` on the `<displayOptions>` element
291*912701f9SAndroid Build Coastguard Worker* Some attributes on [Keyboard Test Data](#keyboard-test-data) subelements
292*912701f9SAndroid Build Coastguard Worker
293*912701f9SAndroid Build Coastguard WorkerCharacters of general category of Mark (M), Control characters (Cc), Format characters (Cf), and whitespace other than space should be encoded using one of the notation above as appropriate.
294*912701f9SAndroid Build Coastguard Worker
295*912701f9SAndroid Build Coastguard WorkerAttribute values escaped in this manner are annotated with the `<!--@ALLOWS_UESC-->` DTD annotation, see [DTD Annotations](tr35.md#dtd-annotations)
296*912701f9SAndroid Build Coastguard Worker
297*912701f9SAndroid Build Coastguard Worker* * *
298*912701f9SAndroid Build Coastguard Worker
299*912701f9SAndroid Build Coastguard Worker## File and Directory Structure
300*912701f9SAndroid Build Coastguard Worker
301*912701f9SAndroid Build Coastguard Worker* In the future, new layouts will be included in the CLDR repository, as a way for new layouts to be distributed in a cross-platorm manner. The process for this repository of layouts has not yet been defined, see the [CLDR Keyboard Workgroup Page][keyboard-workgroup] for up-to-date information.
302*912701f9SAndroid Build Coastguard Worker
303*912701f9SAndroid Build Coastguard Worker* Layouts have version metadata to indicate their specification compliance versi​​on number, such as `45`. See [`cldrVersion`](tr35-info.md#version-information).
304*912701f9SAndroid Build Coastguard Worker
305*912701f9SAndroid Build Coastguard Worker```xml
306*912701f9SAndroid Build Coastguard Worker<keyboard3 xmlns="https://schemas.unicode.org/cldr/45/keyboard3" conformsTo="45"/>
307*912701f9SAndroid Build Coastguard Worker```
308*912701f9SAndroid Build Coastguard Worker
309*912701f9SAndroid Build Coastguard Worker> _Note_: Unlike other LDML files, layouts are designed to be used outside of the CLDR source tree.  As such, they do not contain DOCTYPE entries.
310*912701f9SAndroid Build Coastguard Worker>
311*912701f9SAndroid Build Coastguard Worker> DTD and Schema (.xsd) files are available for use in validating keyboard files.
312*912701f9SAndroid Build Coastguard Worker
313*912701f9SAndroid Build Coastguard Worker* The filename of a keyboard .xml file does not have to match the BCP47 primary locale ID, but it is recommended to do so. The CLDR repository may enforce filename consistency.
314*912701f9SAndroid Build Coastguard Worker
315*912701f9SAndroid Build Coastguard Worker### Extensibility
316*912701f9SAndroid Build Coastguard Worker
317*912701f9SAndroid Build Coastguard WorkerFor extensibility, the `<special>` element will be allowed at nearly every level.
318*912701f9SAndroid Build Coastguard Worker
319*912701f9SAndroid Build Coastguard WorkerSee [Element special](tr35.md#special) in Part 1.
320*912701f9SAndroid Build Coastguard Worker
321*912701f9SAndroid Build Coastguard Worker## Normalization
322*912701f9SAndroid Build Coastguard Worker
323*912701f9SAndroid Build Coastguard WorkerUnicode Normalization, as described in [The Unicode Standard](https://www.unicode.org/reports/tr41/#Unicode/), is a process by which Unicode text is processed to eliminate unwanted distinctions.
324*912701f9SAndroid Build Coastguard Worker
325*912701f9SAndroid Build Coastguard WorkerThis section discusses how conformant keyboards are affected by normalization, and the impact of normalization on keyboard authors and keyboard implmentations.
326*912701f9SAndroid Build Coastguard Worker
327*912701f9SAndroid Build Coastguard WorkerKeyboard implementations will usually apply normalization as appropriate when matching transform rules and `<display>` value matching.
328*912701f9SAndroid Build Coastguard WorkerOutput from the keyboard, following application of all transform rules, will be normalized to the appropriate form by the keyboard implementation.
329*912701f9SAndroid Build Coastguard Worker
330*912701f9SAndroid Build Coastguard Worker> Note: There are many existing software libraries which perform Unicode Normalization, including [ICU](https://icu.unicode.org), [ICU4X](https://icu4x.unicode.org), and JavaScript's [String.prototype.normalize()](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String/normalize).
331*912701f9SAndroid Build Coastguard Worker
332*912701f9SAndroid Build Coastguard WorkerKeyboard authors will not typically need to perform normalization as part of the keyboard layout.  However, authors should be aware of areas where normalization affects keyboard operation so that they may achieve their desired results.
333*912701f9SAndroid Build Coastguard Worker
334*912701f9SAndroid Build Coastguard Worker### Where Normalization Occurs
335*912701f9SAndroid Build Coastguard Worker
336*912701f9SAndroid Build Coastguard WorkerThere are four stages where normalization must be performed by keyboard implementations.
337*912701f9SAndroid Build Coastguard Worker
338*912701f9SAndroid Build Coastguard Worker1. **From the keyboard source `.xml`**
339*912701f9SAndroid Build Coastguard Worker
340*912701f9SAndroid Build Coastguard Worker    Keyboard source .xml files may be in any normalization form.
341*912701f9SAndroid Build Coastguard Worker    However, in processing they are converted to NFD.
342*912701f9SAndroid Build Coastguard Worker
343*912701f9SAndroid Build Coastguard Worker    - From any form to NFD: full normalization (decompose+reorder)
344*912701f9SAndroid Build Coastguard Worker    - Markers must be processed as described [below](#marker-algorithm-overview).
345*912701f9SAndroid Build Coastguard Worker    - Regex patterns must be processed so that matching is performed in NFD.
346*912701f9SAndroid Build Coastguard Worker
347*912701f9SAndroid Build Coastguard Worker    Example: `<key output=`, and `<transform from= to=` attribute contents will be normalized to NFD.
348*912701f9SAndroid Build Coastguard Worker
349*912701f9SAndroid Build Coastguard Worker2. **From the input context**
350*912701f9SAndroid Build Coastguard Worker
351*912701f9SAndroid Build Coastguard Worker    The input context must be normalized for purposes of matching.
352*912701f9SAndroid Build Coastguard Worker
353*912701f9SAndroid Build Coastguard Worker    - From any form to NFD: full normalization (decompose+reorder)
354*912701f9SAndroid Build Coastguard Worker    - Markers in the cached context must be preserved.
355*912701f9SAndroid Build Coastguard Worker
356*912701f9SAndroid Build Coastguard Worker    Example: The input context contains U+00E8 (`è`).  The user clicks the cursor after the character, then presses a key which produces U+0320 (`<key output="\u{0320}"/>`).
357*912701f9SAndroid Build Coastguard Worker    The implementation must normalize the context buffer to `e\u{0320}\u{0300}` (`è̠`) before matching.
358*912701f9SAndroid Build Coastguard Worker
359*912701f9SAndroid Build Coastguard Worker3. **Before each `transformGroup`**
360*912701f9SAndroid Build Coastguard Worker
361*912701f9SAndroid Build Coastguard Worker    Text must be normalized before processing by the next `transformGroup`.
362*912701f9SAndroid Build Coastguard Worker
363*912701f9SAndroid Build Coastguard Worker    - To NFD: no decomposition should be needed, because all of the input text (including transform rules) was already in NFD form.
364*912701f9SAndroid Build Coastguard Worker    However, marker reordering may be needed if transforms insert segments out of order.
365*912701f9SAndroid Build Coastguard Worker    - Markers must be preserved.
366*912701f9SAndroid Build Coastguard Worker
367*912701f9SAndroid Build Coastguard Worker    Example: The input context contains U+00E8 (`è`).  The user clicks the cursor after this character, then presses a key producing `x`. A transform rule `<transform from='x' to='\u{0320}'/>` matches. The implementation must normalize the intermediate buffer to `e\u{0320}\u{0300}` (`è̠`) before proceeding to the next `transformGroup`.
368*912701f9SAndroid Build Coastguard Worker
369*912701f9SAndroid Build Coastguard Worker4. **Before output to the platform/application**
370*912701f9SAndroid Build Coastguard Worker
371*912701f9SAndroid Build Coastguard Worker    Text must be normalized into the output form requested by the platform or application. This will typically be NFC, but may not be.
372*912701f9SAndroid Build Coastguard Worker
373*912701f9SAndroid Build Coastguard Worker    - If normalizing to NFC, full normalization (reorder+composition) will be required.
374*912701f9SAndroid Build Coastguard Worker    - No markers are present in this text, they are removed prior to output but retained in the implementation's input context for subsequent keystrokes. See [markers](#markers).
375*912701f9SAndroid Build Coastguard Worker
376*912701f9SAndroid Build Coastguard Worker    Example: The result of keystrokes and transform processing produces the string `e\u{0300}`. The keyboard implementation normalizes this to a single NFC codepoint U+00E8 (`è`), which is returned to the application.
377*912701f9SAndroid Build Coastguard Worker
378*912701f9SAndroid Build Coastguard Worker### Normalization and Transform Matching
379*912701f9SAndroid Build Coastguard Worker
380*912701f9SAndroid Build Coastguard WorkerRegardless of the normalization form in the keyboard source file or in the edit buffer context, transform matching will be performed using **NFD**. For example, all of the following transforms will match the input strings è̠, whether the input is U+00E8 U+0320, U+0065 U+0320 U+0300, or U+0065 U+0300 U+0320.
381*912701f9SAndroid Build Coastguard Worker
382*912701f9SAndroid Build Coastguard Worker```xml
383*912701f9SAndroid Build Coastguard Worker<transform from="e\u{0320}\u{0300}" /> <!-- NFD -->
384*912701f9SAndroid Build Coastguard Worker<transform from="\u{00E8}\u{0320}"  /> <!-- NFC: è + U+0320 -->
385*912701f9SAndroid Build Coastguard Worker<transform from="e\u{0300}\u{0320}" /> <!-- Unnormalized -->
386*912701f9SAndroid Build Coastguard Worker```
387*912701f9SAndroid Build Coastguard Worker
388*912701f9SAndroid Build Coastguard Worker### Normalization and Markers
389*912701f9SAndroid Build Coastguard Worker
390*912701f9SAndroid Build Coastguard WorkerA special issue occurs when markers are involved.
391*912701f9SAndroid Build Coastguard Worker[Markers](#markers) are not text, and so not themselves modified or reordered by the Unicode Normalization Algorithm.
392*912701f9SAndroid Build Coastguard WorkerExisting Normalization APIs typically operate on plain text, and so those APIs can not be used with content containing markers.
393*912701f9SAndroid Build Coastguard Worker
394*912701f9SAndroid Build Coastguard WorkerHowever, the markers must be retained and processed by keyboard implementations in a manner which will be both consistent across implementations and predictable to keyboard authors.
395*912701f9SAndroid Build Coastguard WorkerInconsistencies would result in different user experiences — specifically, different or incorrect text output — on some implementations and not another.
396*912701f9SAndroid Build Coastguard WorkerUnpredictability would make it challenging for the keyboard author to create a keyboard with expected behavior.
397*912701f9SAndroid Build Coastguard Worker
398*912701f9SAndroid Build Coastguard WorkerThis section gives an algorithm for implementing normalization on a text stream including markers.
399*912701f9SAndroid Build Coastguard Worker
400*912701f9SAndroid Build Coastguard Worker_Note:_ When the algorithm is performed on a plain text stream that doesn't include markers, implementations may skip the removing/re-adding steps 1 and 3 because no markers are involved.
401*912701f9SAndroid Build Coastguard Worker
402*912701f9SAndroid Build Coastguard Worker#### Rationale for 'gluing' markers
403*912701f9SAndroid Build Coastguard Worker
404*912701f9SAndroid Build Coastguard WorkerThe processing described here describes an extension to Unicode normalization to account for the desired behavior of markers.
405*912701f9SAndroid Build Coastguard Worker
406*912701f9SAndroid Build Coastguard WorkerThe algorithm described considers markers 'glued' (remaining with) the following character. If a context ends with a marker, that marker would be guaranteed to remain at the end after processing, consistently located with respect to the next keystroke to be input.
407*912701f9SAndroid Build Coastguard Worker
408*912701f9SAndroid Build Coastguard Worker1. Keyboard authors can keep a marker together with a character of interest by emitting the marker just previous to that character.
409*912701f9SAndroid Build Coastguard Worker
410*912701f9SAndroid Build Coastguard WorkerFor example, given a key `output="\m{marker}X"`, the marker will proceed `X` regardless of any normalization. (If `output="X\m{marker}"` were used, and `X` were to reorder with other characters, the marker would no longer be adjacent to the X.)
411*912701f9SAndroid Build Coastguard Worker
412*912701f9SAndroid Build Coastguard Worker2. Markers which are at the end of the input remain at the end of input during normalization.
413*912701f9SAndroid Build Coastguard Worker
414*912701f9SAndroid Build Coastguard WorkerFor example, given input context which ends with a marker, such as `...ABCDX\m{marker}`, the marker will remain at the end of the input context regardless of any normalization.
415*912701f9SAndroid Build Coastguard Worker
416*912701f9SAndroid Build Coastguard WorkerThe 'gluing' is only applicable during one particular processing step. It does not persist or affect further processing steps or future keystrokes.
417*912701f9SAndroid Build Coastguard Worker
418*912701f9SAndroid Build Coastguard Worker#### Data Model: `Marker`
419*912701f9SAndroid Build Coastguard Worker
420*912701f9SAndroid Build Coastguard WorkerFor purposes of this algorithm, a `Marker` is an opaque data type which has one property, its ID. See [Markers](#markers) for a discussion of the marker ID.
421*912701f9SAndroid Build Coastguard Worker
422*912701f9SAndroid Build Coastguard Worker#### Data Model: string
423*912701f9SAndroid Build Coastguard Worker
424*912701f9SAndroid Build Coastguard WorkerFor purposes of this algorithm, a string is an array of elements, where each element is either a codepoint or a `Marker`. For example, a [`key`](#element-key) in the XML such as `<key id="sha" output="��\m{mymarker}x" />` would produce a string with three elements:
425*912701f9SAndroid Build Coastguard Worker
426*912701f9SAndroid Build Coastguard Worker1. The codepoint U+104EF
427*912701f9SAndroid Build Coastguard Worker2. The `Marker` named `mymarker`
428*912701f9SAndroid Build Coastguard Worker3. The codepoint U+0078
429*912701f9SAndroid Build Coastguard Worker
430*912701f9SAndroid Build Coastguard WorkerIf this string were output to an application, it would be converted to _plain text_ by removing all markers, which would yield the plain text string with only two codepoints: `��x`.
431*912701f9SAndroid Build Coastguard Worker
432*912701f9SAndroid Build Coastguard Worker#### Data Model: `MarkerEntry`
433*912701f9SAndroid Build Coastguard Worker
434*912701f9SAndroid Build Coastguard WorkerThis algorithm uses a temporary data structure which is an ordered array of `MarkerEntry` elements.
435*912701f9SAndroid Build Coastguard Worker
436*912701f9SAndroid Build Coastguard WorkerEach `MarkerEntry` element has the following properties:
437*912701f9SAndroid Build Coastguard Worker- `glue` (a codepoint, or the special value `END_OF_SEGMENT`)
438*912701f9SAndroid Build Coastguard Worker- `divider?` (true/false)
439*912701f9SAndroid Build Coastguard Worker- `processed?` (true/false, defaults to false)
440*912701f9SAndroid Build Coastguard Worker- `marker` (the `Marker` object)
441*912701f9SAndroid Build Coastguard Worker
442*912701f9SAndroid Build Coastguard Worker#### Marker Algorithm Overview
443*912701f9SAndroid Build Coastguard Worker
444*912701f9SAndroid Build Coastguard WorkerThis algorithm has three main phases to it.
445*912701f9SAndroid Build Coastguard Worker
446*912701f9SAndroid Build Coastguard Worker1. **Parsing/Removing Markers**
447*912701f9SAndroid Build Coastguard Worker
448*912701f9SAndroid Build Coastguard Worker    In this phase, the input string is analyzed to locate all markers. Metadata about each marker is stored in a temporary `MarkerArray` data structure.
449*912701f9SAndroid Build Coastguard Worker    Markers are removed from the input string, leaving only plain text.
450*912701f9SAndroid Build Coastguard Worker
451*912701f9SAndroid Build Coastguard Worker2. **Plain Text Processing**
452*912701f9SAndroid Build Coastguard Worker
453*912701f9SAndroid Build Coastguard Worker    This phase is performed on the plain text string, such as NFD normalization.
454*912701f9SAndroid Build Coastguard Worker
455*912701f9SAndroid Build Coastguard Worker3. **Re-Adding Markers**
456*912701f9SAndroid Build Coastguard Worker
457*912701f9SAndroid Build Coastguard Worker    Finally, markers are re-added to the plain text string using the `MarkerEntry` metadata from step 1.
458*912701f9SAndroid Build Coastguard Worker    This phase results in a string which contains both codepoints and markers.
459*912701f9SAndroid Build Coastguard Worker
460*912701f9SAndroid Build Coastguard Worker#### Phase 1: Parsing/Removing Markers
461*912701f9SAndroid Build Coastguard Worker
462*912701f9SAndroid Build Coastguard WorkerGiven an input string _s_
463*912701f9SAndroid Build Coastguard Worker
464*912701f9SAndroid Build Coastguard Worker1. Initialize an empty `MarkerEntry` array _e_
465*912701f9SAndroid Build Coastguard Worker2. Initialize an empty `Marker` array _pending_
466*912701f9SAndroid Build Coastguard Worker2. Loop through each element _i_ of the input _s_
467*912701f9SAndroid Build Coastguard Worker    1. If _i_ is a `Marker`:
468*912701f9SAndroid Build Coastguard Worker        1. add the marker _i_ to the end of _pending_
469*912701f9SAndroid Build Coastguard Worker        2. remove the marker from the input string _s_
470*912701f9SAndroid Build Coastguard Worker    2. else if _i_ is a codepoint:
471*912701f9SAndroid Build Coastguard Worker        1. Decompose _i_ into NFD form into a plain text string array of codepoints _d_
472*912701f9SAndroid Build Coastguard Worker        2. Add an element with `glue=d[0]` (the first codepoint of _d_) and `divider? = true` to the end of _e_
473*912701f9SAndroid Build Coastguard Worker        3. For every marker _m_ in _pending_:
474*912701f9SAndroid Build Coastguard Worker            1. Add an element with `glue=d[0]` and `marker=m` and `divider? = false` to the end of _e_
475*912701f9SAndroid Build Coastguard Worker        4. Clear the _pending_ array.
476*912701f9SAndroid Build Coastguard Worker        5. Finally, for every codepoint _c_ in _d_ **following** the initial codepoint: (d[1]..):
477*912701f9SAndroid Build Coastguard Worker            1. Add an element with `glue=c` and `divider? = true` to the end of _e_
478*912701f9SAndroid Build Coastguard Worker3. At the end of text,
479*912701f9SAndroid Build Coastguard Worker    1. Add an element with `glue=END` and `divider?=true` to the end of _e_
480*912701f9SAndroid Build Coastguard Worker    2. For every marker _m_ in _pending_:
481*912701f9SAndroid Build Coastguard Worker        1. Add an element with `glue=END` and `marker=m` and `divider? = false` to the end of _e_
482*912701f9SAndroid Build Coastguard Worker
483*912701f9SAndroid Build Coastguard WorkerThe string _s_ is now plain text and can be processed by the next phase.
484*912701f9SAndroid Build Coastguard Worker
485*912701f9SAndroid Build Coastguard WorkerThe array _e_ will be used in Phase 3.
486*912701f9SAndroid Build Coastguard Worker
487*912701f9SAndroid Build Coastguard Worker#### Phase 2: Plain Text Processing
488*912701f9SAndroid Build Coastguard Worker
489*912701f9SAndroid Build Coastguard WorkerSee [UAX #15](https://www.unicode.org/reports/tr15/#Description_Norm) for an overview of the process.  An existing Unicode-compliant API can be used here.
490*912701f9SAndroid Build Coastguard Worker
491*912701f9SAndroid Build Coastguard Worker#### Phase 3: Adding Markers
492*912701f9SAndroid Build Coastguard Worker
493*912701f9SAndroid Build Coastguard Worker1. Initialize an empty output string _o_
494*912701f9SAndroid Build Coastguard Worker2. Loop through the elements _p_ of the array _e_ from end to beginning (backwards)
495*912701f9SAndroid Build Coastguard Worker    1. If _p_.glue isn't `END`:
496*912701f9SAndroid Build Coastguard Worker        1. break out of the loop
497*912701f9SAndroid Build Coastguard Worker    2. If _p_.divider? == false:
498*912701f9SAndroid Build Coastguard Worker        1. Prepend marker _p_.marker to the output string _o_
499*912701f9SAndroid Build Coastguard Worker    3. Set _p_.processed?=true (so we don't process this again)
500*912701f9SAndroid Build Coastguard Worker2. Loop through each codepoint _i_ ( in the plain text input string ) from end to beginning (backwards)
501*912701f9SAndroid Build Coastguard Worker    1. Prepend _i_ to output _o_
502*912701f9SAndroid Build Coastguard Worker    2. Loop through the elements _p_ of the array _e_ from end to beginning (backwards)
503*912701f9SAndroid Build Coastguard Worker        1. If _p_.processed? == true:
504*912701f9SAndroid Build Coastguard Worker            1. Continue the inner loop  (was already processed)
505*912701f9SAndroid Build Coastguard Worker        2. If _p_.glue isn't _i_
506*912701f9SAndroid Build Coastguard Worker            1. Continue the inner loop  (wrong glue, not applicable)
507*912701f9SAndroid Build Coastguard Worker        3. If _p_.divider? == true:
508*912701f9SAndroid Build Coastguard Worker            1. Break out of the inner loop  (reached end of this 'glue' char)
509*912701f9SAndroid Build Coastguard Worker        4. Prepend marker _p_.marker to the output string _o_
510*912701f9SAndroid Build Coastguard Worker        5. Set _p_.processed?=true (so we don't process this again)
511*912701f9SAndroid Build Coastguard Worker3. _o_ is now the output string including markers.
512*912701f9SAndroid Build Coastguard Worker
513*912701f9SAndroid Build Coastguard Worker#### Example Normalization with Markers
514*912701f9SAndroid Build Coastguard Worker
515*912701f9SAndroid Build Coastguard Worker**Example 1**
516*912701f9SAndroid Build Coastguard Worker
517*912701f9SAndroid Build Coastguard WorkerConsider this example, without markers:
518*912701f9SAndroid Build Coastguard Worker
519*912701f9SAndroid Build Coastguard Worker- `e\u{0300}\u{0320}` (input)
520*912701f9SAndroid Build Coastguard Worker- `e\u{0320}\u{0300}` (NFD)
521*912701f9SAndroid Build Coastguard Worker
522*912701f9SAndroid Build Coastguard WorkerThe combining marks are reordered.
523*912701f9SAndroid Build Coastguard Worker
524*912701f9SAndroid Build Coastguard Worker**Example 2**
525*912701f9SAndroid Build Coastguard Worker
526*912701f9SAndroid Build Coastguard WorkerIf we add markers:
527*912701f9SAndroid Build Coastguard Worker
528*912701f9SAndroid Build Coastguard Worker- `e\u{0300}\m{marker}\u{0320}` (input)
529*912701f9SAndroid Build Coastguard Worker- `e\m{marker}\u{0320}\u{0300}` (NFD)
530*912701f9SAndroid Build Coastguard Worker
531*912701f9SAndroid Build Coastguard WorkerNote that the marker is 'glued' to the _following_ character. In the above example, `\m{marker}` was 'glued' to the `\u{0320}`.
532*912701f9SAndroid Build Coastguard Worker
533*912701f9SAndroid Build Coastguard Worker**Example 2**
534*912701f9SAndroid Build Coastguard Worker
535*912701f9SAndroid Build Coastguard WorkerA second example:
536*912701f9SAndroid Build Coastguard Worker
537*912701f9SAndroid Build Coastguard Worker- `e\m{marker0}\u{0300}\m{marker1}\u{0320}\m{marker2}` (input)
538*912701f9SAndroid Build Coastguard Worker- `e\m{marker1}\u{0320}\m{marker0}\u{0300}\m{marker2}` (NFD)
539*912701f9SAndroid Build Coastguard Worker
540*912701f9SAndroid Build Coastguard WorkerHere `\m{marker2}` is 'glued' to the end of the string. However, if additional text is added such as by a subsequent keystroke (which may add an additional combining character, for example), this marker may be 'glued' to that following text.
541*912701f9SAndroid Build Coastguard Worker
542*912701f9SAndroid Build Coastguard WorkerMarkers remain in the same normalization-safe segment during normalization. Consider:
543*912701f9SAndroid Build Coastguard Worker
544*912701f9SAndroid Build Coastguard Worker**Example 3**
545*912701f9SAndroid Build Coastguard Worker
546*912701f9SAndroid Build Coastguard Worker- `e\u{0300}\m{marker1}\u{0320}a\u{0300}\m{marker2}\u{0320}` (original)
547*912701f9SAndroid Build Coastguard Worker- `e\m{marker1}\u{0320}\u{0300}a\m{marker2}\u{0320}\u{0300}` (NFD)
548*912701f9SAndroid Build Coastguard Worker
549*912701f9SAndroid Build Coastguard WorkerThere are two normalization-safe segments here:
550*912701f9SAndroid Build Coastguard Worker
551*912701f9SAndroid Build Coastguard Worker1. `e\u{0300}\m{marker1}\u{0320}`
552*912701f9SAndroid Build Coastguard Worker2. `a\u{0300}\m{marker2}\u{0320}`
553*912701f9SAndroid Build Coastguard Worker
554*912701f9SAndroid Build Coastguard WorkerNormalization (and marker rearranging) effectively occurs within each segment.  While `\m{marker1}` is 'glued' to the `\u{0320}`, it is glued within the first segment and has no effect on the second segment.
555*912701f9SAndroid Build Coastguard Worker
556*912701f9SAndroid Build Coastguard Worker### Normalization and Character Classes
557*912701f9SAndroid Build Coastguard Worker
558*912701f9SAndroid Build Coastguard WorkerIf pre-composed (non-NFD) characters are used in [character classes](#regex-like-syntax), such as `[á-é]`, these may not match as keyboard authors expect, as the U+00E1 character (á) will not occur in NFD form. Thus this may be masking serious errors in the data.
559*912701f9SAndroid Build Coastguard Worker
560*912701f9SAndroid Build Coastguard WorkerTools that process keyboard data must reject the data when character classes include non-NFD characters.
561*912701f9SAndroid Build Coastguard Worker
562*912701f9SAndroid Build Coastguard WorkerThe above should be written instead as a regex `(á|â|ã|ä|å|æ|ç|è|é)`. Alternatively, it could be written as a set variable `<set id="Example" value="á â ã ä å æ ç è é"/>` and matched as `$[Example]`.
563*912701f9SAndroid Build Coastguard Worker
564*912701f9SAndroid Build Coastguard WorkerThere is another case where there is no explicit mention of a non-NFD character, but the character class could include non-NFD characters, such as the range `[\u{0020}-\u{01FF}]`. For these, the tools should raise a warning by default.
565*912701f9SAndroid Build Coastguard Worker
566*912701f9SAndroid Build Coastguard Worker### Normalization and Reorder elements
567*912701f9SAndroid Build Coastguard Worker
568*912701f9SAndroid Build Coastguard Worker[`reorder`](#element-reorder) elements operate on NFD codepoints.
569*912701f9SAndroid Build Coastguard Worker
570*912701f9SAndroid Build Coastguard Worker### Normalization-safe Segments
571*912701f9SAndroid Build Coastguard Worker
572*912701f9SAndroid Build Coastguard WorkerFor purposes of this algorithm, "normalization-safe segments" are defined as a string of codepoints which are
573*912701f9SAndroid Build Coastguard Worker
574*912701f9SAndroid Build Coastguard Worker1. already in [NFD](https://www.unicode.org/reports/tr15/#Norm_Forms), and
575*912701f9SAndroid Build Coastguard Worker2. begin with a character with [Canonical Combining Class](https://www.unicode.org/reports/tr44/#Canonical_Combining_Class_Values) of `0`.
576*912701f9SAndroid Build Coastguard Worker
577*912701f9SAndroid Build Coastguard WorkerSee [UAX #15 Section 9.1: Stable Code Points](https://www.unicode.org/reports/tr15/#Stable_Code_Points) for related discussion.
578*912701f9SAndroid Build Coastguard WorkerText under consideration can be segmented by locating such characters.
579*912701f9SAndroid Build Coastguard Worker
580*912701f9SAndroid Build Coastguard Worker### Normalization and Output
581*912701f9SAndroid Build Coastguard Worker
582*912701f9SAndroid Build Coastguard WorkerOn output, text will be normalized into a specified normalization form. That form will typically be NFC, but an implementation may allow a calling application to override the choice of normalization form.
583*912701f9SAndroid Build Coastguard WorkerFor example, many platforms may request NFC as the output format. In such a case, all text emitted via the keyboard will be transformed into NFC.
584*912701f9SAndroid Build Coastguard Worker
585*912701f9SAndroid Build Coastguard WorkerExisting text in a document will only have normalization applied within a single normalization-safe segment from the caret.  The output will not contain any markers, thus any normalization is unaffected by any markers embedded within the segment.
586*912701f9SAndroid Build Coastguard Worker
587*912701f9SAndroid Build Coastguard WorkerFor example, the sequence `e\m{marker}\u{300}` would be output in NFC as `è`. The marker is removed and has no effect on the output.
588*912701f9SAndroid Build Coastguard Worker
589*912701f9SAndroid Build Coastguard Worker### Disabling Normalization
590*912701f9SAndroid Build Coastguard Worker
591*912701f9SAndroid Build Coastguard WorkerThe attribute value `normalization="disabled"` can be used to indicate that no automatic normalization is to be applied in input, matching, or output. Using this setting should be done with caution:
592*912701f9SAndroid Build Coastguard Worker
593*912701f9SAndroid Build Coastguard Worker- When this attribute value is used, all matching and output uses only the exact codepoints provided by the keyboard author.
594*912701f9SAndroid Build Coastguard Worker- The input context from the application may not be normalized, which means that the keyboard author should consider all possible combinations, including NFC, NFD, and mixed normalization in `<transform from=` attributes.
595*912701f9SAndroid Build Coastguard Worker- See [`<settings>`](#element-settings) for further details.
596*912701f9SAndroid Build Coastguard Worker
597*912701f9SAndroid Build Coastguard WorkerThe majority of the above section only applies when `normalization="disabled"` is not used.
598*912701f9SAndroid Build Coastguard Worker
599*912701f9SAndroid Build Coastguard Worker* * *
600*912701f9SAndroid Build Coastguard Worker
601*912701f9SAndroid Build Coastguard Worker## Element Hierarchy
602*912701f9SAndroid Build Coastguard Worker
603*912701f9SAndroid Build Coastguard WorkerThis section describes the XML elements in a keyboard layout file, beginning with the top level element `<keyboard3>`.
604*912701f9SAndroid Build Coastguard Worker
605*912701f9SAndroid Build Coastguard Worker### Element: keyboard3
606*912701f9SAndroid Build Coastguard Worker
607*912701f9SAndroid Build Coastguard WorkerThis is the top level element. All other elements defined below are under this element.
608*912701f9SAndroid Build Coastguard Worker
609*912701f9SAndroid Build Coastguard Worker**Syntax**
610*912701f9SAndroid Build Coastguard Worker
611*912701f9SAndroid Build Coastguard Worker```xml
612*912701f9SAndroid Build Coastguard Worker<keyboard3 locale="…localeId">
613*912701f9SAndroid Build Coastguard Worker    <!-- …definition of the layout as described by the elements defined below -->
614*912701f9SAndroid Build Coastguard Worker</keyboard3>
615*912701f9SAndroid Build Coastguard Worker```
616*912701f9SAndroid Build Coastguard Worker
617*912701f9SAndroid Build Coastguard Worker> <small>
618*912701f9SAndroid Build Coastguard Worker>
619*912701f9SAndroid Build Coastguard Worker> Parents: _none_
620*912701f9SAndroid Build Coastguard Worker>
621*912701f9SAndroid Build Coastguard Worker> Children: [displays](#element-displays), [flicks](#element-flicks), [forms](#element-forms), [import](#element-import), [info](#element-info), [keys](#element-keys), [layers](#element-layers), [locales](#element-locales), [settings](#element-settings), [_special_](tr35.md#special), [transforms](#element-transforms), [variables](#element-variables), [version](#element-version)
622*912701f9SAndroid Build Coastguard Worker>
623*912701f9SAndroid Build Coastguard Worker> Occurrence: required, single
624*912701f9SAndroid Build Coastguard Worker>
625*912701f9SAndroid Build Coastguard Worker> </small>
626*912701f9SAndroid Build Coastguard Worker
627*912701f9SAndroid Build Coastguard Worker_Attribute:_ `conformsTo` (required)
628*912701f9SAndroid Build Coastguard Worker
629*912701f9SAndroid Build Coastguard WorkerThis attribute value distinguishes the keyboard from prior versions,
630*912701f9SAndroid Build Coastguard Workerand it also specifies the minimum CLDR major version required.
631*912701f9SAndroid Build Coastguard Worker
632*912701f9SAndroid Build Coastguard WorkerThis attribute value must be a whole number of `45` or greater. See [`cldrVersion`](tr35-info.md#version-information)
633*912701f9SAndroid Build Coastguard Worker
634*912701f9SAndroid Build Coastguard Worker```xml
635*912701f9SAndroid Build Coastguard Worker<keyboard3 … conformsTo="45"/>
636*912701f9SAndroid Build Coastguard Worker```
637*912701f9SAndroid Build Coastguard Worker
638*912701f9SAndroid Build Coastguard Worker_Attribute:_ `locale` (required)
639*912701f9SAndroid Build Coastguard Worker
640*912701f9SAndroid Build Coastguard WorkerThis attribute value contains the primary locale of the keyboard using BCP 47 [Unicode locale identifiers](tr35.md#Canonical_Unicode_Locale_Identifiers) - for example `"el"` for Greek. Sometimes, the locale may not specify the base language. For example, a Devanagari keyboard for many languages could be specified by BCP-47 code: `"und-Deva"`. However, it is better to list out the languages explicitly using the [`locales`](#element-locales) element.
641*912701f9SAndroid Build Coastguard Worker
642*912701f9SAndroid Build Coastguard WorkerFor further details about the choice of locale ID, see [Keyboard IDs](#keyboard-ids).
643*912701f9SAndroid Build Coastguard Worker
644*912701f9SAndroid Build Coastguard Worker**Example** (for illustrative purposes only, not indicative of the real data)
645*912701f9SAndroid Build Coastguard Worker
646*912701f9SAndroid Build Coastguard Worker```xml
647*912701f9SAndroid Build Coastguard Worker<keyboard3 locale="ka">
648*912701f9SAndroid Build Coastguard Worker649*912701f9SAndroid Build Coastguard Worker</keyboard3>
650*912701f9SAndroid Build Coastguard Worker```
651*912701f9SAndroid Build Coastguard Worker
652*912701f9SAndroid Build Coastguard Worker```xml
653*912701f9SAndroid Build Coastguard Worker<keyboard3 locale="fr-CH-t-k0-azerty">
654*912701f9SAndroid Build Coastguard Worker655*912701f9SAndroid Build Coastguard Worker</keyboard3>
656*912701f9SAndroid Build Coastguard Worker```
657*912701f9SAndroid Build Coastguard Worker* * *
658*912701f9SAndroid Build Coastguard Worker
659*912701f9SAndroid Build Coastguard Worker### Element: import
660*912701f9SAndroid Build Coastguard Worker
661*912701f9SAndroid Build Coastguard WorkerThe `import` element is used to reference another xml file so that elements are imported from
662*912701f9SAndroid Build Coastguard Workeranother file. The use case is to be able to import a standard set of `transform`s and similar
663*912701f9SAndroid Build Coastguard Workerfrom the CLDR repository, especially to be able to share common information relevant to a particular script.
664*912701f9SAndroid Build Coastguard WorkerThe intent is for each single XML file to contain all that is needed for a keyboard layout, other than required standard import data from the CLDR repository.
665*912701f9SAndroid Build Coastguard Worker
666*912701f9SAndroid Build Coastguard Worker`<import>` can be used as a child of a number of elements (see the _Parents_ section immediately below). Multiple `<import>` elements may be used, however, `<import>` elements must come before any other sibling elements.
667*912701f9SAndroid Build Coastguard WorkerIf two identical elements are defined, the later element will take precedence, that is, override.
668*912701f9SAndroid Build Coastguard WorkerImported elements may contain other `<import>` statements. Implementations must prevent recursion, that is, each imported file may only be included once.
669*912701f9SAndroid Build Coastguard Worker
670*912701f9SAndroid Build Coastguard Worker**Note:** imported files do not have any indication of their normalization mode. For this reason, the keyboard author must verify that the imported file is of a compatible normalization mode. See the [`settings` element](#element-settings) for further details.
671*912701f9SAndroid Build Coastguard Worker
672*912701f9SAndroid Build Coastguard Worker**Syntax**
673*912701f9SAndroid Build Coastguard Worker```xml
674*912701f9SAndroid Build Coastguard Worker<import base="cldr" path="45/keys-Zyyy-punctuation.xml"/>
675*912701f9SAndroid Build Coastguard Worker```
676*912701f9SAndroid Build Coastguard Worker> <small>
677*912701f9SAndroid Build Coastguard Worker>
678*912701f9SAndroid Build Coastguard Worker> Parents: [displays](#element-displays), [flicks](#element-flicks), [forms](#element-forms), [keyboard3](#element-keyboard3), [keys](#element-keys), [layers](#element-layers), [transformGroup](#element-transformgroup), [transforms](#element-transforms), [variables](#element-variables)
679*912701f9SAndroid Build Coastguard Worker> Children: _none_
680*912701f9SAndroid Build Coastguard Worker>
681*912701f9SAndroid Build Coastguard Worker> Occurrence: optional, multiple
682*912701f9SAndroid Build Coastguard Worker>
683*912701f9SAndroid Build Coastguard Worker> </small>
684*912701f9SAndroid Build Coastguard Worker
685*912701f9SAndroid Build Coastguard Worker_Attribute:_ `base`
686*912701f9SAndroid Build Coastguard Worker
687*912701f9SAndroid Build Coastguard Worker> The base may be omitted (indicating a local import) or have the value `"cldr"`.
688*912701f9SAndroid Build Coastguard Worker
689*912701f9SAndroid Build Coastguard Worker**Note:** `base="cldr"` is required for all `<import>` statements within keyboard files in the CLDR repository.
690*912701f9SAndroid Build Coastguard Worker
691*912701f9SAndroid Build Coastguard Worker_Attribute:_ `path` (required)
692*912701f9SAndroid Build Coastguard Worker
693*912701f9SAndroid Build Coastguard Worker> If `base` is `cldr`, then the `path` must start with a CLDR major version (such as `45`) representing the CLDR version to pull imports from. The imports are located in the `keyboard/import` subdirectory of the CLDR source repository.
694*912701f9SAndroid Build Coastguard Worker> Implementations are not required to have all CLDR versions available to them.
695*912701f9SAndroid Build Coastguard Worker>
696*912701f9SAndroid Build Coastguard Worker> If `base` is omitted, then `path` is an absolute or relative file path.
697*912701f9SAndroid Build Coastguard Worker
698*912701f9SAndroid Build Coastguard Worker
699*912701f9SAndroid Build Coastguard Worker**Further Examples**
700*912701f9SAndroid Build Coastguard Worker
701*912701f9SAndroid Build Coastguard Worker```xml
702*912701f9SAndroid Build Coastguard Worker<!-- in a keyboard xml file-->
703*912701f9SAndroid Build Coastguard Worker704*912701f9SAndroid Build Coastguard Worker<transforms type="simple">
705*912701f9SAndroid Build Coastguard Worker    <import base="cldr" path="45/transforms-example.xml"/>
706*912701f9SAndroid Build Coastguard Worker    <transform from="` " to="`" />
707*912701f9SAndroid Build Coastguard Worker    <transform from="^ " to="^" />
708*912701f9SAndroid Build Coastguard Worker</transforms>
709*912701f9SAndroid Build Coastguard Worker710*912701f9SAndroid Build Coastguard Worker
711*912701f9SAndroid Build Coastguard Worker
712*912701f9SAndroid Build Coastguard Worker<!-- contents of transforms-example.xml -->
713*912701f9SAndroid Build Coastguard Worker<?xml version="1.0" encoding="UTF-8"?>
714*912701f9SAndroid Build Coastguard Worker<transforms>
715*912701f9SAndroid Build Coastguard Worker    <!-- begin imported part-->
716*912701f9SAndroid Build Coastguard Worker    <transform from="`a" to="à" />
717*912701f9SAndroid Build Coastguard Worker    <transform from="`e" to="è" />
718*912701f9SAndroid Build Coastguard Worker    <transform from="`i" to="ì" />
719*912701f9SAndroid Build Coastguard Worker    <transform from="`o" to="ò" />
720*912701f9SAndroid Build Coastguard Worker    <transform from="`u" to="ù" />
721*912701f9SAndroid Build Coastguard Worker    <!-- end imported part -->
722*912701f9SAndroid Build Coastguard Worker</transforms>
723*912701f9SAndroid Build Coastguard Worker```
724*912701f9SAndroid Build Coastguard Worker
725*912701f9SAndroid Build Coastguard Worker**Note:** The root element, here `transforms`, is the same as
726*912701f9SAndroid Build Coastguard Workerthe _parent_ of the `<import/>` element. It is an error to import an XML file
727*912701f9SAndroid Build Coastguard Workerwhose root element is different than the parent element of the `<import/>` element.
728*912701f9SAndroid Build Coastguard Worker
729*912701f9SAndroid Build Coastguard WorkerAfter loading, the above example will be the equivalent of the following.
730*912701f9SAndroid Build Coastguard Worker
731*912701f9SAndroid Build Coastguard Worker```xml
732*912701f9SAndroid Build Coastguard Worker<transforms type="simple">
733*912701f9SAndroid Build Coastguard Worker    <!-- begin imported part-->
734*912701f9SAndroid Build Coastguard Worker    <transform from="`a" to="à" />
735*912701f9SAndroid Build Coastguard Worker    <transform from="`e" to="è" />
736*912701f9SAndroid Build Coastguard Worker    <transform from="`i" to="ì" />
737*912701f9SAndroid Build Coastguard Worker    <transform from="`o" to="ò" />
738*912701f9SAndroid Build Coastguard Worker    <transform from="`u" to="ù" />
739*912701f9SAndroid Build Coastguard Worker    <!-- end imported part -->
740*912701f9SAndroid Build Coastguard Worker
741*912701f9SAndroid Build Coastguard Worker    <!-- this line is after the import -->
742*912701f9SAndroid Build Coastguard Worker    <transform from="^ " to="^" />
743*912701f9SAndroid Build Coastguard Worker    <transform from="` " to="`" />
744*912701f9SAndroid Build Coastguard Worker</transforms>
745*912701f9SAndroid Build Coastguard Worker```
746*912701f9SAndroid Build Coastguard Worker
747*912701f9SAndroid Build Coastguard Worker* * *
748*912701f9SAndroid Build Coastguard Worker
749*912701f9SAndroid Build Coastguard Worker### Element: locales
750*912701f9SAndroid Build Coastguard Worker
751*912701f9SAndroid Build Coastguard WorkerThe optional `<locales>` element allows specifying additional or alternate locales.
752*912701f9SAndroid Build Coastguard Worker
753*912701f9SAndroid Build Coastguard Worker**Syntax**
754*912701f9SAndroid Build Coastguard Worker
755*912701f9SAndroid Build Coastguard Worker```xml
756*912701f9SAndroid Build Coastguard Worker<locales>
757*912701f9SAndroid Build Coastguard Worker    <locale id="…"/>
758*912701f9SAndroid Build Coastguard Worker    <locale id="…"/>
759*912701f9SAndroid Build Coastguard Worker</locales>
760*912701f9SAndroid Build Coastguard Worker```
761*912701f9SAndroid Build Coastguard Worker
762*912701f9SAndroid Build Coastguard Worker> <small>
763*912701f9SAndroid Build Coastguard Worker>
764*912701f9SAndroid Build Coastguard Worker> Parents: [keyboard3](#element-keyboard3)
765*912701f9SAndroid Build Coastguard Worker>
766*912701f9SAndroid Build Coastguard Worker> Children: [locale](#element-locale)
767*912701f9SAndroid Build Coastguard Worker>
768*912701f9SAndroid Build Coastguard Worker> Occurrence: optional, single
769*912701f9SAndroid Build Coastguard Worker>
770*912701f9SAndroid Build Coastguard Worker> </small>
771*912701f9SAndroid Build Coastguard Worker
772*912701f9SAndroid Build Coastguard Worker### Element: locale
773*912701f9SAndroid Build Coastguard Worker
774*912701f9SAndroid Build Coastguard WorkerThe `<locale>` element specifies an additional or alternate locale. Denotes intentional support for an extra language, not just that a keyboard incidentally supports a language’s orthography.
775*912701f9SAndroid Build Coastguard Worker
776*912701f9SAndroid Build Coastguard Worker**Syntax**
777*912701f9SAndroid Build Coastguard Worker
778*912701f9SAndroid Build Coastguard Worker```xml
779*912701f9SAndroid Build Coastguard Worker<locale id="…id"/>
780*912701f9SAndroid Build Coastguard Worker```
781*912701f9SAndroid Build Coastguard Worker
782*912701f9SAndroid Build Coastguard Worker> <small>
783*912701f9SAndroid Build Coastguard Worker>
784*912701f9SAndroid Build Coastguard Worker> Parents: [locales](#element-locales)
785*912701f9SAndroid Build Coastguard Worker>
786*912701f9SAndroid Build Coastguard Worker> Children: _none_
787*912701f9SAndroid Build Coastguard Worker>
788*912701f9SAndroid Build Coastguard Worker> Occurrence: optional, multiple
789*912701f9SAndroid Build Coastguard Worker>
790*912701f9SAndroid Build Coastguard Worker> </small>
791*912701f9SAndroid Build Coastguard Worker
792*912701f9SAndroid Build Coastguard Worker_Attribute:_ `id` (required)
793*912701f9SAndroid Build Coastguard Worker
794*912701f9SAndroid Build Coastguard Worker> The [BCP 47](tr35.md#Canonical_Unicode_Locale_Identifiers) locale ID of an additional language supported by this keyboard.
795*912701f9SAndroid Build Coastguard Worker> Must _not_ include the `-k0-` subtag for this additional language.
796*912701f9SAndroid Build Coastguard Worker
797*912701f9SAndroid Build Coastguard Worker**Example**
798*912701f9SAndroid Build Coastguard Worker
799*912701f9SAndroid Build Coastguard WorkerSee [Principles for Keyboard IDs](#principles-for-keyboard-ids) for discussion and further examples.
800*912701f9SAndroid Build Coastguard Worker
801*912701f9SAndroid Build Coastguard Worker```xml
802*912701f9SAndroid Build Coastguard Worker<!-- Pan Nigerian Keyboard-->
803*912701f9SAndroid Build Coastguard Worker<keyboard3 locale="mul-Latn-NG-t-k0-panng">
804*912701f9SAndroid Build Coastguard Worker    <locales>
805*912701f9SAndroid Build Coastguard Worker        <locale id="ha"/>
806*912701f9SAndroid Build Coastguard Worker        <locale id="ig"/>
807*912701f9SAndroid Build Coastguard Worker        <!-- others … -->
808*912701f9SAndroid Build Coastguard Worker    </locales>
809*912701f9SAndroid Build Coastguard Worker</keyboard3>
810*912701f9SAndroid Build Coastguard Worker```
811*912701f9SAndroid Build Coastguard Worker
812*912701f9SAndroid Build Coastguard Worker* * *
813*912701f9SAndroid Build Coastguard Worker
814*912701f9SAndroid Build Coastguard Worker### Element: version
815*912701f9SAndroid Build Coastguard Worker
816*912701f9SAndroid Build Coastguard WorkerElement used to keep track of the source data version.
817*912701f9SAndroid Build Coastguard Worker
818*912701f9SAndroid Build Coastguard Worker**Syntax**
819*912701f9SAndroid Build Coastguard Worker
820*912701f9SAndroid Build Coastguard Worker```xml
821*912701f9SAndroid Build Coastguard Worker<version number="…number">
822*912701f9SAndroid Build Coastguard Worker```
823*912701f9SAndroid Build Coastguard Worker
824*912701f9SAndroid Build Coastguard Worker> <small>
825*912701f9SAndroid Build Coastguard Worker>
826*912701f9SAndroid Build Coastguard Worker> Parents: [keyboard3](#element-keyboard3)
827*912701f9SAndroid Build Coastguard Worker>
828*912701f9SAndroid Build Coastguard Worker> Children: _none_
829*912701f9SAndroid Build Coastguard Worker>
830*912701f9SAndroid Build Coastguard Worker> Occurrence: optional, single
831*912701f9SAndroid Build Coastguard Worker>
832*912701f9SAndroid Build Coastguard Worker> </small>
833*912701f9SAndroid Build Coastguard Worker
834*912701f9SAndroid Build Coastguard Worker_Attribute:_ `number` (required)
835*912701f9SAndroid Build Coastguard Worker
836*912701f9SAndroid Build Coastguard Worker> Must be a [[SEMVER](https://semver.org)] compatible version number, such as `1.0.0` or `38.0.0-beta.11`
837*912701f9SAndroid Build Coastguard Worker
838*912701f9SAndroid Build Coastguard Worker_Attribute:_ `cldrVersion` (fixed by DTD)
839*912701f9SAndroid Build Coastguard Worker
840*912701f9SAndroid Build Coastguard Worker> The CLDR specification version that is associated with this data file. This value is fixed and is inherited from the [DTD file](https://github.com/unicode-org/cldr/tree/main/keyboards/dtd) and therefore does not show up directly in the XML file.
841*912701f9SAndroid Build Coastguard Worker
842*912701f9SAndroid Build Coastguard Worker**Example**
843*912701f9SAndroid Build Coastguard Worker
844*912701f9SAndroid Build Coastguard Worker```xml
845*912701f9SAndroid Build Coastguard Worker<keyboard3 locale="tok">
846*912701f9SAndroid Build Coastguard Worker847*912701f9SAndroid Build Coastguard Worker    <version number="1"/>
848*912701f9SAndroid Build Coastguard Worker849*912701f9SAndroid Build Coastguard Worker</keyboard3>
850*912701f9SAndroid Build Coastguard Worker```
851*912701f9SAndroid Build Coastguard Worker
852*912701f9SAndroid Build Coastguard Worker* * *
853*912701f9SAndroid Build Coastguard Worker
854*912701f9SAndroid Build Coastguard Worker### Element: info
855*912701f9SAndroid Build Coastguard Worker
856*912701f9SAndroid Build Coastguard WorkerElement containing informative properties about the layout, for displaying in user interfaces etc.
857*912701f9SAndroid Build Coastguard Worker
858*912701f9SAndroid Build Coastguard Worker**Syntax**
859*912701f9SAndroid Build Coastguard Worker
860*912701f9SAndroid Build Coastguard Worker```xml
861*912701f9SAndroid Build Coastguard Worker<info
862*912701f9SAndroid Build Coastguard Worker      name="…name"
863*912701f9SAndroid Build Coastguard Worker      author="…author"
864*912701f9SAndroid Build Coastguard Worker      layout="…hint of the layout"
865*912701f9SAndroid Build Coastguard Worker      indicator="…short identifier" />
866*912701f9SAndroid Build Coastguard Worker```
867*912701f9SAndroid Build Coastguard Worker
868*912701f9SAndroid Build Coastguard Worker> <small>
869*912701f9SAndroid Build Coastguard Worker>
870*912701f9SAndroid Build Coastguard Worker> Parents: [keyboard3](#element-keyboard3)
871*912701f9SAndroid Build Coastguard Worker>
872*912701f9SAndroid Build Coastguard Worker> Children: _none_
873*912701f9SAndroid Build Coastguard Worker>
874*912701f9SAndroid Build Coastguard Worker> Occurrence: required, single
875*912701f9SAndroid Build Coastguard Worker>
876*912701f9SAndroid Build Coastguard Worker> </small>
877*912701f9SAndroid Build Coastguard Worker
878*912701f9SAndroid Build Coastguard Worker_Attribute:_ `name` (required)
879*912701f9SAndroid Build Coastguard Worker
880*912701f9SAndroid Build Coastguard Worker> Note that this is the only required attribute for the `<info>` element.
881*912701f9SAndroid Build Coastguard Worker>
882*912701f9SAndroid Build Coastguard Worker> This attribute is an informative name for the keyboard.
883*912701f9SAndroid Build Coastguard Worker
884*912701f9SAndroid Build Coastguard Worker```xml
885*912701f9SAndroid Build Coastguard Worker<keyboard3 locale="bg-t-k0-phonetic-trad">
886*912701f9SAndroid Build Coastguard Worker887*912701f9SAndroid Build Coastguard Worker    <info name="Bulgarian (Phonetic Traditional)" />
888*912701f9SAndroid Build Coastguard Worker889*912701f9SAndroid Build Coastguard Worker</keyboard3>
890*912701f9SAndroid Build Coastguard Worker```
891*912701f9SAndroid Build Coastguard Worker
892*912701f9SAndroid Build Coastguard Worker* * *
893*912701f9SAndroid Build Coastguard Worker
894*912701f9SAndroid Build Coastguard Worker
895*912701f9SAndroid Build Coastguard Worker_Attribute:_ `author`
896*912701f9SAndroid Build Coastguard Worker
897*912701f9SAndroid Build Coastguard Worker> The `author` attribute value contains the name of the author of the layout file.
898*912701f9SAndroid Build Coastguard Worker
899*912701f9SAndroid Build Coastguard Worker_Attribute:_ `layout`
900*912701f9SAndroid Build Coastguard Worker
901*912701f9SAndroid Build Coastguard Worker> The `layout` attribute describes the layout pattern, such as QWERTY, DVORAK, INSCRIPT, etc. typically used to distinguish various layouts for the same language.
902*912701f9SAndroid Build Coastguard Worker>
903*912701f9SAndroid Build Coastguard Worker> This attribute is not localized, but is an informative identifier for implementation use.
904*912701f9SAndroid Build Coastguard Worker
905*912701f9SAndroid Build Coastguard Worker_Attribute:_ `indicator`
906*912701f9SAndroid Build Coastguard Worker
907*912701f9SAndroid Build Coastguard Worker> The `indicator` attribute describes a short string to be used in currently selected layout indicator, such as `US`, `SI9` etc.
908*912701f9SAndroid Build Coastguard Worker> Typically, this is shown on a UI element that allows switching keyboard layouts and/or input languages.
909*912701f9SAndroid Build Coastguard Worker>
910*912701f9SAndroid Build Coastguard Worker> This attribute is not localized.
911*912701f9SAndroid Build Coastguard Worker
912*912701f9SAndroid Build Coastguard Worker* * *
913*912701f9SAndroid Build Coastguard Worker
914*912701f9SAndroid Build Coastguard Worker### Element: settings
915*912701f9SAndroid Build Coastguard Worker
916*912701f9SAndroid Build Coastguard WorkerAn element used to keep track of layout-specific settings by implementations. This element may or may not show up on a layout. These settings reflect the normal practice by the implementation. However, an implementation using the data may customize the behavior.
917*912701f9SAndroid Build Coastguard Worker
918*912701f9SAndroid Build Coastguard Worker**Syntax**
919*912701f9SAndroid Build Coastguard Worker
920*912701f9SAndroid Build Coastguard Worker```xml
921*912701f9SAndroid Build Coastguard Worker<settings normalization="disabled" />
922*912701f9SAndroid Build Coastguard Worker```
923*912701f9SAndroid Build Coastguard Worker
924*912701f9SAndroid Build Coastguard Worker> <small>
925*912701f9SAndroid Build Coastguard Worker>
926*912701f9SAndroid Build Coastguard Worker> Parents: [keyboard3](#element-keyboard3)
927*912701f9SAndroid Build Coastguard Worker>
928*912701f9SAndroid Build Coastguard Worker> Children: _none_
929*912701f9SAndroid Build Coastguard Worker>
930*912701f9SAndroid Build Coastguard Worker> Occurrence: optional, single
931*912701f9SAndroid Build Coastguard Worker>
932*912701f9SAndroid Build Coastguard Worker> </small>
933*912701f9SAndroid Build Coastguard Worker
934*912701f9SAndroid Build Coastguard Worker_Attribute:_ `normalization="disabled"`
935*912701f9SAndroid Build Coastguard Worker
936*912701f9SAndroid Build Coastguard Worker> The presence of this attribute indicates that normalization will not be applied to the input text, matching, or the output.
937*912701f9SAndroid Build Coastguard Worker> See [Normalization](#normalization) for additional details.
938*912701f9SAndroid Build Coastguard Worker>
939*912701f9SAndroid Build Coastguard Worker> **Note**: while this attribute is allowed by the specification, it should be used with caution.
940*912701f9SAndroid Build Coastguard Worker
941*912701f9SAndroid Build Coastguard Worker
942*912701f9SAndroid Build Coastguard Worker**Example**
943*912701f9SAndroid Build Coastguard Worker
944*912701f9SAndroid Build Coastguard Worker```xml
945*912701f9SAndroid Build Coastguard Worker<keyboard3 locale="bg">
946*912701f9SAndroid Build Coastguard Worker947*912701f9SAndroid Build Coastguard Worker    <settings normalization="disabled" />
948*912701f9SAndroid Build Coastguard Worker949*912701f9SAndroid Build Coastguard Worker</keyboard3>
950*912701f9SAndroid Build Coastguard Worker```
951*912701f9SAndroid Build Coastguard Worker
952*912701f9SAndroid Build Coastguard Worker* * *
953*912701f9SAndroid Build Coastguard Worker
954*912701f9SAndroid Build Coastguard Worker### Element: displays
955*912701f9SAndroid Build Coastguard Worker
956*912701f9SAndroid Build Coastguard WorkerThe `displays` element consists of a list of [`display`](#element-display) subelements.
957*912701f9SAndroid Build Coastguard Worker
958*912701f9SAndroid Build Coastguard Worker**Syntax**
959*912701f9SAndroid Build Coastguard Worker
960*912701f9SAndroid Build Coastguard Worker```xml
961*912701f9SAndroid Build Coastguard Worker<displays>
962*912701f9SAndroid Build Coastguard Worker    <display … />
963*912701f9SAndroid Build Coastguard Worker    <display … />
964*912701f9SAndroid Build Coastguard Worker965*912701f9SAndroid Build Coastguard Worker</displays>
966*912701f9SAndroid Build Coastguard Worker```
967*912701f9SAndroid Build Coastguard Worker
968*912701f9SAndroid Build Coastguard Worker> <small>
969*912701f9SAndroid Build Coastguard Worker>
970*912701f9SAndroid Build Coastguard Worker> Parents: [keyboard3](#element-keyboard3)
971*912701f9SAndroid Build Coastguard Worker>
972*912701f9SAndroid Build Coastguard Worker> Children: [display](#element-display), [displayOptions](#element-displayoptions), [_special_](tr35.md#special)
973*912701f9SAndroid Build Coastguard Worker>
974*912701f9SAndroid Build Coastguard Worker> Occurrence: optional, single
975*912701f9SAndroid Build Coastguard Worker>
976*912701f9SAndroid Build Coastguard Worker> </small>
977*912701f9SAndroid Build Coastguard Worker
978*912701f9SAndroid Build Coastguard Worker* * *
979*912701f9SAndroid Build Coastguard Worker
980*912701f9SAndroid Build Coastguard Worker### Element: display
981*912701f9SAndroid Build Coastguard Worker
982*912701f9SAndroid Build Coastguard WorkerThe `display` elements can be used to describe what is to be displayed on the keytops for various keys. For the most part, such explicit information is unnecessary since the `@to` element from the `keys/key` element will be used for keytop display.
983*912701f9SAndroid Build Coastguard Worker
984*912701f9SAndroid Build Coastguard Worker- Some characters, such as diacritics, do not display well on their own.
985*912701f9SAndroid Build Coastguard Worker- Another useful scenario is where there are doubled diacritics, or multiple characters with spacing issues.
986*912701f9SAndroid Build Coastguard Worker- Finally, the `display` element provides a way to specify the keytop for keys which do not otherwise produce output. Keys which switch layers using the `@layerId` attribute typically do not produce output.
987*912701f9SAndroid Build Coastguard Worker
988*912701f9SAndroid Build Coastguard Worker> Note: `displays` elements are designed to be shared across many different keyboard layout descriptions, and imported with `<import>` where needed.
989*912701f9SAndroid Build Coastguard Worker
990*912701f9SAndroid Build Coastguard Worker#### Non-spacing marks on keytops
991*912701f9SAndroid Build Coastguard Worker
992*912701f9SAndroid Build Coastguard WorkerFor non-spacing marks, U+25CC `◌` is used as a base. It is an error to use a nonspacing character without a base in the `display` attribute. For example, `display="\u{0303}"` would produce an error.
993*912701f9SAndroid Build Coastguard Worker
994*912701f9SAndroid Build Coastguard WorkerA key which outputs a combining tilde (U+0303) could be represented as either of the following:
995*912701f9SAndroid Build Coastguard Worker
996*912701f9SAndroid Build Coastguard Worker```xml
997*912701f9SAndroid Build Coastguard Worker    <display output="\u{0303}" display="◌̃" />  <!-- \u{25CC} \u{0303}-->
998*912701f9SAndroid Build Coastguard Worker    <display output="\u{0303}" display="\u{25cc}\u{0303}" />  <!-- also acceptable -->
999*912701f9SAndroid Build Coastguard Worker```
1000*912701f9SAndroid Build Coastguard Worker
1001*912701f9SAndroid Build Coastguard WorkerThis way, a key which outputs a combining tilde (U+0303) will be represented as `◌̃` (a tilde on a dotted circle).
1002*912701f9SAndroid Build Coastguard Worker
1003*912701f9SAndroid Build Coastguard WorkerUsers of some scripts/languages may prefer a different base than U+25CC. See  [`<displayOptions baseCharacter=…/>`](#element-displayoptions).
1004*912701f9SAndroid Build Coastguard Worker
1005*912701f9SAndroid Build Coastguard Worker
1006*912701f9SAndroid Build Coastguard Worker**Syntax**
1007*912701f9SAndroid Build Coastguard Worker
1008*912701f9SAndroid Build Coastguard Worker```xml
1009*912701f9SAndroid Build Coastguard Worker<display output="…string" display="…string" />
1010*912701f9SAndroid Build Coastguard Worker```
1011*912701f9SAndroid Build Coastguard Worker
1012*912701f9SAndroid Build Coastguard Worker> <small>
1013*912701f9SAndroid Build Coastguard Worker>
1014*912701f9SAndroid Build Coastguard Worker> Parents: [displays](#element-displays)
1015*912701f9SAndroid Build Coastguard Worker>
1016*912701f9SAndroid Build Coastguard Worker> Children: _none_
1017*912701f9SAndroid Build Coastguard Worker>
1018*912701f9SAndroid Build Coastguard Worker> Occurrence: required, multiple
1019*912701f9SAndroid Build Coastguard Worker>
1020*912701f9SAndroid Build Coastguard Worker> </small>
1021*912701f9SAndroid Build Coastguard Worker
1022*912701f9SAndroid Build Coastguard WorkerOne of the `output` or `id` attributes is required.
1023*912701f9SAndroid Build Coastguard Worker
1024*912701f9SAndroid Build Coastguard Worker**Note**: There is currently no way to indicate a custom display for a key without output (i.e. without a `to=` attribute), nor is there a way to indicate that such a key has a standardized identity (e.g. that a key should be identified as a “Shift”). These may be addressed in future versions of this standard.
1025*912701f9SAndroid Build Coastguard Worker
1026*912701f9SAndroid Build Coastguard Worker
1027*912701f9SAndroid Build Coastguard Worker_Attribute:_ `output` (optional)
1028*912701f9SAndroid Build Coastguard Worker
1029*912701f9SAndroid Build Coastguard Worker> Specifies the character or character sequence from the `keys/key` element that is to have a special display.
1030*912701f9SAndroid Build Coastguard Worker> This attribute may be escaped with `\u` notation, see [Escaping](#escaping).
1031*912701f9SAndroid Build Coastguard Worker> The `output` attribute may also contain the `\m{…}` syntax to reference a marker. See [Markers](#markers). Implementations may highlight a displayed marker, such as with a lighter text color, or a yellow highlight.
1032*912701f9SAndroid Build Coastguard Worker> String variables may be substituted. See [String variables](#element-string)
1033*912701f9SAndroid Build Coastguard Worker
1034*912701f9SAndroid Build Coastguard Worker_Attribute:_ `keyId` (optional)
1035*912701f9SAndroid Build Coastguard Worker
1036*912701f9SAndroid Build Coastguard Worker> Specifies the `key` id. This is useful for keys which do not produce any output (no `output=` value), such as a shift key.
1037*912701f9SAndroid Build Coastguard Worker>
1038*912701f9SAndroid Build Coastguard Worker> Must match `[A-Za-z0-9][A-Za-z0-9_-]*`
1039*912701f9SAndroid Build Coastguard Worker
1040*912701f9SAndroid Build Coastguard Worker_Attribute:_ `display` (required)
1041*912701f9SAndroid Build Coastguard Worker
1042*912701f9SAndroid Build Coastguard Worker> Required and specifies the character sequence that should be displayed on the keytop for any key that generates the `@output` sequence or has the `@id`. (It is an error if the value of the `display` attribute is the same as the value of the `output` attribute, this would be an extraneous entry.)
1043*912701f9SAndroid Build Coastguard Worker
1044*912701f9SAndroid Build Coastguard Worker> String variables may be substituted. See [String variables](#element-string)
1045*912701f9SAndroid Build Coastguard Worker
1046*912701f9SAndroid Build Coastguard WorkerThis attribute may be escaped with `\u` notation, see [Escaping](#escaping).
1047*912701f9SAndroid Build Coastguard Worker
1048*912701f9SAndroid Build Coastguard Worker**Example**
1049*912701f9SAndroid Build Coastguard Worker
1050*912701f9SAndroid Build Coastguard Worker```xml
1051*912701f9SAndroid Build Coastguard Worker<keyboard3>
1052*912701f9SAndroid Build Coastguard Worker    <keys>
1053*912701f9SAndroid Build Coastguard Worker        <key id="grave" output="\u{0300}" /> <!-- combining grave -->
1054*912701f9SAndroid Build Coastguard Worker        <key id="marker" output="\m{acute}" /> <!-- generates a marker-->
1055*912701f9SAndroid Build Coastguard Worker        <key id="numeric" layerId="numeric" /> <!-- changes layers-->
1056*912701f9SAndroid Build Coastguard Worker    </keys>
1057*912701f9SAndroid Build Coastguard Worker    <displays>
1058*912701f9SAndroid Build Coastguard Worker        <display output="\u{0300}" display="ˋ" /> <!-- \u{02CB} -->
1059*912701f9SAndroid Build Coastguard Worker        <display keyId="numeric"  display="#" /> <!-- display the layer shift key as # -->
1060*912701f9SAndroid Build Coastguard Worker        <display output="\m{acute}" display="´" /> <!-- Display \m{acute} as ´ -->
1061*912701f9SAndroid Build Coastguard Worker    </displays>
1062*912701f9SAndroid Build Coastguard Worker</keyboard3>
1063*912701f9SAndroid Build Coastguard Worker```
1064*912701f9SAndroid Build Coastguard Worker
1065*912701f9SAndroid Build Coastguard WorkerTo allow `displays` elements to be shared across keyboards, there is no requirement that `@output` in a `display` element matches any `@output`/`@id` in any `keys/key` element in the keyboard description.
1066*912701f9SAndroid Build Coastguard Worker
1067*912701f9SAndroid Build Coastguard Worker* * *
1068*912701f9SAndroid Build Coastguard Worker
1069*912701f9SAndroid Build Coastguard Worker### Element: displayOptions
1070*912701f9SAndroid Build Coastguard Worker
1071*912701f9SAndroid Build Coastguard WorkerThe `displayOptions` is an optional singleton element providing additional settings on this `displays`.  It is structured so as to provide for future flexibility in such options.
1072*912701f9SAndroid Build Coastguard Worker
1073*912701f9SAndroid Build Coastguard Worker**Syntax**
1074*912701f9SAndroid Build Coastguard Worker
1075*912701f9SAndroid Build Coastguard Worker```xml
1076*912701f9SAndroid Build Coastguard Worker<displays>
1077*912701f9SAndroid Build Coastguard Worker    <display …/>
1078*912701f9SAndroid Build Coastguard Worker    <displayOptions baseCharacter="x"/>
1079*912701f9SAndroid Build Coastguard Worker</displays>
1080*912701f9SAndroid Build Coastguard Worker```
1081*912701f9SAndroid Build Coastguard Worker
1082*912701f9SAndroid Build Coastguard Worker> <small>
1083*912701f9SAndroid Build Coastguard Worker>
1084*912701f9SAndroid Build Coastguard Worker> Parents: [displays](#element-displays)
1085*912701f9SAndroid Build Coastguard Worker>
1086*912701f9SAndroid Build Coastguard Worker> Children: _none_
1087*912701f9SAndroid Build Coastguard Worker>
1088*912701f9SAndroid Build Coastguard Worker> Occurrence: optional, single
1089*912701f9SAndroid Build Coastguard Worker>
1090*912701f9SAndroid Build Coastguard Worker> </small>
1091*912701f9SAndroid Build Coastguard Worker
1092*912701f9SAndroid Build Coastguard Worker_Attribute:_ `baseCharacter` (optional)
1093*912701f9SAndroid Build Coastguard Worker
1094*912701f9SAndroid Build Coastguard Worker**Note:** At present, this is the only option settable in the `displayOptions`.
1095*912701f9SAndroid Build Coastguard Worker
1096*912701f9SAndroid Build Coastguard Worker> Some scripts/languages may prefer a different base than U+25CC.
1097*912701f9SAndroid Build Coastguard Worker> For Lao for example, `x` is often used as a base instead of `◌`.
1098*912701f9SAndroid Build Coastguard Worker> Setting `baseCharacter="x"` (for example) is a _hint_ to the implementation which
1099*912701f9SAndroid Build Coastguard Worker> requests U+25CC to be substituted with `x` on display.
1100*912701f9SAndroid Build Coastguard Worker> As a hint, the implementation may ignore this option.
1101*912701f9SAndroid Build Coastguard Worker>
1102*912701f9SAndroid Build Coastguard Worker> **Note** that not all base characters will be suitable as bases for combining marks.
1103*912701f9SAndroid Build Coastguard Worker
1104*912701f9SAndroid Build Coastguard WorkerThis attribute may be escaped with `\u` notation, see [Escaping](#escaping).
1105*912701f9SAndroid Build Coastguard Worker
1106*912701f9SAndroid Build Coastguard Worker* * *
1107*912701f9SAndroid Build Coastguard Worker
1108*912701f9SAndroid Build Coastguard Worker### Element: keys
1109*912701f9SAndroid Build Coastguard Worker
1110*912701f9SAndroid Build Coastguard WorkerThis element defines the properties of all possible keys via [`<key>` elements](#element-key) used in all layouts.
1111*912701f9SAndroid Build Coastguard WorkerIt is a “bag of keys” without specifying any ordering or relation between the keys.
1112*912701f9SAndroid Build Coastguard WorkerThere is only a single `<keys>` element in each layout.
1113*912701f9SAndroid Build Coastguard Worker
1114*912701f9SAndroid Build Coastguard Worker**Syntax**
1115*912701f9SAndroid Build Coastguard Worker
1116*912701f9SAndroid Build Coastguard Worker```xml
1117*912701f9SAndroid Build Coastguard Worker<keys>
1118*912701f9SAndroid Build Coastguard Worker    <key … />
1119*912701f9SAndroid Build Coastguard Worker    <key … />
1120*912701f9SAndroid Build Coastguard Worker    <key … />
1121*912701f9SAndroid Build Coastguard Worker</keys>
1122*912701f9SAndroid Build Coastguard Worker```
1123*912701f9SAndroid Build Coastguard Worker
1124*912701f9SAndroid Build Coastguard Worker> <small>
1125*912701f9SAndroid Build Coastguard Worker>
1126*912701f9SAndroid Build Coastguard Worker> Parents: [keyboard3](#element-keyboard3)
1127*912701f9SAndroid Build Coastguard Worker> Children: [key](#element-key)
1128*912701f9SAndroid Build Coastguard Worker> Occurrence: optional, single
1129*912701f9SAndroid Build Coastguard Worker>
1130*912701f9SAndroid Build Coastguard Worker> </small>
1131*912701f9SAndroid Build Coastguard Worker
1132*912701f9SAndroid Build Coastguard Worker
1133*912701f9SAndroid Build Coastguard Worker
1134*912701f9SAndroid Build Coastguard Worker* * *
1135*912701f9SAndroid Build Coastguard Worker
1136*912701f9SAndroid Build Coastguard Worker### Element: key
1137*912701f9SAndroid Build Coastguard Worker
1138*912701f9SAndroid Build Coastguard WorkerThis element defines a mapping between an abstract key and its output. This element must have the `keys` element as its parent. The `key` element is referenced by the `keys=` attribute of the [`row` element](#element-row).
1139*912701f9SAndroid Build Coastguard Worker
1140*912701f9SAndroid Build Coastguard Worker**Syntax**
1141*912701f9SAndroid Build Coastguard Worker
1142*912701f9SAndroid Build Coastguard Worker```xml
1143*912701f9SAndroid Build Coastguard Worker<key
1144*912701f9SAndroid Build Coastguard Worker id="…keyId"
1145*912701f9SAndroid Build Coastguard Worker flickId="…flickId"
1146*912701f9SAndroid Build Coastguard Worker gap="true"
1147*912701f9SAndroid Build Coastguard Worker output="…string"
1148*912701f9SAndroid Build Coastguard Worker longPressKeyIds="…list of keyIds"
1149*912701f9SAndroid Build Coastguard Worker longPressDefaultKeyId="…keyId"
1150*912701f9SAndroid Build Coastguard Worker multiTapKeyIds="…listId"
1151*912701f9SAndroid Build Coastguard Worker stretch="true"
1152*912701f9SAndroid Build Coastguard Worker layerId="…layerId"
1153*912701f9SAndroid Build Coastguard Worker width="…number"
1154*912701f9SAndroid Build Coastguard Worker />
1155*912701f9SAndroid Build Coastguard Worker```
1156*912701f9SAndroid Build Coastguard Worker
1157*912701f9SAndroid Build Coastguard Worker> <small>
1158*912701f9SAndroid Build Coastguard Worker>
1159*912701f9SAndroid Build Coastguard Worker> Parents: [keys](#element-keys)
1160*912701f9SAndroid Build Coastguard Worker>
1161*912701f9SAndroid Build Coastguard Worker> Children: _none_
1162*912701f9SAndroid Build Coastguard Worker>
1163*912701f9SAndroid Build Coastguard Worker> Occurrence: optional, multiple
1164*912701f9SAndroid Build Coastguard Worker> </small>
1165*912701f9SAndroid Build Coastguard Worker
1166*912701f9SAndroid Build Coastguard Worker**Note**: The `id` attribute is required.
1167*912701f9SAndroid Build Coastguard Worker
1168*912701f9SAndroid Build Coastguard Worker**Note**: _at least one of_ `layerId`, `gap`, or `output` are required.
1169*912701f9SAndroid Build Coastguard Worker
1170*912701f9SAndroid Build Coastguard Worker_Attribute:_ `id`
1171*912701f9SAndroid Build Coastguard Worker
1172*912701f9SAndroid Build Coastguard Worker> The `id` attribute uniquely identifies the key. NMTOKEN. It can (but needn't be) the key name (a, b, c, A, B, C, …), or any other valid token (e-acute, alef, alif, alpha, …).
1173*912701f9SAndroid Build Coastguard Worker>
1174*912701f9SAndroid Build Coastguard Worker> In the future, this attribute’s definition is expected to be updated to align with [UAX#31](https://www.unicode.org/reports/tr31/).
1175*912701f9SAndroid Build Coastguard Worker
1176*912701f9SAndroid Build Coastguard Worker_Attribute:_ `flickId="…flickId"` (optional)
1177*912701f9SAndroid Build Coastguard Worker
1178*912701f9SAndroid Build Coastguard Worker> The `flickId` attribute indicates that this key makes use of a [`flick`](#element-flick) set with the specified id.
1179*912701f9SAndroid Build Coastguard Worker
1180*912701f9SAndroid Build Coastguard Worker_Attribute:_ `gap="true"` (optional)
1181*912701f9SAndroid Build Coastguard Worker
1182*912701f9SAndroid Build Coastguard Worker> The `gap` attribute indicates that this key does not have any appearance, but causes a "gap" of the specified number of key widths. Can be used with `width` to set a width.
1183*912701f9SAndroid Build Coastguard Worker> Such elements may not be referred to by `display` elements, nor may they have any of the following attributes:  `flickId`, `longPressKeyId`, `longPressDefaultKeyId`, `multiTapKeyIds`, `layerId`, or `output`.
1184*912701f9SAndroid Build Coastguard Worker
1185*912701f9SAndroid Build Coastguard Worker```xml
1186*912701f9SAndroid Build Coastguard Worker<key id="mediumgap" gap="true" width="1.5"/>
1187*912701f9SAndroid Build Coastguard Worker```
1188*912701f9SAndroid Build Coastguard Worker
1189*912701f9SAndroid Build Coastguard Worker_Attribute:_ `output`
1190*912701f9SAndroid Build Coastguard Worker
1191*912701f9SAndroid Build Coastguard Worker> The `output` attribute value contains the sequence of characters that is emitted when pressing this particular key. Control characters, whitespace (other than the regular space character) and combining marks in this attribute are escaped using the `\u{…}` notation. More than one key may output the same output.
1192*912701f9SAndroid Build Coastguard Worker>
1193*912701f9SAndroid Build Coastguard Worker> The `output` attribute may also contain the `\m{…markerId}` syntax to insert a marker. See the definition of [markers](#markers).
1194*912701f9SAndroid Build Coastguard Worker
1195*912701f9SAndroid Build Coastguard Worker_Attribute:_ `longPressKeyIds="…list of keyIds"` (optional)
1196*912701f9SAndroid Build Coastguard Worker
1197*912701f9SAndroid Build Coastguard Worker> A space-separated ordered list of `key` element ids, which keys which can be emitted by "long-pressing" this key. This feature is prominent in mobile devices.
1198*912701f9SAndroid Build Coastguard Worker>
1199*912701f9SAndroid Build Coastguard Worker> In a list of keys specified by `longPressKeyIds`, the key matching `longPressDefaultKeyId` attribute (if present) specifies the default long-press target, which could be different than the first element. It is an error if the `longPressDefaultKeyId` key is not in the `longPressKeyIds` list.
1200*912701f9SAndroid Build Coastguard Worker>
1201*912701f9SAndroid Build Coastguard Worker> Implementations shall ignore any gestures (such as flick, multiTap, longPress) defined on keys in the `longPressKeyIds` list.
1202*912701f9SAndroid Build Coastguard Worker>
1203*912701f9SAndroid Build Coastguard Worker> For example, if the default key is a key whose [display](#element-displays) value is `{`, an implementation might render the key as follows:
1204*912701f9SAndroid Build Coastguard Worker>
1205*912701f9SAndroid Build Coastguard Worker> ![keycap hint](images/keycapHint.png)
1206*912701f9SAndroid Build Coastguard Worker>
1207*912701f9SAndroid Build Coastguard Worker> _Example:_
1208*912701f9SAndroid Build Coastguard Worker> - pressing the `o` key will produce `o`
1209*912701f9SAndroid Build Coastguard Worker> - holding down the key will produce a list `ó`, `{` (where `{` is the default and produces a marker)
1210*912701f9SAndroid Build Coastguard Worker>
1211*912701f9SAndroid Build Coastguard Worker> ```xml
1212*912701f9SAndroid Build Coastguard Worker> <displays>
1213*912701f9SAndroid Build Coastguard Worker>    <display output="\m{marker}" display="{" />
1214*912701f9SAndroid Build Coastguard Worker> </displays>
1215*912701f9SAndroid Build Coastguard Worker>
1216*912701f9SAndroid Build Coastguard Worker> <keys>
1217*912701f9SAndroid Build Coastguard Worker>    <key id="o" output="o" longPressKeyIds="o-acute marker" longPressDefaultKeyId="marker">
1218*912701f9SAndroid Build Coastguard Worker>    <key id="o-acute" output="ó"/>
1219*912701f9SAndroid Build Coastguard Worker>    <key id="marker" output="\m{marker}" />
1220*912701f9SAndroid Build Coastguard Worker> </key>
1221*912701f9SAndroid Build Coastguard Worker>
1222*912701f9SAndroid Build Coastguard Worker> ```
1223*912701f9SAndroid Build Coastguard Worker
1224*912701f9SAndroid Build Coastguard Worker_Attribute:_ `longPressDefaultKeyId="…keyId"` (optional)
1225*912701f9SAndroid Build Coastguard Worker
1226*912701f9SAndroid Build Coastguard Worker> Specifies the default key, by id, in a list of long-press keys. See the discussion of `LongPressKeyIds`, above.
1227*912701f9SAndroid Build Coastguard Worker
1228*912701f9SAndroid Build Coastguard Worker_Attribute:_ `multiTapKeyIds` (optional)
1229*912701f9SAndroid Build Coastguard Worker
1230*912701f9SAndroid Build Coastguard Worker> A space-separated ordered list of `key` element ids, which keys, where each successive key in the list is produced by the corresponding number of quick taps.
1231*912701f9SAndroid Build Coastguard Worker> It is an error for a key to reference itself in the `multiTapKeyIds` list.
1232*912701f9SAndroid Build Coastguard Worker>
1233*912701f9SAndroid Build Coastguard Worker> Implementations shall ignore any gestures (such as flick, multiTap, longPress) defined on keys in the `multiTapKeyIds` list.
1234*912701f9SAndroid Build Coastguard Worker>
1235*912701f9SAndroid Build Coastguard Worker> _Example:_
1236*912701f9SAndroid Build Coastguard Worker> - first tap on the key will produce “a”
1237*912701f9SAndroid Build Coastguard Worker> - two taps will produce “bb”
1238*912701f9SAndroid Build Coastguard Worker> - three taps on the key will produce “c”
1239*912701f9SAndroid Build Coastguard Worker> - four taps on the key will produce “d”
1240*912701f9SAndroid Build Coastguard Worker>
1241*912701f9SAndroid Build Coastguard Worker> ```xml
1242*912701f9SAndroid Build Coastguard Worker> <keys>
1243*912701f9SAndroid Build Coastguard Worker>    <key id="a" output="a" multiTapKeyIds="bb c d">
1244*912701f9SAndroid Build Coastguard Worker>    <key id="bb" output="bb" />
1245*912701f9SAndroid Build Coastguard Worker>    <key id="c" output="c" />
1246*912701f9SAndroid Build Coastguard Worker>    <key id="d" output="d" />
1247*912701f9SAndroid Build Coastguard Worker> </key>
1248*912701f9SAndroid Build Coastguard Worker> ```
1249*912701f9SAndroid Build Coastguard Worker
1250*912701f9SAndroid Build Coastguard Worker**Note**: Behavior past the end of the multiTap list is implementation specific.
1251*912701f9SAndroid Build Coastguard Worker
1252*912701f9SAndroid Build Coastguard Worker_Attribute:_ `stretch="true"` (optional)
1253*912701f9SAndroid Build Coastguard Worker
1254*912701f9SAndroid Build Coastguard Worker> The `stretch` attribute indicates that a touch layout may stretch this key to fill available horizontal space on the row.
1255*912701f9SAndroid Build Coastguard Worker> This is used, for example, on the spacebar. Note that `stretch=` is ignored for hardware layouts.
1256*912701f9SAndroid Build Coastguard Worker
1257*912701f9SAndroid Build Coastguard Worker_Attribute:_ `layerId="shift"` (optional)
1258*912701f9SAndroid Build Coastguard Worker
1259*912701f9SAndroid Build Coastguard Worker> The `layerId` attribute indicates that this key switches to another `layer` with the specified id (such as `<layer id="shift"/>` in this example).
1260*912701f9SAndroid Build Coastguard Worker> Note that a key may have both a `layerId=` and a `output=` attribute, indicating that the key outputs _prior_ to switching layers.
1261*912701f9SAndroid Build Coastguard Worker> Also note that `layerId=` is ignored for hardware layouts: their shifting is controlled via
1262*912701f9SAndroid Build Coastguard Worker> the modifier keys.
1263*912701f9SAndroid Build Coastguard Worker>
1264*912701f9SAndroid Build Coastguard Worker> This attribute is an NMTOKEN.
1265*912701f9SAndroid Build Coastguard Worker>
1266*912701f9SAndroid Build Coastguard Worker> In the future, this attribute’s definition is expected to be updated to align with [UAX#31](https://www.unicode.org/reports/tr31/).
1267*912701f9SAndroid Build Coastguard Worker
1268*912701f9SAndroid Build Coastguard Worker
1269*912701f9SAndroid Build Coastguard Worker_Attribute:_ `width="1.2"` (optional, default "1.0")
1270*912701f9SAndroid Build Coastguard Worker
1271*912701f9SAndroid Build Coastguard Worker> The `width` attribute indicates that this key has a different width than other keys, by the specified number of key widths.
1272*912701f9SAndroid Build Coastguard Worker
1273*912701f9SAndroid Build Coastguard Worker```xml
1274*912701f9SAndroid Build Coastguard Worker<key id="wide-a" output="a" width="1.2"/>
1275*912701f9SAndroid Build Coastguard Worker<key id="wide-gap" gap="true" width="2.5"/>
1276*912701f9SAndroid Build Coastguard Worker```
1277*912701f9SAndroid Build Coastguard Worker
1278*912701f9SAndroid Build Coastguard Worker##### Implied Keys
1279*912701f9SAndroid Build Coastguard Worker
1280*912701f9SAndroid Build Coastguard WorkerNot all keys need to be listed explicitly.  The following two can be assumed to already exist:
1281*912701f9SAndroid Build Coastguard Worker
1282*912701f9SAndroid Build Coastguard Worker```xml
1283*912701f9SAndroid Build Coastguard Worker<key id="gap" gap="true" width="1"/>
1284*912701f9SAndroid Build Coastguard Worker<key id="space" output=" " stretch="true" width="1"/>
1285*912701f9SAndroid Build Coastguard Worker```
1286*912701f9SAndroid Build Coastguard Worker
1287*912701f9SAndroid Build Coastguard WorkerIn addition, these 62 keys, comprising 10 digit keys, 26 Latin lower-case keys, and 26 Latin upper-case keys, where the `id` is the same as the `to`, are assumed to exist:
1288*912701f9SAndroid Build Coastguard Worker
1289*912701f9SAndroid Build Coastguard Worker```xml
1290*912701f9SAndroid Build Coastguard Worker<key id="0" output="0"/>
1291*912701f9SAndroid Build Coastguard Worker<key id="1" output="1"/>
1292*912701f9SAndroid Build Coastguard Worker<key id="2" output="2"/>
1293*912701f9SAndroid Build Coastguard Worker1294*912701f9SAndroid Build Coastguard Worker<key id="A" output="A"/>
1295*912701f9SAndroid Build Coastguard Worker<key id="B" output="B"/>
1296*912701f9SAndroid Build Coastguard Worker<key id="C" output="C"/>
1297*912701f9SAndroid Build Coastguard Worker1298*912701f9SAndroid Build Coastguard Worker<key id="a" output="a"/>
1299*912701f9SAndroid Build Coastguard Worker<key id="b" output="b"/>
1300*912701f9SAndroid Build Coastguard Worker<key id="c" output="c"/>
1301*912701f9SAndroid Build Coastguard Worker1302*912701f9SAndroid Build Coastguard Worker```
1303*912701f9SAndroid Build Coastguard Worker
1304*912701f9SAndroid Build Coastguard WorkerThese implied keys are available in a data file named `keyboards/import/keys-Latn-implied.xml` in the CLDR distribution for the convenience of implementations.
1305*912701f9SAndroid Build Coastguard Worker
1306*912701f9SAndroid Build Coastguard WorkerThus, the implied keys behave as if the following import were present.
1307*912701f9SAndroid Build Coastguard Worker
1308*912701f9SAndroid Build Coastguard Worker```xml
1309*912701f9SAndroid Build Coastguard Worker<keyboard3>
1310*912701f9SAndroid Build Coastguard Worker    <keys>
1311*912701f9SAndroid Build Coastguard Worker        <import base="cldr" path="45/keys-Latn-implied.xml" />
1312*912701f9SAndroid Build Coastguard Worker    </keys>
1313*912701f9SAndroid Build Coastguard Worker</keyboard3>
1314*912701f9SAndroid Build Coastguard Worker```
1315*912701f9SAndroid Build Coastguard Worker
1316*912701f9SAndroid Build Coastguard Worker**Note:** All implied keys may be overridden, as with all other imported data items. See the [`import`](#element-import) element for more details.
1317*912701f9SAndroid Build Coastguard Worker
1318*912701f9SAndroid Build Coastguard Worker* * *
1319*912701f9SAndroid Build Coastguard Worker
1320*912701f9SAndroid Build Coastguard Worker### Element: flicks
1321*912701f9SAndroid Build Coastguard Worker
1322*912701f9SAndroid Build Coastguard WorkerThe `flicks` element is a collection of `flick` elements.
1323*912701f9SAndroid Build Coastguard Worker
1324*912701f9SAndroid Build Coastguard Worker> <small>
1325*912701f9SAndroid Build Coastguard Worker>
1326*912701f9SAndroid Build Coastguard Worker> Parents: [keyboard3](#element-keyboard3)
1327*912701f9SAndroid Build Coastguard Worker>
1328*912701f9SAndroid Build Coastguard Worker> Children: [flick](#element-flick), [import](#element-import), [_special_](tr35.md#special)
1329*912701f9SAndroid Build Coastguard Worker>
1330*912701f9SAndroid Build Coastguard Worker> Occurrence: optional, single
1331*912701f9SAndroid Build Coastguard Worker> </small>
1332*912701f9SAndroid Build Coastguard Worker
1333*912701f9SAndroid Build Coastguard Worker* * *
1334*912701f9SAndroid Build Coastguard Worker
1335*912701f9SAndroid Build Coastguard Worker#### Element: flick
1336*912701f9SAndroid Build Coastguard Worker
1337*912701f9SAndroid Build Coastguard WorkerThe `flick` element is used to generate results from a "flick" of the finger on a mobile device.
1338*912701f9SAndroid Build Coastguard Worker
1339*912701f9SAndroid Build Coastguard Worker**Syntax**
1340*912701f9SAndroid Build Coastguard Worker
1341*912701f9SAndroid Build Coastguard Worker```xml
1342*912701f9SAndroid Build Coastguard Worker<keyboard3>
1343*912701f9SAndroid Build Coastguard Worker    <keys>
1344*912701f9SAndroid Build Coastguard Worker        <key id="a" flickId="a-flicks" output="a" />
1345*912701f9SAndroid Build Coastguard Worker    </keys>
1346*912701f9SAndroid Build Coastguard Worker    <flicks>
1347*912701f9SAndroid Build Coastguard Worker        <flick id="a-flicks">
1348*912701f9SAndroid Build Coastguard Worker            <flickSegment … />
1349*912701f9SAndroid Build Coastguard Worker            <flickSegment … />
1350*912701f9SAndroid Build Coastguard Worker            <flickSegment … />
1351*912701f9SAndroid Build Coastguard Worker        </flick>
1352*912701f9SAndroid Build Coastguard Worker    </flicks>
1353*912701f9SAndroid Build Coastguard Worker</keyboard3>
1354*912701f9SAndroid Build Coastguard Worker```
1355*912701f9SAndroid Build Coastguard Worker
1356*912701f9SAndroid Build Coastguard Worker> <small>
1357*912701f9SAndroid Build Coastguard Worker>
1358*912701f9SAndroid Build Coastguard Worker> Parents: [flicks](#element-flicks)
1359*912701f9SAndroid Build Coastguard Worker>
1360*912701f9SAndroid Build Coastguard Worker> Children: [flickSegment](#element-flicksegment), [_special_](tr35.md#special)
1361*912701f9SAndroid Build Coastguard Worker>
1362*912701f9SAndroid Build Coastguard Worker> Occurrence: optional, multiple
1363*912701f9SAndroid Build Coastguard Worker>
1364*912701f9SAndroid Build Coastguard Worker> </small>
1365*912701f9SAndroid Build Coastguard Worker
1366*912701f9SAndroid Build Coastguard Worker_Attribute:_ `id` (required)
1367*912701f9SAndroid Build Coastguard Worker
1368*912701f9SAndroid Build Coastguard Worker> The `id` attribute identifies the flicks. It can be any NMTOKEN.
1369*912701f9SAndroid Build Coastguard Worker>
1370*912701f9SAndroid Build Coastguard Worker> The `id` attribute on `flick` elements are distinct from the `id` attribute on `key` elements.
1371*912701f9SAndroid Build Coastguard Worker> For example, it is permissible to have both `<key id="a" />` and
1372*912701f9SAndroid Build Coastguard Worker> `<flick id="a" />` which are two unrelated elements.
1373*912701f9SAndroid Build Coastguard Worker>
1374*912701f9SAndroid Build Coastguard Worker> In the future, this attribute’s definition is expected to be updated to align with [UAX#31](https://www.unicode.org/reports/tr31/).
1375*912701f9SAndroid Build Coastguard Worker
1376*912701f9SAndroid Build Coastguard Worker* * *
1377*912701f9SAndroid Build Coastguard Worker
1378*912701f9SAndroid Build Coastguard Worker#### Element: flickSegment
1379*912701f9SAndroid Build Coastguard Worker
1380*912701f9SAndroid Build Coastguard Worker> <small>
1381*912701f9SAndroid Build Coastguard Worker>
1382*912701f9SAndroid Build Coastguard Worker> Parents: [flick](#element-flick)
1383*912701f9SAndroid Build Coastguard Worker>
1384*912701f9SAndroid Build Coastguard Worker> Children: _none_
1385*912701f9SAndroid Build Coastguard Worker>
1386*912701f9SAndroid Build Coastguard Worker> Occurrence: required, multiple
1387*912701f9SAndroid Build Coastguard Worker>
1388*912701f9SAndroid Build Coastguard Worker> </small>
1389*912701f9SAndroid Build Coastguard Worker
1390*912701f9SAndroid Build Coastguard Worker_Attribute:_ `directions` (required)
1391*912701f9SAndroid Build Coastguard Worker
1392*912701f9SAndroid Build Coastguard Worker> The `directions` attribute value is a space-delimited list of keywords, that describe a path, currently restricted to the cardinal and intercardinal directions `{n e s w ne nw se sw}`.
1393*912701f9SAndroid Build Coastguard Worker
1394*912701f9SAndroid Build Coastguard Worker_Attribute:_ `keyId` (required)
1395*912701f9SAndroid Build Coastguard Worker
1396*912701f9SAndroid Build Coastguard Worker> The `keyId` attribute value is the result of (one or more) flicks.
1397*912701f9SAndroid Build Coastguard Worker>
1398*912701f9SAndroid Build Coastguard Worker> Implementations shall ignore any gestures (such as flick, multiTap, longPress) defined on the key specified by `keyId`.
1399*912701f9SAndroid Build Coastguard Worker
1400*912701f9SAndroid Build Coastguard Worker
1401*912701f9SAndroid Build Coastguard Worker**Example**
1402*912701f9SAndroid Build Coastguard Workerwhere a flick to the Northeast then South produces `Å`.
1403*912701f9SAndroid Build Coastguard Worker
1404*912701f9SAndroid Build Coastguard Worker```xml
1405*912701f9SAndroid Build Coastguard Worker<keys>
1406*912701f9SAndroid Build Coastguard Worker    <key id="something" flickId="a" output="Something" />
1407*912701f9SAndroid Build Coastguard Worker    <key id="A-ring" output="A-ring" />
1408*912701f9SAndroid Build Coastguard Worker</keys>
1409*912701f9SAndroid Build Coastguard Worker
1410*912701f9SAndroid Build Coastguard Worker<flicks>
1411*912701f9SAndroid Build Coastguard Worker    <flick id="a">
1412*912701f9SAndroid Build Coastguard Worker        <flickSegment directions="ne s" keyId="A-ring" />
1413*912701f9SAndroid Build Coastguard Worker    </flick>
1414*912701f9SAndroid Build Coastguard Worker</flicks>
1415*912701f9SAndroid Build Coastguard Worker```
1416*912701f9SAndroid Build Coastguard Worker
1417*912701f9SAndroid Build Coastguard Worker* * *
1418*912701f9SAndroid Build Coastguard Worker
1419*912701f9SAndroid Build Coastguard Worker### Element: forms
1420*912701f9SAndroid Build Coastguard Worker
1421*912701f9SAndroid Build Coastguard WorkerThis element contains a set of `form` elements which define the layout of a particular hardware form.
1422*912701f9SAndroid Build Coastguard Worker
1423*912701f9SAndroid Build Coastguard Worker
1424*912701f9SAndroid Build Coastguard Worker> <small>
1425*912701f9SAndroid Build Coastguard Worker>
1426*912701f9SAndroid Build Coastguard Worker> Parents: [keyboard3](#element-keyboard3)
1427*912701f9SAndroid Build Coastguard Worker>
1428*912701f9SAndroid Build Coastguard Worker> Children: [import](#element-import), [form](#element-form), [_special_](tr35.md#special)
1429*912701f9SAndroid Build Coastguard Worker>
1430*912701f9SAndroid Build Coastguard Worker> Occurrence: optional, single
1431*912701f9SAndroid Build Coastguard Worker>
1432*912701f9SAndroid Build Coastguard Worker> </small>
1433*912701f9SAndroid Build Coastguard Worker
1434*912701f9SAndroid Build Coastguard Worker***Syntax***
1435*912701f9SAndroid Build Coastguard Worker
1436*912701f9SAndroid Build Coastguard Worker```xml
1437*912701f9SAndroid Build Coastguard Worker<forms>
1438*912701f9SAndroid Build Coastguard Worker    <form id="iso">
1439*912701f9SAndroid Build Coastguard Worker        <!-- … -->
1440*912701f9SAndroid Build Coastguard Worker    </form>
1441*912701f9SAndroid Build Coastguard Worker    <form id="us">
1442*912701f9SAndroid Build Coastguard Worker        <!-- … -->
1443*912701f9SAndroid Build Coastguard Worker    </form>
1444*912701f9SAndroid Build Coastguard Worker</forms>
1445*912701f9SAndroid Build Coastguard Worker```
1446*912701f9SAndroid Build Coastguard Worker
1447*912701f9SAndroid Build Coastguard Worker* * *
1448*912701f9SAndroid Build Coastguard Worker
1449*912701f9SAndroid Build Coastguard Worker### Element: form
1450*912701f9SAndroid Build Coastguard Worker
1451*912701f9SAndroid Build Coastguard WorkerThis element contains a specific `form` element which defines the layout of a particular hardware form.
1452*912701f9SAndroid Build Coastguard Worker
1453*912701f9SAndroid Build Coastguard Worker> *Note:* Most keyboards will not need to use this element directly, and the CLDR repository will not accept keyboards which define a custom `form` element.  This element is provided for two reasons:
1454*912701f9SAndroid Build Coastguard Worker
1455*912701f9SAndroid Build Coastguard Worker1. To formally specify the standard hardware arrangements used with CLDR for implementations. Implementations can verify the arrangement, and validate keyboards against the number of rows and the number of keys per row.
1456*912701f9SAndroid Build Coastguard Worker
1457*912701f9SAndroid Build Coastguard Worker2. To allow a way to customize the scancode layout for keyboards not intended to be included in the common CLDR repository.
1458*912701f9SAndroid Build Coastguard Worker
1459*912701f9SAndroid Build Coastguard WorkerSee [Implied Form Values](#implied-form-values), below.
1460*912701f9SAndroid Build Coastguard Worker
1461*912701f9SAndroid Build Coastguard Worker> <small>
1462*912701f9SAndroid Build Coastguard Worker>
1463*912701f9SAndroid Build Coastguard Worker> Parents: [forms](#element-forms)
1464*912701f9SAndroid Build Coastguard Worker>
1465*912701f9SAndroid Build Coastguard Worker> Children: [scanCodes](#element-scancodes), [_special_](tr35.md#special)
1466*912701f9SAndroid Build Coastguard Worker>
1467*912701f9SAndroid Build Coastguard Worker> Occurrence: optional, multiple
1468*912701f9SAndroid Build Coastguard Worker>
1469*912701f9SAndroid Build Coastguard Worker> </small>
1470*912701f9SAndroid Build Coastguard Worker
1471*912701f9SAndroid Build Coastguard Worker_Attribute:_ `id` (required)
1472*912701f9SAndroid Build Coastguard Worker
1473*912701f9SAndroid Build Coastguard Worker> This attribute specifies the form id. The value may not be `touch`.
1474*912701f9SAndroid Build Coastguard Worker
1475*912701f9SAndroid Build Coastguard Worker> Must match `[A-Za-z0-9][A-Za-z0-9_-]*`
1476*912701f9SAndroid Build Coastguard Worker
1477*912701f9SAndroid Build Coastguard Worker
1478*912701f9SAndroid Build Coastguard Worker***Syntax***
1479*912701f9SAndroid Build Coastguard Worker
1480*912701f9SAndroid Build Coastguard Worker```xml
1481*912701f9SAndroid Build Coastguard Worker<form id="us">
1482*912701f9SAndroid Build Coastguard Worker    <scanCodes codes="00 01 02"/>
1483*912701f9SAndroid Build Coastguard Worker    <scanCodes codes="03 04 05"/>
1484*912701f9SAndroid Build Coastguard Worker</form>
1485*912701f9SAndroid Build Coastguard Worker```
1486*912701f9SAndroid Build Coastguard Worker
1487*912701f9SAndroid Build Coastguard Worker##### Implied Form Values
1488*912701f9SAndroid Build Coastguard Worker
1489*912701f9SAndroid Build Coastguard WorkerThere is an implied set of `<form>` elements corresponding to the default forms, thus implementations must behave as if there was the following import statement:
1490*912701f9SAndroid Build Coastguard Worker
1491*912701f9SAndroid Build Coastguard Worker```xml
1492*912701f9SAndroid Build Coastguard Worker<keyboard3>
1493*912701f9SAndroid Build Coastguard Worker    <forms>
1494*912701f9SAndroid Build Coastguard Worker        <import base="cldr" path="45/scanCodes-implied.xml" /> <!-- the version will match the current conformsTo of the file -->
1495*912701f9SAndroid Build Coastguard Worker    </forms>
1496*912701f9SAndroid Build Coastguard Worker</keyboard3>
1497*912701f9SAndroid Build Coastguard Worker```
1498*912701f9SAndroid Build Coastguard Worker
1499*912701f9SAndroid Build Coastguard WorkerHere is a summary of the implied form elements. Keyboards included in the CLDR Repository must only use these `formId=` values and may not override the scanCodes.
1500*912701f9SAndroid Build Coastguard Worker
1501*912701f9SAndroid Build Coastguard Worker> - `touch` - Touch (non-hardware) layout.
1502*912701f9SAndroid Build Coastguard Worker> - `abnt2` - Brazilian 103 key ABNT2 layout (iso + extra key near right shift)
1503*912701f9SAndroid Build Coastguard Worker> - `iso` - European 102 key layout (extra key near left shift)
1504*912701f9SAndroid Build Coastguard Worker> - `jis` - Japanese 109 key layout
1505*912701f9SAndroid Build Coastguard Worker> - `us` - ANSI 101 key layout
1506*912701f9SAndroid Build Coastguard Worker> - `ks` - Korean KS layout
1507*912701f9SAndroid Build Coastguard Worker
1508*912701f9SAndroid Build Coastguard Worker* * *
1509*912701f9SAndroid Build Coastguard Worker
1510*912701f9SAndroid Build Coastguard Worker### Element: scanCodes
1511*912701f9SAndroid Build Coastguard Worker
1512*912701f9SAndroid Build Coastguard WorkerThis element contains a keyboard row, and defines the scan codes for the non-frame keys in that row.
1513*912701f9SAndroid Build Coastguard Worker
1514*912701f9SAndroid Build Coastguard Worker> <small>
1515*912701f9SAndroid Build Coastguard Worker>
1516*912701f9SAndroid Build Coastguard Worker> Parents: [form](#element-form)
1517*912701f9SAndroid Build Coastguard Worker>
1518*912701f9SAndroid Build Coastguard Worker> Children: none
1519*912701f9SAndroid Build Coastguard Worker>
1520*912701f9SAndroid Build Coastguard Worker> Occurrence: required, multiple
1521*912701f9SAndroid Build Coastguard Worker>
1522*912701f9SAndroid Build Coastguard Worker> </small>
1523*912701f9SAndroid Build Coastguard Worker
1524*912701f9SAndroid Build Coastguard Worker> _Attribute:_ `codes` (required)
1525*912701f9SAndroid Build Coastguard Worker
1526*912701f9SAndroid Build Coastguard Worker> The `codes` attribute is a space-separated list of 2-digit hex bytes, each representing a scan code.
1527*912701f9SAndroid Build Coastguard Worker
1528*912701f9SAndroid Build Coastguard Worker**Syntax**
1529*912701f9SAndroid Build Coastguard Worker
1530*912701f9SAndroid Build Coastguard Worker```xml
1531*912701f9SAndroid Build Coastguard Worker<scanCodes codes="29 02 03 04 05 06 07 08 09 0A 0B 0C 0D" />
1532*912701f9SAndroid Build Coastguard Worker```
1533*912701f9SAndroid Build Coastguard Worker
1534*912701f9SAndroid Build Coastguard Worker* * *
1535*912701f9SAndroid Build Coastguard Worker
1536*912701f9SAndroid Build Coastguard Worker### Element: layers
1537*912701f9SAndroid Build Coastguard Worker
1538*912701f9SAndroid Build Coastguard WorkerThis element contains a set of `layer` elements with a specific physical form factor, whether
1539*912701f9SAndroid Build Coastguard Workerhardware or touch layout.
1540*912701f9SAndroid Build Coastguard Worker
1541*912701f9SAndroid Build Coastguard Worker> <small>
1542*912701f9SAndroid Build Coastguard Worker>
1543*912701f9SAndroid Build Coastguard Worker> Parents: [keyboard3](#element-keyboard3)
1544*912701f9SAndroid Build Coastguard Worker>
1545*912701f9SAndroid Build Coastguard Worker> Children: [import](#element-import), [layer](#element-layer), [_special_](tr35.md#special)
1546*912701f9SAndroid Build Coastguard Worker>
1547*912701f9SAndroid Build Coastguard Worker> Occurrence: required, multiple
1548*912701f9SAndroid Build Coastguard Worker>
1549*912701f9SAndroid Build Coastguard Worker> </small>
1550*912701f9SAndroid Build Coastguard Worker
1551*912701f9SAndroid Build Coastguard Worker- At least one `layers` element is required.
1552*912701f9SAndroid Build Coastguard Worker
1553*912701f9SAndroid Build Coastguard Worker_Attribute:_ `formId` (required)
1554*912701f9SAndroid Build Coastguard Worker
1555*912701f9SAndroid Build Coastguard Worker> This attribute specifies the physical layout of a hardware keyboard,
1556*912701f9SAndroid Build Coastguard Worker> or that the form is a `touch` layout.
1557*912701f9SAndroid Build Coastguard Worker>
1558*912701f9SAndroid Build Coastguard Worker> When using an on-screen touch keyboard, if the keyboard does not specify a `<layers formId="touch">`
1559*912701f9SAndroid Build Coastguard Worker> element, a `<layers formId="…formId">` element can be used as an fallback alternative.
1560*912701f9SAndroid Build Coastguard Worker> If there is no `hardware` form, the implementation may need
1561*912701f9SAndroid Build Coastguard Worker> to choose a different keyboard file, or use some other fallback behavior when using a
1562*912701f9SAndroid Build Coastguard Worker> hardware keyboard.
1563*912701f9SAndroid Build Coastguard Worker>
1564*912701f9SAndroid Build Coastguard Worker> Because a hardware keyboard facilitates non-trivial amounts of text input,
1565*912701f9SAndroid Build Coastguard Worker> and many touch devices can also be connected to a hardware keyboard, it
1566*912701f9SAndroid Build Coastguard Worker> is recommended to always have a hardware (non-touch) form.
1567*912701f9SAndroid Build Coastguard Worker>
1568*912701f9SAndroid Build Coastguard Worker> Multiple `<layers formId="touch">` elements are allowed with distinct `minDeviceWidth` values.
1569*912701f9SAndroid Build Coastguard Worker> At most one hardware (non-`formId="touch"`) `<layers>` element is allowed. If a different key arrangement is desired between, for example, `us` and `iso` formats, these should be separated into two different keyboards.
1570*912701f9SAndroid Build Coastguard Worker>
1571*912701f9SAndroid Build Coastguard Worker> The typical keyboard author will be designing a keyboard based on their circumstances and the hardware that they are using. So, for example, if they are in South East Asia, they will almost certainly be using an 101 key hardware keyboard with US key caps. So we want them to be able to reference that (`<layers formId="us">`) in their design, rather than having to work with an unfamiliar form.
1572*912701f9SAndroid Build Coastguard Worker>
1573*912701f9SAndroid Build Coastguard Worker> A mismatch between the hardware layout in the keyboard file, and the actual hardware used by the user could result in some keys being inaccessible to the user if their hardware cannot generate the scancodes corresponding to the layout specified by the `formId=` attribute. Such keys could be accessed only via an on-screen keyboard utility. Conversely, a user with hardware keys that are not present in the specified `formId=` will result in some hardware keys which have no function when pressed.
1574*912701f9SAndroid Build Coastguard Worker>
1575*912701f9SAndroid Build Coastguard Worker> The value of the `formId=` attribute may be `touch`, or correspond to a `form` element. See [`form`](#element-form).
1576*912701f9SAndroid Build Coastguard Worker>
1577*912701f9SAndroid Build Coastguard Worker
1578*912701f9SAndroid Build Coastguard Worker_Attribute:_ `minDeviceWidth`
1579*912701f9SAndroid Build Coastguard Worker
1580*912701f9SAndroid Build Coastguard Worker> This attribute specifies the minimum required width, in millimeters (mm), of the touch surface.  The `layers` entry with the greatest matching width will be selected. This attribute is intended for `formId="touch"`, but is supported for hardware forms.
1581*912701f9SAndroid Build Coastguard Worker>
1582*912701f9SAndroid Build Coastguard Worker> This must be a whole number between 1 and 999, inclusive.
1583*912701f9SAndroid Build Coastguard Worker
1584*912701f9SAndroid Build Coastguard Worker### Element: layer
1585*912701f9SAndroid Build Coastguard Worker
1586*912701f9SAndroid Build Coastguard WorkerA `layer` element describes the configuration of keys on a particular layer of a keyboard. It contains one or more `row` elements to describe which keys exist in each row.
1587*912701f9SAndroid Build Coastguard Worker
1588*912701f9SAndroid Build Coastguard Worker**Syntax**
1589*912701f9SAndroid Build Coastguard Worker
1590*912701f9SAndroid Build Coastguard Worker```xml
1591*912701f9SAndroid Build Coastguard Worker<layer id="…layerId" modifiers="…modifier modifier, …modifier modifier, …">
1592*912701f9SAndroid Build Coastguard Worker    <row …/>
1593*912701f9SAndroid Build Coastguard Worker    <row …/>
1594*912701f9SAndroid Build Coastguard Worker1595*912701f9SAndroid Build Coastguard Worker</layer>
1596*912701f9SAndroid Build Coastguard Worker```
1597*912701f9SAndroid Build Coastguard Worker
1598*912701f9SAndroid Build Coastguard Worker> <small>
1599*912701f9SAndroid Build Coastguard Worker>
1600*912701f9SAndroid Build Coastguard Worker> Parents: [keyboard3](#element-keyboard3)
1601*912701f9SAndroid Build Coastguard Worker>
1602*912701f9SAndroid Build Coastguard Worker> Children: [row](#element-row), [_special_](tr35.md#special)
1603*912701f9SAndroid Build Coastguard Worker>
1604*912701f9SAndroid Build Coastguard Worker> Occurrence: optional, multiple
1605*912701f9SAndroid Build Coastguard Worker>
1606*912701f9SAndroid Build Coastguard Worker> </small>
1607*912701f9SAndroid Build Coastguard Worker
1608*912701f9SAndroid Build Coastguard Worker_Attribute_ `id` (required for `touch`)
1609*912701f9SAndroid Build Coastguard Worker
1610*912701f9SAndroid Build Coastguard Worker> The `id` attribute identifies the layer for touch layouts.  This identifier specifies the layout as the target for layer switching, as specified by the `layerId=` attribute on the [`<key>`](#element-key) element.
1611*912701f9SAndroid Build Coastguard Worker> Touch layouts must have one `layer` with `id="base"` to serve as the base layer.
1612*912701f9SAndroid Build Coastguard Worker>
1613*912701f9SAndroid Build Coastguard Worker> Must match `[A-Za-z0-9][A-Za-z0-9_-]*`
1614*912701f9SAndroid Build Coastguard Worker
1615*912701f9SAndroid Build Coastguard Worker_Attribute:_ `modifiers` (required for `hardware`)
1616*912701f9SAndroid Build Coastguard Worker
1617*912701f9SAndroid Build Coastguard Worker> This has two roles. It acts as an identifier for the `layer` element for hardware keyboards (in the absence of the id= element) and also provides the linkage from the hardware modifiers into the correct `layer`.
1618*912701f9SAndroid Build Coastguard Worker>
1619*912701f9SAndroid Build Coastguard Worker> For hardware layouts, the use of `@modifiers` as an identifier for a layer is sufficient since it is always unique among the set of `layer` elements in each  `form`.
1620*912701f9SAndroid Build Coastguard Worker>
1621*912701f9SAndroid Build Coastguard Worker> This attribute value is a list of lists. It is a comma-separated (`,`) list of modifier sets, and each modifier set is a space-separated list of modifier components.
1622*912701f9SAndroid Build Coastguard Worker>
1623*912701f9SAndroid Build Coastguard Worker> Each modifier component must match `[A-Za-z0-9]+`. Extra whitespace is ignored.
1624*912701f9SAndroid Build Coastguard Worker>
1625*912701f9SAndroid Build Coastguard Worker> To indicate that no modifiers apply, the reserved name of `none` is used.
1626*912701f9SAndroid Build Coastguard Worker
1627*912701f9SAndroid Build Coastguard Worker**Syntax**
1628*912701f9SAndroid Build Coastguard Worker
1629*912701f9SAndroid Build Coastguard Worker```xml
1630*912701f9SAndroid Build Coastguard Worker<layer id="base"        modifiers="none">
1631*912701f9SAndroid Build Coastguard Worker    <row keys="a" />
1632*912701f9SAndroid Build Coastguard Worker</layer>
1633*912701f9SAndroid Build Coastguard Worker
1634*912701f9SAndroid Build Coastguard Worker<layer id="upper"       modifiers="shift">
1635*912701f9SAndroid Build Coastguard Worker    <row keys="A" />
1636*912701f9SAndroid Build Coastguard Worker</layer>
1637*912701f9SAndroid Build Coastguard Worker
1638*912701f9SAndroid Build Coastguard Worker<layer id="altgr"       modifiers="altR">
1639*912701f9SAndroid Build Coastguard Worker    <row keys="a-umlaut" />
1640*912701f9SAndroid Build Coastguard Worker</layer>
1641*912701f9SAndroid Build Coastguard Worker
1642*912701f9SAndroid Build Coastguard Worker<layer id="upper-altgr" modifiers="altR shift">
1643*912701f9SAndroid Build Coastguard Worker    <row keys="A-umlaut" />
1644*912701f9SAndroid Build Coastguard Worker</layer>
1645*912701f9SAndroid Build Coastguard Worker```
1646*912701f9SAndroid Build Coastguard Worker
1647*912701f9SAndroid Build Coastguard Worker#### Layer Modifier Sets
1648*912701f9SAndroid Build Coastguard Worker
1649*912701f9SAndroid Build Coastguard WorkerThe `@modifiers` attribute value contains one or more Layer Modifier Sets, separated by commas.
1650*912701f9SAndroid Build Coastguard WorkerFor example, in the element `<layer … modifiers="ctrlL altL, altR" …` the attribute value consists of two sets:
1651*912701f9SAndroid Build Coastguard Worker
1652*912701f9SAndroid Build Coastguard Worker- `ctrlL altL` (two components)
1653*912701f9SAndroid Build Coastguard Worker- `altR` (one component)
1654*912701f9SAndroid Build Coastguard Worker
1655*912701f9SAndroid Build Coastguard WorkerThe order of the sets and the order of the components within each set is not significant. However, for clarity in reading, the canonical order within a set is in the order listed in Layout Modifier Components; the canonical order for the sets should be first by the cardinality of the sets (least first), then alphabetical.
1656*912701f9SAndroid Build Coastguard Worker
1657*912701f9SAndroid Build Coastguard Worker#### Layer Modifier Components
1658*912701f9SAndroid Build Coastguard Worker
1659*912701f9SAndroid Build Coastguard WorkerWithin a Layer Modifier Set, the following modifier components can be used, separated by spaces.
1660*912701f9SAndroid Build Coastguard Worker
1661*912701f9SAndroid Build Coastguard Worker - `none` (no modifier)
1662*912701f9SAndroid Build Coastguard Worker - `alt`
1663*912701f9SAndroid Build Coastguard Worker - `altL`
1664*912701f9SAndroid Build Coastguard Worker - `altR`
1665*912701f9SAndroid Build Coastguard Worker - `caps`
1666*912701f9SAndroid Build Coastguard Worker - `ctrl`
1667*912701f9SAndroid Build Coastguard Worker - `ctrlL`
1668*912701f9SAndroid Build Coastguard Worker - `ctrlR`
1669*912701f9SAndroid Build Coastguard Worker - `shift`
1670*912701f9SAndroid Build Coastguard Worker - `other` (matches if no other layers match)
1671*912701f9SAndroid Build Coastguard Worker
1672*912701f9SAndroid Build Coastguard Worker1. `alt` in this specification is referred to on some platforms as "opt" or "option".
1673*912701f9SAndroid Build Coastguard Worker
1674*912701f9SAndroid Build Coastguard Worker2. `none` and `other` may not be combined with any other components.
1675*912701f9SAndroid Build Coastguard Worker
1676*912701f9SAndroid Build Coastguard Worker#### Modifier Left- and Right- keys
1677*912701f9SAndroid Build Coastguard Worker
1678*912701f9SAndroid Build Coastguard Worker1. `L` or `R` indicates a left- or right- side modifier only (such as `altL`)
1679*912701f9SAndroid Build Coastguard Worker whereas `alt` indicates _either_ left or right alt key (that is, `altL` or `altR`). `ctrl` indicates either left or right ctrl key (that is, `ctrlL` or `ctrlR`).
1680*912701f9SAndroid Build Coastguard Worker
1681*912701f9SAndroid Build Coastguard Worker2. Keyboard implementations must warn if a keyboard mixes `alt` with `altL`/`altR`, or `ctrl` with `ctrlL`/`ctrlR`.
1682*912701f9SAndroid Build Coastguard Worker
1683*912701f9SAndroid Build Coastguard Worker3. Left- and right- side modifiers may not be mixed together in a single `modifier` attribute value, so neither `altL ctrlR"` nor `altL altR` are allowed.
1684*912701f9SAndroid Build Coastguard Worker
1685*912701f9SAndroid Build Coastguard Worker4. `shift` indicates either shift key. The left and right shift keys are not distinguishable in this specification.
1686*912701f9SAndroid Build Coastguard Worker
1687*912701f9SAndroid Build Coastguard Worker#### Layer Modifier Matching
1688*912701f9SAndroid Build Coastguard Worker
1689*912701f9SAndroid Build Coastguard WorkerLayers are matched exactly based on the modifier keys which are down. For example:
1690*912701f9SAndroid Build Coastguard Worker
1691*912701f9SAndroid Build Coastguard Worker- `none` as a modifier will only match if *all* of the keys `caps`, `alt`, `ctrl` and `shift` are up.
1692*912701f9SAndroid Build Coastguard Worker
1693*912701f9SAndroid Build Coastguard Worker- `alt` as a modifier will only match if either `alt` is down, *and* `caps`, `ctrl`, and `shift` are up.
1694*912701f9SAndroid Build Coastguard Worker
1695*912701f9SAndroid Build Coastguard Worker- `altL ctrl` as a modifier will only match if the left `alt` is down, either `ctrl` is down, *and* `shift` and `caps` are up.
1696*912701f9SAndroid Build Coastguard Worker
1697*912701f9SAndroid Build Coastguard Worker- `other` as a modifier will match if no other layers match.
1698*912701f9SAndroid Build Coastguard Worker
1699*912701f9SAndroid Build Coastguard WorkerMultiple modifier sets are separated by commas.  For example, `none, shift caps` will match either no modifiers *or* shift and caps.  `ctrlL altL, altR` will match either  left-control and left-alt, *or* right-alt.
1700*912701f9SAndroid Build Coastguard Worker
1701*912701f9SAndroid Build Coastguard WorkerKeystrokes must be ignored where there isn’t a layer that explicitly matches nor a layer with `other`. Example: If there is a `ctrl` and `shift` layer, but no `ctrl shift` nor `other` layer, no output will result from `ctrl shift X`.
1702*912701f9SAndroid Build Coastguard Worker
1703*912701f9SAndroid Build Coastguard WorkerLayers are not allowed to overlap in their matching.  For example, the keyboard author will receive an error if one layer specifies `alt shift` and another layer specifies `altR shift`.
1704*912701f9SAndroid Build Coastguard Worker
1705*912701f9SAndroid Build Coastguard WorkerThere is one special case:  the `other` layer matches if and only if no other layer matches. Thus logically the `other` layer is matched after all other layers have been checked.
1706*912701f9SAndroid Build Coastguard Worker
1707*912701f9SAndroid Build Coastguard WorkerBecause there is no overlap allowed between layers, the order of `<layer>` elements is not significant.
1708*912701f9SAndroid Build Coastguard Worker
1709*912701f9SAndroid Build Coastguard Worker> Note: The modifier syntax may be enhanced in the future, but will remain backwards compatible with the syntax described here.
1710*912701f9SAndroid Build Coastguard Worker
1711*912701f9SAndroid Build Coastguard Worker* * *
1712*912701f9SAndroid Build Coastguard Worker
1713*912701f9SAndroid Build Coastguard Worker### Element: row
1714*912701f9SAndroid Build Coastguard Worker
1715*912701f9SAndroid Build Coastguard WorkerA `row` element describes the keys that are present in the row of a keyboard.
1716*912701f9SAndroid Build Coastguard Worker
1717*912701f9SAndroid Build Coastguard Worker**Syntax**
1718*912701f9SAndroid Build Coastguard Worker
1719*912701f9SAndroid Build Coastguard Worker```xml
1720*912701f9SAndroid Build Coastguard Worker<row keys="…keyId …keyId …" />
1721*912701f9SAndroid Build Coastguard Worker```
1722*912701f9SAndroid Build Coastguard Worker
1723*912701f9SAndroid Build Coastguard Worker> <small>
1724*912701f9SAndroid Build Coastguard Worker>
1725*912701f9SAndroid Build Coastguard Worker> Parents: [layer](#element-layer)
1726*912701f9SAndroid Build Coastguard Worker>
1727*912701f9SAndroid Build Coastguard Worker> Children: _none_
1728*912701f9SAndroid Build Coastguard Worker>
1729*912701f9SAndroid Build Coastguard Worker> Occurrence: required, multiple
1730*912701f9SAndroid Build Coastguard Worker>
1731*912701f9SAndroid Build Coastguard Worker> </small>
1732*912701f9SAndroid Build Coastguard Worker
1733*912701f9SAndroid Build Coastguard Worker_Attribute:_ `keys` (required)
1734*912701f9SAndroid Build Coastguard Worker
1735*912701f9SAndroid Build Coastguard Worker> This is a string that lists the id of [`key` elements](#element-key) for each of the keys in a row, whether those are explicitly listed in the file or are implied.  See the `key` documentation for more detail.
1736*912701f9SAndroid Build Coastguard Worker>
1737*912701f9SAndroid Build Coastguard Worker> For non-`touch` forms, the number of keys in each row may not exceed the number of scan codes defined for that row, and the number of rows may not exceed the defined number of rows for that form. See [`scanCodes`](#element-scancodes);
1738*912701f9SAndroid Build Coastguard Worker
1739*912701f9SAndroid Build Coastguard Worker**Example**
1740*912701f9SAndroid Build Coastguard Worker
1741*912701f9SAndroid Build Coastguard WorkerHere is an example of a `row` element:
1742*912701f9SAndroid Build Coastguard Worker
1743*912701f9SAndroid Build Coastguard Worker```xml
1744*912701f9SAndroid Build Coastguard Worker<row keys="a z e r t y u i o p caret dollar" />
1745*912701f9SAndroid Build Coastguard Worker```
1746*912701f9SAndroid Build Coastguard Worker
1747*912701f9SAndroid Build Coastguard Worker* * *
1748*912701f9SAndroid Build Coastguard Worker
1749*912701f9SAndroid Build Coastguard Worker### Element: variables
1750*912701f9SAndroid Build Coastguard Worker
1751*912701f9SAndroid Build Coastguard Worker> <small>
1752*912701f9SAndroid Build Coastguard Worker>
1753*912701f9SAndroid Build Coastguard Worker> Parents: [keyboard3](#element-keyboard3)
1754*912701f9SAndroid Build Coastguard Worker>
1755*912701f9SAndroid Build Coastguard Worker> Children: [import](#element-import), [_special_](tr35.md#special), [string](#element-string), [set](#element-set), [uset](#element-uset)
1756*912701f9SAndroid Build Coastguard Worker>
1757*912701f9SAndroid Build Coastguard Worker> Occurrence: optional, single
1758*912701f9SAndroid Build Coastguard Worker> </small>
1759*912701f9SAndroid Build Coastguard Worker
1760*912701f9SAndroid Build Coastguard WorkerThis is a container for variables to be used with [transform](#element-transform), [display](#element-display) and [key](#element-key) elements.
1761*912701f9SAndroid Build Coastguard Worker
1762*912701f9SAndroid Build Coastguard WorkerNote that the `id=` attribute value must be unique across all children of the `variables` element.
1763*912701f9SAndroid Build Coastguard Worker
1764*912701f9SAndroid Build Coastguard Worker**Example**
1765*912701f9SAndroid Build Coastguard Worker
1766*912701f9SAndroid Build Coastguard Worker```xml
1767*912701f9SAndroid Build Coastguard Worker<variables>
1768*912701f9SAndroid Build Coastguard Worker    <string id="y" value="yes" /> <!-- a simple string-->
1769*912701f9SAndroid Build Coastguard Worker    <set id="upper" value="A B C D E FF" /> <!-- a set with 6 items -->
1770*912701f9SAndroid Build Coastguard Worker    <uset id="consonants" value="[कसतनमह]" /> <!-- a UnicodeSet -->
1771*912701f9SAndroid Build Coastguard Worker</variables>
1772*912701f9SAndroid Build Coastguard Worker```
1773*912701f9SAndroid Build Coastguard Worker
1774*912701f9SAndroid Build Coastguard Worker* * *
1775*912701f9SAndroid Build Coastguard Worker
1776*912701f9SAndroid Build Coastguard Worker### Element: string
1777*912701f9SAndroid Build Coastguard Worker
1778*912701f9SAndroid Build Coastguard Worker> <small>
1779*912701f9SAndroid Build Coastguard Worker>
1780*912701f9SAndroid Build Coastguard Worker> Parents: [variables](#element-variables)
1781*912701f9SAndroid Build Coastguard Worker>
1782*912701f9SAndroid Build Coastguard Worker> Children: _none_
1783*912701f9SAndroid Build Coastguard Worker>
1784*912701f9SAndroid Build Coastguard Worker> Occurrence: optional, multiple
1785*912701f9SAndroid Build Coastguard Worker> </small>
1786*912701f9SAndroid Build Coastguard Worker
1787*912701f9SAndroid Build Coastguard Worker> This element contains a single string which is used by the [transform](#element-transform) elements for string matching and substitution, as well as by the [key](#element-key) and [display](#element-display) elements.
1788*912701f9SAndroid Build Coastguard Worker
1789*912701f9SAndroid Build Coastguard Worker_Attribute:_ `id` (required)
1790*912701f9SAndroid Build Coastguard Worker
1791*912701f9SAndroid Build Coastguard Worker> Specifies the identifier (name) of this string.
1792*912701f9SAndroid Build Coastguard Worker> All ids must be unique across all types of variables.
1793*912701f9SAndroid Build Coastguard Worker>
1794*912701f9SAndroid Build Coastguard Worker> `id` must match `[0-9A-Za-z_]{1,32}`
1795*912701f9SAndroid Build Coastguard Worker
1796*912701f9SAndroid Build Coastguard Worker_Attribute:_ `value` (required)
1797*912701f9SAndroid Build Coastguard Worker
1798*912701f9SAndroid Build Coastguard Worker> Strings may contain whitespaces. However, for clarity, it is recommended to escape spacing marks, even in strings.
1799*912701f9SAndroid Build Coastguard Worker> This attribute value may be escaped with `\u` notation, see [Escaping](#escaping).
1800*912701f9SAndroid Build Coastguard Worker> Variables may refer to other string variables if they have been previously defined, using `${string}` syntax.
1801*912701f9SAndroid Build Coastguard Worker> [Markers](#markers) may be included with the `\m{…}` notation.
1802*912701f9SAndroid Build Coastguard Worker
1803*912701f9SAndroid Build Coastguard Worker**Example**
1804*912701f9SAndroid Build Coastguard Worker
1805*912701f9SAndroid Build Coastguard Worker```xml
1806*912701f9SAndroid Build Coastguard Worker<variables>
1807*912701f9SAndroid Build Coastguard Worker    <string id="cluster_hi" value="हि" /> <!-- a string -->
1808*912701f9SAndroid Build Coastguard Worker    <string id="zwnj" value="\u{200C}"/> <!-- single codepoint -->
1809*912701f9SAndroid Build Coastguard Worker    <string id="acute" value="\m{acute}"/> <!-- refer to a marker -->
1810*912701f9SAndroid Build Coastguard Worker    <string id="backquote" value="`"/>
1811*912701f9SAndroid Build Coastguard Worker    <string id="zwnj_acute" value="${zwnj}${acute}"  /> <!-- Combine two variables -->
1812*912701f9SAndroid Build Coastguard Worker    <string id="zwnj_sp_acute" value="${zwnj}\u{0020}${acute}"  /> <!-- Combine two variables -->
1813*912701f9SAndroid Build Coastguard Worker</variables>
1814*912701f9SAndroid Build Coastguard Worker```
1815*912701f9SAndroid Build Coastguard Worker
1816*912701f9SAndroid Build Coastguard WorkerThese may be then used in multiple contexts:
1817*912701f9SAndroid Build Coastguard Worker
1818*912701f9SAndroid Build Coastguard Worker```xml
1819*912701f9SAndroid Build Coastguard Worker<!-- as part of a regex -->
1820*912701f9SAndroid Build Coastguard Worker<transform from="${cluster_hi}X" to="X" />
1821*912701f9SAndroid Build Coastguard Worker<transform from="Y" to="${cluster_hi}" />
1822*912701f9SAndroid Build Coastguard Worker1823*912701f9SAndroid Build Coastguard Worker<!-- as part of a key bag  -->
1824*912701f9SAndroid Build Coastguard Worker<key id="hi_key" output="${cluster_hi}" />
1825*912701f9SAndroid Build Coastguard Worker<key id="acute_key" output="${acute}" />
1826*912701f9SAndroid Build Coastguard Worker1827*912701f9SAndroid Build Coastguard Worker<!-- Display ´ instead of the non-displayable marker -->
1828*912701f9SAndroid Build Coastguard Worker<display output="${acute}" display="${backquote}" />
1829*912701f9SAndroid Build Coastguard Worker```
1830*912701f9SAndroid Build Coastguard Worker
1831*912701f9SAndroid Build Coastguard Worker* * *
1832*912701f9SAndroid Build Coastguard Worker
1833*912701f9SAndroid Build Coastguard Worker### Element: set
1834*912701f9SAndroid Build Coastguard Worker
1835*912701f9SAndroid Build Coastguard Worker> <small>
1836*912701f9SAndroid Build Coastguard Worker>
1837*912701f9SAndroid Build Coastguard Worker> Parents: [variables](#element-variables)
1838*912701f9SAndroid Build Coastguard Worker>
1839*912701f9SAndroid Build Coastguard Worker> Children: _none_
1840*912701f9SAndroid Build Coastguard Worker>
1841*912701f9SAndroid Build Coastguard Worker> Occurrence: optional, multiple
1842*912701f9SAndroid Build Coastguard Worker> </small>
1843*912701f9SAndroid Build Coastguard Worker
1844*912701f9SAndroid Build Coastguard Worker> This element contains a set of strings used by the [transform](#element-transform) elements for string matching and substitution.
1845*912701f9SAndroid Build Coastguard Worker
1846*912701f9SAndroid Build Coastguard Worker_Attribute:_ `id` (required)
1847*912701f9SAndroid Build Coastguard Worker
1848*912701f9SAndroid Build Coastguard Worker> Specifies the identifier (name) of this set.
1849*912701f9SAndroid Build Coastguard Worker> All ids must be unique across all types of variables.
1850*912701f9SAndroid Build Coastguard Worker>
1851*912701f9SAndroid Build Coastguard Worker> `id` must match `[0-9A-Za-z_]{1,32}`
1852*912701f9SAndroid Build Coastguard Worker
1853*912701f9SAndroid Build Coastguard Worker_Attribute:_ `value` (required)
1854*912701f9SAndroid Build Coastguard Worker
1855*912701f9SAndroid Build Coastguard Worker> The `value` attribute value is always a set of strings separated by whitespace, even if there is only a single item in the set, such as `"A"`.
1856*912701f9SAndroid Build Coastguard Worker> Leading and trailing whitespace is ignored.
1857*912701f9SAndroid Build Coastguard Worker> This attribute value may be escaped with `\u` notation, see [Escaping](#escaping).
1858*912701f9SAndroid Build Coastguard Worker> Sets may refer to other string variables if they have been previously defined, using `${string}` syntax, or to other previously-defined sets using `$[set]` syntax.
1859*912701f9SAndroid Build Coastguard Worker> Set references must be separated by whitespace: `$[set1]$[set2]` is an error; instead use `$[set1] $[set2]`.
1860*912701f9SAndroid Build Coastguard Worker> [Markers](#markers) may be included with the `\m{…}` notation.
1861*912701f9SAndroid Build Coastguard Worker
1862*912701f9SAndroid Build Coastguard Worker**Examples**
1863*912701f9SAndroid Build Coastguard Worker
1864*912701f9SAndroid Build Coastguard Worker```xml
1865*912701f9SAndroid Build Coastguard Worker<variables>
1866*912701f9SAndroid Build Coastguard Worker    <set id="upper" value="A B CC D E FF " /> <!-- 6 items -->
1867*912701f9SAndroid Build Coastguard Worker    <set id="lower" value="a b c  d e  f " /> <!-- 6 items -->
1868*912701f9SAndroid Build Coastguard Worker    <set id="upper_or_lower" value="$[upper] $[lower]"  /> <!-- Concatenate two sets -->
1869*912701f9SAndroid Build Coastguard Worker    <set id="lower_or_upper" value="$[lower] $[upper]"  /> <!-- Concatenate two sets -->
1870*912701f9SAndroid Build Coastguard Worker    <set id="a" value="A"/> <!-- Just one element, an 'A'-->
1871*912701f9SAndroid Build Coastguard Worker    <set id="cluster_or_zwnj" value="${hi_cluster} ${zwnj}"/> <!-- 2 items: "हि \u${200C}"-->
1872*912701f9SAndroid Build Coastguard Worker</variables>
1873*912701f9SAndroid Build Coastguard Worker```
1874*912701f9SAndroid Build Coastguard Worker
1875*912701f9SAndroid Build Coastguard WorkerMatch "X" followed by any uppercase letter:
1876*912701f9SAndroid Build Coastguard Worker
1877*912701f9SAndroid Build Coastguard Worker```xml
1878*912701f9SAndroid Build Coastguard Worker<transform from="X$[upper]" to="…" />
1879*912701f9SAndroid Build Coastguard Worker```
1880*912701f9SAndroid Build Coastguard Worker
1881*912701f9SAndroid Build Coastguard WorkerMap from upper to lower:
1882*912701f9SAndroid Build Coastguard Worker
1883*912701f9SAndroid Build Coastguard Worker```xml
1884*912701f9SAndroid Build Coastguard Worker<transform from="($[upper])" to="$[1:lower]" />
1885*912701f9SAndroid Build Coastguard Worker```
1886*912701f9SAndroid Build Coastguard Worker
1887*912701f9SAndroid Build Coastguard WorkerSee [transform](#element-transform) for further details and syntax.
1888*912701f9SAndroid Build Coastguard Worker
1889*912701f9SAndroid Build Coastguard Worker* * *
1890*912701f9SAndroid Build Coastguard Worker
1891*912701f9SAndroid Build Coastguard Worker### Element: uset
1892*912701f9SAndroid Build Coastguard Worker
1893*912701f9SAndroid Build Coastguard Worker> <small>
1894*912701f9SAndroid Build Coastguard Worker>
1895*912701f9SAndroid Build Coastguard Worker> Parents: [variables](#element-variables)
1896*912701f9SAndroid Build Coastguard Worker>
1897*912701f9SAndroid Build Coastguard Worker> Children: _none_
1898*912701f9SAndroid Build Coastguard Worker>
1899*912701f9SAndroid Build Coastguard Worker> Occurrence: optional, multiple
1900*912701f9SAndroid Build Coastguard Worker> </small>
1901*912701f9SAndroid Build Coastguard Worker
1902*912701f9SAndroid Build Coastguard Worker> This element contains a set, using a subset of the [UnicodeSet](tr35.md#Unicode_Sets) format, used by the [`transform`](#element-transform) elements for string matching and substitution.
1903*912701f9SAndroid Build Coastguard Worker> Note important restrictions on the syntax below.
1904*912701f9SAndroid Build Coastguard Worker
1905*912701f9SAndroid Build Coastguard Worker_Attribute:_ `id` (required)
1906*912701f9SAndroid Build Coastguard Worker
1907*912701f9SAndroid Build Coastguard Worker> Specifies the identifier (name) of this uset.
1908*912701f9SAndroid Build Coastguard Worker> All ids must be unique across all types of variables.
1909*912701f9SAndroid Build Coastguard Worker>
1910*912701f9SAndroid Build Coastguard Worker> `id` must match `[0-9A-Za-z_]{1,32}`
1911*912701f9SAndroid Build Coastguard Worker
1912*912701f9SAndroid Build Coastguard Worker_Attribute:_ `value` (required)
1913*912701f9SAndroid Build Coastguard Worker
1914*912701f9SAndroid Build Coastguard Worker> String value in a subset of [UnicodeSet](tr35.md#Unicode_Sets) format.
1915*912701f9SAndroid Build Coastguard Worker> Leading and trailing whitespace is ignored.
1916*912701f9SAndroid Build Coastguard Worker> Variables may refer to other string variables if they have been previously defined, using `${string}` syntax, or to other previously-defined `uset` elements (not `set` elements) using `$[...usetId]` syntax.
1917*912701f9SAndroid Build Coastguard Worker
1918*912701f9SAndroid Build Coastguard Worker
1919*912701f9SAndroid Build Coastguard Worker- Warning: `uset` elements look superficially similar to regex character classes as used in [`transform`](#element-transform) elements, but they are different. `uset`s must be defined with a `uset` element, and referenced with the `$[...usetId]` notation in transforms. `uset`s cannot be specified inline in a transform, and can only be used indirectly by reference to the corresponding `uset` element.
1920*912701f9SAndroid Build Coastguard Worker- Multi-character strings (`{}`) are not supported, such as `[żġħ{ie}{għ}]`.
1921*912701f9SAndroid Build Coastguard Worker- UnicodeSet property notation (`\p{…}` or `[:…:]`) may **NOT** be used.
1922*912701f9SAndroid Build Coastguard Worker
1923*912701f9SAndroid Build Coastguard Worker> **Rationale**: allowing property notation would make keyboard implementations dependent on a particular version of Unicode. However, implementations and tools may wish to pre-calculate the value of a particular uset, and "freeze" it as explicit code points.  The example below of `$[KhmrMn]` matches nonspacing marks in the `Khmr` script.
1924*912701f9SAndroid Build Coastguard Worker
1925*912701f9SAndroid Build Coastguard Worker- `uset` elements may represent a very large number of codepoints. Keyboard implementations may set a limit on how many unique range entries may be matched.
1926*912701f9SAndroid Build Coastguard Worker- The `uset` element may not be used as the source or target for mapping operations (`$[1:variable]` syntax).
1927*912701f9SAndroid Build Coastguard Worker- The `uset` element may not be referenced by [`key`](#element-key) or [`display`](#element-display) elements.
1928*912701f9SAndroid Build Coastguard Worker
1929*912701f9SAndroid Build Coastguard Worker**Examples**
1930*912701f9SAndroid Build Coastguard Worker
1931*912701f9SAndroid Build Coastguard Worker```xml
1932*912701f9SAndroid Build Coastguard Worker<variables>
1933*912701f9SAndroid Build Coastguard Worker  <uset id="consonants" value="[कसतनमह]" /> <!-- unicode set range -->
1934*912701f9SAndroid Build Coastguard Worker  <uset id="range" value="[a-z D E F G \u{200A}]" /> <!-- a through z, plus a few others -->
1935*912701f9SAndroid Build Coastguard Worker  <uset id="newrange" value="[$[range]-[G]]" /> <!-- The above range, but not including G -->
1936*912701f9SAndroid Build Coastguard Worker  <uset id="KhmrMn" value="[\u{17B4}\u{17B5}\u{17B7}-\u{17BD}\u{17C6}\u{17C9}-\u{17D3}\u{17DD}]"> <!--  [[:Khmr:][:Mn:]] as of Unicode 15.0-->
1937*912701f9SAndroid Build Coastguard Worker</variables>
1938*912701f9SAndroid Build Coastguard Worker```
1939*912701f9SAndroid Build Coastguard Worker
1940*912701f9SAndroid Build Coastguard Worker* * *
1941*912701f9SAndroid Build Coastguard Worker
1942*912701f9SAndroid Build Coastguard Worker### Element: transforms
1943*912701f9SAndroid Build Coastguard Worker
1944*912701f9SAndroid Build Coastguard WorkerThis element defines a group of one or more `transform` elements associated with this keyboard layout. This is used to support features such as dead-keys, character reordering, backspace behavior, etc. using a straightforward structure that works for all the keyboards tested, and that results in readable source data.
1945*912701f9SAndroid Build Coastguard Worker
1946*912701f9SAndroid Build Coastguard WorkerThere can be multiple `<transforms>` elements, but only one for each `type`.
1947*912701f9SAndroid Build Coastguard Worker
1948*912701f9SAndroid Build Coastguard Worker**Syntax**
1949*912701f9SAndroid Build Coastguard Worker
1950*912701f9SAndroid Build Coastguard Worker```xml
1951*912701f9SAndroid Build Coastguard Worker<transforms type="…type">
1952*912701f9SAndroid Build Coastguard Worker    <transformGroup …/>
1953*912701f9SAndroid Build Coastguard Worker    <transformGroup …/>
1954*912701f9SAndroid Build Coastguard Worker1955*912701f9SAndroid Build Coastguard Worker</transforms>
1956*912701f9SAndroid Build Coastguard Worker```
1957*912701f9SAndroid Build Coastguard Worker
1958*912701f9SAndroid Build Coastguard Worker> <small>
1959*912701f9SAndroid Build Coastguard Worker>
1960*912701f9SAndroid Build Coastguard Worker> Parents: [keyboard3](#element-keyboard3)
1961*912701f9SAndroid Build Coastguard Worker>
1962*912701f9SAndroid Build Coastguard Worker> Children: [import](#element-import), [_special_](tr35.md#special), [transformGroup](#element-transformgroup)
1963*912701f9SAndroid Build Coastguard Worker>
1964*912701f9SAndroid Build Coastguard Worker> Occurrence: optional, multiple
1965*912701f9SAndroid Build Coastguard Worker>
1966*912701f9SAndroid Build Coastguard Worker> </small>
1967*912701f9SAndroid Build Coastguard Worker
1968*912701f9SAndroid Build Coastguard Worker_Attribute:_ `type` (required)
1969*912701f9SAndroid Build Coastguard Worker
1970*912701f9SAndroid Build Coastguard Worker> Values: `simple`, `backspace`
1971*912701f9SAndroid Build Coastguard Worker
1972*912701f9SAndroid Build Coastguard WorkerThere are other keying behaviors that are needed particularly in handing complex orthographies from various parts of the world. The behaviors intended to be covered by the transforms are:
1973*912701f9SAndroid Build Coastguard Worker
1974*912701f9SAndroid Build Coastguard Worker* Reordering combining marks. The order required for underlying storage may differ considerably from the desired typing order. In addition, a keyboard may want to allow for different typing orders.
1975*912701f9SAndroid Build Coastguard Worker* Error indication. Sometimes a keyboard layout will want to specify to the application that a particular keying sequence in a context is in error and that the application should indicate that that particular keypress is erroneous.
1976*912701f9SAndroid Build Coastguard Worker* Backspace handling. There are various approaches to handling the backspace key. An application may treat it as an undo of the last key input, or it may simply delete the last character in the currently output text, or it may use transform rules to tell it how much to delete.
1977*912701f9SAndroid Build Coastguard Worker
1978*912701f9SAndroid Build Coastguard Worker#### Markers
1979*912701f9SAndroid Build Coastguard Worker
1980*912701f9SAndroid Build Coastguard WorkerMarkers are placeholders which record some state, but without producing normal visible text output.  They were designed particularly to support dead-keys.
1981*912701f9SAndroid Build Coastguard Worker
1982*912701f9SAndroid Build Coastguard WorkerThe marker ID is any valid `NMTOKEN`.
1983*912701f9SAndroid Build Coastguard Worker
1984*912701f9SAndroid Build Coastguard WorkerConsider the following abbreviated example:
1985*912701f9SAndroid Build Coastguard Worker
1986*912701f9SAndroid Build Coastguard Worker```xml
1987*912701f9SAndroid Build Coastguard Worker    <display output="\m{circ_marker}" display="^" />
1988*912701f9SAndroid Build Coastguard Worker1989*912701f9SAndroid Build Coastguard Worker    <key id="circ_key" output="\m{circ_marker}" />
1990*912701f9SAndroid Build Coastguard Worker    <key id="e" output="e" />
1991*912701f9SAndroid Build Coastguard Worker1992*912701f9SAndroid Build Coastguard Worker    <transform from="\m{circ_marker}e" to="ê" />
1993*912701f9SAndroid Build Coastguard Worker```
1994*912701f9SAndroid Build Coastguard Worker
1995*912701f9SAndroid Build Coastguard Worker1. The user presses the `circ_key` key. The key can be shown with the keycap `^` due to the `<display>` element.
1996*912701f9SAndroid Build Coastguard Worker
1997*912701f9SAndroid Build Coastguard Worker2. The special marker, `circ_marker`, is added to the end of the input context.
1998*912701f9SAndroid Build Coastguard Worker
1999*912701f9SAndroid Build Coastguard Worker    The input context does not match any transforms.
2000*912701f9SAndroid Build Coastguard Worker
2001*912701f9SAndroid Build Coastguard Worker    The input context has:
2002*912701f9SAndroid Build Coastguard Worker
2003*912701f9SAndroid Build Coastguard Worker    - …
2004*912701f9SAndroid Build Coastguard Worker    - marker `circ_marker`
2005*912701f9SAndroid Build Coastguard Worker
2006*912701f9SAndroid Build Coastguard Worker3. Also due to the `<display>` element, implementations can opt to display a visible `^` (perhaps visually distinct from a plain `^` carat). Implementations may opt to display nothing and only store the marker in the input context.
2007*912701f9SAndroid Build Coastguard Worker
2008*912701f9SAndroid Build Coastguard Worker4. The user now presses the `e` key, which is also added to the input context. The input context now has:
2009*912701f9SAndroid Build Coastguard Worker
2010*912701f9SAndroid Build Coastguard Worker    - …
2011*912701f9SAndroid Build Coastguard Worker    - character `e`
2012*912701f9SAndroid Build Coastguard Worker    - marker `circ_marker`
2013*912701f9SAndroid Build Coastguard Worker
2014*912701f9SAndroid Build Coastguard Worker5. Now, the input context matches the transform.  The `e` and the marker are replaced with `ê`.
2015*912701f9SAndroid Build Coastguard Worker
2016*912701f9SAndroid Build Coastguard Worker    The input context now has:
2017*912701f9SAndroid Build Coastguard Worker
2018*912701f9SAndroid Build Coastguard Worker    - …
2019*912701f9SAndroid Build Coastguard Worker    - character `ê`
2020*912701f9SAndroid Build Coastguard Worker
2021*912701f9SAndroid Build Coastguard Worker**Using markers to inhibit other transforms**
2022*912701f9SAndroid Build Coastguard Worker
2023*912701f9SAndroid Build Coastguard WorkerSometimes it is desirable to prevent transforms from having an effect.
2024*912701f9SAndroid Build Coastguard WorkerPerhaps two different keys output the same characters, with different key or modifier combinations, but only one of them is intended to participate in a transform.
2025*912701f9SAndroid Build Coastguard Worker
2026*912701f9SAndroid Build Coastguard WorkerConsider the following case, where pressing the keys `X`, `e` results in `^e`, which is transformed into `ê`.
2027*912701f9SAndroid Build Coastguard Worker
2028*912701f9SAndroid Build Coastguard Worker```xml
2029*912701f9SAndroid Build Coastguard Worker<keys>
2030*912701f9SAndroid Build Coastguard Worker    <key id="X" output="^"/>
2031*912701f9SAndroid Build Coastguard Worker    <key id="e" output="e" />
2032*912701f9SAndroid Build Coastguard Worker</keys>
2033*912701f9SAndroid Build Coastguard Worker<transforms>
2034*912701f9SAndroid Build Coastguard Worker    <transform from="^e" output="ê"/>
2035*912701f9SAndroid Build Coastguard Worker</transforms>
2036*912701f9SAndroid Build Coastguard Worker```
2037*912701f9SAndroid Build Coastguard Worker
2038*912701f9SAndroid Build Coastguard WorkerHowever, what if the user wanted to produce `^e` without the transform taking effect?
2039*912701f9SAndroid Build Coastguard WorkerOne strategy would be to use a marker, which won’t be visible in the output, but will inhibit the transform.
2040*912701f9SAndroid Build Coastguard Worker
2041*912701f9SAndroid Build Coastguard Worker```xml
2042*912701f9SAndroid Build Coastguard Worker<keys>
2043*912701f9SAndroid Build Coastguard Worker    <key id="caret" output="^\m{no_transform}"/>
2044*912701f9SAndroid Build Coastguard Worker    <key id="X" output="^" />
2045*912701f9SAndroid Build Coastguard Worker    <key id="e" output="e" />
2046*912701f9SAndroid Build Coastguard Worker</keys>
2047*912701f9SAndroid Build Coastguard Worker2048*912701f9SAndroid Build Coastguard Worker<transforms>
2049*912701f9SAndroid Build Coastguard Worker    <!-- this wouldn't match the key caret output because of the marker -->
2050*912701f9SAndroid Build Coastguard Worker    <transform from="^e" output="ê"/>
2051*912701f9SAndroid Build Coastguard Worker</transforms>
2052*912701f9SAndroid Build Coastguard Worker```
2053*912701f9SAndroid Build Coastguard Worker
2054*912701f9SAndroid Build Coastguard WorkerPressing `caret` `e` will result in `^e` (with an invisible _no_transform_ marker — note that any name could be used). The `^e` won’t have the transform applied, at least while the marker’s context remains valid.
2055*912701f9SAndroid Build Coastguard Worker
2056*912701f9SAndroid Build Coastguard WorkerAnother strategy might be to use a marker to indicate where transforms are desired, instead of where they aren't desired.
2057*912701f9SAndroid Build Coastguard Worker
2058*912701f9SAndroid Build Coastguard Worker```xml
2059*912701f9SAndroid Build Coastguard Worker<keys>
2060*912701f9SAndroid Build Coastguard Worker    <key id="caret" output="^"/>
2061*912701f9SAndroid Build Coastguard Worker    <key id="X" output="^\m{transform}"/>
2062*912701f9SAndroid Build Coastguard Worker    <key id="e" output="e" />
2063*912701f9SAndroid Build Coastguard Worker</keys>
2064*912701f9SAndroid Build Coastguard Worker2065*912701f9SAndroid Build Coastguard Worker<transforms …>
2066*912701f9SAndroid Build Coastguard Worker    <!-- Won't match ^e without marker. -->
2067*912701f9SAndroid Build Coastguard Worker    <transform from="^\m{transform}e" output="ê"/>
2068*912701f9SAndroid Build Coastguard Worker</transforms>
2069*912701f9SAndroid Build Coastguard Worker```
2070*912701f9SAndroid Build Coastguard Worker
2071*912701f9SAndroid Build Coastguard WorkerIn this way, only the `X`, `e` keys will produce `^e` with a _transform_ marker (again, any name could be used) which will cause the transform to be applied. One benefit is that navigating to an existing `^` in a document and adding an `e` will result in `^e`, and this output will not be affected by the transform, because there will be no marker present there (remember that markers are not stored with the document but only recorded in memory temporarily during text input).
2072*912701f9SAndroid Build Coastguard Worker
2073*912701f9SAndroid Build Coastguard WorkerPlease note important considerations for [Normalization and Markers](#normalization-and-markers).
2074*912701f9SAndroid Build Coastguard Worker
2075*912701f9SAndroid Build Coastguard Worker**Effect of markers on final text**
2076*912701f9SAndroid Build Coastguard Worker
2077*912701f9SAndroid Build Coastguard WorkerAll markers must be removed before text is returned to the application from the input context.
2078*912701f9SAndroid Build Coastguard WorkerIf the input context changes, such as if the cursor or mouse moves the insertion point somewhere else, all markers in the input context are removed.
2079*912701f9SAndroid Build Coastguard Worker
2080*912701f9SAndroid Build Coastguard Worker**Implementation Notes**
2081*912701f9SAndroid Build Coastguard Worker
2082*912701f9SAndroid Build Coastguard WorkerIdeally, markers are implemented entirely out-of-band from the normal text stream. However, implementations _may_ choose to map each marker to a [Unicode private-use character](https://www.unicode.org/glossary/#private_use_character) for use only within the implementation’s processing and temporary storage in the input context.
2083*912701f9SAndroid Build Coastguard Worker
2084*912701f9SAndroid Build Coastguard WorkerFor example, the first marker encountered could be represented as U+E000, the second by U+E001 and so on.  If a regex processing engine were used, then those PUA characters could be processed through the existing regex processing engine.  `[^\u{E000}-\u{E009}]` could be used as an expression to match a character that is not a marker, and `[Ee]\u{E000}` could match `E` or `e` followed by the first marker.
2085*912701f9SAndroid Build Coastguard Worker
2086*912701f9SAndroid Build Coastguard WorkerSuch implementations must take care to remove all such markers (see prior section) from the resultant text. As well, implementations must take care to avoid conflicts if applications themselves are using PUA characters, such as is often done with not-yet-encoded scripts or characters.
2087*912701f9SAndroid Build Coastguard Worker
2088*912701f9SAndroid Build Coastguard Worker* * *
2089*912701f9SAndroid Build Coastguard Worker
2090*912701f9SAndroid Build Coastguard Worker### Element: transformGroup
2091*912701f9SAndroid Build Coastguard Worker
2092*912701f9SAndroid Build Coastguard Worker> <small>
2093*912701f9SAndroid Build Coastguard Worker>
2094*912701f9SAndroid Build Coastguard Worker> Parents: [transforms](#element-transforms)
2095*912701f9SAndroid Build Coastguard Worker>
2096*912701f9SAndroid Build Coastguard Worker> Children: [import](#element-import), [reorder](#element-reorder), [_special_](tr35.md#special), [transform](#element-transform)
2097*912701f9SAndroid Build Coastguard Worker>
2098*912701f9SAndroid Build Coastguard Worker> Occurrence: optional, multiple
2099*912701f9SAndroid Build Coastguard Worker> </small>
2100*912701f9SAndroid Build Coastguard Worker
2101*912701f9SAndroid Build Coastguard WorkerA `transformGroup` contains a set of transform elements or reorder elements.
2102*912701f9SAndroid Build Coastguard Worker
2103*912701f9SAndroid Build Coastguard WorkerEach `transformGroup` is processed entirely before proceeding to the next one.
2104*912701f9SAndroid Build Coastguard Worker
2105*912701f9SAndroid Build Coastguard Worker
2106*912701f9SAndroid Build Coastguard WorkerEach `transformGroup` element, after imports are processed, must have either [reorder](#element-reorder) elements or [transform](#element-transform) elements, but not both. The `<transformGroup>` element may not be empty.
2107*912701f9SAndroid Build Coastguard Worker
2108*912701f9SAndroid Build Coastguard Worker**Examples**
2109*912701f9SAndroid Build Coastguard Worker
2110*912701f9SAndroid Build Coastguard Worker
2111*912701f9SAndroid Build Coastguard Worker#### Example: `transformGroup` with `transform` elements
2112*912701f9SAndroid Build Coastguard Worker
2113*912701f9SAndroid Build Coastguard WorkerThis is a `transformGroup` that consists of one or more [`transform`](#element-transform) elements, prefaced by one or more `import` elements. See the discussion of those elements for details. `import` elements in this group may not import `reorder` elements.
2114*912701f9SAndroid Build Coastguard Worker
2115*912701f9SAndroid Build Coastguard Worker
2116*912701f9SAndroid Build Coastguard Worker```xml
2117*912701f9SAndroid Build Coastguard Worker<transformGroup>
2118*912701f9SAndroid Build Coastguard Worker    <import path="…"/> <!-- optional import elements-->
2119*912701f9SAndroid Build Coastguard Worker    <transform />
2120*912701f9SAndroid Build Coastguard Worker    <!-- other <transform/> elements -->
2121*912701f9SAndroid Build Coastguard Worker</transformGroup>
2122*912701f9SAndroid Build Coastguard Worker```
2123*912701f9SAndroid Build Coastguard Worker
2124*912701f9SAndroid Build Coastguard Worker
2125*912701f9SAndroid Build Coastguard Worker#### Example: `transformGroup` with `reorder` elements
2126*912701f9SAndroid Build Coastguard Worker
2127*912701f9SAndroid Build Coastguard WorkerThis is a `transformGroup` that consists of one or more [`transform`](#element-transform) elements, optionally prefaced by one or more `import` elements that import `transform` elements. See the discussion of those elements for details.
2128*912701f9SAndroid Build Coastguard Worker
2129*912701f9SAndroid Build Coastguard Worker`import` elements in this group may not import `transform` elements.
2130*912701f9SAndroid Build Coastguard Worker
2131*912701f9SAndroid Build Coastguard Worker```xml
2132*912701f9SAndroid Build Coastguard Worker<transformGroup>
2133*912701f9SAndroid Build Coastguard Worker    <import path="…"/> <!-- optional import elements-->
2134*912701f9SAndroid Build Coastguard Worker    <reorder … />
2135*912701f9SAndroid Build Coastguard Worker    <!-- other <reorder> elements -->
2136*912701f9SAndroid Build Coastguard Worker</transformGroup>
2137*912701f9SAndroid Build Coastguard Worker```
2138*912701f9SAndroid Build Coastguard Worker
2139*912701f9SAndroid Build Coastguard Worker* * *
2140*912701f9SAndroid Build Coastguard Worker
2141*912701f9SAndroid Build Coastguard Worker### Element: transform
2142*912701f9SAndroid Build Coastguard Worker
2143*912701f9SAndroid Build Coastguard WorkerThis element contains a single transform that may be performed using the keyboard layout. A transform is an element that specifies a set of conversions from sequences of code points into (one or more) other code points. For example, in most French keyboards hitting the `^` dead-key followed by the `e` key produces `ê`.
2144*912701f9SAndroid Build Coastguard Worker
2145*912701f9SAndroid Build Coastguard WorkerMatches are processed against the "input context", a temporary buffer containing all relevant text up to the insertion point. If the user moves the insertion point, the input context is discarded and recreated from the application’s text buffer.  Implementations may discard the input context at any time.
2146*912701f9SAndroid Build Coastguard Worker
2147*912701f9SAndroid Build Coastguard WorkerThe input context may contain, besides regular text, any [Markers](#markers) as a result of keys or transforms, since the insertion point was moved.
2148*912701f9SAndroid Build Coastguard Worker
2149*912701f9SAndroid Build Coastguard WorkerUsing regular expression terminology, matches are done as if there was an implicit `$` (match end of buffer) at the end of each pattern. In other words, `<transform from="ke" …>` will not match an input context ending with `…keyboard`, but it will match the last two codepoints of an input context ending with `…awake`.
2150*912701f9SAndroid Build Coastguard Worker
2151*912701f9SAndroid Build Coastguard WorkerAll of the `transform` elements in a `transformGroup` are tested for a match, in order, until a match is found. Then, the matching element is processed, and then processing proceeds to the **next** `transformGroup`. If none of the `transform` elements match, processing proceeds without modification to the buffer to the **next** `transformGroup`.
2152*912701f9SAndroid Build Coastguard Worker
2153*912701f9SAndroid Build Coastguard Worker**Syntax**
2154*912701f9SAndroid Build Coastguard Worker
2155*912701f9SAndroid Build Coastguard Worker```xml
2156*912701f9SAndroid Build Coastguard Worker<transform from="…matching pattern" to="…output pattern"/>
2157*912701f9SAndroid Build Coastguard Worker```
2158*912701f9SAndroid Build Coastguard Worker
2159*912701f9SAndroid Build Coastguard Worker> <small>
2160*912701f9SAndroid Build Coastguard Worker>
2161*912701f9SAndroid Build Coastguard Worker> Parents: [transformGroup](#element-transformgroup)
2162*912701f9SAndroid Build Coastguard Worker> Children: _none_
2163*912701f9SAndroid Build Coastguard Worker> Occurrence: required, multiple
2164*912701f9SAndroid Build Coastguard Worker>
2165*912701f9SAndroid Build Coastguard Worker> </small>
2166*912701f9SAndroid Build Coastguard Worker
2167*912701f9SAndroid Build Coastguard Worker
2168*912701f9SAndroid Build Coastguard Worker_Attribute:_ `from` (required)
2169*912701f9SAndroid Build Coastguard Worker
2170*912701f9SAndroid Build Coastguard Worker> The `from` attribute value consists of an input rule for matching the input context.
2171*912701f9SAndroid Build Coastguard Worker>
2172*912701f9SAndroid Build Coastguard Worker> The `transform` rule and output pattern uses a modified, mostly subsetted, regular expression syntax, with EcmaScript syntax (with the `u` Unicode flag) as its baseline reference (see [MDN-REGEX](https://developer.mozilla.org/docs/Web/JavaScript/Guide/Regular_Expressions)). Differences from regex implementations will be noted.
2173*912701f9SAndroid Build Coastguard Worker
2174*912701f9SAndroid Build Coastguard Worker#### Regex-like Syntax
2175*912701f9SAndroid Build Coastguard Worker
2176*912701f9SAndroid Build Coastguard Worker- **Simple matches**
2177*912701f9SAndroid Build Coastguard Worker
2178*912701f9SAndroid Build Coastguard Worker    `abc` `��`
2179*912701f9SAndroid Build Coastguard Worker
2180*912701f9SAndroid Build Coastguard Worker- **Unicode codepoint escapes**
2181*912701f9SAndroid Build Coastguard Worker
2182*912701f9SAndroid Build Coastguard Worker    `\u{1234} \u{012A}`
2183*912701f9SAndroid Build Coastguard Worker    `\u{22} \u{012a} \u{1234A}`
2184*912701f9SAndroid Build Coastguard Worker
2185*912701f9SAndroid Build Coastguard Worker    The hex escaping is case insensitive. The value may not match a surrogate or illegal character, nor a marker character.
2186*912701f9SAndroid Build Coastguard Worker    The form `\u{…}` is preferred as it is the same regardless of codepoint length.
2187*912701f9SAndroid Build Coastguard Worker
2188*912701f9SAndroid Build Coastguard Worker- **Fixed character classes and escapes**
2189*912701f9SAndroid Build Coastguard Worker
2190*912701f9SAndroid Build Coastguard Worker    `\s \S \t \r \n \f \v \\ \$ \d \w \D \W \0`
2191*912701f9SAndroid Build Coastguard Worker
2192*912701f9SAndroid Build Coastguard Worker    The value of these classes do not change with Unicode versions.
2193*912701f9SAndroid Build Coastguard Worker
2194*912701f9SAndroid Build Coastguard Worker    `\s` for example is exactly `[\f\n\r\t\v\u{00a0}\u{1680}\u{2000}-\u{200a}\u{2028}\u{2029}\u{202f}\u{205f}\u{3000}\u{feff}]`
2195*912701f9SAndroid Build Coastguard Worker
2196*912701f9SAndroid Build Coastguard Worker    `\\` and `\$` evaluate to `\` and `$`, respectively.
2197*912701f9SAndroid Build Coastguard Worker
2198*912701f9SAndroid Build Coastguard Worker- **Character classes**
2199*912701f9SAndroid Build Coastguard Worker
2200*912701f9SAndroid Build Coastguard Worker    `[abc]` `[^def]` `[a-z]` `[ॲऄ-आइ-ऋ]` `[\u{093F}-\u{0944}\u{0962}\u{0963}]`
2201*912701f9SAndroid Build Coastguard Worker
2202*912701f9SAndroid Build Coastguard Worker    - supported
2203*912701f9SAndroid Build Coastguard Worker    - no Unicode properties such as `\p{…}`
2204*912701f9SAndroid Build Coastguard Worker    - Warning: Character classes look superficially similar to [`uset`](#element-uset) elements, but they are distinct and referenced with the `$[...usetId]` notation in transforms. The `uset` notation cannot be embedded directly in a transform.
2205*912701f9SAndroid Build Coastguard Worker
2206*912701f9SAndroid Build Coastguard Worker- **Bounded quantifier**
2207*912701f9SAndroid Build Coastguard Worker
2208*912701f9SAndroid Build Coastguard Worker    `{x,y}`
2209*912701f9SAndroid Build Coastguard Worker
2210*912701f9SAndroid Build Coastguard Worker    `x` and `y` are required single digits representing the minimum and maximum number of occurrences.
2211*912701f9SAndroid Build Coastguard Worker    `x` must be ≥ 0, `y` must be ≥ x and ≥ 1
2212*912701f9SAndroid Build Coastguard Worker
2213*912701f9SAndroid Build Coastguard Worker- **Optional Specifier**
2214*912701f9SAndroid Build Coastguard Worker
2215*912701f9SAndroid Build Coastguard Worker    `?` - equivalent of `{0,1}`
2216*912701f9SAndroid Build Coastguard Worker
2217*912701f9SAndroid Build Coastguard Worker- **Numbered Capture Groups**
2218*912701f9SAndroid Build Coastguard Worker
2219*912701f9SAndroid Build Coastguard Worker    `([abc])([def])` (up to 9 groups)
2220*912701f9SAndroid Build Coastguard Worker
2221*912701f9SAndroid Build Coastguard Worker    These refer to groups captured as a set, and can be referenced with the `$1` through `$9` operators in the `to=` pattern. May not be nested.
2222*912701f9SAndroid Build Coastguard Worker
2223*912701f9SAndroid Build Coastguard Worker- **Non-capturing groups**
2224*912701f9SAndroid Build Coastguard Worker
2225*912701f9SAndroid Build Coastguard Worker    `(?:thismatches)`
2226*912701f9SAndroid Build Coastguard Worker
2227*912701f9SAndroid Build Coastguard Worker- **Nested capturing groups**
2228*912701f9SAndroid Build Coastguard Worker
2229*912701f9SAndroid Build Coastguard Worker    `(?:[abc]([def]))|(?:[ghi])`
2230*912701f9SAndroid Build Coastguard Worker
2231*912701f9SAndroid Build Coastguard Worker    Capture groups may be nested, however only the innermost group is allowed to be a capture group. The outer group must be a non-capturing group.
2232*912701f9SAndroid Build Coastguard Worker
2233*912701f9SAndroid Build Coastguard Worker- **Disjunctions**
2234*912701f9SAndroid Build Coastguard Worker
2235*912701f9SAndroid Build Coastguard Worker    `abc|def`
2236*912701f9SAndroid Build Coastguard Worker
2237*912701f9SAndroid Build Coastguard Worker    Match either `abc` or `def`.
2238*912701f9SAndroid Build Coastguard Worker
2239*912701f9SAndroid Build Coastguard Worker- **Match a single Unicode codepoint**
2240*912701f9SAndroid Build Coastguard Worker
2241*912701f9SAndroid Build Coastguard Worker    `.`
2242*912701f9SAndroid Build Coastguard Worker
2243*912701f9SAndroid Build Coastguard Worker    Matches a codepoint, not individual code units. (See the ’u’ option in EcmaScript262 regex.)
2244*912701f9SAndroid Build Coastguard Worker    For example, Osage `��` is one match (`.`) not two.
2245*912701f9SAndroid Build Coastguard Worker    Does not match [markers](#markers). (See `\m{.}` and `\m{marker}`, below.)
2246*912701f9SAndroid Build Coastguard Worker
2247*912701f9SAndroid Build Coastguard Worker- **Match the start of the text context**
2248*912701f9SAndroid Build Coastguard Worker
2249*912701f9SAndroid Build Coastguard Worker    `^`
2250*912701f9SAndroid Build Coastguard Worker
2251*912701f9SAndroid Build Coastguard Worker    The start of the context could be the start of a line, a grid cell, or some other formatting boundary.
2252*912701f9SAndroid Build Coastguard Worker    See description at the top of [`transforms`](#element-transform).
2253*912701f9SAndroid Build Coastguard Worker
2254*912701f9SAndroid Build Coastguard Worker#### Additional Features
2255*912701f9SAndroid Build Coastguard Worker
2256*912701f9SAndroid Build Coastguard WorkerThe following are additions to standard Regex syntax.
2257*912701f9SAndroid Build Coastguard Worker
2258*912701f9SAndroid Build Coastguard Worker- **Match a Marker**
2259*912701f9SAndroid Build Coastguard Worker
2260*912701f9SAndroid Build Coastguard Worker    `\m{Some_Marker}`
2261*912701f9SAndroid Build Coastguard Worker
2262*912701f9SAndroid Build Coastguard Worker    Matches the named marker.
2263*912701f9SAndroid Build Coastguard Worker    Also see [Markers](#markers).
2264*912701f9SAndroid Build Coastguard Worker
2265*912701f9SAndroid Build Coastguard Worker- **Match a single marker**
2266*912701f9SAndroid Build Coastguard Worker
2267*912701f9SAndroid Build Coastguard Worker    `\m{.}`
2268*912701f9SAndroid Build Coastguard Worker
2269*912701f9SAndroid Build Coastguard Worker    Matches any single marker.
2270*912701f9SAndroid Build Coastguard Worker    Also see [Markers](#markers).
2271*912701f9SAndroid Build Coastguard Worker
2272*912701f9SAndroid Build Coastguard Worker- **String Variables**
2273*912701f9SAndroid Build Coastguard Worker
2274*912701f9SAndroid Build Coastguard Worker    `${zwnj}`
2275*912701f9SAndroid Build Coastguard Worker
2276*912701f9SAndroid Build Coastguard Worker    In this usage, the variable with `id="zwnj"` will be substituted in at this point in the expression. The variable can contain a range, a character, or any other portion of a pattern. If `zwnj` is a simple string, the pattern will match that string at this point.
2277*912701f9SAndroid Build Coastguard Worker
2278*912701f9SAndroid Build Coastguard Worker- **`set` or `uset` variables**
2279*912701f9SAndroid Build Coastguard Worker
2280*912701f9SAndroid Build Coastguard Worker    `$[upper]`
2281*912701f9SAndroid Build Coastguard Worker
2282*912701f9SAndroid Build Coastguard Worker    Given a space-separated `set` or `uset` variable, this syntax will match _any_ of the substrings. This expression may be thought of  (and implemented) as if it were a _non-capturing group_. It may, however, be enclosed within a capturing group. For example, the following definition of `$[upper]` will match as if it were written `(?:A|B|CC|D|E|FF)`.
2283*912701f9SAndroid Build Coastguard Worker
2284*912701f9SAndroid Build Coastguard Worker    ```xml
2285*912701f9SAndroid Build Coastguard Worker    <variables>
2286*912701f9SAndroid Build Coastguard Worker        <set id="upper" value=" A B CC  D E  FF " />
2287*912701f9SAndroid Build Coastguard Worker    </variables>
2288*912701f9SAndroid Build Coastguard Worker    ```
2289*912701f9SAndroid Build Coastguard Worker
2290*912701f9SAndroid Build Coastguard Worker    This expression in a `from=` may be used to **insert a mapped variable**, see below under [Replacement syntax](#replacement-syntax).
2291*912701f9SAndroid Build Coastguard Worker
2292*912701f9SAndroid Build Coastguard Worker#### Disallowed Regex Features
2293*912701f9SAndroid Build Coastguard Worker
2294*912701f9SAndroid Build Coastguard Worker- **Matching an empty string**
2295*912701f9SAndroid Build Coastguard Worker
2296*912701f9SAndroid Build Coastguard Worker    Transforms may not match an empty string. For example, `<transform from=""/>` or `<transform from="X{0,1}"/>` are not allowed and must be flagged as an error to keyboard authors.
2297*912701f9SAndroid Build Coastguard Worker
2298*912701f9SAndroid Build Coastguard Worker- **Unicode properties**
2299*912701f9SAndroid Build Coastguard Worker
2300*912701f9SAndroid Build Coastguard Worker    `\p{property}` `\P{property}`
2301*912701f9SAndroid Build Coastguard Worker
2302*912701f9SAndroid Build Coastguard Worker    **Rationale:** The behavior of this feature varies by Unicode version, and so would not have predictable results.
2303*912701f9SAndroid Build Coastguard Worker
2304*912701f9SAndroid Build Coastguard Worker    Tooling may choose to suggest an expansion of properties, such as `\p{Mn}` to all non spacing marks for a certain Unicode version.  As well, a set of variables could be constructed in an `import`-able file matching particularly useful Unicode properties.
2305*912701f9SAndroid Build Coastguard Worker
2306*912701f9SAndroid Build Coastguard Worker    ```xml
2307*912701f9SAndroid Build Coastguard Worker    <uset id="Mn" value="[\u{034F}\u{0591}-\u{05AF}\u{05BD}\u{05C4}\u{05C5}\…]" /> <!-- 1,985 code points -->
2308*912701f9SAndroid Build Coastguard Worker    ```
2309*912701f9SAndroid Build Coastguard Worker
2310*912701f9SAndroid Build Coastguard Worker- **Backreferences**
2311*912701f9SAndroid Build Coastguard Worker
2312*912701f9SAndroid Build Coastguard Worker    `([abc])-\1` `\k<something>`
2313*912701f9SAndroid Build Coastguard Worker
2314*912701f9SAndroid Build Coastguard Worker    **Rationale:** Implementation and cognitive complexity.
2315*912701f9SAndroid Build Coastguard Worker
2316*912701f9SAndroid Build Coastguard Worker- **Unbounded Quantifiers**
2317*912701f9SAndroid Build Coastguard Worker
2318*912701f9SAndroid Build Coastguard Worker    `* + *? +? {1,} {0,}`
2319*912701f9SAndroid Build Coastguard Worker
2320*912701f9SAndroid Build Coastguard Worker    **Rationale:** Implementation and Computational complexity.
2321*912701f9SAndroid Build Coastguard Worker
2322*912701f9SAndroid Build Coastguard Worker- **Nested capture groups**
2323*912701f9SAndroid Build Coastguard Worker
2324*912701f9SAndroid Build Coastguard Worker    `((a|b|c)|(d|e|f))`
2325*912701f9SAndroid Build Coastguard Worker
2326*912701f9SAndroid Build Coastguard Worker    **Rationale:** Computational and cognitive complexity.
2327*912701f9SAndroid Build Coastguard Worker
2328*912701f9SAndroid Build Coastguard Worker- **Named capture groups**
2329*912701f9SAndroid Build Coastguard Worker
2330*912701f9SAndroid Build Coastguard Worker    `(?<something>)`
2331*912701f9SAndroid Build Coastguard Worker
2332*912701f9SAndroid Build Coastguard Worker    **Rationale:** Implementation complexity.
2333*912701f9SAndroid Build Coastguard Worker
2334*912701f9SAndroid Build Coastguard Worker- **Assertions** other than `^`
2335*912701f9SAndroid Build Coastguard Worker
2336*912701f9SAndroid Build Coastguard Worker    `\b` `\B` `(?<!…)` …
2337*912701f9SAndroid Build Coastguard Worker
2338*912701f9SAndroid Build Coastguard Worker    **Rationale:** Implementation complexity.
2339*912701f9SAndroid Build Coastguard Worker
2340*912701f9SAndroid Build Coastguard Worker- **End marker**
2341*912701f9SAndroid Build Coastguard Worker
2342*912701f9SAndroid Build Coastguard Worker    `$`
2343*912701f9SAndroid Build Coastguard Worker
2344*912701f9SAndroid Build Coastguard Worker    The end marker can be thought of as being implicitly at the end of every `from=` pattern, matching the insertion point. Transforms do not match past the insertion point.
2345*912701f9SAndroid Build Coastguard Worker
2346*912701f9SAndroid Build Coastguard Worker_Attribute:_ `to`
2347*912701f9SAndroid Build Coastguard Worker
2348*912701f9SAndroid Build Coastguard Worker> This attribute value represents the characters that are output from the transform.
2349*912701f9SAndroid Build Coastguard Worker>
2350*912701f9SAndroid Build Coastguard Worker> If this attribute is absent, it indicates that the no characters are output, such as with a backspace transform.
2351*912701f9SAndroid Build Coastguard Worker>
2352*912701f9SAndroid Build Coastguard Worker> A final rule such as `<transform from=".*"/>` will remove all context which doesn’t match one of the prior rules.
2353*912701f9SAndroid Build Coastguard Worker
2354*912701f9SAndroid Build Coastguard Worker#### Replacement syntax
2355*912701f9SAndroid Build Coastguard Worker
2356*912701f9SAndroid Build Coastguard WorkerUsed in the `to=`
2357*912701f9SAndroid Build Coastguard Worker
2358*912701f9SAndroid Build Coastguard Worker- **Literals**
2359*912701f9SAndroid Build Coastguard Worker
2360*912701f9SAndroid Build Coastguard Worker    `$$ \$ \\` = `$ $ \`
2361*912701f9SAndroid Build Coastguard Worker
2362*912701f9SAndroid Build Coastguard Worker- **Entire matched substring**
2363*912701f9SAndroid Build Coastguard Worker
2364*912701f9SAndroid Build Coastguard Worker    `$0`
2365*912701f9SAndroid Build Coastguard Worker
2366*912701f9SAndroid Build Coastguard Worker- **Insert the specified capture group**
2367*912701f9SAndroid Build Coastguard Worker
2368*912701f9SAndroid Build Coastguard Worker    `$1 $2 $3 … $9`
2369*912701f9SAndroid Build Coastguard Worker
2370*912701f9SAndroid Build Coastguard Worker- **Insert an entire variable**
2371*912701f9SAndroid Build Coastguard Worker
2372*912701f9SAndroid Build Coastguard Worker    `${variable}`
2373*912701f9SAndroid Build Coastguard Worker
2374*912701f9SAndroid Build Coastguard Worker    The entire contents of the named variable will be inserted at this point.
2375*912701f9SAndroid Build Coastguard Worker
2376*912701f9SAndroid Build Coastguard Worker- **Insert a mapped set**
2377*912701f9SAndroid Build Coastguard Worker
2378*912701f9SAndroid Build Coastguard Worker    `$[1:variable]` (Where "1" is any numbered capture group from 1 to 9)
2379*912701f9SAndroid Build Coastguard Worker
2380*912701f9SAndroid Build Coastguard Worker    Maps capture group 1 to variable `variable`. The `from=` side must also contain a grouped variable. This expression may appear anywhere or multiple times in the `to=` pattern.
2381*912701f9SAndroid Build Coastguard Worker
2382*912701f9SAndroid Build Coastguard Worker    **Example**
2383*912701f9SAndroid Build Coastguard Worker
2384*912701f9SAndroid Build Coastguard Worker    ```xml
2385*912701f9SAndroid Build Coastguard Worker    <set id="upper" value="A B CC D E  FF       G" />
2386*912701f9SAndroid Build Coastguard Worker    <set id="lower" value="a b c  d e  \u{0192} g" />
2387*912701f9SAndroid Build Coastguard Worker    <!-- note that values may be spaced for ease of reading -->
2388*912701f9SAndroid Build Coastguard Worker2389*912701f9SAndroid Build Coastguard Worker    <transform from="($[upper])" to="$[1:lower]" />
2390*912701f9SAndroid Build Coastguard Worker    ```
2391*912701f9SAndroid Build Coastguard Worker
2392*912701f9SAndroid Build Coastguard Worker    - The capture group on the `from=` side **must** contain exactly one set variable.  `from="Q($[upper])X"` can be used (other context before or after the capture group), but `from="(Q$[upper])"` may not be used with a mapped variable and is flagged as an error.
2393*912701f9SAndroid Build Coastguard Worker
2394*912701f9SAndroid Build Coastguard Worker    - The `from=` and `to=` sides of the pattern must both be using `set` variables. There is no way to insert a set literal on either side and avoid using a variable.
2395*912701f9SAndroid Build Coastguard Worker
2396*912701f9SAndroid Build Coastguard Worker    - The two variables (here `upper` and `lower`) must have exactly the same number of whitespace-separated items. Leading and trailing space (such as at the end of `lower`) is ignored. A variable without any spaces is considered to be a set variable of exactly one item.
2397*912701f9SAndroid Build Coastguard Worker
2398*912701f9SAndroid Build Coastguard Worker    - As described in [Additional Features](#additional-features), the `upper` set variable as used here matches as if it is `((?:A|B|CC|D|E|FF|G))`, showing the enclosing capturing group. When text from the input context matches this expression, and all above conditions are met, the mapping proceeds as follows:
2399*912701f9SAndroid Build Coastguard Worker
2400*912701f9SAndroid Build Coastguard Worker    1. The portion of the input context, such as `CC`, is matched against the above calculated pattern.
2401*912701f9SAndroid Build Coastguard Worker
2402*912701f9SAndroid Build Coastguard Worker    2. The position within the `from=` variable (`upper`) is calculated. The regex match may not have this information, but the matched substring `CC` can be compared against the tokenized input variable: `A`, `B`, `CC`, `D`, … to find that the 3rd item matches exactly.
2403*912701f9SAndroid Build Coastguard Worker
2404*912701f9SAndroid Build Coastguard Worker    3. The same position within the `to=` variable (`lower`) is calculated. The 3rd item is `c`.
2405*912701f9SAndroid Build Coastguard Worker
2406*912701f9SAndroid Build Coastguard Worker    4. `CC` in the input context is replaced with `c`, and processing proceeds to the next `transformGroup`.
2407*912701f9SAndroid Build Coastguard Worker
2408*912701f9SAndroid Build Coastguard Worker- **Emit a marker**
2409*912701f9SAndroid Build Coastguard Worker
2410*912701f9SAndroid Build Coastguard Worker    `\m{Some_marker}`
2411*912701f9SAndroid Build Coastguard Worker
2412*912701f9SAndroid Build Coastguard Worker    Emits the named mark. Also see [Markers](#markers).
2413*912701f9SAndroid Build Coastguard Worker
2414*912701f9SAndroid Build Coastguard Worker* * *
2415*912701f9SAndroid Build Coastguard Worker
2416*912701f9SAndroid Build Coastguard Worker### Element: reorder
2417*912701f9SAndroid Build Coastguard Worker
2418*912701f9SAndroid Build Coastguard WorkerThe reorder transform consists of a [`<transformGroup>`](#element-transformgroup) element containing `<reorder>` elements.  Multiple such `<transformGroup>` elements may be contained in an enclosing `<transforms>` element.
2419*912701f9SAndroid Build Coastguard Worker
2420*912701f9SAndroid Build Coastguard WorkerOne or more [`<import>`](#element-import) elements are allowed to precede the `<reorder>` elements.
2421*912701f9SAndroid Build Coastguard Worker
2422*912701f9SAndroid Build Coastguard WorkerThis transform has the job of reordering sequences of characters that have been typed, from their typed order to the desired output order. The primary concern in this transform is to sort combining marks into their correct relative order after a base, as described in this section. The reorder transforms can be quite complex, keyboard layouts will almost always import them.
2423*912701f9SAndroid Build Coastguard Worker
2424*912701f9SAndroid Build Coastguard WorkerThe reordering algorithm consists of four parts:
2425*912701f9SAndroid Build Coastguard Worker
2426*912701f9SAndroid Build Coastguard Worker1. Create a sort key for each character in the input string. A sort key has 4 parts (primary, index, tertiary, quaternary):
2427*912701f9SAndroid Build Coastguard Worker   * The **primary weight** is the primary order value.
2428*912701f9SAndroid Build Coastguard Worker   * The **secondary weight** is the index, a position in the input string, usually of the character itself, but it may be of a character earlier in the string.
2429*912701f9SAndroid Build Coastguard Worker   * The **tertiary weight** is a tertiary order value (defaulting to 0).
2430*912701f9SAndroid Build Coastguard Worker   * The **quaternary weight** is the index of the character in the string. This is solely to ensure a stable sort for sequences of characters with the same tertiary weight.
2431*912701f9SAndroid Build Coastguard Worker2. Mark each character as to whether it is a prebase character, one that is typed before the base and logically stored after. Thus it will have a primary order > 0.
2432*912701f9SAndroid Build Coastguard Worker3. Use the sort key and the prebase mark to identify runs. A run starts with a prefix that contains any prebase characters and a single base character whose primary and tertiary key is 0. The run extends until, but not including, the start of the prefix of the next run or end of the string.
2433*912701f9SAndroid Build Coastguard Worker   * `run := preBase* (primary=0 && tertiary=0) ((primary≠0 || tertiary≠0) && !preBase)*`
2434*912701f9SAndroid Build Coastguard Worker4. Sort the character order of each character in the run based on its sort key.
2435*912701f9SAndroid Build Coastguard Worker
2436*912701f9SAndroid Build Coastguard WorkerThe primary order of a character with the Unicode property `Canonical_Combining_Class` (ccc) of 0 may well not be 0. In addition, a character may receive a different primary order dependent on context. For example, in the Devanagari sequence ka halant ka, the first ka would have a primary order 0 while the halant ka sequence would give both halant and the second ka a primary order > 0, for example 2. Note that “base” character in this discussion is not a Unicode base character. It is instead a character with primary=0.
2437*912701f9SAndroid Build Coastguard Worker
2438*912701f9SAndroid Build Coastguard WorkerIn order to get the characters into the correct relative order, it is necessary not only to order combining marks relative to the base character, but also to order some combining marks in a subsequence following another combining mark. For example in Devanagari, a nukta may follow a consonant character, but it may also follow a conjunct consisting of consonant, halant, consonant. Notice that the second consonant is not, in this model, the start of a new run because some characters may need to be reordered to before the first base, for example repha. The repha would get primary < 0, and be sorted before the character with order = 0, which is, in the case of Devanagari, the initial consonant of the orthographic syllable.
2439*912701f9SAndroid Build Coastguard Worker
2440*912701f9SAndroid Build Coastguard WorkerThe reorder transform consists of `<reorder>` elements encapsulated in a `<transformGroup>` element. Each element is a rule that matches against a string of characters with the action of setting the various ordering attributes (`primary`, `tertiary`, `tertiaryBase`, `preBase`) for the matched characters in the string.
2441*912701f9SAndroid Build Coastguard Worker
2442*912701f9SAndroid Build Coastguard WorkerThe relative ordering of `<reorder>` elements is not significant.
2443*912701f9SAndroid Build Coastguard Worker
2444*912701f9SAndroid Build Coastguard Worker**Syntax**
2445*912701f9SAndroid Build Coastguard Worker
2446*912701f9SAndroid Build Coastguard Worker```xml
2447*912701f9SAndroid Build Coastguard Worker<transformGroup>
2448*912701f9SAndroid Build Coastguard Worker    <!-- one or more <import/> elements are allowed at this point -->
2449*912701f9SAndroid Build Coastguard Worker    <reorder from="…combination of characters"
2450*912701f9SAndroid Build Coastguard Worker    before="…look-behind required match"
2451*912701f9SAndroid Build Coastguard Worker    order="…list of weights"
2452*912701f9SAndroid Build Coastguard Worker    tertiary="…list of weights"
2453*912701f9SAndroid Build Coastguard Worker    tertiaryBase="…list of true/false"
2454*912701f9SAndroid Build Coastguard Worker    preBase="…list of true/false" />
2455*912701f9SAndroid Build Coastguard Worker    <!-- other <reorder/> elements… -->
2456*912701f9SAndroid Build Coastguard Worker</transformGroup>
2457*912701f9SAndroid Build Coastguard Worker```
2458*912701f9SAndroid Build Coastguard Worker
2459*912701f9SAndroid Build Coastguard Worker> <small>
2460*912701f9SAndroid Build Coastguard Worker>
2461*912701f9SAndroid Build Coastguard Worker> Parents: [transformGroup](#element-transformgroup)
2462*912701f9SAndroid Build Coastguard Worker> Children: _none_
2463*912701f9SAndroid Build Coastguard Worker> Occurrence: optional, multiple
2464*912701f9SAndroid Build Coastguard Worker>
2465*912701f9SAndroid Build Coastguard Worker> </small>
2466*912701f9SAndroid Build Coastguard Worker
2467*912701f9SAndroid Build Coastguard Worker_Attribute:_ `from` (required)
2468*912701f9SAndroid Build Coastguard Worker
2469*912701f9SAndroid Build Coastguard Worker> This attribute value contains a string of elements. Each element matches one character and may consist of a codepoint or a UnicodeSet (both as defined in [UTS #35 Part One](tr35.md#Unicode_Sets)).
2470*912701f9SAndroid Build Coastguard Worker
2471*912701f9SAndroid Build Coastguard Worker_Attribute:_ `before`
2472*912701f9SAndroid Build Coastguard Worker
2473*912701f9SAndroid Build Coastguard Worker> This attribute value contains the element string that must match the string immediately preceding the start of the string that the @from matches.
2474*912701f9SAndroid Build Coastguard Worker
2475*912701f9SAndroid Build Coastguard Worker_Attribute:_ `order`
2476*912701f9SAndroid Build Coastguard Worker
2477*912701f9SAndroid Build Coastguard Worker> This attribute value gives the primary order for the elements in the matched string in the `@from` attribute. The value is a simple integer between -128 and +127 inclusive, or a space separated list of such integers. For a single integer, it is applied to all the elements in the matched string. Details of such list type attributes are given after all the attributes are described. If missing, the order value of all the matched characters is 0. We consider the order value for a matched character in the string.
2478*912701f9SAndroid Build Coastguard Worker>
2479*912701f9SAndroid Build Coastguard Worker> * If the value is 0 and its tertiary value is 0, then the character is the base of a new run.
2480*912701f9SAndroid Build Coastguard Worker> * If the value is 0 and its tertiary value is non-zero, then it is a normal character in a run, with ordering semantics as described in the `@tertiary` attribute.
2481*912701f9SAndroid Build Coastguard Worker> * If the value is negative, then the character is a primary character and will reorder to be before the base of the run.
2482*912701f9SAndroid Build Coastguard Worker> * If the value is positive, then the character is a primary character and is sorted based on the order value as the primary key following a previous base character.
2483*912701f9SAndroid Build Coastguard Worker>
2484*912701f9SAndroid Build Coastguard Worker> A character with a zero tertiary value is a primary character and receives a sort key consisting of:
2485*912701f9SAndroid Build Coastguard Worker>
2486*912701f9SAndroid Build Coastguard Worker> * Primary weight is the order value
2487*912701f9SAndroid Build Coastguard Worker> * Secondary weight is the index of the character. This may be any value (character index, codepoint index) such that its value is greater than the character before it and less than the character after it.
2488*912701f9SAndroid Build Coastguard Worker> * Tertiary weight is 0.
2489*912701f9SAndroid Build Coastguard Worker> * Quaternary weight is the same as the secondary weight.
2490*912701f9SAndroid Build Coastguard Worker
2491*912701f9SAndroid Build Coastguard Worker_Attribute:_ `tertiary`
2492*912701f9SAndroid Build Coastguard Worker
2493*912701f9SAndroid Build Coastguard Worker> This attribute value gives the tertiary order value to the characters matched. The value is a simple integer between -128 and +127 inclusive, or a space separated list of such integers. If missing, the value for all the characters matched is 0. We consider the tertiary value for a matched character in the string.
2494*912701f9SAndroid Build Coastguard Worker>
2495*912701f9SAndroid Build Coastguard Worker> * If the value is 0 then the character is considered to have a primary order as specified in its order value and is a primary character.
2496*912701f9SAndroid Build Coastguard Worker> * If the value is non zero, then the order value must be zero otherwise it is an error. The character is considered as a tertiary character for the purposes of ordering.
2497*912701f9SAndroid Build Coastguard Worker>
2498*912701f9SAndroid Build Coastguard Worker> A tertiary character receives its primary order and index from a previous character, which it is intended to sort closely after. The sort key for a tertiary character consists of:
2499*912701f9SAndroid Build Coastguard Worker>
2500*912701f9SAndroid Build Coastguard Worker> * Primary weight is the primary weight of the primary character..
2501*912701f9SAndroid Build Coastguard Worker> * Secondary weight is the index of the primary character, not the tertiary character
2502*912701f9SAndroid Build Coastguard Worker> * Tertiary weight is the tertiary value for the character.
2503*912701f9SAndroid Build Coastguard Worker> * Quaternary weight is the index of the tertiary character.
2504*912701f9SAndroid Build Coastguard Worker
2505*912701f9SAndroid Build Coastguard Worker_Attribute:_ `tertiaryBase`
2506*912701f9SAndroid Build Coastguard Worker
2507*912701f9SAndroid Build Coastguard Worker> This attribute value is a space separated list of `"true"` or `"false"` values corresponding to each character matched. It is illegal for a tertiary character to have a true `tertiaryBase` value. For a primary character it marks that this character may have tertiary characters moved after it. When calculating the secondary weight for a tertiary character, the most recently encountered primary character with a true `tertiaryBase` attribute value is used. Primary characters with an `@order` value of 0 automatically are treated as having `tertiaryBase` true regardless of what is specified for them.
2508*912701f9SAndroid Build Coastguard Worker
2509*912701f9SAndroid Build Coastguard Worker_Attribute:_ `preBase`
2510*912701f9SAndroid Build Coastguard Worker
2511*912701f9SAndroid Build Coastguard Worker> This attribute value gives the prebase attribute for each character matched. The value may be `"true"` or `"false"` or a space separated list of such values. If missing the value for all the characters matched is false. It is illegal for a tertiary character to have a true prebase value.
2512*912701f9SAndroid Build Coastguard Worker>
2513*912701f9SAndroid Build Coastguard Worker> If a primary character has a true prebase value then the character is marked as being typed before the base character of a run, even though it is intended to be stored after it. The primary order gives the intended position in the order after the base character, that the prebase character will end up. Thus `@order` shall not be 0. These characters are part of the run prefix. If such characters are typed then, in order to give the run a base character after which characters can be sorted, an appropriate base character, such as a dotted circle, is inserted into the output run, until a real base character has been typed. A value of `"false"` indicates that the character is not a prebase.
2514*912701f9SAndroid Build Coastguard Worker
2515*912701f9SAndroid Build Coastguard WorkerFor `@from` attribute values with a match string length greater than 1, the sort key information (`@order`, `@tertiary`, `@tertiaryBase`, `@preBase`) may consist of a space-separated list of values, one for each element matched. The last value is repeated to fill out any missing values. Such a list may not contain more values than there are elements in the `@from` attribute:
2516*912701f9SAndroid Build Coastguard Worker
2517*912701f9SAndroid Build Coastguard Worker```java
2518*912701f9SAndroid Build Coastguard Workerif len(@from) < len(@list) then error
2519*912701f9SAndroid Build Coastguard Workerelse
2520*912701f9SAndroid Build Coastguard Worker    while len(@from) > len(@list)
2521*912701f9SAndroid Build Coastguard Worker        append lastitem(@list) to @list
2522*912701f9SAndroid Build Coastguard Worker    endwhile
2523*912701f9SAndroid Build Coastguard Workerendif
2524*912701f9SAndroid Build Coastguard Worker```
2525*912701f9SAndroid Build Coastguard Worker
2526*912701f9SAndroid Build Coastguard Worker**Example**
2527*912701f9SAndroid Build Coastguard Worker
2528*912701f9SAndroid Build Coastguard WorkerFor example, consider the Northern Thai (`nod-Lana`, Tai Tham script) word: ᨡ᩠ᩅᩫ᩶ 'roasted'. This is ideally encoded as the following:
2529*912701f9SAndroid Build Coastguard Worker
2530*912701f9SAndroid Build Coastguard Worker| name | _kha_ | _sakot_ | _wa_ | _o_  | _t2_ |
2531*912701f9SAndroid Build Coastguard Worker|------|-------|---------|------|------|------|
2532*912701f9SAndroid Build Coastguard Worker| code | 1A21  | 1A60    | 1A45 | 1A6B | 1A76 |
2533*912701f9SAndroid Build Coastguard Worker| ccc  | 0     | 9       | 0    | 0    | 230  |
2534*912701f9SAndroid Build Coastguard Worker
2535*912701f9SAndroid Build Coastguard Worker(That sequence is already in NFC format.)
2536*912701f9SAndroid Build Coastguard Worker
2537*912701f9SAndroid Build Coastguard WorkerSome users may type the upper component of the vowel first, and the tone before or after the lower component. Thus someone might type it as:
2538*912701f9SAndroid Build Coastguard Worker
2539*912701f9SAndroid Build Coastguard Worker| name | _kha_ | _o_  | _t2_ | _sakot_ | _wa_ |
2540*912701f9SAndroid Build Coastguard Worker|------|-------|------|------|---------|------|
2541*912701f9SAndroid Build Coastguard Worker| code | 1A21  | 1A6B | 1A76 | 1A60    | 1A45 |
2542*912701f9SAndroid Build Coastguard Worker| ccc  | 0     | 0    | 230  | 9       | 0    |
2543*912701f9SAndroid Build Coastguard Worker
2544*912701f9SAndroid Build Coastguard WorkerThe Unicode NFC format of that typed value reorders to:
2545*912701f9SAndroid Build Coastguard Worker
2546*912701f9SAndroid Build Coastguard Worker| name | _kha_ | _o_  | _sakot_ | _t2_ | _wa_ |
2547*912701f9SAndroid Build Coastguard Worker|------|-------|------|---------|------|------|
2548*912701f9SAndroid Build Coastguard Worker| code | 1A21  | 1A6B | 1A60    | 1A76 | 1A45 |
2549*912701f9SAndroid Build Coastguard Worker| ccc  | 0     | 0    | 9       | 230  | 0    |
2550*912701f9SAndroid Build Coastguard Worker
2551*912701f9SAndroid Build Coastguard WorkerFinally, the user might also type in the sequence with the tone _after_ the lower component.
2552*912701f9SAndroid Build Coastguard Worker
2553*912701f9SAndroid Build Coastguard Worker| name | _kha_ | _o_  | _sakot_ | _wa_ | _t2_ |
2554*912701f9SAndroid Build Coastguard Worker|------|-------|------|---------|------|------|
2555*912701f9SAndroid Build Coastguard Worker| code | 1A21  | 1A6B | 1A60    | 1A45 | 1A76 |
2556*912701f9SAndroid Build Coastguard Worker| ccc  | 0     | 0    | 9       | 0    | 230  |
2557*912701f9SAndroid Build Coastguard Worker
2558*912701f9SAndroid Build Coastguard Worker(That sequence is already in NFC format.)
2559*912701f9SAndroid Build Coastguard Worker
2560*912701f9SAndroid Build Coastguard WorkerWe want all of these sequences to end up ordered as the first. To do this, we use the following rules:
2561*912701f9SAndroid Build Coastguard Worker
2562*912701f9SAndroid Build Coastguard Worker```xml
2563*912701f9SAndroid Build Coastguard Worker<reorder from="\u{1A60}" order="127" />      <!-- max possible order -->
2564*912701f9SAndroid Build Coastguard Worker<reorder from="\u{1A6B}" order="42" />
2565*912701f9SAndroid Build Coastguard Worker<reorder from="[\u{1A75}-\u{1A79}]" order="55" />
2566*912701f9SAndroid Build Coastguard Worker<reorder before="\u{1A6B}" from="\u{1A60}\u{1A45}" order="10" />
2567*912701f9SAndroid Build Coastguard Worker<reorder before="\u{1A6B}[\u{1A75}-\u{1A79}]" from="\u{1A60}\u{1A45}" order="10" />
2568*912701f9SAndroid Build Coastguard Worker<reorder before="\u{1A6B}" from="\u{1A60}[\u{1A75}-\u{1A79}]\u{1A45}" order="10 55 10" />
2569*912701f9SAndroid Build Coastguard Worker```
2570*912701f9SAndroid Build Coastguard Worker
2571*912701f9SAndroid Build Coastguard WorkerThe first reorder is the default ordering for the _sakot_ which allows for it to be placed anywhere in a sequence, but moves any non-consonants that may immediately follow it, back before it in the sequence. The next two rules give the orders for the top vowel component and tone marks respectively. The next three rules give the _sakot_ and _wa_ characters a primary order that places them before the _o_. Notice particularly the final reorder rule where the _sakot_+_wa_ is split by the tone mark. This rule is necessary in case someone types into the middle of previously normalized text.
2572*912701f9SAndroid Build Coastguard Worker
2573*912701f9SAndroid Build Coastguard Worker`<reorder>` elements are priority ordered based first on the length of string their `@from` attribute value matches and then the sum of the lengths of the strings their `@before` attribute value matches.
2574*912701f9SAndroid Build Coastguard Worker
2575*912701f9SAndroid Build Coastguard Worker#### Using `<import>` with `<reorder>` elements
2576*912701f9SAndroid Build Coastguard Worker
2577*912701f9SAndroid Build Coastguard WorkerThis section describes the impact of using [`import`](#element-import) elements with `<reorder>` elements.
2578*912701f9SAndroid Build Coastguard Worker
2579*912701f9SAndroid Build Coastguard WorkerThe @from string in a `<reorder>` element describes a set of strings that it matches. This also holds for the `@before` attribute. The **intersection** of any two `<reorder>` elements consists of the intersections of their `@from` and `@before` string sets. Tooling should warn users if the intersection between any two `<reorder>` elements in the same `<transformGroup>` element to be non empty prior to processing imports.
2580*912701f9SAndroid Build Coastguard Worker
2581*912701f9SAndroid Build Coastguard WorkerIf two `<reorder>` elements have a non empty intersection, then they are split and merged. They are split such that where there were two `<reorder>` elements, there are, in effect (but not actuality), three elements consisting of:
2582*912701f9SAndroid Build Coastguard Worker
2583*912701f9SAndroid Build Coastguard Worker* `@from`, `@before` that match the intersection of the two rules. The other attribute values are merged, as described below.
2584*912701f9SAndroid Build Coastguard Worker* `@from`, `@before` that match the set of strings in the first rule not in the intersection with the other attribute values from the first rule.
2585*912701f9SAndroid Build Coastguard Worker* `@from`, `@before` that match the set of strings in the second rule not in the intersection, with the other attribute values from the second rule.
2586*912701f9SAndroid Build Coastguard Worker
2587*912701f9SAndroid Build Coastguard WorkerWhen merging the other attributes, the second rule is taken to have priority (being an override of the earlier element). Where the second rule does not define the value for a character but the first does, the value is taken from the first rule, otherwise it is taken from the second rule.
2588*912701f9SAndroid Build Coastguard Worker
2589*912701f9SAndroid Build Coastguard WorkerNotice that it is possible for two rules to match the same string, but for them not to merge because the distribution of the string across `@before` and `@from` is different. For example, the following would not merge:
2590*912701f9SAndroid Build Coastguard Worker
2591*912701f9SAndroid Build Coastguard Worker```xml
2592*912701f9SAndroid Build Coastguard Worker<reorder before="ab" from="cd" />
2593*912701f9SAndroid Build Coastguard Worker<reorder before="a" from="bcd" />
2594*912701f9SAndroid Build Coastguard Worker```
2595*912701f9SAndroid Build Coastguard Worker
2596*912701f9SAndroid Build Coastguard WorkerAfter `<reorder>` elements merge, the resulting `reorder` elements are sorted into priority order for matching.
2597*912701f9SAndroid Build Coastguard Worker
2598*912701f9SAndroid Build Coastguard WorkerConsider this fragment from a shared reordering for the Myanmar script:
2599*912701f9SAndroid Build Coastguard Worker
2600*912701f9SAndroid Build Coastguard Worker```xml
2601*912701f9SAndroid Build Coastguard Worker<!-- File: "myanmar-reordering.xml" -->
2602*912701f9SAndroid Build Coastguard Worker<transformGroup>
2603*912701f9SAndroid Build Coastguard Worker    <!-- medial-r -->
2604*912701f9SAndroid Build Coastguard Worker    <reorder from="\u{103C}" order="20" />
2605*912701f9SAndroid Build Coastguard Worker
2606*912701f9SAndroid Build Coastguard Worker    <!-- [medial-wa or shan-medial-wa] -->
2607*912701f9SAndroid Build Coastguard Worker    <reorder from="[\u{103D}\u{1082}]" order="25" />
2608*912701f9SAndroid Build Coastguard Worker
2609*912701f9SAndroid Build Coastguard Worker    <!-- [medial-ha or shan-medial-wa]+asat = Mon asat -->
2610*912701f9SAndroid Build Coastguard Worker    <reorder from="[\u{103E}\u{1082}]\u{103A}" order="27" />
2611*912701f9SAndroid Build Coastguard Worker
2612*912701f9SAndroid Build Coastguard Worker    <!-- [medial-ha or mon-medial-wa] -->
2613*912701f9SAndroid Build Coastguard Worker    <reorder from="[\u{103E}\u{1060}]" order="27" />
2614*912701f9SAndroid Build Coastguard Worker
2615*912701f9SAndroid Build Coastguard Worker    <!-- [e-vowel (U+1031) or shan-e-vowel (U+1084)] -->
2616*912701f9SAndroid Build Coastguard Worker    <reorder from="[\u{1031}\u{1084}]" order="30" />
2617*912701f9SAndroid Build Coastguard Worker
2618*912701f9SAndroid Build Coastguard Worker    <reorder from="[\u{102D}\u{102E}\u{1033}-\u{1035}\u{1071}-\u{1074}\u{1085}\u{109D}\u{A9E5}]" order="35" />
2619*912701f9SAndroid Build Coastguard Worker</transformGroup>
2620*912701f9SAndroid Build Coastguard Worker```
2621*912701f9SAndroid Build Coastguard Worker
2622*912701f9SAndroid Build Coastguard WorkerA particular Myanmar keyboard layout can have these `reorder` elements:
2623*912701f9SAndroid Build Coastguard Worker
2624*912701f9SAndroid Build Coastguard Worker```xml
2625*912701f9SAndroid Build Coastguard Worker<transformGroup>
2626*912701f9SAndroid Build Coastguard Worker    <import path="myanmar-reordering.xml"/> <!-- import the above transformGroup -->
2627*912701f9SAndroid Build Coastguard Worker    <!-- Kinzi -->
2628*912701f9SAndroid Build Coastguard Worker    <reorder from="\u{1004}\u{103A}\u{1039}" order="-1" />
2629*912701f9SAndroid Build Coastguard Worker
2630*912701f9SAndroid Build Coastguard Worker    <!-- e-vowel -->
2631*912701f9SAndroid Build Coastguard Worker    <reorder from="\u{1031}" preBase="1" />
2632*912701f9SAndroid Build Coastguard Worker
2633*912701f9SAndroid Build Coastguard Worker    <!-- medial-r -->
2634*912701f9SAndroid Build Coastguard Worker    <reorder from="\u{103C}" preBase="1" />
2635*912701f9SAndroid Build Coastguard Worker</transformGroup>
2636*912701f9SAndroid Build Coastguard Worker```
2637*912701f9SAndroid Build Coastguard Worker
2638*912701f9SAndroid Build Coastguard WorkerThe effect of this is that the _e-vowel_ will be identified as a prebase and will have an order of 30. Likewise a _medial-r_ will be identified as a prebase and will have an order of 20. Notice that a _shan-e-vowel_ (`\u{1084}`) will not be identified as a prebase (even if it should be!). The _kinzi_ is described in the layout since it moves something across a run boundary. By separating such movements (prebase or moving to in front of a base) from the shared ordering rules, the shared ordering rules become a self-contained combining order description that can be used in other keyboards or even in other contexts than keyboarding.
2639*912701f9SAndroid Build Coastguard Worker
2640*912701f9SAndroid Build Coastguard Worker#### Example Post-reorder transforms
2641*912701f9SAndroid Build Coastguard Worker
2642*912701f9SAndroid Build Coastguard WorkerIt may be desired to perform additional processing following reorder operations.  This may be aaccomplished by adding an additional `<transformGroup>` element after the group containing `<reorder>` elements.
2643*912701f9SAndroid Build Coastguard Worker
2644*912701f9SAndroid Build Coastguard WorkerFirst, a partial example from Khmer where split vowels are combined after reordering.
2645*912701f9SAndroid Build Coastguard Worker
2646*912701f9SAndroid Build Coastguard Worker```xml
2647*912701f9SAndroid Build Coastguard Worker2648*912701f9SAndroid Build Coastguard Worker<transformGroup>
2649*912701f9SAndroid Build Coastguard Worker    <reorder … />
2650*912701f9SAndroid Build Coastguard Worker    <reorder … />
2651*912701f9SAndroid Build Coastguard Worker    <reorder … />
2652*912701f9SAndroid Build Coastguard Worker2653*912701f9SAndroid Build Coastguard Worker</transformGroup>
2654*912701f9SAndroid Build Coastguard Worker<transformGroup>
2655*912701f9SAndroid Build Coastguard Worker    <transform from="\u{17C1}\u{17B8}" to="\u{17BE}" />
2656*912701f9SAndroid Build Coastguard Worker    <transform from="\u{17C1}\u{17B6}" to="\u{17C4}" />
2657*912701f9SAndroid Build Coastguard Worker</transformGroup>
2658*912701f9SAndroid Build Coastguard Worker```
2659*912701f9SAndroid Build Coastguard Worker
2660*912701f9SAndroid Build Coastguard WorkerAnother partial example allows a keyboard implementation to prevent people typing two lower vowels in a Burmese cluster:
2661*912701f9SAndroid Build Coastguard Worker
2662*912701f9SAndroid Build Coastguard Worker```xml
2663*912701f9SAndroid Build Coastguard Worker2664*912701f9SAndroid Build Coastguard Worker<transformGroup>
2665*912701f9SAndroid Build Coastguard Worker    <reorder … />
2666*912701f9SAndroid Build Coastguard Worker    <reorder … />
2667*912701f9SAndroid Build Coastguard Worker    <reorder … />
2668*912701f9SAndroid Build Coastguard Worker2669*912701f9SAndroid Build Coastguard Worker</transformGroup>
2670*912701f9SAndroid Build Coastguard Worker<transformGroup>
2671*912701f9SAndroid Build Coastguard Worker    <transform from="[\u{102F}\u{1030}\u{1048}\u{1059}][\u{102F}\u{1030}\u{1048}\u{1059}]"  />
2672*912701f9SAndroid Build Coastguard Worker</transformGroup>
2673*912701f9SAndroid Build Coastguard Worker```
2674*912701f9SAndroid Build Coastguard Worker
2675*912701f9SAndroid Build Coastguard Worker#### Reorder and Markers
2676*912701f9SAndroid Build Coastguard Worker
2677*912701f9SAndroid Build Coastguard WorkerMarkers are not matched by `reorder` elements. However, if a character preceded by one or more markers is reordered due to a `reorder` element, those markers will be reordered with the characters, maintaining the same relative order.  This is a similar process to the algorithm used to normalize strings processed by `transform` elements.
2678*912701f9SAndroid Build Coastguard Worker
2679*912701f9SAndroid Build Coastguard WorkerKeyboard implementations must process `reorder` elements using the following algorithm.
2680*912701f9SAndroid Build Coastguard Worker
2681*912701f9SAndroid Build Coastguard WorkerNote that steps 1 and 3 are identical to the steps used for normalization using markers in the [Marker Algorithm Overview](#marker-algorithm-overview).
2682*912701f9SAndroid Build Coastguard Worker
2683*912701f9SAndroid Build Coastguard WorkerGiven an input string from context or from a previous `transformGroup`:
2684*912701f9SAndroid Build Coastguard Worker
2685*912701f9SAndroid Build Coastguard Worker1. Parsing/Removing Markers
2686*912701f9SAndroid Build Coastguard Worker
2687*912701f9SAndroid Build Coastguard Worker2. Perform reordering (as in this section)
2688*912701f9SAndroid Build Coastguard Worker
2689*912701f9SAndroid Build Coastguard Worker3. Re-Adding Markers
2690*912701f9SAndroid Build Coastguard Worker
2691*912701f9SAndroid Build Coastguard Worker* * *
2692*912701f9SAndroid Build Coastguard Worker
2693*912701f9SAndroid Build Coastguard Worker### Backspace Transforms
2694*912701f9SAndroid Build Coastguard Worker
2695*912701f9SAndroid Build Coastguard WorkerThe `<transforms type="backspace">` describe an optional transform that is not applied on input of normal characters, but is only used to perform extra backspace modifications to previously committed text.
2696*912701f9SAndroid Build Coastguard Worker
2697*912701f9SAndroid Build Coastguard WorkerWhen the backspace key is pressed, the `<transforms type="backspace">` element (if present) is processed, and then the `<transforms type="simple">` element (if processed) as with any other key.
2698*912701f9SAndroid Build Coastguard Worker
2699*912701f9SAndroid Build Coastguard WorkerKeyboarding applications typically work, but are not required to, in one of two modes:
2700*912701f9SAndroid Build Coastguard Worker
2701*912701f9SAndroid Build Coastguard Worker**_text entry_**
2702*912701f9SAndroid Build Coastguard Worker
2703*912701f9SAndroid Build Coastguard Worker> text entry happens while a user is typing new text. A user typically wants the backspace key to undo whatever they last typed, whether or not they typed things in the 'right' order.
2704*912701f9SAndroid Build Coastguard Worker
2705*912701f9SAndroid Build Coastguard Worker**_text editing_**
2706*912701f9SAndroid Build Coastguard Worker
2707*912701f9SAndroid Build Coastguard Worker> text editing happens when a user moves the cursor into some previously entered text which may have been entered by someone else. As such, there is no way to know in which order things were typed, but a user will still want appropriate behaviour when they press backspace. This may involve deleting more than one character or replacing a sequence of characters with a different sequence.
2708*912701f9SAndroid Build Coastguard Worker
2709*912701f9SAndroid Build Coastguard WorkerIn text editing mode, different keyboard layouts may behave differently in the same textual context. The backspace transform allows the keyboard layout to specify the effect of pressing backspace in a particular textual context. This is done by specifying a set of backspace rules that match a string before the cursor and replace it with another string. The rules are expressed within a `transforms type="backspace"` element.
2710*912701f9SAndroid Build Coastguard Worker
2711*912701f9SAndroid Build Coastguard Worker
2712*912701f9SAndroid Build Coastguard Worker```xml
2713*912701f9SAndroid Build Coastguard Worker<transforms type="backspace">
2714*912701f9SAndroid Build Coastguard Worker    <transformGroup>
2715*912701f9SAndroid Build Coastguard Worker        <transform from="…match pattern" to="…output pattern" />
2716*912701f9SAndroid Build Coastguard Worker    </transformGroup>
2717*912701f9SAndroid Build Coastguard Worker</transforms>
2718*912701f9SAndroid Build Coastguard Worker```
2719*912701f9SAndroid Build Coastguard Worker
2720*912701f9SAndroid Build Coastguard Worker**Example**
2721*912701f9SAndroid Build Coastguard Worker
2722*912701f9SAndroid Build Coastguard WorkerFor example, consider deleting a Devanagari ksha क्श:
2723*912701f9SAndroid Build Coastguard Worker
2724*912701f9SAndroid Build Coastguard WorkerWhile this character is made up of three codepoints, the following rule causes all three to be deleted by a single press of the backspace.
2725*912701f9SAndroid Build Coastguard Worker
2726*912701f9SAndroid Build Coastguard Worker
2727*912701f9SAndroid Build Coastguard Worker```xml
2728*912701f9SAndroid Build Coastguard Worker<transforms type="backspace">
2729*912701f9SAndroid Build Coastguard Worker    <transformGroup>
2730*912701f9SAndroid Build Coastguard Worker        <transform from="\u{0915}\u{094D}\u{0936}"/>
2731*912701f9SAndroid Build Coastguard Worker    </transformGroup>
2732*912701f9SAndroid Build Coastguard Worker</transforms>
2733*912701f9SAndroid Build Coastguard Worker```
2734*912701f9SAndroid Build Coastguard Worker
2735*912701f9SAndroid Build Coastguard WorkerNote that the optional attribute `@to` is omitted, since the whole string is being deleted. This is not uncommon in backspace transforms.
2736*912701f9SAndroid Build Coastguard Worker
2737*912701f9SAndroid Build Coastguard WorkerA more complex example comes from a Burmese visually ordered keyboard:
2738*912701f9SAndroid Build Coastguard Worker
2739*912701f9SAndroid Build Coastguard Worker```xml
2740*912701f9SAndroid Build Coastguard Worker<transforms type="backspace">
2741*912701f9SAndroid Build Coastguard Worker    <transformGroup>
2742*912701f9SAndroid Build Coastguard Worker        <!-- Kinzi -->
2743*912701f9SAndroid Build Coastguard Worker        <transform from="[\u{1004}\u{101B}\u{105A}]\u{103A}\u{1039}" />
2744*912701f9SAndroid Build Coastguard Worker
2745*912701f9SAndroid Build Coastguard Worker        <!-- subjoined consonant -->
2746*912701f9SAndroid Build Coastguard Worker        <transform from="\u{1039}[\u{1000}-\u{101C}\u{101E}\u{1020}\u{1021}\u{1050}\u{1051}\u{105A}-\u{105D}]" />
2747*912701f9SAndroid Build Coastguard Worker
2748*912701f9SAndroid Build Coastguard Worker        <!-- tone mark -->
2749*912701f9SAndroid Build Coastguard Worker        <transform from="\u{102B}\u{103A}" />
2750*912701f9SAndroid Build Coastguard Worker
2751*912701f9SAndroid Build Coastguard Worker        <!-- Handle prebases -->
2752*912701f9SAndroid Build Coastguard Worker        <!-- diacritics stored before e-vowel -->
2753*912701f9SAndroid Build Coastguard Worker        <transform from="[\u{103A}-\u{103F}\u{105E}-\u{1060}\u{1082}]\u{1031}" to="\u{1031}" />
2754*912701f9SAndroid Build Coastguard Worker
2755*912701f9SAndroid Build Coastguard Worker        <!-- diacritics stored before medial r -->
2756*912701f9SAndroid Build Coastguard Worker        <transform from="[\u{103A}-\u{103B}\u{105E}-\u{105F}]\u{103C}" to="\u{103C}" />
2757*912701f9SAndroid Build Coastguard Worker
2758*912701f9SAndroid Build Coastguard Worker        <!-- subjoined consonant before e-vowel -->
2759*912701f9SAndroid Build Coastguard Worker        <transform from="\u{1039}[\u{1000}-\u{101C}\u{101E}\u{1020}\u{1021}]\u{1031}" to="\u{1031}" />
2760*912701f9SAndroid Build Coastguard Worker
2761*912701f9SAndroid Build Coastguard Worker        <!-- base consonant before e-vowel -->
2762*912701f9SAndroid Build Coastguard Worker        <transform from="[\u{1000}-\u{102A}\u{103F}-\u{1049}\u{104E}]\u{1031}" to="\m{prebase}\u{1031}" />
2763*912701f9SAndroid Build Coastguard Worker
2764*912701f9SAndroid Build Coastguard Worker        <!-- subjoined consonant before medial r -->
2765*912701f9SAndroid Build Coastguard Worker        <transform from="\u{1039}[\u{1000}-\u{101C}\u{101E}\u{1020}\u{1021}]\u{103C}" to="\u{103C}" />
2766*912701f9SAndroid Build Coastguard Worker
2767*912701f9SAndroid Build Coastguard Worker        <!-- base consonant before medial r -->
2768*912701f9SAndroid Build Coastguard Worker        <transform from="[\u{1000}-\u{102A}\u{103F}-\u{1049}\u{104E}]\u{103C}" to="\m{prebase}\u{103C}" />
2769*912701f9SAndroid Build Coastguard Worker
2770*912701f9SAndroid Build Coastguard Worker        <!-- delete lone medial r or e-vowel -->
2771*912701f9SAndroid Build Coastguard Worker        <transform from="\m{prebase}[\u{1031}\u{103C}]" />
2772*912701f9SAndroid Build Coastguard Worker    </transformGroup>
2773*912701f9SAndroid Build Coastguard Worker</transforms>
2774*912701f9SAndroid Build Coastguard Worker```
2775*912701f9SAndroid Build Coastguard Worker
2776*912701f9SAndroid Build Coastguard WorkerThe above example is simplified, and doesn't fully handle the interaction between medial-r and e-vowel.
2777*912701f9SAndroid Build Coastguard Worker
2778*912701f9SAndroid Build Coastguard Worker
2779*912701f9SAndroid Build Coastguard Worker> The character `\m{prebase}` does not represent a literal character, but is instead a special marker, used as a "filler string". When a keyboard implementation handles a user pressing a key that inserts a prebase character, it also has to insert a special filler string before the prebase to ensure that the prebase character does not combine with the previous cluster. See the reorder transform for details. See [markers](#markers) for the `\m` syntax.
2780*912701f9SAndroid Build Coastguard Worker
2781*912701f9SAndroid Build Coastguard WorkerThe first three transforms above delete various ligatures with a single keypress. The other transforms handle prebase characters. There are two in this Burmese keyboard. The transforms delete the characters preceding the prebase character up to base which gets replaced with the prebase filler string, which represents a null base. Finally the prebase filler string + prebase is deleted as a unit.
2782*912701f9SAndroid Build Coastguard Worker
2783*912701f9SAndroid Build Coastguard WorkerIf no specified transform among all `transformGroup`s under the `<transforms type="backspace">` element matches, a default will be used instead — an implied final transform that simply deletes the codepoint at the end of the input context. This implied transform is effectively similar to the following code sample, even though the `*` operator is not actually allowed in `from=`.  See the documentation for *Match a single Unicode codepoint* under [transform syntax](#regex-like-syntax) and [markers](#markers), above.
2784*912701f9SAndroid Build Coastguard Worker
2785*912701f9SAndroid Build Coastguard WorkerIt is important that implementations do not by default delete more than one non-marker codepoint at a time, except in the case of emoji clusters. Note that implementations will vary in the emoji handling due to the iterative nature of successive Unicode releases. See [UTS#51 §2.4.2: Emoji Modifiers in Text](https://www.unicode.org/reports/tr51/#Emoji_Modifiers_in_Text)
2786*912701f9SAndroid Build Coastguard Worker
2787*912701f9SAndroid Build Coastguard Worker```xml
2788*912701f9SAndroid Build Coastguard Worker<transforms type="backspace">
2789*912701f9SAndroid Build Coastguard Worker    <!-- Other explicit transforms -->
2790*912701f9SAndroid Build Coastguard Worker
2791*912701f9SAndroid Build Coastguard Worker    <!-- Final implicit backspace transform: Delete the final codepoint. -->
2792*912701f9SAndroid Build Coastguard Worker    <transformGroup>
2793*912701f9SAndroid Build Coastguard Worker        <!-- (:?\m{.})*  - matches any number of contiguous markers -->
2794*912701f9SAndroid Build Coastguard Worker        <transform from="(:?\m{.})*.(:?\m{.})*" /> <!-- deletes any number of markers directly on either side of the final pre-caret codepoint -->
2795*912701f9SAndroid Build Coastguard Worker    </transformGroup>
2796*912701f9SAndroid Build Coastguard Worker</transforms>
2797*912701f9SAndroid Build Coastguard Worker```
2798*912701f9SAndroid Build Coastguard Worker
2799*912701f9SAndroid Build Coastguard Worker* * *
2800*912701f9SAndroid Build Coastguard Worker
2801*912701f9SAndroid Build Coastguard Worker## Invariants
2802*912701f9SAndroid Build Coastguard Worker
2803*912701f9SAndroid Build Coastguard WorkerBeyond what the DTD imposes, certain other restrictions on the data are imposed on the data.
2804*912701f9SAndroid Build Coastguard WorkerPlease note the constraints given under each element section above.
2805*912701f9SAndroid Build Coastguard WorkerDTD validation alone is not sufficient to verify a keyboard file.
2806*912701f9SAndroid Build Coastguard Worker
2807*912701f9SAndroid Build Coastguard Worker* * *
2808*912701f9SAndroid Build Coastguard Worker
2809*912701f9SAndroid Build Coastguard Worker## Keyboard IDs
2810*912701f9SAndroid Build Coastguard Worker
2811*912701f9SAndroid Build Coastguard WorkerThere is a set of subtags that help identify the keyboards. Each of these are used after the `"t-k0"` subtags to help identify the keyboards. The first tag appended is a mandatory platform tag followed by zero or more tags that help differentiate the keyboard from others with the same locale code.
2812*912701f9SAndroid Build Coastguard Worker
2813*912701f9SAndroid Build Coastguard Worker### Principles for Keyboard IDs
2814*912701f9SAndroid Build Coastguard Worker
2815*912701f9SAndroid Build Coastguard WorkerThe following are the design principles for the IDs.
2816*912701f9SAndroid Build Coastguard Worker
2817*912701f9SAndroid Build Coastguard Worker1. BCP47 compliant.
2818*912701f9SAndroid Build Coastguard Worker   1. Eg, `en`, `sr-Cyrl`, or `en-t-k0-extended`.
2819*912701f9SAndroid Build Coastguard Worker2. Use the minimal language id based on `likelySubtags` (see [Part 1: Likely Subtags](tr35.md#Likely_Subtags))
2820*912701f9SAndroid Build Coastguard Worker   1. Eg, instead of `fa-Arab`, use `fa`.
2821*912701f9SAndroid Build Coastguard Worker   2. The data is in <https://github.com/unicode-org/cldr/blob/main/common/supplemental/likelySubtags.xml>
2822*912701f9SAndroid Build Coastguard Worker3. Keyboard files should be platform-independent, however, if included, a platform id is the first subtag after `-t-k0-`. If a keyboard on the platform changes over time, both are dated, eg `bg-t-k0-chromeos-2011`. When selecting, if there is no date, it means the latest one.
2823*912701f9SAndroid Build Coastguard Worker4. Keyboards are only tagged that differ from the "standard for each language". That is, for each language on a platform, there will be a keyboard with no subtags. Subtags with common semantics across languages and platforms are used, such as `-extended`, `-phonetic`, `-qwerty`, `-qwertz`, `-azerty`, …
2824*912701f9SAndroid Build Coastguard Worker5. In order to get to 8 letters, abbreviations are reused that are already in [bcp47](https://github.com/unicode-org/cldr/blob/main/common/bcp47/) -u/-t extensions and in [language-subtag-registry](https://www.iana.org/assignments/language-subtag-registry) variants, eg for Traditional use `-trad` or `-traditio` (both exist in [bcp47](https://github.com/unicode-org/cldr/blob/main/common/bcp47/)).
2825*912701f9SAndroid Build Coastguard Worker6. Multiple languages cannot be indicated in the locale id, so the predominant target is used.
2826*912701f9SAndroid Build Coastguard Worker   1. For Finnish + Sami, use `fi-t-k0-smi` or `extended-smi`
2827*912701f9SAndroid Build Coastguard Worker   2. The [`<locales>`](#element-locales) element may be used to identify additional languages.
2828*912701f9SAndroid Build Coastguard Worker7. In some cases, there are multiple subtags, like `en-US-t-k0-chromeos-intl-altgr.xml`
2829*912701f9SAndroid Build Coastguard Worker8. Otherwise, platform names are used as a guide.
2830*912701f9SAndroid Build Coastguard Worker
2831*912701f9SAndroid Build Coastguard Worker**Examples**
2832*912701f9SAndroid Build Coastguard Worker
2833*912701f9SAndroid Build Coastguard Worker```xml
2834*912701f9SAndroid Build Coastguard Worker<!-- Serbian Latin -->
2835*912701f9SAndroid Build Coastguard Worker<keyboard3 locale="sr-Latn"/>
2836*912701f9SAndroid Build Coastguard Worker```
2837*912701f9SAndroid Build Coastguard Worker
2838*912701f9SAndroid Build Coastguard Worker```xml
2839*912701f9SAndroid Build Coastguard Worker<!-- Serbian Cyrillic -->
2840*912701f9SAndroid Build Coastguard Worker<keyboard3 locale="sr-Cyrl"/>
2841*912701f9SAndroid Build Coastguard Worker```
2842*912701f9SAndroid Build Coastguard Worker
2843*912701f9SAndroid Build Coastguard Worker```xml
2844*912701f9SAndroid Build Coastguard Worker<!-- Pan Nigerian Keyboard-->
2845*912701f9SAndroid Build Coastguard Worker<keyboard3 locale="mul-Latn-NG-t-k0-panng">
2846*912701f9SAndroid Build Coastguard Worker    <locales>
2847*912701f9SAndroid Build Coastguard Worker    <locale id="ha"/>
2848*912701f9SAndroid Build Coastguard Worker    <locale id="ig"/>
2849*912701f9SAndroid Build Coastguard Worker    <!-- others … -->
2850*912701f9SAndroid Build Coastguard Worker    </locales>
2851*912701f9SAndroid Build Coastguard Worker</keyboard3>
2852*912701f9SAndroid Build Coastguard Worker```
2853*912701f9SAndroid Build Coastguard Worker
2854*912701f9SAndroid Build Coastguard Worker```xml
2855*912701f9SAndroid Build Coastguard Worker<!-- Finnish Keyboard including Skolt Sami -->
2856*912701f9SAndroid Build Coastguard Worker<keyboard3 locale="fi-t-k0-smi">
2857*912701f9SAndroid Build Coastguard Worker    <locales>
2858*912701f9SAndroid Build Coastguard Worker    <locale id="sms"/>
2859*912701f9SAndroid Build Coastguard Worker    </locales>
2860*912701f9SAndroid Build Coastguard Worker</keyboard3>
2861*912701f9SAndroid Build Coastguard Worker```
2862*912701f9SAndroid Build Coastguard Worker
2863*912701f9SAndroid Build Coastguard Worker* * *
2864*912701f9SAndroid Build Coastguard Worker
2865*912701f9SAndroid Build Coastguard Worker## Platform Behaviors in Edge Cases
2866*912701f9SAndroid Build Coastguard Worker
2867*912701f9SAndroid Build Coastguard Worker| Platform | No modifier combination match is available | No map match is available for key position | Transform fails (i.e. if \^d is pressed when that transform does not exist) |
2868*912701f9SAndroid Build Coastguard Worker|----------|--------------------------------------------|--------------------------------------------|---------------------------------------------------------------------------|
2869*912701f9SAndroid Build Coastguard Worker| Chrome OS | Fall back to base | Fall back to character in a keyMap with same "level" of modifier combination. If this character does not exist, fall back to (n-1) level. (This is handled data-generation-side.) <br/> In the specification: No output | No output at all |
2870*912701f9SAndroid Build Coastguard Worker| Mac OS X  | Fall back to base (unless combination is some sort of keyboard shortcut, e.g. cmd-c) | No output | Both keys are output separately |
2871*912701f9SAndroid Build Coastguard Worker| Windows  | No output | No output | Both keys are output separately |
2872*912701f9SAndroid Build Coastguard Worker
2873*912701f9SAndroid Build Coastguard Worker* * *
2874*912701f9SAndroid Build Coastguard Worker
2875*912701f9SAndroid Build Coastguard WorkerCopyright © 2001–2024 Unicode, Inc. All Rights Reserved. The Unicode Consortium makes no expressed or implied warranty of any kind, and assumes no liability for errors or omissions. No liability is assumed for incidental and consequential damages in connection with or arising out of the use of the information or programs contained or accompanying this technical report. The Unicode [Terms of Use](https://www.unicode.org/copyright.html) apply.
2876*912701f9SAndroid Build Coastguard Worker
2877*912701f9SAndroid Build Coastguard WorkerUnicode and the Unicode logo are trademarks of Unicode, Inc., and are registered in some jurisdictions.
2878*912701f9SAndroid Build Coastguard Worker
2879*912701f9SAndroid Build Coastguard Worker
2880*912701f9SAndroid Build Coastguard Worker[keyboard-workgroup]: https://cldr.unicode.org/index/keyboard-workgroup
2881