1*0e209d39SAndroid Build Coastguard Worker<!-- 2*0e209d39SAndroid Build Coastguard Worker© 2019 and later: Unicode, Inc. and others. 3*0e209d39SAndroid Build Coastguard WorkerLicense & terms of use: http://www.unicode.org/copyright.html 4*0e209d39SAndroid Build Coastguard Worker--> 5*0e209d39SAndroid Build Coastguard Worker 6*0e209d39SAndroid Build Coastguard Worker# Basic instructions for running the LdmlConverter via Maven 7*0e209d39SAndroid Build Coastguard Worker 8*0e209d39SAndroid Build Coastguard Worker> Note: While this document provides useful background information about the 9*0e209d39SAndroid Build Coastguard Worker LdmlConverter, the actual complete process for integrating CLDR data to ICU 10*0e209d39SAndroid Build Coastguard Worker is described in the document `../../../docs/processes/cldr-icu.md` which is 11*0e209d39SAndroid Build Coastguard Worker best viewed as 12*0e209d39SAndroid Build Coastguard Worker [CLDR-ICU integration](https://unicode-org.github.io/icu/processes/cldr-icu.html) 13*0e209d39SAndroid Build Coastguard Worker 14*0e209d39SAndroid Build Coastguard Worker## Requirements 15*0e209d39SAndroid Build Coastguard Worker 16*0e209d39SAndroid Build Coastguard Worker* A CLDR release for supplying CLDR data and the CLDR API. 17*0e209d39SAndroid Build Coastguard Worker* The Maven build tool 18*0e209d39SAndroid Build Coastguard Worker* The Ant build tool (using JDK 11+) 19*0e209d39SAndroid Build Coastguard Worker 20*0e209d39SAndroid Build Coastguard Worker## Important directories 21*0e209d39SAndroid Build Coastguard Worker 22*0e209d39SAndroid Build Coastguard Worker| Directory | Description | 23*0e209d39SAndroid Build Coastguard Worker|-----------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| 24*0e209d39SAndroid Build Coastguard Worker| `TOOLS_ROOT` | Path to root of ICU tools directory, below which are (e.g.) the `cldr/` and `unicodetools/` directories. | 25*0e209d39SAndroid Build Coastguard Worker| `CLDR_DIR` | This is the path to the to root of standard CLDR sources, below which are the `common/` and `tools/` directories. | 26*0e209d39SAndroid Build Coastguard Worker| `CLDR_DATA_DIR` | The top-level directory for the CLDR production data (typically the "production" directory in the staging repository). Usually generated locally or obtained from: https://github.com/unicode-org/cldr-staging/tree/main/production | 27*0e209d39SAndroid Build Coastguard Worker 28*0e209d39SAndroid Build Coastguard WorkerIn Posix systems, it's best to set these as exported shell variables, and any 29*0e209d39SAndroid Build Coastguard Workerfollowing instructions assume they have been set accordingly: 30*0e209d39SAndroid Build Coastguard Worker 31*0e209d39SAndroid Build Coastguard Worker``` 32*0e209d39SAndroid Build Coastguard Worker$ export TOOLS_ROOT=/path/to/icu/tools 33*0e209d39SAndroid Build Coastguard Worker$ export CLDR_DIR=/path/to/cldr 34*0e209d39SAndroid Build Coastguard Worker$ export CLDR_DATA_DIR=/path/to/cldr-staging/production 35*0e209d39SAndroid Build Coastguard Worker``` 36*0e209d39SAndroid Build Coastguard Worker 37*0e209d39SAndroid Build Coastguard WorkerNote that you should not attempt to use data from the CLDR project directory 38*0e209d39SAndroid Build Coastguard Worker(where the CLDR API code exists) for conversion into ICU data. The process now 39*0e209d39SAndroid Build Coastguard Workerrelies on a pre-processing step, and the CLDR data must come from the separate 40*0e209d39SAndroid Build Coastguard Worker"staging" repository (i.e. https://github.com/unicode-org/cldr-staging) or be 41*0e209d39SAndroid Build Coastguard Workerpre-processed locally into a different directory. 42*0e209d39SAndroid Build Coastguard Worker 43*0e209d39SAndroid Build Coastguard Worker 44*0e209d39SAndroid Build Coastguard Worker## Initial Setup 45*0e209d39SAndroid Build Coastguard Worker 46*0e209d39SAndroid Build Coastguard WorkerThis project relies on the Maven build tool for managing dependencies and uses 47*0e209d39SAndroid Build Coastguard WorkerAnt for configuration purposes, so both will need to be installed. On a Debian 48*0e209d39SAndroid Build Coastguard Workerbased system, this should be as simple as: 49*0e209d39SAndroid Build Coastguard Worker 50*0e209d39SAndroid Build Coastguard Worker``` 51*0e209d39SAndroid Build Coastguard Worker$ sudo apt-get install maven ant 52*0e209d39SAndroid Build Coastguard Worker``` 53*0e209d39SAndroid Build Coastguard Worker 54*0e209d39SAndroid Build Coastguard WorkerYou must also install an additional CLDR JAR file the local Maven repository at 55*0e209d39SAndroid Build Coastguard Worker`$TOOLS_ROOT/cldr/lib` (see the `README.txt` in that directory for more 56*0e209d39SAndroid Build Coastguard Workerinformation). 57*0e209d39SAndroid Build Coastguard Worker 58*0e209d39SAndroid Build Coastguard Worker``` 59*0e209d39SAndroid Build Coastguard Worker$ cd "$TOOLS_ROOT/cldr/lib" 60*0e209d39SAndroid Build Coastguard Worker$ ./install-cldr-jars.sh "$CLDR_DIR" 61*0e209d39SAndroid Build Coastguard Worker``` 62*0e209d39SAndroid Build Coastguard Worker 63*0e209d39SAndroid Build Coastguard Worker## Generating all ICU data and source code 64*0e209d39SAndroid Build Coastguard Worker 65*0e209d39SAndroid Build Coastguard Worker``` 66*0e209d39SAndroid Build Coastguard Worker$ cd "$TOOLS_ROOT/cldr/cldr-to-icu" 67*0e209d39SAndroid Build Coastguard Worker$ ant -f build-icu-data.xml 68*0e209d39SAndroid Build Coastguard Worker``` 69*0e209d39SAndroid Build Coastguard Worker 70*0e209d39SAndroid Build Coastguard Worker## Other Examples 71*0e209d39SAndroid Build Coastguard Worker 72*0e209d39SAndroid Build Coastguard Worker* Outputting a subset of the supplemental data into a specified directory: 73*0e209d39SAndroid Build Coastguard Worker ``` 74*0e209d39SAndroid Build Coastguard Worker $ ant -f build-icu-data.xml -DoutDir=/tmp/cldr -DoutputTypes=plurals,dayPeriods -DdontGenCode=true 75*0e209d39SAndroid Build Coastguard Worker ``` 76*0e209d39SAndroid Build Coastguard Worker Note: Output types can be listed with mixedCase, lower_underscore or UPPER_UNDERSCORE. 77*0e209d39SAndroid Build Coastguard Worker Pass `-DoutputTypes=help` to see the full list. 78*0e209d39SAndroid Build Coastguard Worker 79*0e209d39SAndroid Build Coastguard Worker 80*0e209d39SAndroid Build Coastguard Worker* Outputting only a subset of locale IDs (and all the supplemental data): 81*0e209d39SAndroid Build Coastguard Worker ``` 82*0e209d39SAndroid Build Coastguard Worker $ ant -f build-icu-data.xml -DoutDir=/tmp/cldr -DlocaleIdFilter='(zh|yue).*' -DdontGenCode=true 83*0e209d39SAndroid Build Coastguard Worker ``` 84*0e209d39SAndroid Build Coastguard Worker 85*0e209d39SAndroid Build Coastguard Worker* Overriding the default CLDR version string (which normally matches the CLDR library code): 86*0e209d39SAndroid Build Coastguard Worker ``` 87*0e209d39SAndroid Build Coastguard Worker $ ant -f build-icu-data.xml -DcldrVersion="36.1" 88*0e209d39SAndroid Build Coastguard Worker ``` 89*0e209d39SAndroid Build Coastguard Worker 90*0e209d39SAndroid Build Coastguard Worker* Using alternate CLDR values (ex: use `alt="ascii"` values from the CLDR XML): 91*0e209d39SAndroid Build Coastguard Worker 92*0e209d39SAndroid Build Coastguard Worker First, edit the `build-icu-data.xml` file where it mentions `ALTERNATE VALUES` 93*0e209d39SAndroid Build Coastguard Worker with the correctly annotated source path, target path, and locales list: 94*0e209d39SAndroid Build Coastguard Worker ```diff 95*0e209d39SAndroid Build Coastguard Worker @@ -384,6 +399,20 @@ 96*0e209d39SAndroid Build Coastguard Worker <!-- ALTERNATE VALUES --> 97*0e209d39SAndroid Build Coastguard Worker 98*0e209d39SAndroid Build Coastguard Worker <!-- The following elements configure alternate values for some special case paths. 99*0e209d39SAndroid Build Coastguard Worker The target path will only be replaced if both it, and the source path, exist in 100*0e209d39SAndroid Build Coastguard Worker the CLDR data (paths will not be modified if only the source path exists). 101*0e209d39SAndroid Build Coastguard Worker 102*0e209d39SAndroid Build Coastguard Worker Since the paths must represent the same semantic type of data, they must be in the 103*0e209d39SAndroid Build Coastguard Worker same "namespace" (same element names) and must not contain value attributes. Thus 104*0e209d39SAndroid Build Coastguard Worker they can only differ by distinguishing attributes (either added or modified). 105*0e209d39SAndroid Build Coastguard Worker 106*0e209d39SAndroid Build Coastguard Worker This feature is typically used to select alternate translations (e.g. short forms) 107*0e209d39SAndroid Build Coastguard Worker for certain paths. --> 108*0e209d39SAndroid Build Coastguard Worker <!-- <altPath target="//path/to/value[@attr='foo']" 109*0e209d39SAndroid Build Coastguard Worker source="//path/to/value[@attr='bar']" 110*0e209d39SAndroid Build Coastguard Worker locales="xx,yy_ZZ"/> --> 111*0e209d39SAndroid Build Coastguard Worker + <altPath target="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='Ehm']" 112*0e209d39SAndroid Build Coastguard Worker + source="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='Ehm'][@alt='ascii']" 113*0e209d39SAndroid Build Coastguard Worker + locales="en"/> 114*0e209d39SAndroid Build Coastguard Worker + <altPath target="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='Ehms']" 115*0e209d39SAndroid Build Coastguard Worker + source="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='Ehms'][@alt='ascii']" 116*0e209d39SAndroid Build Coastguard Worker + locales="en"/> 117*0e209d39SAndroid Build Coastguard Worker + <altPath target="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='h']" 118*0e209d39SAndroid Build Coastguard Worker + source="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='h'][@alt='ascii']" 119*0e209d39SAndroid Build Coastguard Worker + locales="en"/> 120*0e209d39SAndroid Build Coastguard Worker + <altPath target="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='hm']" 121*0e209d39SAndroid Build Coastguard Worker + source="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='hm'][@alt='ascii']" 122*0e209d39SAndroid Build Coastguard Worker + locales="en"/> 123*0e209d39SAndroid Build Coastguard Worker + <altPath target="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='hms']" 124*0e209d39SAndroid Build Coastguard Worker + source="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='hms'][@alt='ascii']" 125*0e209d39SAndroid Build Coastguard Worker + locales="en"/> 126*0e209d39SAndroid Build Coastguard Worker + <altPath target="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='hmsv']" 127*0e209d39SAndroid Build Coastguard Worker + source="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='hmsv'][@alt='ascii']" 128*0e209d39SAndroid Build Coastguard Worker + locales="en"/> 129*0e209d39SAndroid Build Coastguard Worker + <altPath target="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='hmv']" 130*0e209d39SAndroid Build Coastguard Worker + source="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='hmv'][@alt='ascii']" 131*0e209d39SAndroid Build Coastguard Worker + locales="en"/> 132*0e209d39SAndroid Build Coastguard Worker + <altPath target="//ldml/dates/calendars/calendar[@type='gregorian']/timeFormats/timeFormatLength[@type='full']/timeFormat[@type='standard']/pattern[@type='standard']" 133*0e209d39SAndroid Build Coastguard Worker + source="//ldml/dates/calendars/calendar[@type='gregorian']/timeFormats/timeFormatLength[@type='full']/timeFormat[@type='standard']/pattern[@alt='ascii'][@type='standard']" 134*0e209d39SAndroid Build Coastguard Worker + locales="en"/> 135*0e209d39SAndroid Build Coastguard Worker + <altPath target="//ldml/dates/calendars/calendar[@type='gregorian']/timeFormats/timeFormatLength[@type='long']/timeFormat[@type='standard']/pattern[@type='standard']" 136*0e209d39SAndroid Build Coastguard Worker + source="//ldml/dates/calendars/calendar[@type='gregorian']/timeFormats/timeFormatLength[@type='long']/timeFormat[@type='standard']/pattern[@alt='ascii'][@type='standard']" 137*0e209d39SAndroid Build Coastguard Worker + locales="en"/> 138*0e209d39SAndroid Build Coastguard Worker + <altPath target="//ldml/dates/calendars/calendar[@type='gregorian']/timeFormats/timeFormatLength[@type='medium']/timeFormat[@type='standard']/pattern[@type='standard']" 139*0e209d39SAndroid Build Coastguard Worker + source="//ldml/dates/calendars/calendar[@type='gregorian']/timeFormats/timeFormatLength[@type='medium']/timeFormat[@type='standard']/pattern[@alt='ascii'][@type='standard']" 140*0e209d39SAndroid Build Coastguard Worker + locales="en"/> 141*0e209d39SAndroid Build Coastguard Worker + <altPath target="//ldml/dates/calendars/calendar[@type='gregorian']/timeFormats/timeFormatLength[@type='short']/timeFormat[@type='standard']/pattern[@type='standard']" 142*0e209d39SAndroid Build Coastguard Worker + source="//ldml/dates/calendars/calendar[@type='gregorian']/timeFormats/timeFormatLength[@type='short']/timeFormat[@type='standard']/pattern[@alt='ascii'][@type='standard']" 143*0e209d39SAndroid Build Coastguard Worker + locales="en"/> 144*0e209d39SAndroid Build Coastguard Worker + <altPath target="//ldml/dates/calendars/calendar[@type='generic']/dateTimeFormats/availableFormats/dateFormatItem[@id='Ehm']" 145*0e209d39SAndroid Build Coastguard Worker + source="//ldml/dates/calendars/calendar[@type='generic']/dateTimeFormats/availableFormats/dateFormatItem[@id='Ehm'][@alt='ascii']" 146*0e209d39SAndroid Build Coastguard Worker + locales="en"/> 147*0e209d39SAndroid Build Coastguard Worker + <altPath target="//ldml/dates/calendars/calendar[@type='generic']/dateTimeFormats/availableFormats/dateFormatItem[@id='Ehms']" 148*0e209d39SAndroid Build Coastguard Worker + source="//ldml/dates/calendars/calendar[@type='generic']/dateTimeFormats/availableFormats/dateFormatItem[@id='Ehms'][@alt='ascii']" 149*0e209d39SAndroid Build Coastguard Worker + locales="en"/> 150*0e209d39SAndroid Build Coastguard Worker + <altPath target="//ldml/dates/calendars/calendar[@type='generic']/dateTimeFormats/availableFormats/dateFormatItem[@id='h']" 151*0e209d39SAndroid Build Coastguard Worker + source="//ldml/dates/calendars/calendar[@type='generic']/dateTimeFormats/availableFormats/dateFormatItem[@id='h'][@alt='ascii']" 152*0e209d39SAndroid Build Coastguard Worker + locales="en"/> 153*0e209d39SAndroid Build Coastguard Worker + <altPath target="//ldml/dates/calendars/calendar[@type='generic']/dateTimeFormats/availableFormats/dateFormatItem[@id='hm']" 154*0e209d39SAndroid Build Coastguard Worker + source="//ldml/dates/calendars/calendar[@type='generic']/dateTimeFormats/availableFormats/dateFormatItem[@id='hm'][@alt='ascii']" 155*0e209d39SAndroid Build Coastguard Worker + locales="en"/> 156*0e209d39SAndroid Build Coastguard Worker + <altPath target="//ldml/dates/calendars/calendar[@type='generic']/dateTimeFormats/availableFormats/dateFormatItem[@id='hms']" 157*0e209d39SAndroid Build Coastguard Worker + source="//ldml/dates/calendars/calendar[@type='generic']/dateTimeFormats/availableFormats/dateFormatItem[@id='hms'][@alt='ascii']" 158*0e209d39SAndroid Build Coastguard Worker + locales="en"/> 159*0e209d39SAndroid Build Coastguard Worker ``` 160*0e209d39SAndroid Build Coastguard Worker Then run the generator: 161*0e209d39SAndroid Build Coastguard Worker ``` 162*0e209d39SAndroid Build Coastguard Worker $ ant -f build-icu-data.xml <options> 163*0e209d39SAndroid Build Coastguard Worker ``` 164*0e209d39SAndroid Build Coastguard Worker 165*0e209d39SAndroid Build Coastguard WorkerSee build-icu-data.xml for documentation of all options and additional customization. 166*0e209d39SAndroid Build Coastguard Worker 167*0e209d39SAndroid Build Coastguard Worker 168*0e209d39SAndroid Build Coastguard Worker## Running unit tests 169*0e209d39SAndroid Build Coastguard Worker 170*0e209d39SAndroid Build Coastguard Worker``` 171*0e209d39SAndroid Build Coastguard Worker$ mvn test -DCLDR_DIR="$CLDR_DATA_DIR" 172*0e209d39SAndroid Build Coastguard Worker``` 173*0e209d39SAndroid Build Coastguard Worker 174*0e209d39SAndroid Build Coastguard Worker 175*0e209d39SAndroid Build Coastguard Worker## Importing and running from an IDE 176*0e209d39SAndroid Build Coastguard Worker 177*0e209d39SAndroid Build Coastguard WorkerThis project should be easy to import into an IDE which supports Maven development, such 178*0e209d39SAndroid Build Coastguard Workeras IntelliJ or Eclipse. It uses a local Maven repository directory for the unpublished 179*0e209d39SAndroid Build Coastguard WorkerCLDR libraries (which are included in the project), but otherwise gets all dependencies 180*0e209d39SAndroid Build Coastguard Workervia Maven's public repositories. 181