1<!-- 2© 2019 and later: Unicode, Inc. and others. 3License & terms of use: http://www.unicode.org/copyright.html 4--> 5 6# Basic instructions for running the LdmlConverter via Maven 7 8> Note: While this document provides useful background information about the 9 LdmlConverter, the actual complete process for integrating CLDR data to ICU 10 is described in the document `../../../docs/processes/cldr-icu.md` which is 11 best viewed as 12 [CLDR-ICU integration](https://unicode-org.github.io/icu/processes/cldr-icu.html) 13 14## Requirements 15 16* A CLDR release for supplying CLDR data and the CLDR API. 17* The Maven build tool 18* The Ant build tool (using JDK 11+) 19 20## Important directories 21 22| Directory | Description | 23|-----------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| 24| `TOOLS_ROOT` | Path to root of ICU tools directory, below which are (e.g.) the `cldr/` and `unicodetools/` directories. | 25| `CLDR_DIR` | This is the path to the to root of standard CLDR sources, below which are the `common/` and `tools/` directories. | 26| `CLDR_DATA_DIR` | The top-level directory for the CLDR production data (typically the "production" directory in the staging repository). Usually generated locally or obtained from: https://github.com/unicode-org/cldr-staging/tree/main/production | 27 28In Posix systems, it's best to set these as exported shell variables, and any 29following instructions assume they have been set accordingly: 30 31``` 32$ export TOOLS_ROOT=/path/to/icu/tools 33$ export CLDR_DIR=/path/to/cldr 34$ export CLDR_DATA_DIR=/path/to/cldr-staging/production 35``` 36 37Note that you should not attempt to use data from the CLDR project directory 38(where the CLDR API code exists) for conversion into ICU data. The process now 39relies on a pre-processing step, and the CLDR data must come from the separate 40"staging" repository (i.e. https://github.com/unicode-org/cldr-staging) or be 41pre-processed locally into a different directory. 42 43 44## Initial Setup 45 46This project relies on the Maven build tool for managing dependencies and uses 47Ant for configuration purposes, so both will need to be installed. On a Debian 48based system, this should be as simple as: 49 50``` 51$ sudo apt-get install maven ant 52``` 53 54You must also install an additional CLDR JAR file the local Maven repository at 55`$TOOLS_ROOT/cldr/lib` (see the `README.txt` in that directory for more 56information). 57 58``` 59$ cd "$TOOLS_ROOT/cldr/lib" 60$ ./install-cldr-jars.sh "$CLDR_DIR" 61``` 62 63## Generating all ICU data and source code 64 65``` 66$ cd "$TOOLS_ROOT/cldr/cldr-to-icu" 67$ ant -f build-icu-data.xml 68``` 69 70## Other Examples 71 72* Outputting a subset of the supplemental data into a specified directory: 73 ``` 74 $ ant -f build-icu-data.xml -DoutDir=/tmp/cldr -DoutputTypes=plurals,dayPeriods -DdontGenCode=true 75 ``` 76 Note: Output types can be listed with mixedCase, lower_underscore or UPPER_UNDERSCORE. 77 Pass `-DoutputTypes=help` to see the full list. 78 79 80* Outputting only a subset of locale IDs (and all the supplemental data): 81 ``` 82 $ ant -f build-icu-data.xml -DoutDir=/tmp/cldr -DlocaleIdFilter='(zh|yue).*' -DdontGenCode=true 83 ``` 84 85* Overriding the default CLDR version string (which normally matches the CLDR library code): 86 ``` 87 $ ant -f build-icu-data.xml -DcldrVersion="36.1" 88 ``` 89 90* Using alternate CLDR values (ex: use `alt="ascii"` values from the CLDR XML): 91 92 First, edit the `build-icu-data.xml` file where it mentions `ALTERNATE VALUES` 93 with the correctly annotated source path, target path, and locales list: 94 ```diff 95 @@ -384,6 +399,20 @@ 96 <!-- ALTERNATE VALUES --> 97 98 <!-- The following elements configure alternate values for some special case paths. 99 The target path will only be replaced if both it, and the source path, exist in 100 the CLDR data (paths will not be modified if only the source path exists). 101 102 Since the paths must represent the same semantic type of data, they must be in the 103 same "namespace" (same element names) and must not contain value attributes. Thus 104 they can only differ by distinguishing attributes (either added or modified). 105 106 This feature is typically used to select alternate translations (e.g. short forms) 107 for certain paths. --> 108 <!-- <altPath target="//path/to/value[@attr='foo']" 109 source="//path/to/value[@attr='bar']" 110 locales="xx,yy_ZZ"/> --> 111 + <altPath target="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='Ehm']" 112 + source="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='Ehm'][@alt='ascii']" 113 + locales="en"/> 114 + <altPath target="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='Ehms']" 115 + source="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='Ehms'][@alt='ascii']" 116 + locales="en"/> 117 + <altPath target="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='h']" 118 + source="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='h'][@alt='ascii']" 119 + locales="en"/> 120 + <altPath target="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='hm']" 121 + source="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='hm'][@alt='ascii']" 122 + locales="en"/> 123 + <altPath target="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='hms']" 124 + source="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='hms'][@alt='ascii']" 125 + locales="en"/> 126 + <altPath target="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='hmsv']" 127 + source="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='hmsv'][@alt='ascii']" 128 + locales="en"/> 129 + <altPath target="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='hmv']" 130 + source="//ldml/dates/calendars/calendar[@type='gregorian']/dateTimeFormats/availableFormats/dateFormatItem[@id='hmv'][@alt='ascii']" 131 + locales="en"/> 132 + <altPath target="//ldml/dates/calendars/calendar[@type='gregorian']/timeFormats/timeFormatLength[@type='full']/timeFormat[@type='standard']/pattern[@type='standard']" 133 + source="//ldml/dates/calendars/calendar[@type='gregorian']/timeFormats/timeFormatLength[@type='full']/timeFormat[@type='standard']/pattern[@alt='ascii'][@type='standard']" 134 + locales="en"/> 135 + <altPath target="//ldml/dates/calendars/calendar[@type='gregorian']/timeFormats/timeFormatLength[@type='long']/timeFormat[@type='standard']/pattern[@type='standard']" 136 + source="//ldml/dates/calendars/calendar[@type='gregorian']/timeFormats/timeFormatLength[@type='long']/timeFormat[@type='standard']/pattern[@alt='ascii'][@type='standard']" 137 + locales="en"/> 138 + <altPath target="//ldml/dates/calendars/calendar[@type='gregorian']/timeFormats/timeFormatLength[@type='medium']/timeFormat[@type='standard']/pattern[@type='standard']" 139 + source="//ldml/dates/calendars/calendar[@type='gregorian']/timeFormats/timeFormatLength[@type='medium']/timeFormat[@type='standard']/pattern[@alt='ascii'][@type='standard']" 140 + locales="en"/> 141 + <altPath target="//ldml/dates/calendars/calendar[@type='gregorian']/timeFormats/timeFormatLength[@type='short']/timeFormat[@type='standard']/pattern[@type='standard']" 142 + source="//ldml/dates/calendars/calendar[@type='gregorian']/timeFormats/timeFormatLength[@type='short']/timeFormat[@type='standard']/pattern[@alt='ascii'][@type='standard']" 143 + locales="en"/> 144 + <altPath target="//ldml/dates/calendars/calendar[@type='generic']/dateTimeFormats/availableFormats/dateFormatItem[@id='Ehm']" 145 + source="//ldml/dates/calendars/calendar[@type='generic']/dateTimeFormats/availableFormats/dateFormatItem[@id='Ehm'][@alt='ascii']" 146 + locales="en"/> 147 + <altPath target="//ldml/dates/calendars/calendar[@type='generic']/dateTimeFormats/availableFormats/dateFormatItem[@id='Ehms']" 148 + source="//ldml/dates/calendars/calendar[@type='generic']/dateTimeFormats/availableFormats/dateFormatItem[@id='Ehms'][@alt='ascii']" 149 + locales="en"/> 150 + <altPath target="//ldml/dates/calendars/calendar[@type='generic']/dateTimeFormats/availableFormats/dateFormatItem[@id='h']" 151 + source="//ldml/dates/calendars/calendar[@type='generic']/dateTimeFormats/availableFormats/dateFormatItem[@id='h'][@alt='ascii']" 152 + locales="en"/> 153 + <altPath target="//ldml/dates/calendars/calendar[@type='generic']/dateTimeFormats/availableFormats/dateFormatItem[@id='hm']" 154 + source="//ldml/dates/calendars/calendar[@type='generic']/dateTimeFormats/availableFormats/dateFormatItem[@id='hm'][@alt='ascii']" 155 + locales="en"/> 156 + <altPath target="//ldml/dates/calendars/calendar[@type='generic']/dateTimeFormats/availableFormats/dateFormatItem[@id='hms']" 157 + source="//ldml/dates/calendars/calendar[@type='generic']/dateTimeFormats/availableFormats/dateFormatItem[@id='hms'][@alt='ascii']" 158 + locales="en"/> 159 ``` 160 Then run the generator: 161 ``` 162 $ ant -f build-icu-data.xml <options> 163 ``` 164 165See build-icu-data.xml for documentation of all options and additional customization. 166 167 168## Running unit tests 169 170``` 171$ mvn test -DCLDR_DIR="$CLDR_DATA_DIR" 172``` 173 174 175## Importing and running from an IDE 176 177This project should be easy to import into an IDE which supports Maven development, such 178as IntelliJ or Eclipse. It uses a local Maven repository directory for the unpublished 179CLDR libraries (which are included in the project), but otherwise gets all dependencies 180via Maven's public repositories. 181