1*6777b538SAndroid Build Coastguard WorkerName: icu 2*6777b538SAndroid Build Coastguard WorkerURL: https://github.com/unicode-org/icu 3*6777b538SAndroid Build Coastguard WorkerVersion: 74-2 4*6777b538SAndroid Build Coastguard WorkerCPEPrefix: cpe:/a:icu-project:international_components_for_unicode:74.2 5*6777b538SAndroid Build Coastguard WorkerLicense: MIT 6*6777b538SAndroid Build Coastguard WorkerLicense File: LICENSE 7*6777b538SAndroid Build Coastguard WorkerSecurity Critical: yes 8*6777b538SAndroid Build Coastguard WorkerShipped: yes 9*6777b538SAndroid Build Coastguard Worker 10*6777b538SAndroid Build Coastguard WorkerDescription: 11*6777b538SAndroid Build Coastguard WorkerThis directory contains the source code of ICU 74.2 for C/C++. 12*6777b538SAndroid Build Coastguard Worker 13*6777b538SAndroid Build Coastguard WorkerA. How to update ICU 14*6777b538SAndroid Build Coastguard Worker 15*6777b538SAndroid Build Coastguard Worker1. Run "scripts/update.sh <version>" (e.g. 74-2). 16*6777b538SAndroid Build Coastguard Worker This will download ICU from the upstream git repository. 17*6777b538SAndroid Build Coastguard Worker It does preserve Chrome-specific build files and 18*6777b538SAndroid Build Coastguard Worker converter files. (see section C) 19*6777b538SAndroid Build Coastguard Worker 20*6777b538SAndroid Build Coastguard Worker source.gni and icu.gyp* files are automatically updated, too. 21*6777b538SAndroid Build Coastguard Worker 22*6777b538SAndroid Build Coastguard Worker2. Review and apply patches/changes in "D. Local Modifications" if 23*6777b538SAndroid Build Coastguard Worker necessary/applicable. Update patch files in patches/. 24*6777b538SAndroid Build Coastguard Worker 25*6777b538SAndroid Build Coastguard Worker3. Follow the instructions in section B on building ICU data files 26*6777b538SAndroid Build Coastguard Worker 27*6777b538SAndroid Build Coastguard WorkerB. How to build ICU data files 28*6777b538SAndroid Build Coastguard Worker 29*6777b538SAndroid Build Coastguard Worker 30*6777b538SAndroid Build Coastguard WorkerPre-built data files are generated and checked in with the following steps 31*6777b538SAndroid Build Coastguard Worker 32*6777b538SAndroid Build Coastguard Worker1. icu data files for Chrome OS, Linux, Mac and Windows 33*6777b538SAndroid Build Coastguard Worker 34*6777b538SAndroid Build Coastguard Worker a. Make a icu data build directory outside the Chromium source tree 35*6777b538SAndroid Build Coastguard Worker and cd to that directory (say, $ICUBUILDIR). 36*6777b538SAndroid Build Coastguard Worker 37*6777b538SAndroid Build Coastguard Worker b. Run 38*6777b538SAndroid Build Coastguard Worker ${CHROME_ICU_TREE_TOP}/scripts/make_data_all.sh 39*6777b538SAndroid Build Coastguard Worker 40*6777b538SAndroid Build Coastguard Worker This script takes the following steps: 41*6777b538SAndroid Build Coastguard Worker 42*6777b538SAndroid Build Coastguard Worker i) Run 43*6777b538SAndroid Build Coastguard Worker ${CHROME_ICU_TREE_TOP}/source/runConfigureICU Linux --disable-layout --disable-tests 44*6777b538SAndroid Build Coastguard Worker 45*6777b538SAndroid Build Coastguard Worker ii) Run make 46*6777b538SAndroid Build Coastguard Worker 47*6777b538SAndroid Build Coastguard Worker iii) (cd data && make clean) 48*6777b538SAndroid Build Coastguard Worker 49*6777b538SAndroid Build Coastguard Worker iv) scripts/config_data.sh common 50*6777b538SAndroid Build Coastguard Worker This configure the build with filer for common. 51*6777b538SAndroid Build Coastguard Worker 52*6777b538SAndroid Build Coastguard Worker v) Run make 53*6777b538SAndroid Build Coastguard Worker 54*6777b538SAndroid Build Coastguard Worker vi) scripts/copy_data.sh common 55*6777b538SAndroid Build Coastguard Worker This copies the ICU data files for non-Android platforms 56*6777b538SAndroid Build Coastguard Worker (both Little and Big Endian) to the following locations: 57*6777b538SAndroid Build Coastguard Worker 58*6777b538SAndroid Build Coastguard Worker common/icudtl.dat 59*6777b538SAndroid Build Coastguard Worker common/icudtb.dat 60*6777b538SAndroid Build Coastguard Worker 61*6777b538SAndroid Build Coastguard Worker vii) Repeat step iii) - vi) for chromeos to produce chromeos/icudtl.dat 62*6777b538SAndroid Build Coastguard Worker 63*6777b538SAndroid Build Coastguard Worker viii) cast/patch_locale.sh 64*6777b538SAndroid Build Coastguard Worker Modify the file for cast, android, ios and flutter. 65*6777b538SAndroid Build Coastguard Worker 66*6777b538SAndroid Build Coastguard Worker ix) Repeat step iii) - vi) for cast, andriod and ios to produce 67*6777b538SAndroid Build Coastguard Worker cast/icudtl.dat 68*6777b538SAndroid Build Coastguard Worker andriod/icudtl.dat 69*6777b538SAndroid Build Coastguard Worker ios/icudtl.dat 70*6777b538SAndroid Build Coastguard Worker 71*6777b538SAndroid Build Coastguard Worker x) flutter/patch_brkitr.sh 72*6777b538SAndroid Build Coastguard Worker On top of cast/patch_locale.sh.sh (step viii)), further patch 73*6777b538SAndroid Build Coastguard Worker the code for flutter. 74*6777b538SAndroid Build Coastguard Worker 75*6777b538SAndroid Build Coastguard Worker xi) Repeat step iii) - vi) for flutter to produce 76*6777b538SAndroid Build Coastguard Worker flutter/icudtl.dat 77*6777b538SAndroid Build Coastguard Worker 78*6777b538SAndroid Build Coastguard Worker xii) scripts/clean_up_data_source.sh 79*6777b538SAndroid Build Coastguard Worker 80*6777b538SAndroid Build Coastguard Worker This reverts the result of cast/patch_locale.sh and flutter/patch_brkitr.sh 81*6777b538SAndroid Build Coastguard Worker make the tree ready for committing updated ICU data files for 82*6777b538SAndroid Build Coastguard Worker non-Android and Android platforms. 83*6777b538SAndroid Build Coastguard Worker 84*6777b538SAndroid Build Coastguard Worker c. Whenever data is updated (e.g timezone update), take step b as long 85*6777b538SAndroid Build Coastguard Worker as the ICU build directory used in a. is kept. 86*6777b538SAndroid Build Coastguard Worker 87*6777b538SAndroid Build Coastguard Worker2. Note on the locale data customization 88*6777b538SAndroid Build Coastguard Worker 89*6777b538SAndroid Build Coastguard Worker - filter/chromeos.json 90*6777b538SAndroid Build Coastguard Worker a. Filter the locale data for ChromeOS's UI langauges : 91*6777b538SAndroid Build Coastguard Worker locales, lang, region, currency, zone 92*6777b538SAndroid Build Coastguard Worker b. Filter the locale data for non-UI languages to the bare minimum : 93*6777b538SAndroid Build Coastguard Worker ExemplarCharacters, LocaleScript, layout, and the name of the 94*6777b538SAndroid Build Coastguard Worker language for a locale in its native language. 95*6777b538SAndroid Build Coastguard Worker c. Filter the legacy Chinese character set-based collation 96*6777b538SAndroid Build Coastguard Worker (big5han/gb2312han) that don't make any sense and nobdoy uses. 97*6777b538SAndroid Build Coastguard Worker 98*6777b538SAndroid Build Coastguard Worker - filter/common.json 99*6777b538SAndroid Build Coastguard Worker Same as above in filter/chromeos.json, AND 100*6777b538SAndroid Build Coastguard Worker e. Filter exemplar cities in timezone data (data/zone). 101*6777b538SAndroid Build Coastguard Worker 102*6777b538SAndroid Build Coastguard Worker - filter/android.json and filter/ios.json 103*6777b538SAndroid Build Coastguard Worker a. Filter the locale data for Android / iOS UI langauges : 104*6777b538SAndroid Build Coastguard Worker locales, lang, region, currency, zone 105*6777b538SAndroid Build Coastguard Worker b. Filter the locale data for non-UI languages to the bare minimum : 106*6777b538SAndroid Build Coastguard Worker ExemplarCharacters, LocaleScript, layout, and the name of the 107*6777b538SAndroid Build Coastguard Worker language for a locale in its native language. 108*6777b538SAndroid Build Coastguard Worker c. Filter the legacy Chinese character set-based collation 109*6777b538SAndroid Build Coastguard Worker d. Filter source/data/{region,lang} to exclude these data 110*6777b538SAndroid Build Coastguard Worker except the language and script names of zh_Hans and zh_Hant. 111*6777b538SAndroid Build Coastguard Worker e. Keep only the minimal calendar data in data/locales. 112*6777b538SAndroid Build Coastguard Worker f. Include currency display names for a smaller subset of currencies. 113*6777b538SAndroid Build Coastguard Worker g. Minimize the locale data for 9 locales to which Chrome on Android 114*6777b538SAndroid Build Coastguard Worker is not localized. 115*6777b538SAndroid Build Coastguard Worker 116*6777b538SAndroid Build Coastguard Worker 117*6777b538SAndroid Build Coastguard WorkerC. Chromium-specific data build files and converters 118*6777b538SAndroid Build Coastguard Worker 119*6777b538SAndroid Build Coastguard WorkerThey're preserved in step A.1 above. In general, there's no need to touch 120*6777b538SAndroid Build Coastguard Workerthem when updating ICU. 121*6777b538SAndroid Build Coastguard Worker 122*6777b538SAndroid Build Coastguard Worker1. source/data/mappings 123*6777b538SAndroid Build Coastguard Worker - convrtrs.txt : Lists encodings and aliases required by the WHATWG 124*6777b538SAndroid Build Coastguard Worker Encoding spec plus a few extra (see the file as to why). 125*6777b538SAndroid Build Coastguard Worker 126*6777b538SAndroid Build Coastguard Worker - ucmlocal.txt : to list only converters we need. 127*6777b538SAndroid Build Coastguard Worker 128*6777b538SAndroid Build Coastguard Worker - *html.ucm: Mapping files per WHATWG encoding standards for EUC-JP, 129*6777b538SAndroid Build Coastguard Worker Shift_JIS, Big5 (Big5+Big5HKSCS), EUC-KR and all the single byte encodings. 130*6777b538SAndroid Build Coastguard Worker They're generated with scripts/{eucjp,sjis,big5,euckr,single_byte}_gen.sh. 131*6777b538SAndroid Build Coastguard Worker 132*6777b538SAndroid Build Coastguard Worker - gb18030.ucm and windows-936.ucm 133*6777b538SAndroid Build Coastguard Worker gb_table.patch was applied for the following changes. No need 134*6777b538SAndroid Build Coastguard Worker to apply it again. The patch is kept for the record. 135*6777b538SAndroid Build Coastguard Worker a. Map \xA3\xA0 to U+3000 instead of U+E5E5 in gb18030 and windows-936 per 136*6777b538SAndroid Build Coastguard Worker the encoding spec (one-way mapping in toUnicode direction). 137*6777b538SAndroid Build Coastguard Worker b. Map \xA8\xBF to U+01F9 instead of U+E7C8. Add one-way map 138*6777b538SAndroid Build Coastguard Worker from U+1E3F to \xA8\xBC (windows-936/GBK). 139*6777b538SAndroid Build Coastguard Worker See https://www.w3.org/Bugs/Public/show_bug.cgi?id=28740#c3 140*6777b538SAndroid Build Coastguard Worker 141*6777b538SAndroid Build Coastguard Worker2. source/data/brkitr 142*6777b538SAndroid Build Coastguard Worker - dictionaries/khmerdict.txt: Abridged Khmer dictionary. See 143*6777b538SAndroid Build Coastguard Worker https://unicode-org.atlassian.net/browse/ICU-9451 144*6777b538SAndroid Build Coastguard Worker - dictionaries/laodict.txt: Abridged Lao dictionary. We keep using the smaller 145*6777b538SAndroid Build Coastguard Worker old version from ICU69-1. 146*6777b538SAndroid Build Coastguard Worker - rules/word_ja.txt (used only on Android) 147*6777b538SAndroid Build Coastguard Worker Added for Japanese-specific word-breaking without the C+J dictionary. 148*6777b538SAndroid Build Coastguard Worker - rules/{root,zh,zh_Hant}.txt 149*6777b538SAndroid Build Coastguard Worker a. Use line_normal by default. 150*6777b538SAndroid Build Coastguard Worker b. Drop local patches we used to have for the following issues. They'll 151*6777b538SAndroid Build Coastguard Worker be dealt with in the upstream (Unicode/CLDR). 152*6777b538SAndroid Build Coastguard Worker http://unicode.org/cldr/trac/ticket/6557 153*6777b538SAndroid Build Coastguard Worker http://unicode.org/cldr/trac/ticket/4200 (http://crbug.com/39779) 154*6777b538SAndroid Build Coastguard Worker 155*6777b538SAndroid Build Coastguard Worker3. Add {an,ku,tg,wa}.txt to source/data/{locale,lang} 156*6777b538SAndroid Build Coastguard Worker with the minimal locale data necessary for spellchecker and 157*6777b538SAndroid Build Coastguard Worker and language menus. 158*6777b538SAndroid Build Coastguard Worker 159*6777b538SAndroid Build Coastguard WorkerD. Local Modifications 160*6777b538SAndroid Build Coastguard Worker 161*6777b538SAndroid Build Coastguard Worker1. Applied locale data patches from Google obtained by diff'ing 162*6777b538SAndroid Build Coastguard Worker the upstream copy and Google's internal copy for source/data 163*6777b538SAndroid Build Coastguard Worker 164*6777b538SAndroid Build Coastguard Worker - patches/locale_google.patch: 165*6777b538SAndroid Build Coastguard Worker * Google's internal ICU locale changes 166*6777b538SAndroid Build Coastguard Worker * Simpler region names for Hong Kong and Macau in all locales 167*6777b538SAndroid Build Coastguard Worker * Currency signs in ru and uk locales (do not include 'tr' locale changes) 168*6777b538SAndroid Build Coastguard Worker * AM/PM, midnight, noon formatting for a few Indian locales 169*6777b538SAndroid Build Coastguard Worker * Timezone name changes in Korean and Chinese locales 170*6777b538SAndroid Build Coastguard Worker * Default digit for Arabic locale is European digits. 171*6777b538SAndroid Build Coastguard Worker 172*6777b538SAndroid Build Coastguard Worker - patches/locale1.patch: Minor fixes for Korean 173*6777b538SAndroid Build Coastguard Worker 174*6777b538SAndroid Build Coastguard Worker - patches/name_5_langs.patch: add the native names of 5 languages not currently 175*6777b538SAndroid Build Coastguard Worker supported by CLDR/ICU. When updating the ICU to a new version, 176*6777b538SAndroid Build Coastguard Worker source/data/lang/{ay,dv,ilo,lus,ts}.txt have to be checked and if they are 177*6777b538SAndroid Build Coastguard Worker present with their display names populated, this patch has to be adjusted 178*6777b538SAndroid Build Coastguard Worker or discarded as necessary. 179*6777b538SAndroid Build Coastguard Worker 180*6777b538SAndroid Build Coastguard Worker2. Breakiterator patches 181*6777b538SAndroid Build Coastguard Worker - patches/wordbrk.patch for word.txt, word_POSIX.txt, and word_fi_sv.txt 182*6777b538SAndroid Build Coastguard Worker a. Move full stops (U+002E, U+FF0E) from MidNumLet to MidNum so that 183*6777b538SAndroid Build Coastguard Worker FQDN labels can be split at '.' 184*6777b538SAndroid Build Coastguard Worker b. Move fullwidth digits (U+FF10 - U+FF19) from Ideographic to Numeric. 185*6777b538SAndroid Build Coastguard Worker See http://unicode.org/cldr/trac/ticket/6555 186*6777b538SAndroid Build Coastguard Worker c. Restore pre-ICU 72 behavior of breaking at '@'. The new upstream behavior 187*6777b538SAndroid Build Coastguard Worker of not breaking at '@' interacted badly with the local change to break at 188*6777b538SAndroid Build Coastguard Worker '.' (D.2.a above): although not breaking at '@' is intended to not break 189*6777b538SAndroid Build Coastguard Worker within e-mail addresses, this is not possible with Chromium's 190*6777b538SAndroid Build Coastguard Worker break-at-'.' behavior. 191*6777b538SAndroid Build Coastguard Worker 192*6777b538SAndroid Build Coastguard Worker - patches/khmer-dictbe.patch 193*6777b538SAndroid Build Coastguard Worker Adjust parameters to use a smaller Khmer dictionary (khmerdict.txt). 194*6777b538SAndroid Build Coastguard Worker https://unicode-org.atlassian.net/browse/ICU-9451 195*6777b538SAndroid Build Coastguard Worker 196*6777b538SAndroid Build Coastguard Worker - Add several common Chinese words that were dropped previously to 197*6777b538SAndroid Build Coastguard Worker source/data/cjdict/brkitr/cjdict.txt 198*6777b538SAndroid Build Coastguard Worker patch: patches/cjdict.patch 199*6777b538SAndroid Build Coastguard Worker upstream bug: https://unicode-org.atlassian.net/browse/ICU-10888 200*6777b538SAndroid Build Coastguard Worker 201*6777b538SAndroid Build Coastguard Worker3. Timezone data update 202*6777b538SAndroid Build Coastguard Worker Run scripts/update_tz.sh to grab the latest version of the 203*6777b538SAndroid Build Coastguard Worker following timezone data files and put them in source/data/misc 204*6777b538SAndroid Build Coastguard Worker 205*6777b538SAndroid Build Coastguard Worker metaZones.txt 206*6777b538SAndroid Build Coastguard Worker timezoneTypes.txt 207*6777b538SAndroid Build Coastguard Worker windowsZones.txt 208*6777b538SAndroid Build Coastguard Worker zoneinfo64.txt 209*6777b538SAndroid Build Coastguard Worker 210*6777b538SAndroid Build Coastguard Worker As of Mar 5, 2024, the latest version is 2024a 211*6777b538SAndroid Build Coastguard Worker and the above files are available at the ICU github repos. 212*6777b538SAndroid Build Coastguard Worker 213*6777b538SAndroid Build Coastguard Worker4. Build-related changes 214*6777b538SAndroid Build Coastguard Worker 215*6777b538SAndroid Build Coastguard Worker - patches/configure.patch: 216*6777b538SAndroid Build Coastguard Worker * Remove a section of configure that will cause breakage while 217*6777b538SAndroid Build Coastguard Worker running runConfigureICU. 218*6777b538SAndroid Build Coastguard Worker 219*6777b538SAndroid Build Coastguard Worker - patches/wpo.patch (only needed when icudata dll is used). 220*6777b538SAndroid Build Coastguard Worker upstream bugs : https://unicode-org.atlassian.net/browse/ICU-8043 221*6777b538SAndroid Build Coastguard Worker https://unicode-org.atlassian.net/browse/ICU-5701 222*6777b538SAndroid Build Coastguard Worker 223*6777b538SAndroid Build Coastguard Worker - patches/data_symb.patch : 224*6777b538SAndroid Build Coastguard Worker Put ICU_DATA_ENTRY_POINT(icudtXX_dat) in common when we use 225*6777b538SAndroid Build Coastguard Worker the icu data file or icudt.dll 226*6777b538SAndroid Build Coastguard Worker 227*6777b538SAndroid Build Coastguard Worker5. ISO-2022-JP encoding (fromUnicode) change per WHATWG encoding spec. 228*6777b538SAndroid Build Coastguard Worker - patches/iso2022jp.patch 229*6777b538SAndroid Build Coastguard Worker - upstream bug: 230*6777b538SAndroid Build Coastguard Worker https://unicode-org.atlassian.net/browse/ICU-20251 231*6777b538SAndroid Build Coastguard Worker 232*6777b538SAndroid Build Coastguard Worker6. Enable tracing of file but not resource, only for Chromium 233*6777b538SAndroid Build Coastguard Worker to reduce performance impact/risk. 234*6777b538SAndroid Build Coastguard Worker - patches/restrace.patch 235*6777b538SAndroid Build Coastguard Worker 236*6777b538SAndroid Build Coastguard Worker7. Patch Arabic date time pattern back to 67 value to avoid test 237*6777b538SAndroid Build Coastguard Worker breakage in 238*6777b538SAndroid Build Coastguard Worker third_party/blink/web_tests/fast/forms/datetimelocal/datetimelocal-appearance-l10n.html 239*6777b538SAndroid Build Coastguard Worker - patches/ardatepattern.patch 240*6777b538SAndroid Build Coastguard Worker - https://bugs.chromium.org/p/chromium/issues/detail?id=1139186 241*6777b538SAndroid Build Coastguard Worker 242*6777b538SAndroid Build Coastguard Worker8. Remove explicit std::atomic<NumberRangeFormatterImpl*> template 243*6777b538SAndroid Build Coastguard Worker instantiation 244*6777b538SAndroid Build Coastguard Worker patches/atomic_template_instantiation.patch 245*6777b538SAndroid Build Coastguard Worker - The explicit instantiation was added to silence MSVC C4251 warnings: 246*6777b538SAndroid Build Coastguard Worker https://unicode-org.atlassian.net/browse/ICU-20157 247*6777b538SAndroid Build Coastguard Worker Small test cases show that it is generally an error to instantiate 248*6777b538SAndroid Build Coastguard Worker std::atomic<T*> with an incomplete type T with MSVC, clang, and GCC, so this 249*6777b538SAndroid Build Coastguard Worker instantiation never should have worked: 250*6777b538SAndroid Build Coastguard Worker https://gcc.godbolt.org/z/34xx8h 251*6777b538SAndroid Build Coastguard Worker At this time, it's not clear if this particular instantiation with 252*6777b538SAndroid Build Coastguard Worker NumberRangeFormatterImpl* was ever necessary for MSVC. Further testing with 253*6777b538SAndroid Build Coastguard Worker MSVC is required to upstream this patch. 254*6777b538SAndroid Build Coastguard Worker - https://unicode-org.atlassian.net/browse/ICU-21482 255*6777b538SAndroid Build Coastguard Worker 256*6777b538SAndroid Build Coastguard Worker9. Patch source/common/uposixdefs.h so it compiles on Fuchsia on Macs. 257*6777b538SAndroid Build Coastguard Worker patches/fuchsia.patch 258*6777b538SAndroid Build Coastguard Worker - context bug: https://bugs.chromium.org/p/chromium/issues/detail?id=1184527 259*6777b538SAndroid Build Coastguard Worker 260*6777b538SAndroid Build Coastguard Worker10. Patch fix of Etc/Unknown being returned for 261*6777b538SAndroid Build Coastguard Worker Intl.DateTimeFormat().resolvedOptions().timeZone on macOS 14. 262*6777b538SAndroid Build Coastguard Worker patches/revert_realpath.patch 263*6777b538SAndroid Build Coastguard Worker - https://bugs.chromium.org/p/chromium/issues/detail?id=1473422 264*6777b538SAndroid Build Coastguard Worker - https://unicode-org.atlassian.net/browse/ICU-22541