11cb0ef41Sopenharmony_ci# Maintaining ICU in Node.js 21cb0ef41Sopenharmony_ci 31cb0ef41Sopenharmony_ci## Background 41cb0ef41Sopenharmony_ci 51cb0ef41Sopenharmony_ciInternational Components for Unicode ([ICU4C][ICU]) is used both by V8 61cb0ef41Sopenharmony_ciand also by Node.js directly to provide internationalization 71cb0ef41Sopenharmony_cifunctionality. To quote from icu-project.org: 81cb0ef41Sopenharmony_ci 91cb0ef41Sopenharmony_ci> ICU is a mature, widely used set of C/C++ and Java libraries providing 101cb0ef41Sopenharmony_ci> Unicode and Globalization support for software applications. ICU is 111cb0ef41Sopenharmony_ci> widely portable and gives applications the same results on all platforms 121cb0ef41Sopenharmony_ci> and between C/C++ and Java software. 131cb0ef41Sopenharmony_ci 141cb0ef41Sopenharmony_ciIf Node.js is configured to use its built-in ICU, 151cb0ef41Sopenharmony_ciit uses a strict subset of ICU which is in 161cb0ef41Sopenharmony_ci[deps/icu-small](https://github.com/nodejs/node/tree/HEAD/deps/icu-small). 171cb0ef41Sopenharmony_ciA good description of the different ways Node.js can be built with ICU 181cb0ef41Sopenharmony_cisupport is in [api/intl.html](https://nodejs.org/api/intl.html). 191cb0ef41Sopenharmony_ci 201cb0ef41Sopenharmony_ci## Data dependencies 211cb0ef41Sopenharmony_ci 221cb0ef41Sopenharmony_ciICU consumes and includes: 231cb0ef41Sopenharmony_ci 241cb0ef41Sopenharmony_ci* Extracted locale data from [CLDR][] 251cb0ef41Sopenharmony_ci* Extracted [Unicode][] data. 261cb0ef41Sopenharmony_ci* Time zone ([tz][]) data 271cb0ef41Sopenharmony_ci 281cb0ef41Sopenharmony_ciThe current versions of these items can be viewed for node with `node -p process.versions`: 291cb0ef41Sopenharmony_ci 301cb0ef41Sopenharmony_ci```console 311cb0ef41Sopenharmony_ci$ node -p process.versions 321cb0ef41Sopenharmony_ci 331cb0ef41Sopenharmony_ci{ 341cb0ef41Sopenharmony_ci … 351cb0ef41Sopenharmony_ci cldr: '35.1', 361cb0ef41Sopenharmony_ci icu: '64.2', 371cb0ef41Sopenharmony_ci tz: '2019a', 381cb0ef41Sopenharmony_ci unicode: '12.1' 391cb0ef41Sopenharmony_ci} 401cb0ef41Sopenharmony_ci``` 411cb0ef41Sopenharmony_ci 421cb0ef41Sopenharmony_ci### Time zone data 431cb0ef41Sopenharmony_ci 441cb0ef41Sopenharmony_ciTime zone data files are updated independently of ICU CLDR data. ICU and its 451cb0ef41Sopenharmony_cimain data files do not need to be upgraded in order to apply time zone data file 461cb0ef41Sopenharmony_cifixes. 471cb0ef41Sopenharmony_ci 481cb0ef41Sopenharmony_ciThe [IANA tzdata][tz] project releases new versions and announces them on the 491cb0ef41Sopenharmony_ci[`tz-announce`](https://mm.icann.org/pipermail/tz-announce/) mailing list. 501cb0ef41Sopenharmony_ci 511cb0ef41Sopenharmony_ciThe Unicode project takes new releases and publishes 521cb0ef41Sopenharmony_ci[updated time zone data files](https://github.com/unicode-org/icu-data/tree/HEAD/tzdata/icunew) 531cb0ef41Sopenharmony_ciin the icu/icu-data repository. 541cb0ef41Sopenharmony_ci 551cb0ef41Sopenharmony_ciAll modern versions of Node.js use the version 44 ABI of the time zone data 561cb0ef41Sopenharmony_cifiles. 571cb0ef41Sopenharmony_ci 581cb0ef41Sopenharmony_ci#### Example: updating the ICU `.dat` file 591cb0ef41Sopenharmony_ci 601cb0ef41Sopenharmony_ci* Decompress `deps/icu-small/source/data/in/icudt##l.dat.bz2`, where `##` is 611cb0ef41Sopenharmony_ci the ICU major version number. 621cb0ef41Sopenharmony_ci* Clone the icu/icu-data repository and copy the latest `tzdata` release `le` 631cb0ef41Sopenharmony_ci files into the `source/data/in` directory. 641cb0ef41Sopenharmony_ci* Follow the upstream [ICU instructions](https://unicode-org.github.io/icu/userguide/datetime/timezone/) 651cb0ef41Sopenharmony_ci to patch the ICU `.dat` file: 661cb0ef41Sopenharmony_ci > `for i in zoneinfo64.res windowsZones.res timezoneTypes.res metaZones.res; 671cb0ef41Sopenharmony_ci > do icupkg -a $i icudt*l.dat` 681cb0ef41Sopenharmony_ci* Optionally, verify that there is only one of the above files listed when using 691cb0ef41Sopenharmony_ci `icupkg -l`. 701cb0ef41Sopenharmony_ci* Optionally, extract each file using `icupkg -x` and verify the `shasum` 711cb0ef41Sopenharmony_ci matches the desired value. 721cb0ef41Sopenharmony_ci* Compress the `.dat` file with the same filename as in the first step. 731cb0ef41Sopenharmony_ci* Build, test, verifying `process.versions.tz` matches the desired version. 741cb0ef41Sopenharmony_ci* Create a new minor version release. 751cb0ef41Sopenharmony_ci 761cb0ef41Sopenharmony_ci## Release schedule 771cb0ef41Sopenharmony_ci 781cb0ef41Sopenharmony_ciICU typically has >1 release a year, particularly coinciding with a major 791cb0ef41Sopenharmony_cirelease of [Unicode][]. The current release schedule is available on the [ICU][] 801cb0ef41Sopenharmony_ciwebsite on the left sidebar. 811cb0ef41Sopenharmony_ci 821cb0ef41Sopenharmony_ci### V8 depends on ICU 831cb0ef41Sopenharmony_ci 841cb0ef41Sopenharmony_ciV8 will aggressively upgrade to a new ICU version, due to requirements for 851cb0ef41Sopenharmony_cifeatures/bugfixes needed for [Ecma402][] support. The minimum required version 861cb0ef41Sopenharmony_ciof ICU is specified within the V8 source tree. If the ICU version is too old, 871cb0ef41Sopenharmony_ciV8 will not compile. 881cb0ef41Sopenharmony_ci 891cb0ef41Sopenharmony_ci```c 901cb0ef41Sopenharmony_ci// deps/v8/src/objects/intl-objects.h 911cb0ef41Sopenharmony_ci#define V8_MINIMUM_ICU_VERSION 65 921cb0ef41Sopenharmony_ci``` 931cb0ef41Sopenharmony_ci 941cb0ef41Sopenharmony_ciV8 in Node.js depends on the ICU version supplied by Node.js. 951cb0ef41Sopenharmony_ci 961cb0ef41Sopenharmony_ciThe file `tools/icu/icu_versions.json` contains the current minimum 971cb0ef41Sopenharmony_civersion of ICU that Node.js is known to work with. This should be 981cb0ef41Sopenharmony_ci_at least_ the same version as V8, so that users will find out 991cb0ef41Sopenharmony_ciearlier that their ICU is too old. A test case validates this when 1001cb0ef41Sopenharmony_ciNode.js is built. 1011cb0ef41Sopenharmony_ci 1021cb0ef41Sopenharmony_ci## How to upgrade ICU 1031cb0ef41Sopenharmony_ci 1041cb0ef41Sopenharmony_ci> The script `tools/dep_updaters/update-icu.sh` automates 1051cb0ef41Sopenharmony_ci> this process. 1061cb0ef41Sopenharmony_ci 1071cb0ef41Sopenharmony_ci* Make sure your Node.js workspace is clean (`git status` 1081cb0ef41Sopenharmony_ci should be sufficient). 1091cb0ef41Sopenharmony_ci* Configure Node.js with the specific [ICU version](http://site.icu-project.org/download) 1101cb0ef41Sopenharmony_ci you want to upgrade to, for example: 1111cb0ef41Sopenharmony_ci 1121cb0ef41Sopenharmony_ci```bash 1131cb0ef41Sopenharmony_ci./configure \ 1141cb0ef41Sopenharmony_ci --with-intl=full-icu \ 1151cb0ef41Sopenharmony_ci --with-icu-source=https://github.com/unicode-org/icu/releases/download/release-67-1/icu4c-67_1-src.tgz 1161cb0ef41Sopenharmony_cimake 1171cb0ef41Sopenharmony_ci``` 1181cb0ef41Sopenharmony_ci 1191cb0ef41Sopenharmony_ci> _Note_ in theory, the equivalent `vcbuild.bat` commands should work also, 1201cb0ef41Sopenharmony_ci> but the commands below are makefile-centric. 1211cb0ef41Sopenharmony_ci 1221cb0ef41Sopenharmony_ci* If there are ICU version-specific changes needed, you may need to make changes 1231cb0ef41Sopenharmony_ci in `tools/icu/icu-generic.gyp` or add patch files to `tools/icu/patches`. 1241cb0ef41Sopenharmony_ci * Specifically, look for the lists in `sources!` in the `tools/icu/icu-generic.gyp` for 1251cb0ef41Sopenharmony_ci files to exclude. 1261cb0ef41Sopenharmony_ci 1271cb0ef41Sopenharmony_ci* Verify the Node.js build works: 1281cb0ef41Sopenharmony_ci 1291cb0ef41Sopenharmony_ci```bash 1301cb0ef41Sopenharmony_cimake test-ci 1311cb0ef41Sopenharmony_ci``` 1321cb0ef41Sopenharmony_ci 1331cb0ef41Sopenharmony_ciAlso running 1341cb0ef41Sopenharmony_ci 1351cb0ef41Sopenharmony_ci<!-- eslint-disable strict --> 1361cb0ef41Sopenharmony_ci 1371cb0ef41Sopenharmony_ci```js 1381cb0ef41Sopenharmony_cinew Intl.DateTimeFormat('es', { month: 'long' }).format(new Date(9E8)); 1391cb0ef41Sopenharmony_ci``` 1401cb0ef41Sopenharmony_ci 1411cb0ef41Sopenharmony_ci…Should return `enero` not `January`. 1421cb0ef41Sopenharmony_ci 1431cb0ef41Sopenharmony_ci* Now, run the shrink tool to update `deps/icu-small` from `deps/icu` 1441cb0ef41Sopenharmony_ci 1451cb0ef41Sopenharmony_ci> :warning: Do not modify any source code in `deps/icu-small` ! 1461cb0ef41Sopenharmony_ci> See section below about floating patches to ICU. 1471cb0ef41Sopenharmony_ci 1481cb0ef41Sopenharmony_ci```bash 1491cb0ef41Sopenharmony_cipython tools/icu/shrink-icu-src.py 1501cb0ef41Sopenharmony_ci``` 1511cb0ef41Sopenharmony_ci 1521cb0ef41Sopenharmony_ci* Now, do a clean rebuild of Node.js to test: 1531cb0ef41Sopenharmony_ci 1541cb0ef41Sopenharmony_ci```bash 1551cb0ef41Sopenharmony_cimake -k distclean 1561cb0ef41Sopenharmony_ci./configure 1571cb0ef41Sopenharmony_cimake 1581cb0ef41Sopenharmony_ci``` 1591cb0ef41Sopenharmony_ci 1601cb0ef41Sopenharmony_ci* Test this newly default-generated Node.js 1611cb0ef41Sopenharmony_ci 1621cb0ef41Sopenharmony_ci<!-- eslint-disable strict --> 1631cb0ef41Sopenharmony_ci 1641cb0ef41Sopenharmony_ci```js 1651cb0ef41Sopenharmony_ciprocess.versions.icu; 1661cb0ef41Sopenharmony_cinew Intl.DateTimeFormat('es', { month: 'long' }).format(new Date(9E8)); 1671cb0ef41Sopenharmony_ci``` 1681cb0ef41Sopenharmony_ci 1691cb0ef41Sopenharmony_ci(This should print your updated ICU version number, and also `enero` again.) 1701cb0ef41Sopenharmony_ci 1711cb0ef41Sopenharmony_ciYou are ready to check in (`git add`) the updated `deps/icu-small`. 1721cb0ef41Sopenharmony_ci 1731cb0ef41Sopenharmony_ci> :warning: Do not modify any source code in `deps/icu-small` ! 1741cb0ef41Sopenharmony_ci> See section below about floating patches to ICU. 1751cb0ef41Sopenharmony_ci 1761cb0ef41Sopenharmony_ci* Now, rebuild the Node.js license. 1771cb0ef41Sopenharmony_ci 1781cb0ef41Sopenharmony_ci```bash 1791cb0ef41Sopenharmony_ci# clean up - remove deps/icu 1801cb0ef41Sopenharmony_cimake clean 1811cb0ef41Sopenharmony_citools/license-builder.sh 1821cb0ef41Sopenharmony_ci``` 1831cb0ef41Sopenharmony_ci 1841cb0ef41Sopenharmony_ci* Update the URL and hash for the full ICU file in `tools/icu/current_ver.dep`. 1851cb0ef41Sopenharmony_ci It should match the ICU URL used in the first step. When this is done, the 1861cb0ef41Sopenharmony_ci following should build with small ICU. 1871cb0ef41Sopenharmony_ci 1881cb0ef41Sopenharmony_ci```bash 1891cb0ef41Sopenharmony_ci# clean up 1901cb0ef41Sopenharmony_cirm -rf out deps/icu deps/icu4c* 1911cb0ef41Sopenharmony_ci./configure --with-intl=small-icu --download=all 1921cb0ef41Sopenharmony_cimake 1931cb0ef41Sopenharmony_cimake test-ci 1941cb0ef41Sopenharmony_ci``` 1951cb0ef41Sopenharmony_ci 1961cb0ef41Sopenharmony_ci* Commit the change to the `deps/icu-small`, `tools/icu/current_ver.dep` 1971cb0ef41Sopenharmony_ci and `LICENSE` files. 1981cb0ef41Sopenharmony_ci 1991cb0ef41Sopenharmony_ci## Floating patches to ICU 2001cb0ef41Sopenharmony_ci 2011cb0ef41Sopenharmony_ciFloating patches are applied at `configure` time. The "patch" files 2021cb0ef41Sopenharmony_ciare used instead of the original source files. The patch files are 2031cb0ef41Sopenharmony_cicomplete `.cpp` files replacing the original contents. 2041cb0ef41Sopenharmony_ci 2051cb0ef41Sopenharmony_ciPatches are tied to a specific ICU version. They won't apply to a 2061cb0ef41Sopenharmony_cifuture ICU version. We assume that you filed a bug against [ICU][] and 2071cb0ef41Sopenharmony_ciupstreamed the fix, so the patch won't be needed in a later ICU 2081cb0ef41Sopenharmony_civersion. 2091cb0ef41Sopenharmony_ci 2101cb0ef41Sopenharmony_ci### Example 2111cb0ef41Sopenharmony_ci 2121cb0ef41Sopenharmony_ciFor example, to patch `source/tools/toolutil/pkg_genc.cpp` for 2131cb0ef41Sopenharmony_ciICU version 63: 2141cb0ef41Sopenharmony_ci 2151cb0ef41Sopenharmony_ci```bash 2161cb0ef41Sopenharmony_ci# go to your Node.js source directory 2171cb0ef41Sopenharmony_cicd <node> 2181cb0ef41Sopenharmony_ci 2191cb0ef41Sopenharmony_ci# create the floating patch directory 2201cb0ef41Sopenharmony_cimkdir -p tools/icu/patches/63 2211cb0ef41Sopenharmony_ci 2221cb0ef41Sopenharmony_ci# create the subdirectory for the file(s) to patch: 2231cb0ef41Sopenharmony_cimkdir -p tools/icu/patches/63/source/tools/toolutil/ 2241cb0ef41Sopenharmony_ci 2251cb0ef41Sopenharmony_ci# copy the file to patch 2261cb0ef41Sopenharmony_cicp deps/icu-small/source/tools/toolutil/pkg_genc.cpp \ 2271cb0ef41Sopenharmony_citools/icu/patches/63/source/tools/toolutil/pkg_genc.cpp 2281cb0ef41Sopenharmony_ci 2291cb0ef41Sopenharmony_ci# Make any changes to this file: 2301cb0ef41Sopenharmony_ci(edit tools/icu/patches/63/source/tools/toolutil/pkg_genc.cpp ) 2311cb0ef41Sopenharmony_ci 2321cb0ef41Sopenharmony_ci# test 2331cb0ef41Sopenharmony_cimake clean && ./configure && make 2341cb0ef41Sopenharmony_ci``` 2351cb0ef41Sopenharmony_ci 2361cb0ef41Sopenharmony_ciYou should see a message such as: 2371cb0ef41Sopenharmony_ci 2381cb0ef41Sopenharmony_ci```console 2391cb0ef41Sopenharmony_ciINFO: Using floating patch "tools/icu/patches/63/source/tools/toolutil/pkg_genc.cpp" from "tools/icu" 2401cb0ef41Sopenharmony_ci``` 2411cb0ef41Sopenharmony_ci 2421cb0ef41Sopenharmony_ci### Clean up 2431cb0ef41Sopenharmony_ci 2441cb0ef41Sopenharmony_ciAny patches older than the minimum version given in `tools/icu/icu_versions.json` 2451cb0ef41Sopenharmony_ciought to be deleted, because they will never be used. 2461cb0ef41Sopenharmony_ci 2471cb0ef41Sopenharmony_ci### Why not just modify the ICU source directly? 2481cb0ef41Sopenharmony_ci 2491cb0ef41Sopenharmony_ciEspecially given the V8 dependencies above, there may be times when a floating 2501cb0ef41Sopenharmony_cipatch to ICU is required. Though it seems expedient to simply change a file in 2511cb0ef41Sopenharmony_ci`deps/icu-small`, this is not the right approach for the following reasons: 2521cb0ef41Sopenharmony_ci 2531cb0ef41Sopenharmony_ci1. **Repeatability.** Given the complexity of merging in a new ICU version, 2541cb0ef41Sopenharmony_ci following the steps above in the prior section of this document ought to be 2551cb0ef41Sopenharmony_ci repeatable without concern for overriding a patch. 2561cb0ef41Sopenharmony_ci 2571cb0ef41Sopenharmony_ci2. **Verifiability.** Given the number of files modified in an ICU PR, 2581cb0ef41Sopenharmony_ci a floating patch could easily be missed or dropped altogether next time 2591cb0ef41Sopenharmony_ci something is landed. 2601cb0ef41Sopenharmony_ci 2611cb0ef41Sopenharmony_ci3. **Compatibility.** There are a number of ways that ICU can be loaded into 2621cb0ef41Sopenharmony_ci Node.js (see the top level README.md). Only modifying `icu-small` would cause 2631cb0ef41Sopenharmony_ci the patch not to be landed in case the user specifies the ICU source code 2641cb0ef41Sopenharmony_ci another way. 2651cb0ef41Sopenharmony_ci 2661cb0ef41Sopenharmony_ci[CLDR]: http://cldr.unicode.org/ 2671cb0ef41Sopenharmony_ci[Ecma402]: https://github.com/tc39/ecma402 2681cb0ef41Sopenharmony_ci[ICU]: http://site.icu-project.org/ 2691cb0ef41Sopenharmony_ci[Unicode]: https://home.unicode.org/ 2701cb0ef41Sopenharmony_ci[tz]: https://www.iana.org/time-zones 271