11cb0ef41Sopenharmony_ci# Maintaining ICU in Node.js
21cb0ef41Sopenharmony_ci
31cb0ef41Sopenharmony_ci## Background
41cb0ef41Sopenharmony_ci
51cb0ef41Sopenharmony_ciInternational Components for Unicode ([ICU4C][ICU]) is used both by V8
61cb0ef41Sopenharmony_ciand also by Node.js directly to provide internationalization
71cb0ef41Sopenharmony_cifunctionality. To quote from icu-project.org:
81cb0ef41Sopenharmony_ci
91cb0ef41Sopenharmony_ci> ICU is a mature, widely used set of C/C++ and Java libraries providing
101cb0ef41Sopenharmony_ci> Unicode and Globalization support for software applications. ICU is
111cb0ef41Sopenharmony_ci> widely portable and gives applications the same results on all platforms
121cb0ef41Sopenharmony_ci> and between C/C++ and Java software.
131cb0ef41Sopenharmony_ci
141cb0ef41Sopenharmony_ciIf Node.js is configured to use its built-in ICU,
151cb0ef41Sopenharmony_ciit uses a strict subset of ICU which is in
161cb0ef41Sopenharmony_ci[deps/icu-small](https://github.com/nodejs/node/tree/HEAD/deps/icu-small).
171cb0ef41Sopenharmony_ciA good description of the different ways Node.js can be built with ICU
181cb0ef41Sopenharmony_cisupport is in [api/intl.html](https://nodejs.org/api/intl.html).
191cb0ef41Sopenharmony_ci
201cb0ef41Sopenharmony_ci## Data dependencies
211cb0ef41Sopenharmony_ci
221cb0ef41Sopenharmony_ciICU consumes and includes:
231cb0ef41Sopenharmony_ci
241cb0ef41Sopenharmony_ci* Extracted locale data from [CLDR][]
251cb0ef41Sopenharmony_ci* Extracted [Unicode][] data.
261cb0ef41Sopenharmony_ci* Time zone ([tz][]) data
271cb0ef41Sopenharmony_ci
281cb0ef41Sopenharmony_ciThe current versions of these items can be viewed for node with `node -p process.versions`:
291cb0ef41Sopenharmony_ci
301cb0ef41Sopenharmony_ci```console
311cb0ef41Sopenharmony_ci$ node -p process.versions
321cb0ef41Sopenharmony_ci
331cb0ef41Sopenharmony_ci{
341cb0ef41Sopenharmony_ci351cb0ef41Sopenharmony_ci  cldr: '35.1',
361cb0ef41Sopenharmony_ci  icu: '64.2',
371cb0ef41Sopenharmony_ci  tz: '2019a',
381cb0ef41Sopenharmony_ci  unicode: '12.1'
391cb0ef41Sopenharmony_ci}
401cb0ef41Sopenharmony_ci```
411cb0ef41Sopenharmony_ci
421cb0ef41Sopenharmony_ci### Time zone data
431cb0ef41Sopenharmony_ci
441cb0ef41Sopenharmony_ciTime zone data files are updated independently of ICU CLDR data.  ICU and its
451cb0ef41Sopenharmony_cimain data files do not need to be upgraded in order to apply time zone data file
461cb0ef41Sopenharmony_cifixes.
471cb0ef41Sopenharmony_ci
481cb0ef41Sopenharmony_ciThe [IANA tzdata][tz] project releases new versions and announces them on the
491cb0ef41Sopenharmony_ci[`tz-announce`](https://mm.icann.org/pipermail/tz-announce/) mailing list.
501cb0ef41Sopenharmony_ci
511cb0ef41Sopenharmony_ciThe Unicode project takes new releases and publishes
521cb0ef41Sopenharmony_ci[updated time zone data files](https://github.com/unicode-org/icu-data/tree/HEAD/tzdata/icunew)
531cb0ef41Sopenharmony_ciin the icu/icu-data repository.
541cb0ef41Sopenharmony_ci
551cb0ef41Sopenharmony_ciAll modern versions of Node.js use the version 44 ABI of the time zone data
561cb0ef41Sopenharmony_cifiles.
571cb0ef41Sopenharmony_ci
581cb0ef41Sopenharmony_ci#### Example: updating the ICU `.dat` file
591cb0ef41Sopenharmony_ci
601cb0ef41Sopenharmony_ci* Decompress `deps/icu-small/source/data/in/icudt##l.dat.bz2`, where `##` is
611cb0ef41Sopenharmony_ci  the ICU major version number.
621cb0ef41Sopenharmony_ci* Clone the icu/icu-data repository and copy the latest `tzdata` release `le`
631cb0ef41Sopenharmony_ci  files into the `source/data/in` directory.
641cb0ef41Sopenharmony_ci* Follow the upstream [ICU instructions](https://unicode-org.github.io/icu/userguide/datetime/timezone/)
651cb0ef41Sopenharmony_ci  to patch the ICU `.dat` file:
661cb0ef41Sopenharmony_ci  > `for i in zoneinfo64.res windowsZones.res timezoneTypes.res metaZones.res;
671cb0ef41Sopenharmony_ci  > do icupkg -a $i icudt*l.dat`
681cb0ef41Sopenharmony_ci* Optionally, verify that there is only one of the above files listed when using
691cb0ef41Sopenharmony_ci  `icupkg -l`.
701cb0ef41Sopenharmony_ci* Optionally, extract each file using `icupkg -x` and verify the `shasum`
711cb0ef41Sopenharmony_ci  matches the desired value.
721cb0ef41Sopenharmony_ci* Compress the `.dat` file with the same filename as in the first step.
731cb0ef41Sopenharmony_ci* Build, test, verifying `process.versions.tz` matches the desired version.
741cb0ef41Sopenharmony_ci* Create a new minor version release.
751cb0ef41Sopenharmony_ci
761cb0ef41Sopenharmony_ci## Release schedule
771cb0ef41Sopenharmony_ci
781cb0ef41Sopenharmony_ciICU typically has >1 release a year, particularly coinciding with a major
791cb0ef41Sopenharmony_cirelease of [Unicode][]. The current release schedule is available on the [ICU][]
801cb0ef41Sopenharmony_ciwebsite on the left sidebar.
811cb0ef41Sopenharmony_ci
821cb0ef41Sopenharmony_ci### V8 depends on ICU
831cb0ef41Sopenharmony_ci
841cb0ef41Sopenharmony_ciV8 will aggressively upgrade to a new ICU version, due to requirements for
851cb0ef41Sopenharmony_cifeatures/bugfixes needed for [Ecma402][] support. The minimum required version
861cb0ef41Sopenharmony_ciof ICU is specified within the V8 source tree. If the ICU version is too old,
871cb0ef41Sopenharmony_ciV8 will not compile.
881cb0ef41Sopenharmony_ci
891cb0ef41Sopenharmony_ci```c
901cb0ef41Sopenharmony_ci// deps/v8/src/objects/intl-objects.h
911cb0ef41Sopenharmony_ci#define V8_MINIMUM_ICU_VERSION 65
921cb0ef41Sopenharmony_ci```
931cb0ef41Sopenharmony_ci
941cb0ef41Sopenharmony_ciV8 in Node.js depends on the ICU version supplied by Node.js.
951cb0ef41Sopenharmony_ci
961cb0ef41Sopenharmony_ciThe file `tools/icu/icu_versions.json` contains the current minimum
971cb0ef41Sopenharmony_civersion of ICU that Node.js is known to work with. This should be
981cb0ef41Sopenharmony_ci_at least_ the same version as V8, so that users will find out
991cb0ef41Sopenharmony_ciearlier that their ICU is too old.  A test case validates this when
1001cb0ef41Sopenharmony_ciNode.js is built.
1011cb0ef41Sopenharmony_ci
1021cb0ef41Sopenharmony_ci## How to upgrade ICU
1031cb0ef41Sopenharmony_ci
1041cb0ef41Sopenharmony_ci> The script `tools/dep_updaters/update-icu.sh` automates
1051cb0ef41Sopenharmony_ci> this process.
1061cb0ef41Sopenharmony_ci
1071cb0ef41Sopenharmony_ci* Make sure your Node.js workspace is clean (`git status`
1081cb0ef41Sopenharmony_ci  should be sufficient).
1091cb0ef41Sopenharmony_ci* Configure Node.js with the specific [ICU version](http://site.icu-project.org/download)
1101cb0ef41Sopenharmony_ci  you want to upgrade to, for example:
1111cb0ef41Sopenharmony_ci
1121cb0ef41Sopenharmony_ci```bash
1131cb0ef41Sopenharmony_ci./configure \
1141cb0ef41Sopenharmony_ci    --with-intl=full-icu \
1151cb0ef41Sopenharmony_ci    --with-icu-source=https://github.com/unicode-org/icu/releases/download/release-67-1/icu4c-67_1-src.tgz
1161cb0ef41Sopenharmony_cimake
1171cb0ef41Sopenharmony_ci```
1181cb0ef41Sopenharmony_ci
1191cb0ef41Sopenharmony_ci> _Note_ in theory, the equivalent `vcbuild.bat` commands should work also,
1201cb0ef41Sopenharmony_ci> but the commands below are makefile-centric.
1211cb0ef41Sopenharmony_ci
1221cb0ef41Sopenharmony_ci* If there are ICU version-specific changes needed, you may need to make changes
1231cb0ef41Sopenharmony_ci  in `tools/icu/icu-generic.gyp` or add patch files to `tools/icu/patches`.
1241cb0ef41Sopenharmony_ci  * Specifically, look for the lists in `sources!` in the `tools/icu/icu-generic.gyp` for
1251cb0ef41Sopenharmony_ci    files to exclude.
1261cb0ef41Sopenharmony_ci
1271cb0ef41Sopenharmony_ci* Verify the Node.js build works:
1281cb0ef41Sopenharmony_ci
1291cb0ef41Sopenharmony_ci```bash
1301cb0ef41Sopenharmony_cimake test-ci
1311cb0ef41Sopenharmony_ci```
1321cb0ef41Sopenharmony_ci
1331cb0ef41Sopenharmony_ciAlso running
1341cb0ef41Sopenharmony_ci
1351cb0ef41Sopenharmony_ci<!-- eslint-disable strict -->
1361cb0ef41Sopenharmony_ci
1371cb0ef41Sopenharmony_ci```js
1381cb0ef41Sopenharmony_cinew Intl.DateTimeFormat('es', { month: 'long' }).format(new Date(9E8));
1391cb0ef41Sopenharmony_ci```
1401cb0ef41Sopenharmony_ci
1411cb0ef41Sopenharmony_ci…Should return `enero` not `January`.
1421cb0ef41Sopenharmony_ci
1431cb0ef41Sopenharmony_ci* Now, run the shrink tool to update `deps/icu-small` from `deps/icu`
1441cb0ef41Sopenharmony_ci
1451cb0ef41Sopenharmony_ci> :warning: Do not modify any source code in `deps/icu-small` !
1461cb0ef41Sopenharmony_ci> See section below about floating patches to ICU.
1471cb0ef41Sopenharmony_ci
1481cb0ef41Sopenharmony_ci```bash
1491cb0ef41Sopenharmony_cipython tools/icu/shrink-icu-src.py
1501cb0ef41Sopenharmony_ci```
1511cb0ef41Sopenharmony_ci
1521cb0ef41Sopenharmony_ci* Now, do a clean rebuild of Node.js to test:
1531cb0ef41Sopenharmony_ci
1541cb0ef41Sopenharmony_ci```bash
1551cb0ef41Sopenharmony_cimake -k distclean
1561cb0ef41Sopenharmony_ci./configure
1571cb0ef41Sopenharmony_cimake
1581cb0ef41Sopenharmony_ci```
1591cb0ef41Sopenharmony_ci
1601cb0ef41Sopenharmony_ci* Test this newly default-generated Node.js
1611cb0ef41Sopenharmony_ci
1621cb0ef41Sopenharmony_ci<!-- eslint-disable strict -->
1631cb0ef41Sopenharmony_ci
1641cb0ef41Sopenharmony_ci```js
1651cb0ef41Sopenharmony_ciprocess.versions.icu;
1661cb0ef41Sopenharmony_cinew Intl.DateTimeFormat('es', { month: 'long' }).format(new Date(9E8));
1671cb0ef41Sopenharmony_ci```
1681cb0ef41Sopenharmony_ci
1691cb0ef41Sopenharmony_ci(This should print your updated ICU version number, and also `enero` again.)
1701cb0ef41Sopenharmony_ci
1711cb0ef41Sopenharmony_ciYou are ready to check in (`git add`) the updated `deps/icu-small`.
1721cb0ef41Sopenharmony_ci
1731cb0ef41Sopenharmony_ci> :warning: Do not modify any source code in `deps/icu-small` !
1741cb0ef41Sopenharmony_ci> See section below about floating patches to ICU.
1751cb0ef41Sopenharmony_ci
1761cb0ef41Sopenharmony_ci* Now, rebuild the Node.js license.
1771cb0ef41Sopenharmony_ci
1781cb0ef41Sopenharmony_ci```bash
1791cb0ef41Sopenharmony_ci# clean up - remove deps/icu
1801cb0ef41Sopenharmony_cimake clean
1811cb0ef41Sopenharmony_citools/license-builder.sh
1821cb0ef41Sopenharmony_ci```
1831cb0ef41Sopenharmony_ci
1841cb0ef41Sopenharmony_ci* Update the URL and hash for the full ICU file in `tools/icu/current_ver.dep`.
1851cb0ef41Sopenharmony_ci  It should match the ICU URL used in the first step.  When this is done, the
1861cb0ef41Sopenharmony_ci  following should build with small ICU.
1871cb0ef41Sopenharmony_ci
1881cb0ef41Sopenharmony_ci```bash
1891cb0ef41Sopenharmony_ci# clean up
1901cb0ef41Sopenharmony_cirm -rf out deps/icu deps/icu4c*
1911cb0ef41Sopenharmony_ci./configure --with-intl=small-icu --download=all
1921cb0ef41Sopenharmony_cimake
1931cb0ef41Sopenharmony_cimake test-ci
1941cb0ef41Sopenharmony_ci```
1951cb0ef41Sopenharmony_ci
1961cb0ef41Sopenharmony_ci* Commit the change to the `deps/icu-small`, `tools/icu/current_ver.dep`
1971cb0ef41Sopenharmony_ci  and `LICENSE` files.
1981cb0ef41Sopenharmony_ci
1991cb0ef41Sopenharmony_ci## Floating patches to ICU
2001cb0ef41Sopenharmony_ci
2011cb0ef41Sopenharmony_ciFloating patches are applied at `configure` time. The "patch" files
2021cb0ef41Sopenharmony_ciare used instead of the original source files. The patch files are
2031cb0ef41Sopenharmony_cicomplete `.cpp` files replacing the original contents.
2041cb0ef41Sopenharmony_ci
2051cb0ef41Sopenharmony_ciPatches are tied to a specific ICU version. They won't apply to a
2061cb0ef41Sopenharmony_cifuture ICU version.  We assume that you filed a bug against [ICU][] and
2071cb0ef41Sopenharmony_ciupstreamed the fix, so the patch won't be needed in a later ICU
2081cb0ef41Sopenharmony_civersion.
2091cb0ef41Sopenharmony_ci
2101cb0ef41Sopenharmony_ci### Example
2111cb0ef41Sopenharmony_ci
2121cb0ef41Sopenharmony_ciFor example, to patch `source/tools/toolutil/pkg_genc.cpp` for
2131cb0ef41Sopenharmony_ciICU version 63:
2141cb0ef41Sopenharmony_ci
2151cb0ef41Sopenharmony_ci```bash
2161cb0ef41Sopenharmony_ci# go to your Node.js source directory
2171cb0ef41Sopenharmony_cicd <node>
2181cb0ef41Sopenharmony_ci
2191cb0ef41Sopenharmony_ci# create the floating patch directory
2201cb0ef41Sopenharmony_cimkdir -p tools/icu/patches/63
2211cb0ef41Sopenharmony_ci
2221cb0ef41Sopenharmony_ci# create the subdirectory for the file(s) to patch:
2231cb0ef41Sopenharmony_cimkdir -p tools/icu/patches/63/source/tools/toolutil/
2241cb0ef41Sopenharmony_ci
2251cb0ef41Sopenharmony_ci# copy the file to patch
2261cb0ef41Sopenharmony_cicp deps/icu-small/source/tools/toolutil/pkg_genc.cpp \
2271cb0ef41Sopenharmony_citools/icu/patches/63/source/tools/toolutil/pkg_genc.cpp
2281cb0ef41Sopenharmony_ci
2291cb0ef41Sopenharmony_ci# Make any changes to this file:
2301cb0ef41Sopenharmony_ci(edit tools/icu/patches/63/source/tools/toolutil/pkg_genc.cpp )
2311cb0ef41Sopenharmony_ci
2321cb0ef41Sopenharmony_ci# test
2331cb0ef41Sopenharmony_cimake clean && ./configure && make
2341cb0ef41Sopenharmony_ci```
2351cb0ef41Sopenharmony_ci
2361cb0ef41Sopenharmony_ciYou should see a message such as:
2371cb0ef41Sopenharmony_ci
2381cb0ef41Sopenharmony_ci```console
2391cb0ef41Sopenharmony_ciINFO: Using floating patch "tools/icu/patches/63/source/tools/toolutil/pkg_genc.cpp" from "tools/icu"
2401cb0ef41Sopenharmony_ci```
2411cb0ef41Sopenharmony_ci
2421cb0ef41Sopenharmony_ci### Clean up
2431cb0ef41Sopenharmony_ci
2441cb0ef41Sopenharmony_ciAny patches older than the minimum version given in `tools/icu/icu_versions.json`
2451cb0ef41Sopenharmony_ciought to be deleted, because they will never be used.
2461cb0ef41Sopenharmony_ci
2471cb0ef41Sopenharmony_ci### Why not just modify the ICU source directly?
2481cb0ef41Sopenharmony_ci
2491cb0ef41Sopenharmony_ciEspecially given the V8 dependencies above, there may be times when a floating
2501cb0ef41Sopenharmony_cipatch to ICU is required.  Though it seems expedient to simply change a file in
2511cb0ef41Sopenharmony_ci`deps/icu-small`, this is not the right approach for the following reasons:
2521cb0ef41Sopenharmony_ci
2531cb0ef41Sopenharmony_ci1. **Repeatability.** Given the complexity of merging in a new ICU version,
2541cb0ef41Sopenharmony_ci   following the steps above in the prior section of this document ought to be
2551cb0ef41Sopenharmony_ci   repeatable without concern for overriding a patch.
2561cb0ef41Sopenharmony_ci
2571cb0ef41Sopenharmony_ci2. **Verifiability.** Given the number of files modified in an ICU PR,
2581cb0ef41Sopenharmony_ci   a floating patch could easily be missed or dropped altogether next time
2591cb0ef41Sopenharmony_ci   something is landed.
2601cb0ef41Sopenharmony_ci
2611cb0ef41Sopenharmony_ci3. **Compatibility.** There are a number of ways that ICU can be loaded into
2621cb0ef41Sopenharmony_ci   Node.js (see the top level README.md). Only modifying `icu-small` would cause
2631cb0ef41Sopenharmony_ci   the patch not to be landed in case the user specifies the ICU source code
2641cb0ef41Sopenharmony_ci   another way.
2651cb0ef41Sopenharmony_ci
2661cb0ef41Sopenharmony_ci[CLDR]: http://cldr.unicode.org/
2671cb0ef41Sopenharmony_ci[Ecma402]: https://github.com/tc39/ecma402
2681cb0ef41Sopenharmony_ci[ICU]: http://site.icu-project.org/
2691cb0ef41Sopenharmony_ci[Unicode]: https://home.unicode.org/
2701cb0ef41Sopenharmony_ci[tz]: https://www.iana.org/time-zones
271