1--- 2layout: default 3title: Healthy Code 4parent: Release & Milestone Tasks 5grand_parent: Contributors 6nav_order: 50 7--- 8 9<!-- 10© 2021 and later: Unicode, Inc. and others. 11License & terms of use: http://www.unicode.org/copyright.html 12--> 13 14# Healthy Code 15{: .no_toc } 16 17## Contents 18{: .no_toc .text-delta } 19 201. TOC 21{:toc} 22 23--- 24 25## Check for ClassID in new class hierarchies 26 27Check that there are no "poor man's RTTI" methods in new class hierarchies. 28 29After ICU 50: New class hierarchies should not declare getDynamicClassID() at 30all. UObject has a dummy implementation for it. 31 32ICU 4.6..50: New class hierarchies used UOBJECT_DEFINE_NO_RTTI_IMPLEMENTATION. 33See Normalizer2 for an example declaration and implementation that satisfies the 34virtual-function override without adding new ClassID support. 35 36We do need to keep and add "poor man's RTTI" in old classes, and in new classes 37extending existing class hierarchies (where parent classes have @stable RTTI 38functions). (For example, a new Format subclass.) 39 40One easy way to check for this is to search through the API change report and 41look for "UClassID" or similar. 42 43* Latest trunk report: 44 [icu/icu4c/APIChangeReport.html](https://github.com/unicode-org/icu/blob/main/icu4c/APIChangeReport.html) 45* ~~The automated build system creates a new report: 46 <https://cldr-build.unicode.org/jenkins/job/icu/job/icu-apidocs/>~~ 47 * Click "ICU4C API Change Report vs Latest" 48* Old: was http://bugs.icu-project.org/trac/build/icu4cdocs 49 * Go into the latest successful build, the report is an attachment there. 50 * Download the style sheet next to it: 51 https://github.com/unicode-org/icu/blob/main/icu4c/icu4c.css 52 53--- 54 55## ~~Check for non-ascii characters in ICU4C source files \[obsolete\]~~ 56 57~~Note: ICU4C and ICU4J source files are UTF-8. The ASCII check is no longer 58appropriate for them.~~ 59 60```sh 61cd icu4c/source 62find . \( -name "*.[ch]" -or -name "*.cpp" \) -exec grep -PHn [^[:ascii:]] {} \; 63``` 64 65--- 66 67## Check source files for valid UTF-8 and correct text file line endings 68 69Verify that all source and text files in the repository have plain LF line 70endings. 71 72To do this on Linux, In an up-to-date git workspace, 73 74```sh 75cd icu/icu4c/source 76tools/icu-file-utf8-check.py # reports problems 77``` 78 79The same python script from the icu4c tools will also check icu4j 80 81```sh 82cd icu/icu4j 83../icu4c/source/tools/icu-file-utf8-check.py 84``` 85 86To double-check the line endings, the following grep will find all text files 87containing \\r characters. Do not run from Windows, where \\r\\n line endings 88are expected. 89 90```sh 91cd icu 92grep -rPIl "\r" * 93``` 94 95Even when run from Mac or Linux, some WIndows specific files (.bat, etc) will be 96found by this check. This is OK. 97 98## ~~Check UTF-8 file properties \[obsolete\]~~ 99 100*Note: As of ICU 63, the project moved from svn to GitHub. SVN file properties 101are no longer relevant.* 102 103<span style="color:red">**Note: As of ICU 59, ICU4C source files are UTF-8 104encoded, and have the svn mime-type property "text/plain;charset=utf-8". They 105must not have a BOM.**</span> 106 107This is checked by the above task, *Check svn properties, valid UTF-8 and text 108file line endings.* 109 110The bomfix.py script, formerly used for this task, must *not* be run over the 111ICU4C sources 112 113<span style="color:red">**~~Note: This task is only applicable to ICU4C. ICU4J 114.java source files are encoded by UTF-8, but must be without UTF-8 115BOM.~~**</span> 116 117~~Check that the following match: Files marked as UTF-8 vs. Files beginning with 118the UTF-8 signature byte sequence ("BOM").~~ 119 120~~Run:~~ 121 122<pre><code><s>cd {icu}/icu4c</s> 123<s>python ../tools/release/c/bomfix.py</s></code></pre> 124 125--- 126 127## Clean up import statements 128 129The Eclipse IDE provides a feature which allow you to organize import statements 130for multiple files. Right click on projects/source folders/files, you can select 131\[Source\] - \[Organize Imports\] which resolve all wildcard imports and sort 132the import statements in a consistent order. (Note: You may experience OOM 133problem when your run this for projects/folders which contain many files. In 134this case, you may need to narrow a selection per iteration.) 135 136--- 137 138## Check library dependencies 139 140***ICU 64+ (2019+): Done automatically by a build bot, at least in one of the 141two modes (debug/release), ok to skip as BRS task.*** 142 143We want to keep dependencies between .c/.cpp/.o files reasonable, both between 144and inside ICU's libraries. 145 146On Linux, run 147[source/test/depstest/depstest.py](https://github.com/unicode-org/icu/blob/main/icu4c/source/test/depstest/dependencies.py), 148for example: 149```sh 150~/icu/mine/src/icu4c/source/test/depstest$ ./depstest.py ~/icu/mine/dbg/icu4c 151``` 152 153Do this twice: Once for a release build (optimized) and once for a debug build 154(unoptimized). They pull in slightly different sets of standard library symbols 155(see comments in dependencies.txt). 156 157If everything is fine, the test will print "OK: Specified and actual 158dependencies match." If not... 159 160*Get changes reviewed by others, including Markus; including changes in 161dependencies.txt, .py scripts, or ICU4C code.* 162 163At first, the test will likely complain about .o files not listed in its 164dependencies.txt or, if files were removed, the other way around. Try to add 165them to groups or create new groups as appropriate. 166 167As a rule, smaller groups with fewer dependencies are preferable. If public API 168(e.g., constructing a UnicodeString via conversion) is not needed inside ICU 169(e.g., unistr_cnv), make its library depend on its new group. 170 171If the test prints "Info: group icuplug does not need to depend on platform" 172then the plug-in code is disabled, as is the default since ICU 56. Consider 173enabling it for dependency checking, but make sure to revert that before you 174commit changes! 175 176There are usually other "Info" messages where the tool thinks that dependencies 177on system symbols are not needed. These are harmless, that is, don't try to 178remove them: They are needed or not needed based on the compiler flags used. If 179you remove them, then you will likely cause an error for someone with different 180flags. Also, in an unoptimized build you only get half as many info messages. 181You get more in an optimized build because more system stuff gets inlined. 182 183The test might complain "is this a new system symbol?" We should be careful 184about adding those. For example, we must not call printf() from library code, 185nor the global operator new. 186 187The test might complain that some .o file "imports 188`icu_48::UnicodeString::UnicodeString(const char *)` but does not depend on 189unistr_cnv.o". This probably means that someone passes a simple `"string literal"` 190or a `char *` into a function that takes a UnicodeString, which invokes the 191default-conversion constructor. We do not want that! In most cases, such code 192should be fixed. Only implementations 193of API that require conversion should depend on it; for example, group 194formattable_cnv depends on group unistr_cnv, but then nothing inside ICU depends 195on that. 196 197--- 198 199## Verify proper memory allocation functions 200 201Verify the following for library code (common, i18n, layout, ustdio). The 202requirement is for ICU's memory management to be customizable by changing 203cmemory.h and the common base class. 204 205**Note:** The requirement remains. The techniques to fix issues are valid. 206***For testing**, see the section "Check library dependencies" above.* 207 208* No raw malloc/free/realloc but their uprv_ versions. 209* All C++ classes must inherit the common base class UObject or UMemory 210 * But not for mixin/interface classes (pure virtual, no data members, no 211 constructors) because that would break the single-inheritance model. 212 * Also not for pure-static classes (used for name scoping) declare but 213 don't implement a private default constructor to prevent instantiation. 214 * Simple struct-like C++ classes (and structs) that do not have 215 constructors, destructors, and virtual methods, need not inherit the 216 base class but must then be allocated with uprv_malloc. 217* All simple types (UChar, int32_t, pointers(!), ...) must be allocated with 218 uprv_malloc and released with uprv_free. 219* For Testing that this is the case, on Linux 220 * run the Perl script ICUMemCheck.pl. Follow the instructions in the 221 script. The script is in in the ICU4C sources at 222 source/tools/memcheck/ICUMemCheck.pl. 223 * also, depstest.py will show an error message if the global operator new 224 is used (see section about checking dependencies) 225* For testing that this is the case, on Windows: 226 * Don't bother, as of Nov, 2004. Failures appear in many dozens of files 227 from the mixin class destructors. Do the check on Linux. But, for 228 reference, here are the old instructions. 229 * Make sure that in `uobject.h` `UObject::new` and `delete` are 230 defined by default. Currently, this means to grep to see that 231 `U_OVERRIDE_CXX_ALLOCATION` is defined to 1 (in `pwin32.h` for 232 Windows). 233 * Check the \*.obj's for linkage to the global (non-UObject::) 234 operators new/delete; see uobject.cpp for details. 235 * Global `new` must never be imported. Global `delete` will be 236 imported and used by empty-virtual destructors in interface/mixin 237 classes. However, they are not called because implementation classes 238 always derive from UMemory. No other functions must use global 239 delete. 240 * There are now (2002-dec-17) definitions in `utypes.h` for global 241 new/delete, with inline implementations that will always crash. 242 These global new/delete operators are only defined for code inside 243 the ICU4C libraries (but must be there for all of those). See ticket 244 #2581. 245 * If a global new/delete is used somewhere, then change the class 246 inheritance and/or use uprv_malloc/free until no global new/delete 247 are used in the libraries (and the tests still pass...). See the 248 User Guide Coding Guidelines for details. 249 250--- 251 252## Run static code analysis tools 253 254(Purify, Boundary Checker, valgrind...) 255 256Make sure we fixed all the memory leak problems that were discovered when 257running these tools. 258 259Build ICU with debug information. On Linux, 260 261```sh 262runConfigureICU --enable-debug --disable-release Linux 263``` 264 265Run all of the standard tests under valgrind. For intltest, for example, 266 267```sh 268cd <where ever>/source/test/intltest 269LD_LIBRARY_PATH=../../lib:../../stubdata:../../tools/ctestfw:$LD_LIBRARY_PATH valgrind ./intltest 270``` 271 272You can grab the command line for running the tests from the output from "make 273check", and then just insert "valgrind" before the executable. 274 275--- 276 277## Check the code coverage numbers 278 279Our goal is that all releases go out to the public with 100% API test and at 280least 85% code coverage. 281 282--- 283 284## Test ICU4C headers 285 286### Test ICU4C public headers 287 288Testing external dependencies in header files: 289 290(on Unixes) Prerequisite: Configure with --prefix 291(../icu4c/source/runConfigureICU Linux --prefix=/some/temp/folder) and do 'make 292install'. Then set the PATH so that the installed icu-config script can be 293found. (export PATH=/some/temp/folder/**bin**:$PATH) 294 295Then go to the 'icu4c/test/hdrtst' directory (note: not 'source/test/hdrtst') 296and do 'make check'. This will attempt to compile against each header file 297individually to make sure there aren't any problems. Output looks like this, if 298no error springs up all is in order. 299 300~~If a C++ file fails to compile as a C file, add it to the 'cxxfiles.txt' 301located in the hdrtst directory.~~ 302 303<span style="color:red">**As of ICU 65, the hdrtst is now run as part of the 304regular CI builds, and the C++ headers are now guarded with the macro 305"U_SHOW_CPLUSPLUS_API".**</span> 306 307There is no longer any "cxxfiles.txt" file. Instead the public C++ headers are 308all guarded with the macro "U_SHOW_CPLUSPLUS_API" which is set to 1 by default 309if __cplusplus is defined. Users of ICU can explicitly set the macro before 310including any ICU headers if they wish to only use the C APIs. Any new public 311C++ header needs to be similarly guarded with the macro, though this should be 312caught in the CI builds for a pull-request before it is merged. 313 314Run this test with all the uconfig.h variations (see below). 315 316```sh 317ctest unicode/docmain.h 318ctest unicode/icudataver.h 319ctest unicode/icuplug.h 320``` 321 322### Test ICU4C internal headers 323 324Run the following script straight from the source tree (from inside the "source" 325folder, not on the top level), no need to build nor install. 326 327For a new release, also look for new tools and tests and add their folders to 328the script. You can ignore messages stating that no '\*.h' files were found in a 329particular directory. 330 331The command line is simply 332 333```sh 334~/git.icu/icu4c/source$ test/hdrtst/testinternalheaders.sh 335``` 336 337See https://unicode-org.atlassian.net/browse/ICU-12141 "every header file should 338include all other headers if it depends on definitions from them" 339 340<span style="color:red">**As of ICU 68, the internal header test is now 341automated as part of Travis CI.**</span> 342 343--- 344 345## Test uconfig.h variations 346 347Test ICU completely, and run the header test (above) with: 348 3491. **none** of the 'UCONFIG_NO_XXX' switches turned on (i.e., the normal case) 3502. **all** of the 'UCONFIG_NO_XXX' switches turned on (set to 1) 3513. For each switch, test once with **just** that one switch on. But note the 352 exception regarding [UCONFIG_NO_CONVERSION test](healthy-code.md) below. 353 354(See 355[common/unicode/uconfig.h](https://github.com/unicode-org/icu/blob/main/icu4c/source/common/unicode/uconfig.h) 356for more documentation.) 357 358There is a script available which will automatically test ICU in this way on 359UNIXes, it lives in: 360[tools/release/c/uconfigtest.sh](https://github.com/unicode-org/icu/blob/main/tools/release/c/uconfigtest.sh). 361See docs at top of script for information. 362 363<span style="background-color:yellow">When guard conditionals (e.g. #ifndef 364U_HIDE_INTERNAL_API) are removed because they cause header test failures, please 365note in the header file the reason that guard conditionals cannot be used in 366that location, or they will lkeiely be re-added in the future. 367 368--- 369 370## Test C++ Namespace Use 371 372Verify that ICU builds without enabling the default use of the ICU namespace. To 373test on Linux, 374 375```sh 376./runConfigureICU Linux CXXFLAGS="-DU_USING_ICU_NAMESPACE=0" 377make check 378``` 379 380Any problems will show up as compilation errors. 381 382When definitions outside the ICU C++ namespace refer to ICU C++ classes, those 383need to be qualified with "`icu::`", as in "`icu::UnicodeString`". In rare 384cases, a C++ type is also visible in C code (e.g., ucol_imp.h has definitions 385that are visible to cintltst) and then we use `U_NAMESPACE_QUALIFIER` which is 386defined to be empty when compiling for C. 387 388The automated build system should have a machine that sets both 389`-DU_USING_ICU_NAMESPACE=0` and `-DU_CHARSET_IS_UTF8=1`. 390 391--- 392 393## Test UCONFIG_NO_CONVERSION 394 395Make sure that the ICU4C common and i18n libraries build with 396UCONFIG_NO_CONVERSION set to 1. We cannot do this as part of "Test uconfig.h 397variations" because the test suites cannot be built like this, but the library 398code must support it. 399 400The simplest is to take an ICU4C workspace, modify uconfig.h *==temporarily==* 401by changing the value of UCONFIG_NO_CONVERSION to 1, and do "make -j 6" (not 402"make check" or "make tests"). Verify that the stubdata, common & i18n libraries 403build fine; layout should build too but toolutil will fail, that's expected. 404 405Fix any stubdata/common/i18n issues, revert the UCONFIG_NO_CONVERSION value, and 406verify that it still works with the normal setting. 407 408If this breaks, someone probably inadvertently uses the `UnicodeString(const char *)` constructor. 409See the "Check library dependencies" section. 410 411--- 412 413## Test U_CHARSET_IS_UTF8 414 415Verify that ICU builds with default charset hardcoded to UTF-8. To test on 416Linux, 417 418```sh 419./runConfigureICU Linux CPPFLAGS="-DU_CHARSET_IS_UTF8=1" 420make -j6 check 421``` 422 423Any problems will show up as compilation or test errors. 424 425Rather than setting the CPPFLAGS, you can also temporarily add `#define 426U_CHARSET_IS_UTF8 1` in unicode/platform.h before it gets its default 427definition, or modify the default definition there. (In ICU 4.8 and earlier, 428this flag was in unicode/utypes.h.) 429 430This works best on a machine that is set to use UTF-8 as its system charset, 431which is not possible on Windows. 432 433The automated build system should have a machine that sets both 434`-DU_USING_ICU_NAMESPACE=0` and `-DU_CHARSET_IS_UTF8=1`. 435 436--- 437 438## Test U_OVERRIDE_CXX_ALLOCATION=0 439 440Verify that ICU builds with U_OVERRIDE_CXX_ALLOCATION=0 on Linux. Problems will 441show as build failures. 442 443```sh 444CPPFLAGS="-DU_OVERRIDE_CXX_ALLOCATION=0" ./runConfigureICU Linux 445make clean 446make -j12 check 447``` 448 449## ~~Test ICU_USE_THREADS=0 \[Obsolete\]~~ 450 451***Only necessary up to ICU4C 49.*** 452 453* ICU 50m1 removes ICU_USE_THREADS from the runtime code (ticket 454 [ICU-9010](https://unicode-org.atlassian.net/browse/ICU-9010)). 455* It is still possible to build and test intltest with ICU_USE_THREADS=0 but 456 not nearly as important. 457* In ICU 50m1, the `--disable-threads` configure option is gone. If you want 458 to test with ICU_USE_THREADS=0 then temporarily change this flag in 459 intltest.h or in the intltest Makefile. 460 461Verify that ICU builds and tests with threading disabled. To test on Linux, 462 463```sh 464./runConfigureICU Linux --disable-threads 465make check 466``` 467 468--- 469 470## Test ICU4C Samples and Demos 471 472### Windows build and test 473To build the ICU4C samples on Windows with Visual Studio, use the following 474steps: 475 476* Open the "allinone" solution file located under 477 "source\\allinone\\allinone.sln". 478* Build the Debug/Release + x86/x64 configurations (all 4 configurations). 479 * Make sure the generated data file (ex: "icu4c\\bin\\icudt64.dll") is not 480 stub data, it should be ~26MB. 481* Open the "all" Solution file located under "source\\samples\\all\\all.sln". 482* Build both x86 and x64 using the "Batch Build" option. This is located under 483 the menu Build > Batch Build. 484 * Click the "Select All" button. 485 * Click the "Build" button. 486 * If Visual Studio returns errors using the Batch Build option, build each 487 configuration individually instead. 488* The samples should all build cleanly with no errors. 489 490To test the sample programs, run the "source\\samples\\all\\samplecheck.bat" 491script for each configuration, and ensure that they are successful. 492 493### Linux /Unix build and test 494To build and test ICU4C samples: 495 496* In icu4c/source, run the configuration "./runConfigure Linux" (or appropriate system) 497* Build and install ICU4C. 498* Set PATH to include the bin directory of the installed ICU4c. 499* Set LD_LIBRARY_PATH to include the libs directory on the installed ICU4c. 500 501 502``` 503cd icu4c/source 504cd samples 505# To clean all the test binaries 506make clean-samples-recursive 507# To rebuild them all 508make all-samples-recursive 509# To run all tests serially 510make check-samples-recursive 511 512``` 513 514 515## **Test ICU4C Demos via Docker** 516 517See <https://github.com/unicode-org/icu-demos/blob/main/icu-kube/README.md> 518 519--- 520 521## **Test ICU4J Web Demos via Docker** 522 523See: <https://github.com/unicode-org/icu-demos/blob/main/icu4jweb/README.md> 524 525--- 526 527## Test ICU4J Demos 528 529These are the demo applets, see above for the icu4jweb demos. 530 531To test ICU4J demo applications, cd to ICU4J directory and build and run the 532demo. 533 534```sh 535$ cd icu4j 536$ ant jarDemos 537$ java -jar icu4jdemos.jar 538``` 539 540Above command invokes GUI demo applications. As such it has to connect to a 541X-Server. The easiest way is to run via e.g. remote desktop on the machine on 542which it is executed instead of in a ssh shell. 543 544The demos include calendar, charset detection, holidays, RBNF and 545transliterator. Check if each application is working OK. 546 547To check ICU4J samples, open Eclipse workspace and import icu4j-samples project 548from directory <icu4j_root>/samples. Make sure these sample code has no build 549issues. Also run sample code with main and see if each sample code runs. 550 551--- 552 553## Test, exhaustive mode, C & J 554 555For ICU4J, 556 557```sh 558$ ant exhaustiveCheck 559``` 560 561For ICU4C, testing with an optimized build will help reduce the elapsed time 562required for the tests to complete. 563 564```sh 565$ make -j6 check-exhaustive 566``` 567 568--- 569 570## Test ICU4C with the Thread Sanitizer 571 572The build bots run the thread sanitizer on the most interesting multithreaded 573tests. These instructions run the sanitizer on the entire test suite. The clang 574compiler is required. 575 576```sh 577$ CPPFLAGS=-fsanitize=thread LDFLAGS=-fsanitize=thread ./runConfigureICU --enable-debug --disable-release Linux --disable-renaming 578$ make clean 579$ make -j6 check 580``` 581