12e5b6d6dSopenharmony_ci<html>
22e5b6d6dSopenharmony_ci<head>
32e5b6d6dSopenharmony_ci<meta http-equiv="Content-Language" content="en-us">
42e5b6d6dSopenharmony_ci<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
52e5b6d6dSopenharmony_ci<title>ICU's Unicode Tools Read Me</title>
62e5b6d6dSopenharmony_ci    <meta name="COPYRIGHT" content=
72e5b6d6dSopenharmony_ci    "Copyright (c) 2004-2006 IBM Corporation and others. All Rights Reserved." />
82e5b6d6dSopenharmony_ci<style>
92e5b6d6dSopenharmony_ci<!--
102e5b6d6dSopenharmony_cili           { margin-top: 0.5em; margin-bottom: 0.5em }
112e5b6d6dSopenharmony_ci-->
122e5b6d6dSopenharmony_ci</style>
132e5b6d6dSopenharmony_ci</head>
142e5b6d6dSopenharmony_ci
152e5b6d6dSopenharmony_ci<body>
162e5b6d6dSopenharmony_ci
172e5b6d6dSopenharmony_ci<h1>UnicodeTools</h1>
182e5b6d6dSopenharmony_ci<p>This file provides instructions for building and running the UnicodeTools, which<br>
192e5b6d6dSopenharmony_cican be used to:</p>
202e5b6d6dSopenharmony_ci<ul>
212e5b6d6dSopenharmony_ci  <li>build the Derived Unicode files in the UCD (Unicode Character Database),</li>
222e5b6d6dSopenharmony_ci  <li>build the transformed UCA (Unicode Collation Algorithm) files needed by ICU.</li>
232e5b6d6dSopenharmony_ci  <li>run consistency checks on beta releases of the UCD and the UCA.</li>
242e5b6d6dSopenharmony_ci  <li>build 4 chart folders on the unicode site</li>
252e5b6d6dSopenharmony_ci</ul>
262e5b6d6dSopenharmony_ci<p><font color="#FF0000"><b>WARNING!!</b></font></p>
272e5b6d6dSopenharmony_ci<ul>
282e5b6d6dSopenharmony_ci  <li>This is NOT production level code, and should never be used in programs.</li>
292e5b6d6dSopenharmony_ci  <li>The API is subject to change without notice, and will not be maintained.</li>
302e5b6d6dSopenharmony_ci  <li>The source is uncommented, and has many warts; since it is not production code, it has not 
312e5b6d6dSopenharmony_ci  been worth the time to clean it up.</li>
322e5b6d6dSopenharmony_ci  <li>It will probably not work on Unix or Mac without changing the file separator.</li>
332e5b6d6dSopenharmony_ci  <li>Currently it uses hard-coded directory names.</li>
342e5b6d6dSopenharmony_ci  <li>The contents of multiple versions of the UCD must be copied to a local directory, as described 
352e5b6d6dSopenharmony_ci  below.</li>
362e5b6d6dSopenharmony_ci</ul>
372e5b6d6dSopenharmony_ci<h2>Instructions:</h2>
382e5b6d6dSopenharmony_ci<h3>0. You will need to get ICU4J on your system, using CVS.</h3>
392e5b6d6dSopenharmony_ci<p>The rest of this will assume that you have set up CVS so that you load the ICU4J project into 
402e5b6d6dSopenharmony_ciC:\ICU4J<br>
412e5b6d6dSopenharmony_ci<br>
422e5b6d6dSopenharmony_ciYou need both the main icu4j and a subproject called unicodetools. See:
432e5b6d6dSopenharmony_ci<a href="http://www.ibm.com/software/globalization/icu/repository.jsp">
442e5b6d6dSopenharmony_cihttp://www.ibm.com/software/globalization/icu/repository.jsp</a>. Inside unicodetools, look at com/ibm/text. The 
452e5b6d6dSopenharmony_cimain directories of interest are UCD, UCA and utility.</p>
462e5b6d6dSopenharmony_ci<h4>0a. If you are using Eclipse for your IDE, look at the instructions on
472e5b6d6dSopenharmony_ci<a href="http://icu.sourceforge.net/docs/eclipse_howto/eclipse_howto.html">
482e5b6d6dSopenharmony_cihttp://icu.sourceforge.net/docs/eclipse_howto/eclipse_howto.html</a> </h4>
492e5b6d6dSopenharmony_ci<p>Set up Eclipse to build two projects: ICU4J and UnicodeTools:<br>
502e5b6d6dSopenharmony_ci<br>
512e5b6d6dSopenharmony_ci<b>Project Name: </b>ICU4J<br>
522e5b6d6dSopenharmony_ci<b>Directory: </b>C:\ICU4J\icu4j<br>
532e5b6d6dSopenharmony_ci<b>Default output folder = </b>ICU4J/classes<br>
542e5b6d6dSopenharmony_ci<br>
552e5b6d6dSopenharmony_ci<b>Project Name: </b>unicodetools<br>
562e5b6d6dSopenharmony_ci<b>Create project from existing source: </b>C:\ICU4J\unicodetools<br>
572e5b6d6dSopenharmony_ci<b>Default Output Folder: </b>unicodetools/classes<br>
582e5b6d6dSopenharmony_ci<br>
592e5b6d6dSopenharmony_ciAfter Eclipse is set up with these, exclude certain files from unicodetools:<br>
602e5b6d6dSopenharmony_ci<br>
612e5b6d6dSopenharmony_ciRight-Click UnicodeTools &gt; Properties &gt; Java Build Path &gt; Exclusions<br>
622e5b6d6dSopenharmony_cicom/ibm/rbm/<br>
632e5b6d6dSopenharmony_cicom/ibm/text/utility/UnicodeMapInt.java<br>
642e5b6d6dSopenharmony_cicom/ibm/text/utility/TestUtility.java<br>
652e5b6d6dSopenharmony_cicom/ibm/text/UCD/GenerateThaiBreaks-old.java/<br>
662e5b6d6dSopenharmony_cicom/ibm/text/UCD/ProcessUnihan.java/<br>
672e5b6d6dSopenharmony_cicom/ibm/text/UCA/WriteHTMLCollation.java/<br>
682e5b6d6dSopenharmony_ci<br>
692e5b6d6dSopenharmony_ciUnicodeTools must also include the ICU4J project, with<br>
702e5b6d6dSopenharmony_ci<br>
712e5b6d6dSopenharmony_ciRight-Click UnicodeTools &gt; Properties &gt; Java Build Path &gt; Projects</p>
722e5b6d6dSopenharmony_ci<h3>1. In UCD, you must edit UCD_Types.java at the top, to set the directories for the build:</h3>
732e5b6d6dSopenharmony_ci<p>public static final String DATA_DIR = &quot;C:\\DATA\\&quot;;<br>
742e5b6d6dSopenharmony_cipublic static final String UCD_DIR = BASE_DIR + &quot;UCD\\&quot;;<br>
752e5b6d6dSopenharmony_cipublic static final String BIN_DIR = DATA_DIR + &quot;BIN\\&quot;;<br>
762e5b6d6dSopenharmony_cipublic static final String GEN_DIR = DATA_DIR + &quot;GEN\\&quot;;<br>
772e5b6d6dSopenharmony_ci<br>
782e5b6d6dSopenharmony_ciMake sure that each of these directories exist. Also make sure that the following<br>
792e5b6d6dSopenharmony_ciexist:<br>
802e5b6d6dSopenharmony_ci<br>
812e5b6d6dSopenharmony_ci&lt;GEN_DIR&gt;/DerivedData<br>
822e5b6d6dSopenharmony_ci&lt;GEN_DIR&gt;/DerivedData/ExtractedProperties<br>
832e5b6d6dSopenharmony_ci&lt;UCD_DIR&gt;/EXTRAS-Update</p>
842e5b6d6dSopenharmony_ci<h3>2. Download all of the UnicodeData files for each version into UCD_DIR.</h3>
852e5b6d6dSopenharmony_ci<p>The folder names must be of the form: &quot;3.2.0-Update&quot;, so rename the folders on the<br>
862e5b6d6dSopenharmony_ciUnicode site to this format. I<span style="background-color: #FFFF00">f the 
872e5b6d6dSopenharmony_cifolder contains ucd, then make the contents of that directory be the contents of 
882e5b6d6dSopenharmony_cithe x.x.x-Update directory. That is, each directory will directly contain files 
892e5b6d6dSopenharmony_cilike PropList....txt</span></p>
902e5b6d6dSopenharmony_ci<h4>2a Ensure Complete Release</h4>
912e5b6d6dSopenharmony_ci<p>If you are downloading any &quot;incomplete&quot; release (one that does not contain a complete set of data 
922e5b6d6dSopenharmony_cifiles for that release, you need to also download the previous complete release). Most of the N.M-Update 
932e5b6d6dSopenharmony_cidirectoriess are complete, *except*:</p>
942e5b6d6dSopenharmony_ci<p>4.0-Update, which does not contain a copy of Unihan.txt and some other files<br>
952e5b6d6dSopenharmony_ci3.1-Update, which does not contain a copy of BidiMirroring.txt</p>
962e5b6d6dSopenharmony_ci<p>Also, make the following changes to UnicodeData for 1.1.5:</p>
972e5b6d6dSopenharmony_ci<p><b>Delete</b></p>
982e5b6d6dSopenharmony_ci<pre>3400;HANGUL SYLLABLE KIYEOK A;Lo;0;L;1100 1161;;;;N;;;;;
992e5b6d6dSopenharmony_ci...
1002e5b6d6dSopenharmony_ci4DFF;HANGUL SYLLABLE MIEUM WEO RIEUL-THIEUTH;Lo;0;L;1106 116F 11B4;;;;N;;;;;
1012e5b6d6dSopenharmony_ci4E00;<cjk IDEOGRAPH REPRESENTATIVE>;Lo;0;L;;;;;N;;;;;</pre>
1022e5b6d6dSopenharmony_ci<p><b>Add:</b></p>
1032e5b6d6dSopenharmony_ci<pre>4E00;<cjk Ideograph, First>;Lo;0;L;;;;;N;;;;;
1042e5b6d6dSopenharmony_ci9FA5;<cjk Ideograph, Last>;Lo;0;L;;;;;N;;;;;
1052e5b6d6dSopenharmony_ciE000;<private Use, First>;Co;0;L;;;;;N;;;;;
1062e5b6d6dSopenharmony_ciF8FF;<private Use, Last>;Co;0;L;;;;;N;;;;;</pre>
1072e5b6d6dSopenharmony_ci<p><b>And from a late version of Unicode, add:</b></p>
1082e5b6d6dSopenharmony_ci<pre>F900;CJK COMPATIBILITY IDEOGRAPH-F900;Lo;0;L;8C48;;;;N;;;;;
1092e5b6d6dSopenharmony_ci...
1102e5b6d6dSopenharmony_ciFA2D;CJK COMPATIBILITY IDEOGRAPH-FA2D;Lo;0;L;9DB4;;;;N;;;;;</pre>
1112e5b6d6dSopenharmony_ci<h4>2b. UCA data</h4>
1122e5b6d6dSopenharmony_ci<p>If you are building any of the UCA tools, you need to get a copy of the UCA data file<br>
1132e5b6d6dSopenharmony_cifrom http://www.unicode.org/reports/tr10/#AllKeys. The default location for this is:<br>
1142e5b6d6dSopenharmony_ci<br>
1152e5b6d6dSopenharmony_ciBASE_DIR + &quot;Collation\allkeys&quot; + VERSION + &quot;.txt&quot;.<br>
1162e5b6d6dSopenharmony_ci<br>
1172e5b6d6dSopenharmony_ciIf you have it in a different location, change that value for KEYS in UCA.java, and <br>
1182e5b6d6dSopenharmony_cithe value for BASE_DIR</p>
1192e5b6d6dSopenharmony_ci<h4>2c. Here is an example of the default directory structure with files. All of 
1202e5b6d6dSopenharmony_cithe yellow ones should exist</h4>
1212e5b6d6dSopenharmony_ci<pre>C://DATA/
1222e5b6d6dSopenharmony_ci
1232e5b6d6dSopenharmony_ci        BIN/
1242e5b6d6dSopenharmony_ci    
1252e5b6d6dSopenharmony_ci<span style="background-color: #FFFF00">        Collation/
1262e5b6d6dSopenharmony_ci            allkeys-3.1.1.txt
1272e5b6d6dSopenharmony_ci</span>        
1282e5b6d6dSopenharmony_ci        GEN/
1292e5b6d6dSopenharmony_ci            DerivedData/
1302e5b6d6dSopenharmony_ci<span style="background-color: #FFFF00">        </span><span style="background-color: #FFFF00">UCD/
1312e5b6d6dSopenharmony_ci            3.0.0-Update/
1322e5b6d6dSopenharmony_ci                Unihan-3.2.0.txt
1332e5b6d6dSopenharmony_ci                ...
1342e5b6d6dSopenharmony_ci            3.0.1-Update/
1352e5b6d6dSopenharmony_ci                ...
1362e5b6d6dSopenharmony_ci            3.1.0-Update/
1372e5b6d6dSopenharmony_ci                ...
1382e5b6d6dSopenharmony_ci            3.1.1-Update/
1392e5b6d6dSopenharmony_ci                ...
1402e5b6d6dSopenharmony_ci            3.2.0-Update/
1412e5b6d6dSopenharmony_ci                ...
1422e5b6d6dSopenharmony_ci            4.0.0-Update/
1432e5b6d6dSopenharmony_ci                ArabicShaping-4.0.0d14b.txt
1442e5b6d6dSopenharmony_ci                BidiMirroring-4.0.0d1b.txt
1452e5b6d6dSopenharmony_ci                ...
1462e5b6d6dSopenharmony_ci            EXTRAS-Update/</span></pre>
1472e5b6d6dSopenharmony_ci<h3>3. Versions</h3>
1482e5b6d6dSopenharmony_ci<p>All of the following have &quot;version X&quot; in the options you give to Java (either on the&nbsp; 
1492e5b6d6dSopenharmony_cicommand line, or in the Eclipse 'run' options. If you want a specific version like 3.1.0, then you 
1502e5b6d6dSopenharmony_ciwould write &quot;version 3.1.1&quot;. If you want the latest version (4.1.0), you can omit the &quot;version X&quot;.</p>
1512e5b6d6dSopenharmony_ci<h3>4. Building Files</h3>
1522e5b6d6dSopenharmony_ci<ol>
1532e5b6d6dSopenharmony_ci	<li><b>Setup</b><ol>
1542e5b6d6dSopenharmony_ci		<li>In Eclipse, open the Package Explorer (Use Window&gt;Show View if you 
1552e5b6d6dSopenharmony_ci		don't see it)</li>
1562e5b6d6dSopenharmony_ci		<li>Open UnicodeTools<ul>
1572e5b6d6dSopenharmony_ci			<li>com.ibm.text.UCD<ul>
1582e5b6d6dSopenharmony_ci				<li>MakeUnicodeFiles.<span style="background-color: #FFFF00">txt</span><p>This file drives the production of 
1592e5b6d6dSopenharmony_ci				the derived Unicode files. The first three lines contain 
1602e5b6d6dSopenharmony_ci				parameters that you may want to modify at some times:</p>
1612e5b6d6dSopenharmony_ci				<pre>Generate: <b>.*script.*</b> <i>// this is a regular expression. Use .* for all files</i>
1622e5b6d6dSopenharmony_ciDeltaVersion: <b>10</b> <i>    // This gets appended to the file name. Pick 1+ the highest value in Public</i>
1632e5b6d6dSopenharmony_ciCopyrightYear: <b>2006</b> <i> // Pick the current year</i></pre>
1642e5b6d6dSopenharmony_ci				</li>
1652e5b6d6dSopenharmony_ci			</ul>
1662e5b6d6dSopenharmony_ci			</li>
1672e5b6d6dSopenharmony_ci		</ul>
1682e5b6d6dSopenharmony_ci		</li>
1692e5b6d6dSopenharmony_ci		<li>Open in Package Explorer 
1702e5b6d6dSopenharmony_ci		<ul>
1712e5b6d6dSopenharmony_ci			<li>com.ibm.text.UCD<ul>
1722e5b6d6dSopenharmony_ci				<li>Main</li>
1732e5b6d6dSopenharmony_ci			</ul>
1742e5b6d6dSopenharmony_ci			</li>
1752e5b6d6dSopenharmony_ci		</ul>
1762e5b6d6dSopenharmony_ci		</li>
1772e5b6d6dSopenharmony_ci		<li>Run&gt;Run As...<ol>
1782e5b6d6dSopenharmony_ci			<li>Choose Java Application<ul>
1792e5b6d6dSopenharmony_ci				<li>it will fail, don't worry; you need to set some parameters.</li>
1802e5b6d6dSopenharmony_ci			</ul>
1812e5b6d6dSopenharmony_ci			</li>
1822e5b6d6dSopenharmony_ci		</ol>
1832e5b6d6dSopenharmony_ci		</li>
1842e5b6d6dSopenharmony_ci		<li>Run&gt;Run...<ul>
1852e5b6d6dSopenharmony_ci			<li>Select the Arguments tab, and fill in the following<ul>
1862e5b6d6dSopenharmony_ci				<li>Program arguments:<pre>build 5.0<span style="background-color: #FFFF00">.0</span> MakeUnicodeFiles</pre>
1872e5b6d6dSopenharmony_ci				</li>
1882e5b6d6dSopenharmony_ci				<li>VM arguments: 
1892e5b6d6dSopenharmony_ci				<pre>-Xms512m -Xmx512m</pre>
1902e5b6d6dSopenharmony_ci				</li>
1912e5b6d6dSopenharmony_ci			</ul>
1922e5b6d6dSopenharmony_ci			</li>
1932e5b6d6dSopenharmony_ci			<li>Close and Save</li>
1942e5b6d6dSopenharmony_ci		</ul>
1952e5b6d6dSopenharmony_ci		</li>
1962e5b6d6dSopenharmony_ci	</ol>
1972e5b6d6dSopenharmony_ci	</li>
1982e5b6d6dSopenharmony_ci	<li><b>Run</b><ol>
1992e5b6d6dSopenharmony_ci		<li>You'll see it build the 5.0 files, with something like the following 
2002e5b6d6dSopenharmony_ci		results:<pre>Writing UCD_Data5.0.0
2012e5b6d6dSopenharmony_ciData Size: 109,802
2022e5b6d6dSopenharmony_ciWrote Data 109802</pre>
2032e5b6d6dSopenharmony_ci		</li>
2042e5b6d6dSopenharmony_ci		<li>For each version, the tools build a set of binary data in BIN that 
2052e5b6d6dSopenharmony_ci		contain the information for that release. This is done automatically, or 
2062e5b6d6dSopenharmony_ci		you can manually do it with the Program Arguments<pre>version X build</pre>
2072e5b6d6dSopenharmony_ci		<p>This builds an compressed format of all the UCD data (except blocks 
2082e5b6d6dSopenharmony_ci		and Unihan) into the BIN directory. Don't worry about the voluminous 
2092e5b6d6dSopenharmony_ci		console messages, unless one says &quot;FAIL&quot;.</p>
2102e5b6d6dSopenharmony_ci		<p><font color="#FF0000"><i>You have to manually do this if you change 
2112e5b6d6dSopenharmony_ci		any of the data files in that version!</i></font></p>
2122e5b6d6dSopenharmony_ci		<p>Note: if for any reason you modify the binary format of the BIN files, you also have to bump the 
2132e5b6d6dSopenharmony_civalue in that file:</p>
2142e5b6d6dSopenharmony_ci		<pre>static final byte BINARY_FORMAT = 8; // bumped if binary format of UCD changes</pre>
2152e5b6d6dSopenharmony_ci		</li>
2162e5b6d6dSopenharmony_ci	</ol>
2172e5b6d6dSopenharmony_ci	</li>
2182e5b6d6dSopenharmony_ci	<li>Results in <a href="file:///C:/DATA/GEN/DerivedData">
2192e5b6d6dSopenharmony_ci	C:\DATA\GEN\DerivedData</a><ol>
2202e5b6d6dSopenharmony_ci		<li>The files will be in this directory.</li>
2212e5b6d6dSopenharmony_ci		<li>There are also DIFF folders, that contain BAT files that you can run 
2222e5b6d6dSopenharmony_ci		on Windows with CompareIt. (You can modify the code to build BATs with 
2232e5b6d6dSopenharmony_ci		another Diff program if you want).<ol>
2242e5b6d6dSopenharmony_ci			<li>For any file with a significant difference, it will build two 
2252e5b6d6dSopenharmony_ci			BAT files, such as the first two below.<pre>Diff_PropList-5.0.0d10.txt.bat
2262e5b6d6dSopenharmony_ciOLDER-Diff_PropList-5.0.0d10.txt.bat
2272e5b6d6dSopenharmony_ci
2282e5b6d6dSopenharmony_ciUNCHANGED-Diff_PropertyValueAliases-5.0.0d10.txt.bat</pre>
2292e5b6d6dSopenharmony_ci			</li>
2302e5b6d6dSopenharmony_ci		</ol>
2312e5b6d6dSopenharmony_ci		</li>
2322e5b6d6dSopenharmony_ci		<li>Any files without significant changes will have &quot;UNCHANGED&quot; as a 
2332e5b6d6dSopenharmony_ci		prefix: ignore them.&nbsp; The OLDER prefix is the comparison to the 
2342e5b6d6dSopenharmony_ci		last version of Unicode.</li>
2352e5b6d6dSopenharmony_ci		<li>On Windows you can run these BATs to compare files:</li>
2362e5b6d6dSopenharmony_ci	</ol>
2372e5b6d6dSopenharmony_ci	</li>
2382e5b6d6dSopenharmony_ci	<li><span style="background-color: #FFFF00">NFSkippable</span><ol>
2392e5b6d6dSopenharmony_ci	<li><span style="background-color: #FFFF00">A file is needed by ICU that is 
2402e5b6d6dSopenharmony_ci	generated with the same tool. Just use the input parameter &quot;NFSkippable&quot; to 
2412e5b6d6dSopenharmony_ci	generate the file NFSafeSets.txt, also in </span>
2422e5b6d6dSopenharmony_ci	<a href="file:///C:/DATA/GEN"><span style="background-color: #FFFF00">
2432e5b6d6dSopenharmony_ci	file:///C:/DATA/GEN</span></a></li>
2442e5b6d6dSopenharmony_ci</ol>
2452e5b6d6dSopenharmony_ci	</li>
2462e5b6d6dSopenharmony_ci</ol>
2472e5b6d6dSopenharmony_ci<h3>5. Invariant Checking</h3>
2482e5b6d6dSopenharmony_ci<ol>
2492e5b6d6dSopenharmony_ci	<li>Setup<ol>
2502e5b6d6dSopenharmony_ci		<li>Open in Package Explorer<ul>
2512e5b6d6dSopenharmony_ci			<li>com.ibm.text.UCD<ul>
2522e5b6d6dSopenharmony_ci				<li>TestUnicodeInvariants.java</li>
2532e5b6d6dSopenharmony_ci			</ul>
2542e5b6d6dSopenharmony_ci			</li>
2552e5b6d6dSopenharmony_ci		</ul>
2562e5b6d6dSopenharmony_ci		</li>
2572e5b6d6dSopenharmony_ci		<li>Run&gt;Run As... Java Application<br>
2582e5b6d6dSopenharmony_ci		Will create the following file of results:<pre><a href="file:///C:/DATA/GEN/UnicodeInvariantResults.txt/">C:\DATA\GEN\UnicodeInvariantResults.txt\</a></pre>
2592e5b6d6dSopenharmony_ci		<p>And on the console will list whether any problems are found. Thus in 
2602e5b6d6dSopenharmony_ci		the following case there was one failure:</p>
2612e5b6d6dSopenharmony_ci		<pre>ParseErrorCount=0
2622e5b6d6dSopenharmony_ciTestFailureCount=1</pre>
2632e5b6d6dSopenharmony_ci		</li>
2642e5b6d6dSopenharmony_ci		<li>The header of the result file explains the syntax of the tests.</li>
2652e5b6d6dSopenharmony_ci		<li>Open that file and search for &quot;**** START Error Info ****&quot;. Each such 
2662e5b6d6dSopenharmony_ci		point provides a dump of comparison information.<ol>
2672e5b6d6dSopenharmony_ci		<li>Failures print a list of differences between two sets being 
2682e5b6d6dSopenharmony_ci		compared. So if A and B are being compared, it prints all the items in 
2692e5b6d6dSopenharmony_ci		A-B, then in B-A, then in A&amp;B.</li>
2702e5b6d6dSopenharmony_ci		<li>For example, here is a listing of a problem that must be corrected. 
2712e5b6d6dSopenharmony_ci		Note that usually there is a comment that explains what the following 
2722e5b6d6dSopenharmony_ci		line or lines are supposed to test. Then will come false (indicating 
2732e5b6d6dSopenharmony_ci		that the test failed), then the detailed error report.<pre><span style="font-size: 9pt"># Canonical decompositions (minus exclusions) must be identical across releases
2742e5b6d6dSopenharmony_ci[$Decomposition_Type:Canonical - $Full_Composition_Exclusion] = [$�Decomposition_Type:Canonical - $�Full_Composition_Exclusion]
2752e5b6d6dSopenharmony_ci
2762e5b6d6dSopenharmony_cifalse
2772e5b6d6dSopenharmony_ci**** START Error Info ****
2782e5b6d6dSopenharmony_ci
2792e5b6d6dSopenharmony_ciIn [$�Decomposition_Type:Canonical - $�Full_Composition_Exclusion], but not in [$Decomposition_Type:Canonical - $Full_Composition_Exclusion] :
2802e5b6d6dSopenharmony_ci
2812e5b6d6dSopenharmony_ci# Total code points: 0
2822e5b6d6dSopenharmony_ci
2832e5b6d6dSopenharmony_ciNot in [$�Decomposition_Type:Canonical - $�Full_Composition_Exclusion], but in [$Decomposition_Type:Canonical - $Full_Composition_Exclusion] :
2842e5b6d6dSopenharmony_ci1B06           # Lo       BALINESE LETTER AKARA TEDUNG
2852e5b6d6dSopenharmony_ci1B08           # Lo       BALINESE LETTER IKARA TEDUNG
2862e5b6d6dSopenharmony_ci1B0A           # Lo       BALINESE LETTER UKARA TEDUNG
2872e5b6d6dSopenharmony_ci1B0C           # Lo       BALINESE LETTER RA REPA TEDUNG
2882e5b6d6dSopenharmony_ci1B0E           # Lo       BALINESE LETTER LA LENGA TEDUNG
2892e5b6d6dSopenharmony_ci1B12           # Lo       BALINESE LETTER OKARA TEDUNG
2902e5b6d6dSopenharmony_ci1B3B           # Mc       BALINESE VOWEL SIGN RA REPA TEDUNG
2912e5b6d6dSopenharmony_ci1B3D           # Mc       BALINESE VOWEL SIGN LA LENGA TEDUNG
2922e5b6d6dSopenharmony_ci1B40..1B41     # Mc   [2] BALINESE VOWEL SIGN TALING TEDUNG..BALINESE VOWEL SIGN TALING REPA TEDUNG
2932e5b6d6dSopenharmony_ci1B43           # Mc       BALINESE VOWEL SIGN PEPET TEDUNG
2942e5b6d6dSopenharmony_ci
2952e5b6d6dSopenharmony_ci# Total code points: 11
2962e5b6d6dSopenharmony_ci
2972e5b6d6dSopenharmony_ciIn both [$�Decomposition_Type:Canonical - $�Full_Composition_Exclusion], and in [$Decomposition_Type:Canonical - $Full_Composition_Exclusion] :
2982e5b6d6dSopenharmony_ci00C0..00C5     # L&amp;   [6] LATIN CAPITAL LETTER A WITH GRAVE..LATIN CAPITAL LETTER A WITH RING ABOVE
2992e5b6d6dSopenharmony_ci00C7..00CF     # L&amp;   [9] LATIN CAPITAL LETTER C WITH CEDILLA..LATIN CAPITAL LETTER I WITH DIAERESIS
3002e5b6d6dSopenharmony_ci00D1..00D6     # L&amp;   [6] LATIN CAPITAL LETTER N WITH TILDE..LATIN CAPITAL LETTER O WITH DIAERESIS
3012e5b6d6dSopenharmony_ci...
3022e5b6d6dSopenharmony_ci30F7..30FA     # Lo   [4] KATAKANA LETTER VA..KATAKANA LETTER VO
3032e5b6d6dSopenharmony_ci30FE           # Lm       KATAKANA VOICED ITERATION MARK
3042e5b6d6dSopenharmony_ciAC00..D7A3     # Lo [11172] HANGUL SYLLABLE GA..HANGUL SYLLABLE HIH
3052e5b6d6dSopenharmony_ci
3062e5b6d6dSopenharmony_ci# Total code points: 12089
3072e5b6d6dSopenharmony_ci**** END Error Info ****</span></pre>
3082e5b6d6dSopenharmony_ci		</li>
3092e5b6d6dSopenharmony_ci	</ol>
3102e5b6d6dSopenharmony_ci		</li>
3112e5b6d6dSopenharmony_ci		<li>Options:<ol>
3122e5b6d6dSopenharmony_ci		<li>-r&nbsp;&nbsp;&nbsp; Print the failures as a range list.</li>
3132e5b6d6dSopenharmony_ci		<li>-fxxx&nbsp;&nbsp;&nbsp; Use a different input file, such as -fInvariantTest.txt</li>
3142e5b6d6dSopenharmony_ci	</ol>
3152e5b6d6dSopenharmony_ci		</li>
3162e5b6d6dSopenharmony_ci	</ol>
3172e5b6d6dSopenharmony_ci	</li>
3182e5b6d6dSopenharmony_ci</ol>
3192e5b6d6dSopenharmony_ci<h3>6. Options</h3>
3202e5b6d6dSopenharmony_ci<ol>
3212e5b6d6dSopenharmony_ci	<li>If you want to see files that are opened while processing, do the 
3222e5b6d6dSopenharmony_ci	following:<ol>
3232e5b6d6dSopenharmony_ci		<li>Run&gt;Run</li>
3242e5b6d6dSopenharmony_ci		<li>Select the Arguments tab, and add the following<ol>
3252e5b6d6dSopenharmony_ci			<li>VM arguments:
3262e5b6d6dSopenharmony_ci			<pre>-DSHOW_FILES</pre>
3272e5b6d6dSopenharmony_ci			</li>
3282e5b6d6dSopenharmony_ci		</ol>
3292e5b6d6dSopenharmony_ci		</li>
3302e5b6d6dSopenharmony_ci	</ol>
3312e5b6d6dSopenharmony_ci	</li>
3322e5b6d6dSopenharmony_ci</ol>
3332e5b6d6dSopenharmony_ci<h3>5. UCA</h3>
3342e5b6d6dSopenharmony_ci<ol>
3352e5b6d6dSopenharmony_ci	<li>You will use com.ibm.text.UCA.Main as your main class, creating along 
3362e5b6d6dSopenharmony_ci	the same lines as above.</li>
3372e5b6d6dSopenharmony_ci	<li>To test whether the UCA files are valid, use the
3382e5b6d6dSopenharmony_ci	<span style="font-weight: 400">options (<i>note: you must also build the ICU 
3392e5b6d6dSopenharmony_ci	files below, since they test other aspects</i>).</span><pre>writeCollationValidityLog</pre>
3402e5b6d6dSopenharmony_ci	<p>It will create a file:</p>
3412e5b6d6dSopenharmony_ci	<pre><a href="file:///C:/DATA/GEN/collation/5.0.0/CheckCollationValidity.html">C:\DATA\GEN\collation\5.0.0\CheckCollationValidity.html</a></pre>
3422e5b6d6dSopenharmony_ci	<ol>
3432e5b6d6dSopenharmony_ci		<li>Review this file. It will list errors. Some of those are actually 
3442e5b6d6dSopenharmony_ci	warnings, and indicate possible problems (this is indicated in the text, 
3452e5b6d6dSopenharmony_ci	such as by: &quot;These are not necessarily errors, but should be examined for 
3462e5b6d6dSopenharmony_ci		<i>possible</i> errors&quot;). In those cases, the items should be reviewed to make 
3472e5b6d6dSopenharmony_ci	sure that there are no inadvertent problems.</li>
3482e5b6d6dSopenharmony_ci		<li>If it is not so marked, it is a true error, and must be fixed.</li>
3492e5b6d6dSopenharmony_ci		<li>At the end, there is section <b>11. Coverage</b>. There are two sections:<ol>
3502e5b6d6dSopenharmony_ci			<li>In UCDxxx, but not in allkeys. Check this over to make sure that these 
3512e5b6d6dSopenharmony_ci	are all the characters that should get <b><i>implicit</i></b> weights.</li>
3522e5b6d6dSopenharmony_ci			<li>In allkeys, but not in UCD. These should be <b><i>only</i></b> 
3532e5b6d6dSopenharmony_ci	contractions. Check them over to make sure they look right also.</li>
3542e5b6d6dSopenharmony_ci		</ol></li>
3552e5b6d6dSopenharmony_ci	</ol></li>
3562e5b6d6dSopenharmony_ci	<li>
3572e5b6d6dSopenharmony_ci	<h4><span style="font-weight: 400">To build all the charts (including for 
3582e5b6d6dSopenharmony_ci	the UCA), use the options: </span></h4>
3592e5b6d6dSopenharmony_ci	<pre>normalizationChart caseChart scriptChart indexChart</pre>
3602e5b6d6dSopenharmony_ci	<p>They will be built into</p>
3612e5b6d6dSopenharmony_ci	<pre><a href="file:///C:/DATA/GEN/charts">C:\DATA\GEN\charts</a></pre>
3622e5b6d6dSopenharmony_ci	<p><b>Once UCA is released, then copy those files up to the right spots in 
3632e5b6d6dSopenharmony_ci	the Unicode site:</b><ul>
3642e5b6d6dSopenharmony_ci		<li>
3652e5b6d6dSopenharmony_ci		<pre><a href="http://www.unicode.org/charts/normalization/">http://www.unicode.org/charts/normalization/</a></pre>
3662e5b6d6dSopenharmony_ci		</li>
3672e5b6d6dSopenharmony_ci		<li>
3682e5b6d6dSopenharmony_ci		<pre><a href="http://www.unicode.org/charts/collation/">http://www.unicode.org/charts/collation/</a> </pre>
3692e5b6d6dSopenharmony_ci		</li>
3702e5b6d6dSopenharmony_ci		<li>
3712e5b6d6dSopenharmony_ci		<pre><a href="http://www.unicode.org/charts/case/">http://www.unicode.org/charts/case/</a> </pre>
3722e5b6d6dSopenharmony_ci		</li>
3732e5b6d6dSopenharmony_ci		<li>
3742e5b6d6dSopenharmony_ci		<pre><a href="http://www.unicode.org/charts/collation/">http://www.unicode.org/charts/collation/</a> </pre>
3752e5b6d6dSopenharmony_ci		</li>
3762e5b6d6dSopenharmony_ci	</ul>
3772e5b6d6dSopenharmony_ci	</li>
3782e5b6d6dSopenharmony_ci	<li>
3792e5b6d6dSopenharmony_ci	<h4><span style="font-weight: 400">To build all the UCA files used by ICU, use the 
3802e5b6d6dSopenharmony_ci	option:</span></h4>
3812e5b6d6dSopenharmony_ci	<pre>ICU</pre>
3822e5b6d6dSopenharmony_ci	<p>They will be built into:</p>
3832e5b6d6dSopenharmony_ci	<pre><a href="file:///C:/DATA/GEN/collation/5.0.0">C:\DATA\GEN\collation\5.0.0</a></pre>
3842e5b6d6dSopenharmony_ci	</li>
3852e5b6d6dSopenharmony_ci	<li>You should then build a set of the ICU files for the previous version, 
3862e5b6d6dSopenharmony_ci	if you don't have them. Use the options:<pre>version 4.1.0 ICU</pre>
3872e5b6d6dSopenharmony_ci	<p>Or whatever the last version was.</li>
3882e5b6d6dSopenharmony_ci	<li>Now, you will want to compare versions. The key file is 
3892e5b6d6dSopenharmony_ci	UCA_Rules_NoCE.txt. It contains the rules expressed in ICU format, which 
3902e5b6d6dSopenharmony_ci	allows for comparison across versions of UCA without spurious variations of 
3912e5b6d6dSopenharmony_ci	the numbers getting in the way.<ol>
3922e5b6d6dSopenharmony_ci		<li>Do a Diff between the last and current versions of these files, and 
3932e5b6d6dSopenharmony_ci		verify that all the differences are either new characters, or were 
3942e5b6d6dSopenharmony_ci		authorized to be changed by the UTC.</li>
3952e5b6d6dSopenharmony_ci	</ol></li>
3962e5b6d6dSopenharmony_ci</ol>
3972e5b6d6dSopenharmony_ci
3982e5b6d6dSopenharmony_ci</body>
3992e5b6d6dSopenharmony_ci
4002e5b6d6dSopenharmony_ci</html>