1# Character Processing 2 3## Use Cases 4 5Character rules vary greatly in different languages, and it is usually difficult to extract expected information from the corresponding text. Character processing makes it possible to process text with similar logic under different language rules. 6 7## How to Develop 8 9 10### Character Type Identification Using Character Attributes 11 12Character attributes are used to determine the character type, for example, digit, letter, or space, and check whether a character is of the right-to-left (RTL) language or whether a character is an ideographic character (for example, Chinese, Japanese, or Korean). 13 14These functions are implemented by APIs of the **Unicode** class. For example, you can use [isDigit](../reference/apis-localization-kit/js-apis-i18n.md#isdigit9) to check whether a character is a digit. The development procedure is as follows: 15 161. Import the **i18n** module. 17 18 ```ts 19 import { i18n } from '@kit.LocalizationKit'; 20 ``` 21 222. Obtain the character attribute. 23 24 ```ts 25 let isDigit: boolean = i18n.Unicode.isDigit(char: string); 26 ``` 27 283. Obtain the character type. The following code snippet uses the common type as an example. For details, see the **getType** API reference. 29 30 ```ts 31 let type = i18n.Unicode.getType(char: string); 32 ``` 33 34**Development Example** 35```ts 36// Import the i18n module. 37import { i18n } from '@kit.LocalizationKit'; 38 39// Check whether the input character is a digit. 40let isDigit = i18n.Unicode.isDigit('1'); // isDigit: true 41 42// Check whether a character is of the RTL language. 43let isRTL = i18n.Unicode.isRTL('a'); // isRTL: false 44 45// Check whether a character is an ideographic character. 46let isIdeograph = i18n.Unicode.isIdeograph('Hua'); // isIdeograph: true 47 48// Obtain the character type. 49let type = i18n.Unicode.getType('a'); // type: U_LOWERCASE_LETTER 50``` 51 52 53### Transliteration 54 55Transliteration means to use content with similar pronunciation in the local language to replace the original content. This function is implemented through the [transform](../reference/apis-localization-kit/js-apis-i18n.md#transform9) API of the **Transliterator** class. The development procedure is as follows: 56 57> **NOTE** 58> This module supports the transliteration from Chinese characters to pinyin. However, it does not guaranteed that polyphonic characters are effectively processed based on the context. 59 601. Import the **i18n** module. 61 ```ts 62 import { i18n } from '@kit.LocalizationKit'; 63 ``` 64 652. Create a **Transliterator** object to obtain the transliteration list. 66 ```ts 67 let transliterator: i18n.Transliterator = i18n.Transliterator.getInstance(id: string); // Pass in a valid ID to create a Transliterator object. 68 let ids: string[] = i18n.Transliterator.getAvailableIDs(); // Obtain the list of IDs supported by the Transliterator object. 69 ``` 70 713. Transliterate text. 72 ```ts 73 let res: string = transliterator.transform(text: string); // Transliterate the text content. 74 ``` 75 76 77**Development Example** 78```ts 79// Import the i18n module. 80import { i18n } from '@kit.LocalizationKit'; 81 82// Transliterate the text into the Latn format. 83let transliterator = i18n.Transliterator.getInstance('Any-Latn'); 84let res = transliterator.transform("中国"); // res = "zhōng guó" 85 86// Obtain the list of IDs supported by the Transliterator object. 87let ids = i18n.Transliterator.getAvailableIDs(); // ids: ['ASCII-Latin', 'Accents-Any', ...] 88``` 89 90 91### Character Normalization 92 93Character normalization means to the standardize characters according to the specified paradigm. This function is implemented through the [normalize](../reference/apis-localization-kit/js-apis-i18n.md#normalize10) API of the **Normalizer** class. The development procedure is as follows: 94 951. Import the **i18n** module. 96 ```ts 97 import { i18n } from '@kit.LocalizationKit'; 98 ``` 99 1002. Create a **Normalizer** object. Pass in the text normalization paradigm to create a **Normalizer** object. The text normalization paradigm can be NFC, NFD, NFKC, or NFKD. For details, see [Unicode Normalization Forms](https://www.unicode.org/reports/tr15/#Norm_Forms). 101 ```ts 102 let normalizer: i18n.Normalizer = i18n.Normalizer.getInstance(mode: NormalizerMode); 103 ``` 104 1053. Normalize the text. 106 ```ts 107 let normalizedText: string = normalizer.normalize(text: string); // Normalize the text. 108 ``` 109 110**Development Example** 111```ts 112// Import the i18n module. 113import { i18n } from '@kit.LocalizationKit'; 114 115// Normalize characters in the NFC form. 116let normalizer = i18n.Normalizer.getInstance(i18n.NormalizerMode.NFC); 117let normalizedText = normalizer.normalize('\u1E9B\u0323'); // normalizedText: \u1E9B\u0323 118``` 119 120 121### Line Wrapping 122 123Line wrapping means to obtain the text break position based on the specified text boundary and wrap the line. It is implemented by using the APIs of the [BreakIterator](../reference/apis-localization-kit/js-apis-i18n.md#breakiterator8) class. The development procedure is as follows: 124 1251. Import the **i18n** module. 126 ```ts 127 import { i18n } from '@kit.LocalizationKit'; 128 ``` 129 1302. Create a **BreakIterator** object. 131 Pass a valid locale to create a **BreakIterator** object. This object wraps lines based on the rules specified by the locale. 132 133 ```ts 134 let iterator: i18n.BreakIterator = i18n.getLineInstance(locale: string); 135 ``` 136 1373. Set the text to be processed. 138 ```ts 139 iterator.setLineBreakText(text: string); // Set the text to be processed. 140 let breakText: string = iterator.getLineBreakText(); // View the text being processed by the BreakIterator object. 141 ``` 142 1434. Obtain the break positions of the text. 144 ```ts 145 let currentPos: number = iterator.current(); // Obtain the position of BreakIterator in the text. 146 let firstPos: number = iterator.first(); // Set the position of BreakIterator as the first break point and return the position of the break point. The first break point is always at the beginning of the text, that is firstPos = 0. 147 let nextPos: number = iterator.next(number); // Move BreakIterator by the specified number of break points. If the number is a positive number, the iterator is moved backward. If the number is a negative number, the iterator is moved forward. The default value is 1. nextPos indicates the position after moving. If BreakIterator is moved out of the text length range, -1 is returned. 148 let isBoundary: boolean = iterator.isBoundary(number); // Check whether the position indicated by the specified number is a break point. 149 ``` 150 151 152**Development Example** 153```ts 154// Import the i18n module. 155import { i18n } from '@kit.LocalizationKit'; 156 157// Create a BreakIterator object. 158let iterator = i18n.getLineInstance('en-GB'); 159 160// Set the text to be processed. 161iterator.setLineBreakText('Apple is my favorite fruit.'); 162 163// Move BreakIterator to the beginning of the text. 164let firstPos = iterator.first(); // firstPos: 0 165 166// Move BreakIterator by several break points. 167let nextPos = iterator.next(2); // nextPos: 9 168 169// Check whether a position is a break point. 170let isBoundary = iterator.isBoundary(9); // isBoundary: true 171 172// Obtain the text processed by BreakIterator. 173let breakText = iterator.getLineBreakText(); // breakText: Apple is my favorite fruit. 174``` 175<!--RP1--><!--RP1End--> 176 177<!--no_check-->