162306a36Sopenharmony_ci.. include:: ../disclaimer-zh_CN.rst 262306a36Sopenharmony_ci 362306a36Sopenharmony_ci:Original: Documentation/admin-guide/unicode.rst 462306a36Sopenharmony_ci 562306a36Sopenharmony_ci:译者: 662306a36Sopenharmony_ci 762306a36Sopenharmony_ci 吴想成 Wu XiangCheng <bobwxc@email.cn> 862306a36Sopenharmony_ci 962306a36Sopenharmony_ciUnicode(统一码)支持 1062306a36Sopenharmony_ci====================== 1162306a36Sopenharmony_ci 1262306a36Sopenharmony_ci (英文版)上次更新:2005-01-17,版本号 1.4 1362306a36Sopenharmony_ci 1462306a36Sopenharmony_ci此文档由H. Peter Anvin <unicode@lanana.org>管理,是Linux注册名称与编号管理局 1562306a36Sopenharmony_ci(Linux Assigned Names And Numbers Authority,LANANA)项目的一部分。 1662306a36Sopenharmony_ci现行版本请见: 1762306a36Sopenharmony_ci 1862306a36Sopenharmony_ci http://www.lanana.org/docs/unicode/admin-guide/unicode.rst 1962306a36Sopenharmony_ci 2062306a36Sopenharmony_ci简介 2162306a36Sopenharmony_ci----- 2262306a36Sopenharmony_ci 2362306a36Sopenharmony_ciLinux内核代码已被重写以使用Unicode来将字符映射到字体。下载一个Unicode到字体 2462306a36Sopenharmony_ci(Unicode-to-font)表,八位字符集与UTF-8模式都将改用此字体来显示。 2562306a36Sopenharmony_ci 2662306a36Sopenharmony_ci这微妙地改变了八位字符表的语义。现在的四个字符表是: 2762306a36Sopenharmony_ci 2862306a36Sopenharmony_ci=============== =============================== ================ 2962306a36Sopenharmony_ci映射代号 映射名称 Escape代码 (G0) 3062306a36Sopenharmony_ci=============== =============================== ================ 3162306a36Sopenharmony_ciLAT1_MAP Latin-1 (ISO 8859-1) ESC ( B 3262306a36Sopenharmony_ciGRAF_MAP DEC VT100 pseudographics ESC ( 0 3362306a36Sopenharmony_ciIBMPC_MAP IBM code page 437 ESC ( U 3462306a36Sopenharmony_ciUSER_MAP User defined ESC ( K 3562306a36Sopenharmony_ci=============== =============================== ================ 3662306a36Sopenharmony_ci 3762306a36Sopenharmony_ci特别是 ESC ( U 不再是“直通字体”,因为字体可能与IBM字符集完全不同。 3862306a36Sopenharmony_ci例如,即使加载了一个Latin-1字体,也允许使用块图形(block graphics)。 3962306a36Sopenharmony_ci 4062306a36Sopenharmony_ci请注意,尽管这些代码与ISO 2022类似,但这些代码及其用途都与ISO 2022不匹配; 4162306a36Sopenharmony_ciLinux有两个八位代码(G0和G1),而ISO 2022有四个七位代码(G0-G3)。 4262306a36Sopenharmony_ci 4362306a36Sopenharmony_ci根据Unicode标准/ISO 10646,U+F000到U+F8FF被保留用于操作系统范围内的分配 4462306a36Sopenharmony_ci(Unicode标准将其称为“团体区域(Corporate Zone)”,因为这对于Linux是不准确 4562306a36Sopenharmony_ci的,所以我们称之为“Linux区域”)。选择U+F000作为起点,因为它允许直接映射 4662306a36Sopenharmony_ci区域以2的大倍数开始(以防需要1024或2048个字符的字体)。这就留下U+E000到 4762306a36Sopenharmony_ciU+EFFF作为最终用户区。 4862306a36Sopenharmony_ci 4962306a36Sopenharmony_ci[v1.2]:Unicodes范围从U+F000到U+F7FF已经被硬编码为直接映射到加载的字体, 5062306a36Sopenharmony_ci绕过了翻译表。用户定义的映射现在默认为U+F000到U+F0FF,模拟前述行为。实际上, 5162306a36Sopenharmony_ci此范围可能较短;例如,vgacon只能处理256字符(U+F000..U+F0FF)或512字符 5262306a36Sopenharmony_ci(U+F000..U+F1FF)字体。 5362306a36Sopenharmony_ci 5462306a36Sopenharmony_ciLinux 区域中定义的实际字符 5562306a36Sopenharmony_ci--------------------------- 5662306a36Sopenharmony_ci 5762306a36Sopenharmony_ci此外,还定义了Unicode 1.1.4中不存在的以下字符;这些字符由DEC VT图形映射使用。 5862306a36Sopenharmony_ci[v1.2]此用法已过时,不应再使用;请参见下文。 5962306a36Sopenharmony_ci 6062306a36Sopenharmony_ci====== ====================================== 6162306a36Sopenharmony_ciU+F800 DEC VT GRAPHICS HORIZONTAL LINE SCAN 1 6262306a36Sopenharmony_ciU+F801 DEC VT GRAPHICS HORIZONTAL LINE SCAN 3 6362306a36Sopenharmony_ciU+F803 DEC VT GRAPHICS HORIZONTAL LINE SCAN 7 6462306a36Sopenharmony_ciU+F804 DEC VT GRAPHICS HORIZONTAL LINE SCAN 9 6562306a36Sopenharmony_ci====== ====================================== 6662306a36Sopenharmony_ci 6762306a36Sopenharmony_ciDEC VT220使用6x10字符矩阵,这些字符在DEC VT图形字符集中形成一个平滑的过渡。 6862306a36Sopenharmony_ci我省略了扫描5行,因为它也被用作块图形字符,因此被编码为U+2500 FORMS LIGHT 6962306a36Sopenharmony_ciHORIZONTAL。 7062306a36Sopenharmony_ci 7162306a36Sopenharmony_ci[v1.3]:这些字符已正式添加到Unicode 3.2.0中;它们在U+23BA、U+23BB、U+23BC、 7262306a36Sopenharmony_ciU+23BD处添加。Linux现在使用新值。 7362306a36Sopenharmony_ci 7462306a36Sopenharmony_ci[v1.2]:添加了以下字符来表示常见的键盘符号,这些符号不太可能被添加到Unicode 7562306a36Sopenharmony_ci中,因为它们非常讨厌地取决于特定供应商。当然,这是糟糕设计的一个好例子。 7662306a36Sopenharmony_ci 7762306a36Sopenharmony_ci====== ====================================== 7862306a36Sopenharmony_ciU+F810 KEYBOARD SYMBOL FLYING FLAG 7962306a36Sopenharmony_ciU+F811 KEYBOARD SYMBOL PULLDOWN MENU 8062306a36Sopenharmony_ciU+F812 KEYBOARD SYMBOL OPEN APPLE 8162306a36Sopenharmony_ciU+F813 KEYBOARD SYMBOL SOLID APPLE 8262306a36Sopenharmony_ci====== ====================================== 8362306a36Sopenharmony_ci 8462306a36Sopenharmony_ci克林贡(Klingon)语支持 8562306a36Sopenharmony_ci------------------------ 8662306a36Sopenharmony_ci 8762306a36Sopenharmony_ci1996年,Linux是世界上第一个添加对人工语言克林贡支持的操作系统,克林贡是由 8862306a36Sopenharmony_ciMarc Okrand为《星际迷航》电视连续剧创造的。这种编码后来被征募Unicode注册表 8962306a36Sopenharmony_ci(ConScript Unicode Registry,CSUR)采用,并建议(但最终被拒绝)纳入Unicode 9062306a36Sopenharmony_ci平面一。不过,它仍然是Linux区域中的Linux/CSUR私有分配。 9162306a36Sopenharmony_ci 9262306a36Sopenharmony_ci这种编码已经得到克林贡语言研究所(Klingon Language Institute)的认可。 9362306a36Sopenharmony_ci有关更多信息,请联系他们: 9462306a36Sopenharmony_ci 9562306a36Sopenharmony_ci http://www.kli.org/ 9662306a36Sopenharmony_ci 9762306a36Sopenharmony_ci由于Linux CZ开头部分的字符大多是dingbats/symbols/forms类型,而且这是一种 9862306a36Sopenharmony_ci语言,因此根据标准Unicode惯例,我将它放置在16单元的边界上。 9962306a36Sopenharmony_ci 10062306a36Sopenharmony_ci.. note:: 10162306a36Sopenharmony_ci 10262306a36Sopenharmony_ci 这个范围现在由征募Unicode注册表正式管理。规范性引用文件为: 10362306a36Sopenharmony_ci 10462306a36Sopenharmony_ci https://www.evertype.com/standards/csur/klingon.html 10562306a36Sopenharmony_ci 10662306a36Sopenharmony_ci克林贡语有一个26个字符的字母表,一个10位数的位置数字书写系统,从左到右 10762306a36Sopenharmony_ci,从上到下书写。 10862306a36Sopenharmony_ci 10962306a36Sopenharmony_ci克林贡字母的几种字形已经被提出。但是由于这组符号看起来始终是一致的,只有实际 11062306a36Sopenharmony_ci的形状不同,因此按照标准Unicode惯例,这些差异被认为是字体变体。 11162306a36Sopenharmony_ci 11262306a36Sopenharmony_ci====== ======================================================= 11362306a36Sopenharmony_ciU+F8D0 KLINGON LETTER A 11462306a36Sopenharmony_ciU+F8D1 KLINGON LETTER B 11562306a36Sopenharmony_ciU+F8D2 KLINGON LETTER CH 11662306a36Sopenharmony_ciU+F8D3 KLINGON LETTER D 11762306a36Sopenharmony_ciU+F8D4 KLINGON LETTER E 11862306a36Sopenharmony_ciU+F8D5 KLINGON LETTER GH 11962306a36Sopenharmony_ciU+F8D6 KLINGON LETTER H 12062306a36Sopenharmony_ciU+F8D7 KLINGON LETTER I 12162306a36Sopenharmony_ciU+F8D8 KLINGON LETTER J 12262306a36Sopenharmony_ciU+F8D9 KLINGON LETTER L 12362306a36Sopenharmony_ciU+F8DA KLINGON LETTER M 12462306a36Sopenharmony_ciU+F8DB KLINGON LETTER N 12562306a36Sopenharmony_ciU+F8DC KLINGON LETTER NG 12662306a36Sopenharmony_ciU+F8DD KLINGON LETTER O 12762306a36Sopenharmony_ciU+F8DE KLINGON LETTER P 12862306a36Sopenharmony_ciU+F8DF KLINGON LETTER Q 12962306a36Sopenharmony_ci - Written <q> in standard Okrand Latin transliteration 13062306a36Sopenharmony_ciU+F8E0 KLINGON LETTER QH 13162306a36Sopenharmony_ci - Written <Q> in standard Okrand Latin transliteration 13262306a36Sopenharmony_ciU+F8E1 KLINGON LETTER R 13362306a36Sopenharmony_ciU+F8E2 KLINGON LETTER S 13462306a36Sopenharmony_ciU+F8E3 KLINGON LETTER T 13562306a36Sopenharmony_ciU+F8E4 KLINGON LETTER TLH 13662306a36Sopenharmony_ciU+F8E5 KLINGON LETTER U 13762306a36Sopenharmony_ciU+F8E6 KLINGON LETTER V 13862306a36Sopenharmony_ciU+F8E7 KLINGON LETTER W 13962306a36Sopenharmony_ciU+F8E8 KLINGON LETTER Y 14062306a36Sopenharmony_ciU+F8E9 KLINGON LETTER GLOTTAL STOP 14162306a36Sopenharmony_ci 14262306a36Sopenharmony_ciU+F8F0 KLINGON DIGIT ZERO 14362306a36Sopenharmony_ciU+F8F1 KLINGON DIGIT ONE 14462306a36Sopenharmony_ciU+F8F2 KLINGON DIGIT TWO 14562306a36Sopenharmony_ciU+F8F3 KLINGON DIGIT THREE 14662306a36Sopenharmony_ciU+F8F4 KLINGON DIGIT FOUR 14762306a36Sopenharmony_ciU+F8F5 KLINGON DIGIT FIVE 14862306a36Sopenharmony_ciU+F8F6 KLINGON DIGIT SIX 14962306a36Sopenharmony_ciU+F8F7 KLINGON DIGIT SEVEN 15062306a36Sopenharmony_ciU+F8F8 KLINGON DIGIT EIGHT 15162306a36Sopenharmony_ciU+F8F9 KLINGON DIGIT NINE 15262306a36Sopenharmony_ci 15362306a36Sopenharmony_ciU+F8FD KLINGON COMMA 15462306a36Sopenharmony_ciU+F8FE KLINGON FULL STOP 15562306a36Sopenharmony_ciU+F8FF KLINGON SYMBOL FOR EMPIRE 15662306a36Sopenharmony_ci====== ======================================================= 15762306a36Sopenharmony_ci 15862306a36Sopenharmony_ci其他虚构和人工字母 15962306a36Sopenharmony_ci------------------- 16062306a36Sopenharmony_ci 16162306a36Sopenharmony_ci自从分配了克林贡Linux Unicode块之后,John Cowan <jcowan@reutershealth.com> 16262306a36Sopenharmony_ci和 Michael Everson <everson@evertype.com> 建立了一个虚构和人工字母的注册表。 16362306a36Sopenharmony_ci征募Unicode注册表请访问: 16462306a36Sopenharmony_ci 16562306a36Sopenharmony_ci https://www.evertype.com/standards/csur/ 16662306a36Sopenharmony_ci 16762306a36Sopenharmony_ci所使用的范围位于最终用户区域的低端,因此无法进行规范化分配,但建议希望对虚构 16862306a36Sopenharmony_ci字母进行编码的人员使用这些代码,以实现互操作性。对于克林贡语,CSUR采用了Linux 16962306a36Sopenharmony_ci编码。CSUR的人正在推动将Tengwar和Cirth添加到Unicode平面一;将克林贡添加到 17062306a36Sopenharmony_ciUnicode平面一被拒绝,因此上述编码仍然是官方的。 171