17db96d56Sopenharmony_ci:mod:`stringprep` --- Internet String Preparation
27db96d56Sopenharmony_ci=================================================
37db96d56Sopenharmony_ci
47db96d56Sopenharmony_ci.. module:: stringprep
57db96d56Sopenharmony_ci   :synopsis: String preparation, as per RFC 3453
67db96d56Sopenharmony_ci
77db96d56Sopenharmony_ci.. moduleauthor:: Martin v. Löwis <martin@v.loewis.de>
87db96d56Sopenharmony_ci.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de>
97db96d56Sopenharmony_ci
107db96d56Sopenharmony_ci**Source code:** :source:`Lib/stringprep.py`
117db96d56Sopenharmony_ci
127db96d56Sopenharmony_ci--------------
137db96d56Sopenharmony_ci
147db96d56Sopenharmony_ciWhen identifying things (such as host names) in the internet, it is often
157db96d56Sopenharmony_cinecessary to compare such identifications for "equality". Exactly how this
167db96d56Sopenharmony_cicomparison is executed may depend on the application domain, e.g. whether it
177db96d56Sopenharmony_cishould be case-insensitive or not. It may be also necessary to restrict the
187db96d56Sopenharmony_cipossible identifications, to allow only identifications consisting of
197db96d56Sopenharmony_ci"printable" characters.
207db96d56Sopenharmony_ci
217db96d56Sopenharmony_ci:rfc:`3454` defines a procedure for "preparing" Unicode strings in internet
227db96d56Sopenharmony_ciprotocols. Before passing strings onto the wire, they are processed with the
237db96d56Sopenharmony_cipreparation procedure, after which they have a certain normalized form. The RFC
247db96d56Sopenharmony_cidefines a set of tables, which can be combined into profiles. Each profile must
257db96d56Sopenharmony_cidefine which tables it uses, and what other optional parts of the ``stringprep``
267db96d56Sopenharmony_ciprocedure are part of the profile. One example of a ``stringprep`` profile is
277db96d56Sopenharmony_ci``nameprep``, which is used for internationalized domain names.
287db96d56Sopenharmony_ci
297db96d56Sopenharmony_ciThe module :mod:`stringprep` only exposes the tables from :rfc:`3454`. As these
307db96d56Sopenharmony_citables would be very large to represent them as dictionaries or lists, the
317db96d56Sopenharmony_cimodule uses the Unicode character database internally. The module source code
327db96d56Sopenharmony_ciitself was generated using the ``mkstringprep.py`` utility.
337db96d56Sopenharmony_ci
347db96d56Sopenharmony_ciAs a result, these tables are exposed as functions, not as data structures.
357db96d56Sopenharmony_ciThere are two kinds of tables in the RFC: sets and mappings. For a set,
367db96d56Sopenharmony_ci:mod:`stringprep` provides the "characteristic function", i.e. a function that
377db96d56Sopenharmony_cireturns ``True`` if the parameter is part of the set. For mappings, it provides the
387db96d56Sopenharmony_cimapping function: given the key, it returns the associated value. Below is a
397db96d56Sopenharmony_cilist of all functions available in the module.
407db96d56Sopenharmony_ci
417db96d56Sopenharmony_ci
427db96d56Sopenharmony_ci.. function:: in_table_a1(code)
437db96d56Sopenharmony_ci
447db96d56Sopenharmony_ci   Determine whether *code* is in tableA.1 (Unassigned code points in Unicode 3.2).
457db96d56Sopenharmony_ci
467db96d56Sopenharmony_ci
477db96d56Sopenharmony_ci.. function:: in_table_b1(code)
487db96d56Sopenharmony_ci
497db96d56Sopenharmony_ci   Determine whether *code* is in tableB.1 (Commonly mapped to nothing).
507db96d56Sopenharmony_ci
517db96d56Sopenharmony_ci
527db96d56Sopenharmony_ci.. function:: map_table_b2(code)
537db96d56Sopenharmony_ci
547db96d56Sopenharmony_ci   Return the mapped value for *code* according to tableB.2 (Mapping for
557db96d56Sopenharmony_ci   case-folding used with NFKC).
567db96d56Sopenharmony_ci
577db96d56Sopenharmony_ci
587db96d56Sopenharmony_ci.. function:: map_table_b3(code)
597db96d56Sopenharmony_ci
607db96d56Sopenharmony_ci   Return the mapped value for *code* according to tableB.3 (Mapping for
617db96d56Sopenharmony_ci   case-folding used with no normalization).
627db96d56Sopenharmony_ci
637db96d56Sopenharmony_ci
647db96d56Sopenharmony_ci.. function:: in_table_c11(code)
657db96d56Sopenharmony_ci
667db96d56Sopenharmony_ci   Determine whether *code* is in tableC.1.1  (ASCII space characters).
677db96d56Sopenharmony_ci
687db96d56Sopenharmony_ci
697db96d56Sopenharmony_ci.. function:: in_table_c12(code)
707db96d56Sopenharmony_ci
717db96d56Sopenharmony_ci   Determine whether *code* is in tableC.1.2  (Non-ASCII space characters).
727db96d56Sopenharmony_ci
737db96d56Sopenharmony_ci
747db96d56Sopenharmony_ci.. function:: in_table_c11_c12(code)
757db96d56Sopenharmony_ci
767db96d56Sopenharmony_ci   Determine whether *code* is in tableC.1  (Space characters, union of C.1.1 and
777db96d56Sopenharmony_ci   C.1.2).
787db96d56Sopenharmony_ci
797db96d56Sopenharmony_ci
807db96d56Sopenharmony_ci.. function:: in_table_c21(code)
817db96d56Sopenharmony_ci
827db96d56Sopenharmony_ci   Determine whether *code* is in tableC.2.1  (ASCII control characters).
837db96d56Sopenharmony_ci
847db96d56Sopenharmony_ci
857db96d56Sopenharmony_ci.. function:: in_table_c22(code)
867db96d56Sopenharmony_ci
877db96d56Sopenharmony_ci   Determine whether *code* is in tableC.2.2  (Non-ASCII control characters).
887db96d56Sopenharmony_ci
897db96d56Sopenharmony_ci
907db96d56Sopenharmony_ci.. function:: in_table_c21_c22(code)
917db96d56Sopenharmony_ci
927db96d56Sopenharmony_ci   Determine whether *code* is in tableC.2  (Control characters, union of C.2.1 and
937db96d56Sopenharmony_ci   C.2.2).
947db96d56Sopenharmony_ci
957db96d56Sopenharmony_ci
967db96d56Sopenharmony_ci.. function:: in_table_c3(code)
977db96d56Sopenharmony_ci
987db96d56Sopenharmony_ci   Determine whether *code* is in tableC.3  (Private use).
997db96d56Sopenharmony_ci
1007db96d56Sopenharmony_ci
1017db96d56Sopenharmony_ci.. function:: in_table_c4(code)
1027db96d56Sopenharmony_ci
1037db96d56Sopenharmony_ci   Determine whether *code* is in tableC.4  (Non-character code points).
1047db96d56Sopenharmony_ci
1057db96d56Sopenharmony_ci
1067db96d56Sopenharmony_ci.. function:: in_table_c5(code)
1077db96d56Sopenharmony_ci
1087db96d56Sopenharmony_ci   Determine whether *code* is in tableC.5  (Surrogate codes).
1097db96d56Sopenharmony_ci
1107db96d56Sopenharmony_ci
1117db96d56Sopenharmony_ci.. function:: in_table_c6(code)
1127db96d56Sopenharmony_ci
1137db96d56Sopenharmony_ci   Determine whether *code* is in tableC.6  (Inappropriate for plain text).
1147db96d56Sopenharmony_ci
1157db96d56Sopenharmony_ci
1167db96d56Sopenharmony_ci.. function:: in_table_c7(code)
1177db96d56Sopenharmony_ci
1187db96d56Sopenharmony_ci   Determine whether *code* is in tableC.7  (Inappropriate for canonical
1197db96d56Sopenharmony_ci   representation).
1207db96d56Sopenharmony_ci
1217db96d56Sopenharmony_ci
1227db96d56Sopenharmony_ci.. function:: in_table_c8(code)
1237db96d56Sopenharmony_ci
1247db96d56Sopenharmony_ci   Determine whether *code* is in tableC.8  (Change display properties or are
1257db96d56Sopenharmony_ci   deprecated).
1267db96d56Sopenharmony_ci
1277db96d56Sopenharmony_ci
1287db96d56Sopenharmony_ci.. function:: in_table_c9(code)
1297db96d56Sopenharmony_ci
1307db96d56Sopenharmony_ci   Determine whether *code* is in tableC.9  (Tagging characters).
1317db96d56Sopenharmony_ci
1327db96d56Sopenharmony_ci
1337db96d56Sopenharmony_ci.. function:: in_table_d1(code)
1347db96d56Sopenharmony_ci
1357db96d56Sopenharmony_ci   Determine whether *code* is in tableD.1  (Characters with bidirectional property
1367db96d56Sopenharmony_ci   "R" or "AL").
1377db96d56Sopenharmony_ci
1387db96d56Sopenharmony_ci
1397db96d56Sopenharmony_ci.. function:: in_table_d2(code)
1407db96d56Sopenharmony_ci
1417db96d56Sopenharmony_ci   Determine whether *code* is in tableD.2  (Characters with bidirectional property
1427db96d56Sopenharmony_ci   "L").
1437db96d56Sopenharmony_ci
144