Proposal-Lgr-Cyrillic-20180204

This document is mechanically formatted from the XML file for the LGR. It provides additional summary data and explanatory text. The XML file remains the sole normative specification of the LGR.

LGR Version 3
Date 2018-02-04
Language(s) und-Cyrl
Scope(s) domain: .
Unicode Version 6.3.0

Table of Contents

  1. Description
  2. Repertoire
  3. Variant Sets
  4. Classes, Rules and Actions
    1. Character Classes
    2. Whole label evaluation and context rules
    3. Actions
  5. Table of References

Description

Repertoire

Summary

Number of elements in repertoire 116
Number of ranges in repertoire 0
Number of code point sequences 0

Repertoire by Code Point

The following table lists the repertoire by code point (or code point sequence). The data in the Script and Name column are extracted from the Unicode character database. Where the comment in the original LGR is equal to the character name, it has been suppressed.

For any code point or sequence for which a variant is defined, the link to the associated variant set, or if mapped to itself, the variant type of that mapping is provided in the Variants column.

# Code
Point
Glyph Script Name Tags Required Context Variants Comment References
1 U+0061 a Latin LATIN SMALL LETTER A set 1 Out-of-repertoire, required for symmetry
2 U+0063 c Latin LATIN SMALL LETTER C set 2 Out-of-repertoire, required for symmetry
3 U+0065 e Latin LATIN SMALL LETTER E set 3 Out-of-repertoire, required for symmetry
4 U+0068 h Latin LATIN SMALL LETTER H set 4 Out-of-repertoire, required for symmetry
5 U+0069 i Latin LATIN SMALL LETTER I set 5 Out-of-repertoire, required for symmetry
6 U+006A j Latin LATIN SMALL LETTER J set 6 Out-of-repertoire, required for symmetry
7 U+006C l Latin LATIN SMALL LETTER L set 7 Out-of-repertoire, required for symmetry
8 U+006F o Latin LATIN SMALL LETTER O set 8 Out-of-repertoire, required for symmetry
9 U+0070 p Latin LATIN SMALL LETTER P set 9 Out-of-repertoire, required for symmetry
10 U+0073 s Latin LATIN SMALL LETTER S set 10 Out-of-repertoire, required for symmetry
11 U+0078 x Latin LATIN SMALL LETTER X set 11 Out-of-repertoire, required for symmetry
12 U+0079 y Latin LATIN SMALL LETTER Y set 12 Out-of-repertoire, required for symmetry
13 U+00E4 ä Latin LATIN SMALL LETTER A WITH DIAERESIS set 13 Out-of-repertoire, required for symmetry
14 U+00E6 æ Latin LATIN SMALL LETTER AE set 14 Out-of-repertoire, required for symmetry
15 U+00E7 ç Latin LATIN SMALL LETTER C WITH CEDILLA set 15 Out-of-repertoire, required for symmetry
16 U+00EB ë Latin LATIN SMALL LETTER E WITH DIAERESIS set 16 Out-of-repertoire, required for symmetry
17 U+00EF ï Latin LATIN SMALL LETTER I WITH DIAERESIS set 17 Out-of-repertoire, required for symmetry
18 U+00FF ÿ Latin LATIN SMALL LETTER Y WITH DIAERESIS set 18 Out-of-repertoire, required for symmetry
19 U+0103 ă Latin LATIN SMALL LETTER A WITH BREVE set 19 Out-of-repertoire, required for symmetry
20 U+0115 ĕ Latin LATIN SMALL LETTER E WITH BREVE set 20 Out-of-repertoire, required for symmetry
21 U+01DD ǝ Latin LATIN SMALL LETTER TURNED E set 21 Out-of-repertoire, required for symmetry
22 U+0259 ə Latin LATIN SMALL LETTER SCHWA set 21 Out-of-repertoire, required for symmetry
23 U+0275 ɵ Latin LATIN SMALL LETTER BARRED O set 22 Out-of-repertoire, required for symmetry
24 U+0292 ʒ Latin LATIN SMALL LETTER EZH set 23 Out-of-repertoire, required for symmetry
25 U+03BA κ Greek GREEK SMALL LETTER KAPPA set 24 Out-of-repertoire, required for symmetry
26 U+03BF ο Greek GREEK SMALL LETTER OMICRON set 8 Out-of-repertoire, required for symmetry
27 U+03C6 φ Greek GREEK SMALL LETTER PHI set 25 Out-of-repertoire, required for symmetry
28 U+0430 а Cyrillic CYRILLIC SMALL LETTER A set 1 Base Cyrillic [0], [100]
29 U+0431 б Cyrillic CYRILLIC SMALL LETTER BE Base Cyrillic [0], [100]
30 U+0432 в Cyrillic CYRILLIC SMALL LETTER VE Base Cyrillic [0], [100]
31 U+0433 г Cyrillic CYRILLIC SMALL LETTER GHE Base Cyrillic [0], [100]
32 U+0434 д Cyrillic CYRILLIC SMALL LETTER DE Base Cyrillic [0], [100]
33 U+0435 е Cyrillic CYRILLIC SMALL LETTER IE set 3 Base Cyrillic [0], [100]
34 U+0436 ж Cyrillic CYRILLIC SMALL LETTER ZHE Base Cyrillic [0], [100]
35 U+0437 з Cyrillic CYRILLIC SMALL LETTER ZE Base Cyrillic [0], [100]
36 U+0438 и Cyrillic CYRILLIC SMALL LETTER I Russian [0], [106]
37 U+0439 й Cyrillic CYRILLIC SMALL LETTER SHORT I Russian [0], [106]
38 U+043A к Cyrillic CYRILLIC SMALL LETTER KA set 24 Base Cyrillic [0], [100]
39 U+043B л Cyrillic CYRILLIC SMALL LETTER EL Base Cyrillic [0], [100]
40 U+043C м Cyrillic CYRILLIC SMALL LETTER EM Base Cyrillic [0], [100]
41 U+043D н Cyrillic CYRILLIC SMALL LETTER EN Base Cyrillic [0], [100]
42 U+043E о Cyrillic CYRILLIC SMALL LETTER O set 8 Base Cyrillic [0], [100]
43 U+043F п Cyrillic CYRILLIC SMALL LETTER PE Base Cyrillic [0], [100]
44 U+0440 р Cyrillic CYRILLIC SMALL LETTER ER set 9 Base Cyrillic [0], [100]
45 U+0441 с Cyrillic CYRILLIC SMALL LETTER ES set 2 Base Cyrillic [0], [100]
46 U+0442 т Cyrillic CYRILLIC SMALL LETTER TE Base Cyrillic [0], [100]
47 U+0443 у Cyrillic CYRILLIC SMALL LETTER U set 12 Base Cyrillic [0], [100]
48 U+0444 ф Cyrillic CYRILLIC SMALL LETTER EF set 25 Base Cyrillic [0], [100]
49 U+0445 х Cyrillic CYRILLIC SMALL LETTER HA set 11 Base Cyrillic [0], [100]
50 U+0446 ц Cyrillic CYRILLIC SMALL LETTER TSE Base Cyrillic [0], [100]
51 U+0447 ч Cyrillic CYRILLIC SMALL LETTER CHE Base Cyrillic [0], [100]
52 U+0448 ш Cyrillic CYRILLIC SMALL LETTER SHA set 26 Base Cyrillic [0], [100]
53 U+0449 щ Cyrillic CYRILLIC SMALL LETTER SHCHA Russian [0], [106]
54 U+044A ъ Cyrillic CYRILLIC SMALL LETTER HARD SIGN Russian [0], [106]
55 U+044B ы Cyrillic CYRILLIC SMALL LETTER YERU Russian [0], [106]
56 U+044C ь Cyrillic CYRILLIC SMALL LETTER SOFT SIGN Russian [0], [106]
57 U+044D э Cyrillic CYRILLIC SMALL LETTER E Russian [0], [106]
58 U+044E ю Cyrillic CYRILLIC SMALL LETTER YU Russian [0], [106]
59 U+044F я Cyrillic CYRILLIC SMALL LETTER YA Russian [0], [106]
60 U+0451 ё Cyrillic CYRILLIC SMALL LETTER IO set 16 Russian [0], [106]
61 U+0452 ђ Cyrillic CYRILLIC SMALL LETTER DJE Serbian [0], [107]
62 U+0453 ѓ Cyrillic CYRILLIC SMALL LETTER GJE Macedonian [0], [104]
63 U+0454 є Cyrillic CYRILLIC SMALL LETTER UKRAINIAN IE Ukrainian [0], [109]
64 U+0455 ѕ Cyrillic CYRILLIC SMALL LETTER DZE set 10 Macedonian [0], [104]
65 U+0456 і Cyrillic CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I set 5 Byelorussian [0], [101]
66 U+0457 ї Cyrillic CYRILLIC SMALL LETTER YI set 17 Ukrainian [0], [109]
67 U+0458 ј Cyrillic CYRILLIC SMALL LETTER JE set 6 Serbian [0], [107]
68 U+0459 љ Cyrillic CYRILLIC SMALL LETTER LJE Serbian [0], [107]
69 U+045A њ Cyrillic CYRILLIC SMALL LETTER NJE Serbian [0], [107]
70 U+045B ћ Cyrillic CYRILLIC SMALL LETTER TSHE Serbian [0], [107]
71 U+045C ќ Cyrillic CYRILLIC SMALL LETTER KJE Macedonian [0], [104]
72 U+045E ў Cyrillic CYRILLIC SMALL LETTER SHORT U Byelorussian [0], [101]
73 U+045F џ Cyrillic CYRILLIC SMALL LETTER DZHE Serbian [0], [107]
74 U+0491 ґ Cyrillic CYRILLIC SMALL LETTER GHE WITH UPTURN Ukrainian [0], [109]
75 U+0493 ғ Cyrillic CYRILLIC SMALL LETTER GHE WITH STROKE Bashkir [0], [112]
76 U+0495 ҕ Cyrillic CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK Abkhaz [0], [110]
77 U+0497 җ Cyrillic CYRILLIC SMALL LETTER ZHE WITH DESCENDER Tatar [0], [111]
78 U+0499 ҙ Cyrillic CYRILLIC SMALL LETTER ZE WITH DESCENDER Bashkir [0], [112]
79 U+049B қ Cyrillic CYRILLIC SMALL LETTER KA WITH DESCENDER Abkhaz [0], [110]
80 U+049F ҟ Cyrillic CYRILLIC SMALL LETTER KA WITH STROKE Abkhaz [0], [110]
81 U+04A1 ҡ Cyrillic CYRILLIC SMALL LETTER BASHKIR KA Bashkir [0], [112]
82 U+04A3 ң Cyrillic CYRILLIC SMALL LETTER EN WITH DESCENDER Tatar [0], [111]
83 U+04A5 ҥ Cyrillic CYRILLIC SMALL LIGATURE EN GHE Mari [0], [114]
84 U+04A9 ҩ Cyrillic CYRILLIC SMALL LETTER ABKHASIAN HA Abkhaz [0], [110]
85 U+04AB ҫ Cyrillic CYRILLIC SMALL LETTER ES WITH DESCENDER set 15 Bashkir [0], [112]
86 U+04AD ҭ Cyrillic CYRILLIC SMALL LETTER TE WITH DESCENDER Abkhaz [0], [110]
87 U+04AF ү Cyrillic CYRILLIC SMALL LETTER STRAIGHT U Mongolian [0], [105]
88 U+04B1 ұ Cyrillic CYRILLIC SMALL LETTER STRAIGHT U WITH STROKE Kazakh [0], [102]
89 U+04B3 ҳ Cyrillic CYRILLIC SMALL LETTER HA WITH DESCENDER Abkhaz [0], [110]
90 U+04B5 ҵ Cyrillic CYRILLIC SMALL LIGATURE TE TSE Abkhaz [0], [110]
91 U+04B7 ҷ Cyrillic CYRILLIC SMALL LETTER CHE WITH DESCENDER Abkhaz [0], [110]
92 U+04BB һ Cyrillic CYRILLIC SMALL LETTER SHHA set 4 Tatar [0], [111]
93 U+04BD ҽ Cyrillic CYRILLIC SMALL LETTER ABKHASIAN CHE Abkhaz [0], [110]
94 U+04BF ҿ Cyrillic CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER Abkhaz [0], [110]
95 U+04CF ӏ Cyrillic CYRILLIC SMALL LETTER PALOCHKA set 7 Chechen [0], [122]
96 U+04D1 ӑ Cyrillic CYRILLIC SMALL LETTER A WITH BREVE set 19 Chuvash [0], [113]
97 U+04D3 ӓ Cyrillic CYRILLIC SMALL LETTER A WITH DIAERESIS set 13 Mari [0], [114]
98 U+04D5 ӕ Cyrillic CYRILLIC SMALL LIGATURE A IE set 14 Ossetian [0], [115]
99 U+04D7 ӗ Cyrillic CYRILLIC SMALL LETTER IE WITH BREVE set 20 Chuvash [0], [113]
100 U+04D9 ә Cyrillic CYRILLIC SMALL LETTER SCHWA set 21 Bashkir [0], [112]
101 U+04DD ӝ Cyrillic CYRILLIC SMALL LETTER ZHE WITH DIAERESIS Udmurt [0], [116]
102 U+04DF ӟ Cyrillic CYRILLIC SMALL LETTER ZE WITH DIAERESIS Udmurt [0], [116]
103 U+04E1 ӡ Cyrillic CYRILLIC SMALL LETTER ABKHASIAN DZE set 23 Abkhaz [0], [110]
104 U+04E3 ӣ Cyrillic CYRILLIC SMALL LETTER I WITH MACRON Tajik [0], [108]
105 U+04E5 ӥ Cyrillic CYRILLIC SMALL LETTER I WITH DIAERESIS Udmurt [0], [116]
106 U+04E7 ӧ Cyrillic CYRILLIC SMALL LETTER O WITH DIAERESIS Mari [0], [114]
107 U+04E9 ө Cyrillic CYRILLIC SMALL LETTER BARRED O set 22 Kyrgiz, Khanty [0], [103], [117]
108 U+04EF ӯ Cyrillic CYRILLIC SMALL LETTER U WITH MACRON Tajik [0], [108]
109 U+04F1 ӱ Cyrillic CYRILLIC SMALL LETTER U WITH DIAERESIS set 18 Mari [0], [114]
110 U+04F3 ӳ Cyrillic CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE Chuvash [0], [113]
111 U+04F5 ӵ Cyrillic CYRILLIC SMALL LETTER CHE WITH DIAERESIS Udmurt [0], [116]
112 U+04F9 ӹ Cyrillic CYRILLIC SMALL LETTER YERU WITH DIAERESIS Mari [0], [114]
113 U+0525 ԥ Cyrillic CYRILLIC SMALL LETTER PE WITH DESCENDER Abkhaz [0], [110]
114 U+0561 ա Armenian ARMENIAN SMALL LETTER AYB set 26 Out-of-repertoire, required for symmetry
115 U+0570 հ Armenian ARMENIAN SMALL LETTER HO set 4 Out-of-repertoire, required for symmetry
116 U+0585 օ Armenian ARMENIAN SMALL LETTER OH set 8 Out-of-repertoire, required for symmetry

Legend

Code Point
A code point or code point sequence.
Name
Shows the character or sequence name from the Unicode Character Database.
Glyph
The shape displayed depends on the fonts available to your browser.
Script
Shows the script property value from the Unicode Character Database. Combining marks may have the value Inherited and code points used with more than one script may have the value Common.
References
Links to the references associated with the code point or sequence, if any.
Tags
LGR-defined tag values. Any tags matching the Unicode script property are suppressed in this view.
Required Context
Link to the rule defining the required context a code point or sequence must satisfy. If prefixed by "not:", identifies a context that must not occur.
Variants
A link to the variant set the code point or sequence is a member of, except where a coded point or sequence maps only to itself, in which case the type of that mapping is listed.
Comment
If the comment in this row consists only of the code point or sequence name it is suppressed in this view.

Variant Sets

Summary

Number of variant sets 26
Largest variant set 5
Ordinary Variants by Type out-of-repertoire-var (30)
blocked (70)

The following tables list each pair of variant mappings on one row.

In a properly specified LGR, all members of each variant set are variants of each other, a property called transitivity. Because of that, all variant sets are necessarily disjoint. In each set, shading is used to group mappings from the same source code point or sequence.

Variant Set 1 ��� 3 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+0061 a U+0061 a out-of-repertoire-var Out of repertoire
2 U+0061 a U+0430 а blocked
3 U+0430 а U+0061 a blocked cross-script homoglyph

Variant Set 2 ��� 3 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+0063 c U+0063 c out-of-repertoire-var Out of repertoire
2 U+0063 c U+0441 с blocked
3 U+0441 с U+0063 c blocked cross-script homoglyph

Variant Set 3 ��� 3 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+0065 e U+0065 e out-of-repertoire-var Out of repertoire
2 U+0065 e U+0435 е blocked
3 U+0435 е U+0065 e blocked cross-script homoglyph

Variant Set 4 ��� 8 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+0068 h U+0068 h out-of-repertoire-var Out of repertoire
2 U+0068 h U+04BB һ blocked
3 U+0068 h U+0570 հ blocked cross-script homoglyph
4 U+04BB һ U+0068 h blocked cross-scrpt homoglyph
5 U+04BB һ U+0570 հ blocked cross-scrpt homoglyph
6 U+0570 հ U+0068 h blocked cross-script homoglyph
7 U+0570 հ U+04BB һ blocked
8 U+0570 հ U+0570 հ out-of-repertoire-var Out of repertoire

Variant Set 5 ��� 3 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+0069 i U+0069 i out-of-repertoire-var Out of repertoire
2 U+0069 i U+0456 і blocked
3 U+0456 і U+0069 i blocked cross-script homoglyph

Variant Set 6 ��� 3 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+006A j U+006A j out-of-repertoire-var Out of repertoire
2 U+006A j U+0458 ј blocked
3 U+0458 ј U+006A j blocked cross-script homoglyph

Variant Set 7 ��� 3 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+006C l U+006C l out-of-repertoire-var Out of repertoire
2 U+006C l U+04CF ӏ blocked
3 U+04CF ӏ U+006C l blocked cross-script homoglyph

Variant Set 8 ��� 15 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+006F o U+006F o out-of-repertoire-var Out of repertoire
2 U+006F o U+03BF ο blocked cross-script homoglyph
3 U+006F o U+043E о blocked
4 U+006F o U+0585 օ blocked cross-script homoglyph
5 U+03BF ο U+006F o blocked cross-script homoglyph
6 U+03BF ο U+03BF ο out-of-repertoire-var Out of repertoire
7 U+03BF ο U+043E о blocked
8 U+03BF ο U+0585 օ blocked cross-script homoglyph
9 U+043E о U+006F o blocked cross-script homoglyph
10 U+043E о U+03BF ο blocked cross-script homoglyph
11 U+043E о U+0585 օ blocked cross-script homoglyph
12 U+0585 օ U+006F o blocked cross-script homoglyph
13 U+0585 օ U+03BF ο blocked cross-script homoglyph
14 U+0585 օ U+043E о blocked
15 U+0585 օ U+0585 օ out-of-repertoire-var Out of repertoire

Variant Set 9 ��� 3 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+0070 p U+0070 p out-of-repertoire-var Out of repertoire
2 U+0070 p U+0440 р blocked
3 U+0440 р U+0070 p blocked cross-script homoglyph

Variant Set 10 ��� 3 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+0073 s U+0073 s out-of-repertoire-var Out of repertoire
2 U+0073 s U+0455 ѕ blocked
3 U+0455 ѕ U+0073 s blocked cross-script homoglyph

Variant Set 11 ��� 3 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+0078 x U+0078 x out-of-repertoire-var Out of repertoire
2 U+0078 x U+0445 х blocked
3 U+0445 х U+0078 x blocked cross-script homoglyph

Variant Set 12 ��� 3 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+0079 y U+0079 y out-of-repertoire-var Out of repertoire
2 U+0079 y U+0443 у blocked
3 U+0443 у U+0079 y blocked cross-script homoglyph

Variant Set 13 ��� 3 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+00E4 ä U+00E4 ä out-of-repertoire-var Out of repertoire
2 U+00E4 ä U+04D3 ӓ blocked
3 U+04D3 ӓ U+00E4 ä blocked cross-script homoglyph

Variant Set 14 ��� 3 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+00E6 æ U+00E6 æ out-of-repertoire-var Out of repertoire
2 U+00E6 æ U+04D5 ӕ blocked
3 U+04D5 ӕ U+00E6 æ blocked cross-script homoglyph

Variant Set 15 ��� 3 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+00E7 ç U+00E7 ç out-of-repertoire-var Out of repertoire
2 U+00E7 ç U+04AB ҫ blocked
3 U+04AB ҫ U+00E7 ç blocked cross-script homoglyph

Variant Set 16 ��� 3 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+00EB ë U+00EB ë out-of-repertoire-var Out of repertoire
2 U+00EB ë U+0451 ё blocked
3 U+0451 ё U+00EB ë blocked cross-script homoglyph

Variant Set 17 ��� 3 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+00EF ï U+00EF ï out-of-repertoire-var Out of repertoire
2 U+00EF ï U+0457 ї blocked
3 U+0457 ї U+00EF ï blocked cross-script homoglyph

Variant Set 18 ��� 3 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+00FF ÿ U+00FF ÿ out-of-repertoire-var Out of repertoire
2 U+00FF ÿ U+04F1 ӱ blocked
3 U+04F1 ӱ U+00FF ÿ blocked cross-script homoglyph

Variant Set 19 ��� 3 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+0103 ă U+0103 ă out-of-repertoire-var Out of repertoire
2 U+0103 ă U+04D1 ӑ blocked
3 U+04D1 ӑ U+0103 ă blocked cross-script homoglyph

Variant Set 20 ��� 3 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+0115 ĕ U+0115 ĕ out-of-repertoire-var Out of repertoire
2 U+0115 ĕ U+04D7 ӗ blocked
3 U+04D7 ӗ U+0115 ĕ blocked cross-script homoglyph

Variant Set 21 ��� 8 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+01DD ǝ U+01DD ǝ out-of-repertoire-var Out of repertoire
2 U+01DD ǝ U+0259 ə blocked
3 U+01DD ǝ U+04D9 ә blocked
4 U+0259 ə U+01DD ǝ blocked
5 U+0259 ə U+0259 ə out-of-repertoire-var Out of repertoire
6 U+0259 ə U+04D9 ә blocked
7 U+04D9 ә U+01DD ǝ blocked cross-script homoglyph
8 U+04D9 ә U+0259 ə blocked cross-script homoglyph

Variant Set 22 ��� 3 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+0275 ɵ U+0275 ɵ out-of-repertoire-var Out of repertoire
2 U+0275 ɵ U+04E9 ө blocked
3 U+04E9 ө U+0275 ɵ blocked cross-script homoglyph

Variant Set 23 ��� 3 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+0292 ʒ U+0292 ʒ out-of-repertoire-var Out of repertoire
2 U+0292 ʒ U+04E1 ӡ blocked
3 U+04E1 ӡ U+0292 ʒ blocked cross-script homoglyph

Variant Set 24 ��� 3 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+03BA κ U+03BA κ out-of-repertoire-var Out of repertoire
2 U+03BA κ U+043A к blocked
3 U+043A к U+03BA κ blocked cross-script homoglyph

Variant Set 25 ��� 3 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+03C6 φ U+03C6 φ out-of-repertoire-var Out of repertoire
2 U+03C6 φ U+0444 ф blocked
3 U+0444 ф U+03C6 φ blocked cross-script homoglyph

Variant Set 26 ��� 3 Members

# Source Glyph Target Glyph Type(s) References Comment
1 U+0448 ш U+0561 ա blocked cross-script homoglyph
2 U+0561 ա U+0448 ш blocked
3 U+0561 ա U+0561 ա out-of-repertoire-var Out of repertoire

Classes, Rules and Actions

Character Classes

The following table lists all top-level classes with their definition and the regular expression defining their members.

Name Definition Count Members References Comment

Legend

Members or Ranges
Lists the members of the class as code points (xxx) or as ranges of code points (xxx-yyy). Any class too numerous to list in full is elided with "...".
Tag=ttt
An anonymous class implicitly defined based on tag value.
[: :] - named character set
Reference to a named character set [:name:].
(���,���,\,���) - set operators
Sets may be combined by set operators (��� = intersection, ��� = union, \ = difference, ��� = symmetric difference).

Whole label evaluation and context rules

The following table lists all the top-level, or named rules defined in the LGR and indicates whether they are used as trigger in an action or as context (when or not-when) for a code point. (Any use of context rules for variants is not indicated).

Name Regular Expression Used as
Trigger
Used as
Context
Anchor References Comment
leading-combining-mark (start) ([:class property:gc=Mn:]���[:class property:gc=Mc:]) True False False RFC5891 restrictions on placement of combining marks

Legend

Used as Trigger
This rule triggers one of the actions listed below.
Used as Context
This rule defines a required context for a code point.
Anchor
This has a placeholder for the code point for which it is evaluated.
Regular Expression
A regular expression equivalent to the rule, shown in the standard notation with some extensions as noted:
��� - context anchor
In a regex the ��� signifies a placeholder for the actual code point, when a context is evaluated. The code point must occur at the position corresponding to the anchor. Rules containing an anchor cannot be used as triggers.
(...)��� - look-behind
If present encloses required context preceding the anchor.
���(...) - look-ahead
If present encloses required context following the anchor.
(: :) - rule reference
Non-recursive reference to a named rule.
[: :] - character set either named, implicit or property
Reference to a named character set [:name:], an implicit character set [:class tag=val:] or a given Unicode property [:class property:prop=val:]. A leading "^" before name or tag indicates the set complement.
(|) - choice operator
When there are various choices in a rule, choices are separated by the set operator (|) and each choice is represented by a set enclosed in parenthesis.
(���,���,\,���) - set operators
Sets may be combined by set operators (��� = intersection, ��� = union, \ = difference, ��� = symmetric difference).
�� - empty set
Indicated that the following set is empty because of the result of set operations or because non of its elements are part of the repertoire defined here.
An empty set that is not optional means that a rule can never match.
{m}, {m, n}, {m,} - count
Indicates that the preceding element is evaluated from m to n times. Only {m} means the preceding element is evaluated exactly m times (equivalent to {m,m}), {m,} means the preceding element is evaluated at least m times.
No count indicated the elements is evaluated once (equivalent to "{1}").

Actions

The following table lists the actions that are used to assign dispositions to labels and variant labels, based on the specified conditions. The order of actions defines their precedence: the first action triggered by a label is the one defining its disposition.

# Condition Rule / Variant Set   Disposition References Comment
1 if label match leading-combining-mark ��� invalid leading combining marks are disallowed ?
2 if at least one variant is in {out-of-repertoire-var} ��� invalid any variant label with a code point out of repertoire is invalid. ?
3 if at least one variant is in {blocked} ��� blocked any variant label containing a blocked variant is blocked. ?
4 if all variants are in {allocatable} ��� allocatable a variant label with only allocatable variants is allocatable. ?
5 if any label (catch-all) ��� valid catch all ?

Legend

{...} - variant type set
In the "Rule/Variant Set" column the notation {...} means a set of variant types.

Table of References

[0] The Unicode Consortium. The Unicode Standard, Version 1.1
Code points cited were originally encoded in Unicode Version 1.1
[100] Basic Cyrillic, RFC5992
[101] Byelorussian, http://www.omniglot.com/writing/belarusian.htm
EGIDS 1
[102] Kazakh, http://omniglot.com/writing/kazakh.htm
EGIDS 1
[103] Kyrgiz, http://omniglot.com/writing/kirghiz.htm
EGIDS 1
[104] Macedonian, http://www.omniglot.com/writing/macedonian.htm
EGIDS 1
[105] Mongolian, http://www.omniglot.com/writing/mongolian.htm
EGIDS 1
[106] Russian, http://www.omniglot.com/writing/russian.htm
EGIDS 1
[107] Serbian, http://www.omniglot.com/writing/serbian.htm
EGIDS 1
[108] Tajik, http://www.omniglot.com/writing/tajik.htm
EGIDS 1
[109] Ukrainian, http://www.omniglot.com/writing/ukrainian.htm
EGIDS 1
[110] Abkhaz, http://www.omniglot.com/writing/abkhaz.htm
EGIDS 2
[111] Tatar, http://www.omniglot.com/writing/tatar.htm
EGIDS 2
[112] Bashkir, http://www.omniglot.com/writing/bashkir.htm
EGIDS 4
[113] Chuvash, http://www.omniglot.com/writing/chuvash.htm
EGIDS 4
[114] Mari, http://www.omniglot.com/writing/mari.htm
EGIDS 4
[115] Ossetian, http://www.omniglot.com/writing/ossetian.htm, https://en.wikipedia.org/wiki/Ossetian_language
EGIDS 5
[116] Udmurt, http://www.omniglot.com/writing/udmurt.htm, http://ftp.eki.ee/index.php?id=16440#.WFb6gBsrLIU
EGIDS 5
[117] Khanty, http://www.omniglot.com/writing/khanty.htm
EGIDS 6b
[118] Sami, http://www.omniglot.com/writing/saami.htm
EGIDS 8b
[119] Gagauz, http://www.omniglot.com/writing/gagauz.htm, https://www.ethnologue.com/language/gag
EGIDS 5
[120] Khakas, http://www.omniglot.com/writing/khakas.htm, https://www.ethnologue.com/language/kjh
EGIDS 5
[121] Gagauz, https://en.wikipedia.org/wiki/Gagauz_language
EGIDS 5
[122] Chechen, http://www.omniglot.com/writing/chechen.htm
EGIDS 2