Dear GP members,

 

Please find enclosed the proposal of Agenda for our call. Any comments, corrections and sugestions are welcome.

 

Please look at the material prior to call.

 

Regards Mirjana

 

___________________________________________________________________________________________________________________

AGENDA for the GP call on March the 26th   2020, 16:00UTC

 

  1. Roll call
  2. MM letter on test labels for á U+00E1 and ά U+03AC
  3. Underlining Test Case  - missing analysis from BJ,MM,HH
  4. Review the sheet for Generic glyphs analysis produced by Dennis
  5. Discuss Appendix D cases which the first reviewer proposed variants (7 cases)

   (1) D.1.1 Latin Small Letter F vs. Latin Small Letter F with Hook

   (2) D.1.9 Latin Small Letter D with Caron vs. Latin Small Letter D with Hook

   (3) D.1.5 Latin Small Letter I vs. Latin Small Letter Dotless I vs. Latin Small Letter Iota

   (4) D.2.1 Latin Small Ligature Æ vs. Sequence AE

   (5) D.3.7 Double Acute vs. Diaeresis 

   (6) D.4.2 Circumflex and Hook Above

   (7) D.4.16 Circumflex Above + Grave Above

  1. AOB

 

Additional material for Item 5. Discuss Appendix D cases which the first reviewer proposed variants (7 cases)

 

For each subitem in item 5, following information is provided:

  1. The extract of that section in the latest version of the Latin LGR Proposal [onedrive.live.com] 
  2. Some notes from the F2F meeting Note to the IP Comment [onedrive.live.com] and Action Items [onedrive.live.com]

These files are at  OneDrive [onedrive.live.com].

 

 

Agenda 5(1)  -- D.1.1 Latin Small Letter F vs. Latin Small Letter F with Hook

Code Points Considered:

 

Code Points

Glyph

Name

0066

f

Latin Small Letter F

0192

ƒ

Latin Small Letter F with Hook

 

Example from a Swedish Newspaper:

Ein Bild, das Screenshot enthält.

Automatisch generierte Beschreibung

 

Findings:

The example uses a shape of “Latin Small Letter F” (0066) that is identical to “Latin Small Letter F with Hook” (0192) in italic style. Example from a large, daily newspaper, in which all instances of “ƒ” are just italic variants of “f”.

 

Conclusions:

These two Code Points should be treated as variants

 

[TBD: this “conclusion” does not agree with the formal specificaiton. The variant relation is not carried out in the XML (and perhaps not in all tables in this document? Not in Table 14 in Section 6.5) Make sure that either this statement is withdrawn or the variant definition is actually added.]

[MT: to be addressed in the next version of the report]

 

Recap from the F2F meeting:

 

 

Agenda 5(2)  -- D.1.9 Latin Small Letter D with Caron vs. Latin Small Letter D with Hook

 

Hypothesis:

Latin Small Letter D and Latin Small Letter D with Hook may be considered equivalent by readers and writers, since the extended hook may be another style of writing the Caron in cursive hand-writing. Additionally, the Caron may become indistinguishable from an apostrophe.

 

Code Points Considered:

 

Code Points

Glyph

Name

010F

ď

Letter D with Caron

0257

ɗ

Letter D with Hook

02BC

ʼ

Modifier Letter Apostrophe

0064

d

Latin Small Letter D

006C

l

Latin Small Letter L

013E

ľ

Latin Small Letter L With Caron

 

Sequence (ɗďdʼ) (0257 010F 0064 02BC) compared using Google Fonts in https://wordmark.it/: [wordmark.it]

 

Sequence (l'ľ) (006C 02BC 013E) compared using Google Fonts in https://wordmark.it/ [wordmark.it]:

Ein Bild, das Elektronik, Tastatur enthält.

Automatisch generierte Beschreibung

Findings:

While the differences between 0257 and 010F seem rather stable, this is not the case for 010F vs 0064 + 02BC as well as 006C + 02BC vs 013E. While a number of fonts (highlighted in yellow) do retain a difference in the shape of the modifier in between the caron and the apostrophe, however these differences are commonly considered inter-changeable in handwriting, which may impact the way readers perceive these. In very few fonts the caron retains the shape of a modifier above the letter (highlighted in blue), however in a significant number of fonts (red) the shape of the two modifiers is identical, sometimes with slight differences in spacing, but sometimes not. Accordingly, 010F vs 0064 + 02BC as well as 006C + 02BC vs 013E are indistinguishable in a significant number of fonts and homogylphs in a minority of fonts.

 

Conclusions:

010F vs 0064 + 02BC as well as 006C + 02BC vs 013E should be in a variant relationship since they are indistinguishable in a number of fonts. Since punctuation marks and look-alikes must be excluded from the zone however, 010F as well as should be excluded.

 

 Recap from the F2F meeting:

 

 

Agenda 5(3)  -- D.1.5 Latin Small Letter I vs. Latin Small Letter Dotless I vs. Latin Small Letter Iota

 

Hypothesis:

Latin Small Letters I, Dotless I and Iota may be considered equivalent by readers and writers, since the dot of the I is frequently omitted in hand-writing, and since the shape of Iota is a typical style of writing the shape of the I.

 

Code Points Considered:

 

Code Points

Glyph

Name

0069

i

Latin Small Letter I

0131

ı

Latin Small Letter Dotless I

0269

ɩ

Latin Small Letter Iota

 

Sequence iıɩ ( 0069 0131  0269) compared using Google Fonts in https://wordmark.it/ [wordmark.it] :

 

 

A close up of a keyboard

Description automatically generated

 

Findings:

Glyphs are distinguishable when written in lower case.

 

Sequence ıɩ (0131 0269) compared using Google Fonts in https://wordmark.it/ [wordmark.it]

 

 

Findings:

In the italic versions of any of the serif fonts (e.g. Times New Roman or Consolas) these are identical.

 

 Recap from the F2F meeting:

For (1), Agree to have Palochka  and Latin letter L  as variants. Add Palochka and small dotless i to string similarity table.

For (2), GP: Remove from the variant set.

 

 

Agenda 5(4)  -- D.2.1 Latin Small Ligature Æ vs. Sequence AE

 

Code Points Considered:

Code Points

Glyph

Name

00E6

æ

Latin Small Letter Æ

0061

a

Latin Small Letter A

0065

e

Latin Small Letter E

0153

œ

Latin Small Ligature Œ

0251

ɑ

Latin Small Letter Alpha

 

Sequence æae (00E6 + 0061 + 0065) compared using Google Fonts in https://wordmark.it/:

https://lh4.googleusercontent.com/ThKSCpnHvTETlE_0XUlOJxXDS3VKf6QjZg9nX9G03HS09jUQwI9DNPL8Ib8Nmyo-lssbKv7xQKVoggBEeMoCIfJhDPnZWgaTzt74WAXahBjeSeYdQIkwKVjKdOnopEQpE4vvup4C

 

Findings:

In some fonts, in which the a-glyph takes a shape similar to that of 0251 ɑ Latin Small Letter Alpha, the ligature and the sequence bare some similarity but are distinguishable.

In a large number of fonts, the ligature and the sequence are consistently different.

 

Additional Findings:

In fonts, in which the a-glyph takes a shape similar to that of 0251 ɑ Latin Small Letter Alpha, the ligature 00E6 becomes nearly visually identical with the o-e ligature (0153 œ Latin Small Ligature Oe) as demonstrated below.

 

Sequence æaeœoe (00E6+0061+0065+0153+006F+0065) compared using Google Fonts in https://wordmark.it/:Ein Bild, das Screenshot enthält.



Automatisch generierte Beschreibung

Conclusion:

Suggestion to consider 00E6 Latin Small Letter Æ  and 0153 Latin Small Ligature Œ  as variant pair or add to the string similarity list on the grounds of them being visually nearly identical and being similar on non-visual grounds because of conceptional identity of 0251 Latin Small Letter Alpha (“ɑ“)  and 0061 Latin Small Letter A  (“a“) in a significant number of fonts.

 

 Recap from the F2F meeting:

 

Agenda 5(5)  -- D.3.7 Double Acute vs. Diaresis

 

Code Points Considered:

 

Code Points

Glyph

Name

006E + 0308

Latin Small Letter N + Combining Diaeresis

00E4

ä

Latin Small Letter A with Diaeresis

00EB

ë

Latin Small Letter E with Diaeresis

00EF

ï

Latin Small Letter I with Diaeresis

00F6

ö

Latin Small Letter O with Diaeresis

00FC

ü

Latin Small Letter U with Diaeresis

00FF

ÿ

Latin Small Letter Y with Diaeresis

0151

ő

Latin Small Letter O with Double Acute

0171

ű

Latin Small Letter U with Double Acute

0254 + 0308

ɔ̈

Latin Small Letter Open O + Combining Diaeresis

025B + 0308

ɛ̈

Latin Small Letter Open E + Combining Diaeresis

025B + 0331 + 0308

ɛ̱̈

Latin Small Letter Open E + Combining Macron Below + Combining Diaeresis

1E8D

Latin Small Letter X with Diaeresis

 

Sequence őö and üű (00F6 0151 and 00FC 0171) compared using Google Fonts in https://wordmark.it/:

 

Findings:

The representations of the Double Acute vs Diaresis in these pairs are distinguishable in a number of fonts. In some fonts, the two diacritics look similar.

 

Conclusion:

Code points őö and üű should be investigated for visual similarity

 

 

 Recap from the F2F meeting:

 

 Agenda 5(6)  -- D.4.2 Circumflex and Hook Above

Code Points Considered:

Code Points

Glyph

Name

1EA9

Latin Small Letter A with Circumflex and Hook Above

00E2

â

Latin Small Letter A with Circumflex

1EA3

Latin Small Letter A with Hook Above

1EC3

Latin Small Letter E with Circumflex and Hook Above

00EA

ê

Latin Small Letter E with Circumflex

1EBB

Latin Small Letter E with Hook Above

1ED5

Latin Small Letter O with Circumflex and Hook Above

00F4

ô

Latin Small Letter O with Circumflex

1ECF

Latin Small Letter O with Hook Above

 

 

Sequence ẩaâả (1EA9 + 0061 + 00E2 + 1EA3) compared using Google Fonts in https://wordmark.it/:

https://lh6.googleusercontent.com/KPE2TBAG45zsb-2rzrlhDSDJdcqspSHSj8HJFgnrMeiofgObrNqa4nrwB6-ekN_bj_GFbdE0s_Or0-ii5IfPE_Jj6SSWhXtEdT6ufFyHvAkaWEvunpQBnWHU4MvsWiWjjSMBmpJ8

 

Sequence ểeêẻ (1EC3 + 0065 + 00EA + 1EBB) compared using Google Fonts in https://wordmark.it/:

Ein Bild, das Elektronik enthält.



Automatisch generierte Beschreibung

 

Sequence ổoôỏ (1ED5 + 006F + 00F4 + 1ECF) compared using Google Fonts in https://wordmark.it/:

https://lh5.googleusercontent.com/vBVWSP3_OD0mjetoTCUFrhhZE0iQRQcG8w6Y_KS20t7vSw8ss6dz75NIc6ty4ze0iGDFO8BVSwfS3sOnM3bdur3MzVptKqqwQtHVKpYE6UR7_hyHPN7uhgDuNgygif4Dgzki0Opj

 

Findings:

In a large number of fonts, the two letters are consistently different. However, in a significant number of fonts, renderings are very diverse. In some case the hook as secondary modifier is placed vertically above, in others it is set horizontally next to the circumflex as primary modifier, in some fonts it is spaced so far horizontally to the right that it becomes unclear if it is a modifier belonging to the first or the second code point, and yet in other cases it even overlaps with the glyph of the following code point.

 

Conclusion:

Suggestion to add to shortlist for the string similarity list or create three variant pairs on the ground of them being visually similar to the level of being nearly identical or confusable.

ẩ 1EA9 and âả 00E2 + 1EA3

ể 1EC3 and êẻ 00EA + 1EBB

ổ 1ED5 and ôỏ 00F4 + 1ECF

 

 Recap from the F2F meeting:

Reasoning: - It is a complex (two or more above diacritics) combinations, therefore non-Vietnamese users can see that they are strange.

While the Vietnamese users can distinguish them.

 

Agenda 5(7)  -- D.4.16 Circumflex Above + Grave Above

 

Code Points Considered:

 

Code Points

Glyph

Name

00E2

â

Latin Small Letter A with Circumflex

00EA

ê

Latin Small Letter E with Circumflex

00E0

à

Latin Small Letter A with Grave

00E8

è

Latin Small Letter E with Grave

00F4

ô

Latin Small Letter O with Circumflex

00F2

ò

Latin Small Letter O with Grave

1EC1

Latin Small Letter E with Circumflex and Grave

1ED3

Latin Small Letter O with Circumflex and Grave

006F

o

Latin Small Letter O

1EA7

Latin Small Letter A with Circumflex and Grave

0061

a

Latin Small Letter A

0065

e

Latin Small Letter E

Sequence aầaàâ (0061 1EA7 0061 00E0 00E2) compared using Google Fonts in https://wordmark.it/:

Sequence eềeèê (0065 1EC1 0065 00EA 00E8) compared using Google Fonts in https://wordmark.it/:

 

Sequence oồoòô (006F 1ED3 006F 00F4 00F2) compared using Google Fonts in https://wordmark.it/:

 

Findings:

There is no stability in the way the grave is positioned. In few fonts it occurs above the circumflex, in a minority of fonts it occurs displaced to the right, here highlighted in yellow. In a significant minority of fonts the grave occurs instead misplaced to the left of the basic letter shape, and in all such cases presented here (but particularly in the case of a with circumflex and grave) the unmodified basic letter shape followed by the same with circumflex and grave may appear to carry the grave. Accordingly there is a specific risk for confusion by Latin script users of a, e, or o followed by the same with circumflex and grave with a sequence of the same, first with a grave then with circumflex on top.

 

Conclusions:

To ensure safety and stability of the zone, and given the misleading placement of the grave in the cases discussed, it seems warranted to create three variant pairs:

àâ should be in a variant relationship with aầ

èê should be in a variant relationship with eề

òô should be in a variant relationship with oồ


 

 Recap from the F2F meeting: