Re: [Latingp] Generic Glyphs Cross-Script Analysis Google Sheets
Let me describe again what I see as an issue on lines 10-15 in "Generic Glyphs Cross-Script Analysis" https://docs.google.com/spreadsheets/d/17sPMStYinmsFOqZNWw8goqf8bq2SGiWr3xwx... On line 10-12 we compare the sequence "006F 006C 006F 006C 006F" with the sequence "0B20 0B3E 0B20 0B3E 0B20" but we only mention U+006C and U+0B3E, respectively. The issue is not what we compare in columns H and I, but the problems are: 1. We do not state in colums C-F what we compare, only part of it. 2. If we come to the conclusion that there should be variants, what is the variant pair? What we should compare is probably "olol" with "0B20 0B3E 0B20 0B3E" and the possible variant candidate should then be the sequences "ol" vs "0B20 0B3E". Also in columns C-F those sequences should be listed. What I written above also applies for lines 13-15. Mats --- Mats Dufberg mats.dufberg@internetstiftelsen.se DNS Specialist Internetstiftelsen (The Swedish Internet Foundation) Mobile: +46 73 065 3899 https://internetstiftelsen.se/
We also have the same problem in lines 49 to 60. When you do the comparison you will do it on the strings. --- Mats Dufberg mats.dufberg@internetstiftelsen.se DNS Specialist Internetstiftelsen (The Swedish Internet Foundation) Mobile: +46 73 065 3899 https://internetstiftelsen.se/ From: Latingp <latingp-bounces@icann.org> on behalf of Mats Dufberg <mats.dufberg@internetstiftelsen.se> Date: Thursday, 2 April 2020 at 19:22 To: ICANN Latin GP <latingp@icann.org> Subject: Re: [Latingp] Generic Glyphs Cross-Script Analysis Google Sheets Let me describe again what I see as an issue on lines 10-15 in "Generic Glyphs Cross-Script Analysis" https://docs.google.com/spreadsheets/d/17sPMStYinmsFOqZNWw8goqf8bq2SGiWr3xwx... On line 10-12 we compare the sequence "006F 006C 006F 006C 006F" with the sequence "0B20 0B3E 0B20 0B3E 0B20" but we only mention U+006C and U+0B3E, respectively. The issue is not what we compare in columns H and I, but the problems are: 1. We do not state in colums C-F what we compare, only part of it. 2. If we come to the conclusion that there should be variants, what is the variant pair? What we should compare is probably "olol" with "0B20 0B3E 0B20 0B3E" and the possible variant candidate should then be the sequences "ol" vs "0B20 0B3E". Also in columns C-F those sequences should be listed. What I written above also applies for lines 13-15. Mats --- Mats Dufberg mats.dufberg@internetstiftelsen.se DNS Specialist Internetstiftelsen (The Swedish Internet Foundation) Mobile: +46 73 065 3899 https://internetstiftelsen.se/
There is, admittedly, an implicit assumption that the characters, other than the specific one being evaluated, are variants. And therefore don't play into the analysis. Bill Jouris Inside Products bill.jouris@insidethestack.com 831-659-8360 925-855-9512 (direct) On Thursday, April 2, 2020, 10:33:43 AM PDT, Mats Dufberg <mats.dufberg@internetstiftelsen.se> wrote: #yiv0415121892 #yiv0415121892 -- _filtered {} _filtered {} _filtered {}#yiv0415121892 #yiv0415121892 p.yiv0415121892MsoNormal, #yiv0415121892 li.yiv0415121892MsoNormal, #yiv0415121892 div.yiv0415121892MsoNormal {margin:0cm;margin-bottom:.0001pt;font-size:11.0pt;font-family:sans-serif;}#yiv0415121892 a:link, #yiv0415121892 span.yiv0415121892MsoHyperlink {color:blue;text-decoration:underline;}#yiv0415121892 span.yiv0415121892EmailStyle19 {color:windowtext;font-weight:normal;font-style:normal;}#yiv0415121892 .yiv0415121892MsoChpDefault {font-size:10.0pt;} _filtered {}#yiv0415121892 div.yiv0415121892WordSection1 {}#yiv0415121892 We also have the same problem in lines 49 to 60. When you do the comparison you will do it on the strings. --- Mats Dufberg mats.dufberg@internetstiftelsen.se DNS Specialist Internetstiftelsen (The Swedish Internet Foundation) Mobile: +46 73 065 3899 https://internetstiftelsen.se/ From: Latingp <latingp-bounces@icann.org> on behalf of Mats Dufberg <mats.dufberg@internetstiftelsen.se> Date: Thursday, 2 April 2020 at 19:22 To: ICANN Latin GP <latingp@icann.org> Subject: Re: [Latingp] Generic Glyphs Cross-Script Analysis Google Sheets Let me describe again what I see as an issue on lines 10-15 in "Generic Glyphs Cross-Script Analysis"https://docs.google.com/spreadsheets/d/17sPMStYinmsFOqZNWw8goqf8bq2SGiWr3xwx... On line 10-12 we compare the sequence "006F 006C 006F 006C 006F" with the sequence "0B20 0B3E 0B20 0B3E 0B20" but we only mention U+006C and U+0B3E, respectively. The issue is not what we compare in columns H and I, but the problems are: 1. We do not state in colums C-F what we compare, only part of it. 2. If we come to the conclusion that there should be variants, what is the variant pair? What we should compare is probably "olol" with "0B20 0B3E 0B20 0B3E" and the possible variant candidate should then be the sequences "ol" vs "0B20 0B3E". Also in columns C-F those sequences should be listed. What I written above also applies for lines 13-15. Mats --- Mats Dufberg mats.dufberg@internetstiftelsen.se DNS Specialist Internetstiftelsen (The Swedish Internet Foundation) Mobile: +46 73 065 3899 https://internetstiftelsen.se/ _______________________________________________ Latingp mailing list Latingp@icann.org https://mm.icann.org/mailman/listinfo/latingp _______________________________________________ By submitting your personal data, you consent to the processing of your personal data for purposes of subscribing to this mailing list accordance with the ICANN Privacy Policy (https://www.icann.org/privacy/policy) and the website Terms of Service (https://www.icann.org/privacy/tos). You can visit the Mailman link above to change your membership status or configuration, including unsubscribing, setting digest-style delivery or disabling delivery altogether (e.g., for a vacation), and so on.
I have two objections. 1. Implicit assumptions are bad. It must be explicit or else you cannot assume that the reviewers have even tried to review it in the same way. 2. If code point cannot create its own glyph, i.e. it is a mark that always connects to another code point and with that create the glyph, it cannot be reviewed out of its context. When we, within the Latin script, compared e.g. tilde with caron we did not do that out of context but when those were attached to a base letter. We selected the same base letter to keep the context as similar as possible, but the comparison was on the entire glyph. This must also be applied here. What we compare will also affect what will be the candidate variant pair. We cannot compare U+006C with U+0B3E since the latter does not exist as its own glyph, but rather "006F 006C" with "0B20 0B3E". Or another context. And what we compare is the variant candidate. The same with U+0131 vs U+0B3E. It should rather be "006F 0131" vs "0B20 0B3E". If U+OEC0, U+1004, U+0EA7 cannot be by itself, then the same applies here. Else strings with just repeated code point used be used. For U+102C the situation is the same as with U+0B3E. Yours, Mats --- Mats Dufberg mats.dufberg@internetstiftelsen.se DNS Specialist Internetstiftelsen (The Swedish Internet Foundation) Mobile: +46 73 065 3899 https://internetstiftelsen.se/ From: Bill Jouris <bill.jouris@insidethestack.com> Reply to: Bill Jouris <bill.jouris@insidethestack.com> Date: Thursday, 2 April 2020 at 20:20 To: ICANN Latin GP <latingp@icann.org>, Mats Dufberg <mats.dufberg@internetstiftelsen.se> Subject: Re: [Latingp] Generic Glyphs Cross-Script Analysis Google Sheets There is, admittedly, an implicit assumption that the characters, other than the specific one being evaluated, are variants. And therefore don't play into the analysis. Bill Jouris Inside Products bill.jouris@insidethestack.com 831-659-8360 925-855-9512 (direct) On Thursday, April 2, 2020, 10:33:43 AM PDT, Mats Dufberg <mats.dufberg@internetstiftelsen.se> wrote: We also have the same problem in lines 49 to 60. When you do the comparison you will do it on the strings. --- Mats Dufberg mats.dufberg@internetstiftelsen.se DNS Specialist Internetstiftelsen (The Swedish Internet Foundation) Mobile: +46 73 065 3899 https://internetstiftelsen.se/ From: Latingp <latingp-bounces@icann.org> on behalf of Mats Dufberg <mats.dufberg@internetstiftelsen.se> Date: Thursday, 2 April 2020 at 19:22 To: ICANN Latin GP <latingp@icann.org> Subject: Re: [Latingp] Generic Glyphs Cross-Script Analysis Google Sheets Let me describe again what I see as an issue on lines 10-15 in "Generic Glyphs Cross-Script Analysis" https://docs.google.com/spreadsheets/d/17sPMStYinmsFOqZNWw8goqf8bq2SGiWr3xwx... On line 10-12 we compare the sequence "006F 006C 006F 006C 006F" with the sequence "0B20 0B3E 0B20 0B3E 0B20" but we only mention U+006C and U+0B3E, respectively. The issue is not what we compare in columns H and I, but the problems are: 1. We do not state in colums C-F what we compare, only part of it. 2. If we come to the conclusion that there should be variants, what is the variant pair? What we should compare is probably "olol" with "0B20 0B3E 0B20 0B3E" and the possible variant candidate should then be the sequences "ol" vs "0B20 0B3E". Also in columns C-F those sequences should be listed. What I written above also applies for lines 13-15. Mats --- Mats Dufberg mats.dufberg@internetstiftelsen.se DNS Specialist Internetstiftelsen (The Swedish Internet Foundation) Mobile: +46 73 065 3899 https://internetstiftelsen.se/ _______________________________________________ Latingp mailing list Latingp@icann.org<mailto:Latingp@icann.org> https://mm.icann.org/mailman/listinfo/latingp _______________________________________________ By submitting your personal data, you consent to the processing of your personal data for purposes of subscribing to this mailing list accordance with the ICANN Privacy Policy (https://www.icann.org/privacy/policy) and the website Terms of Service (https://www.icann.org/privacy/tos). You can visit the Mailman link above to change your membership status or configuration, including unsubscribing, setting digest-style delivery or disabling delivery altogether (e.g., for a vacation), and so on.
participants (2)
-
Bill Jouris -
Mats Dufberg