Dear All,

 

This is indeed a complex matter to address, and is therefore requiring this continued discussion.  It may also be useful here to refer back to the RZ-LGR Procedure.

 

The RZ-LGR Procedure, while defining “IDN variants” says that:

 

However, the Procedure also acknowledges immediately following the definition that:

 

While noting the benefits of defining IDN variants, the procedure also acknowledges the limitations. 

So, not all matters can be settled in the LGR.  A line has to be drawn between “same” and “similar”.

 

The LGR Procedure does note what is desirable to be in the scope to LGR:

 

But notes that this should not go too far into the string similarity discussion:

 

One could infer from these statements in the RZ-LGR Procedure that:

  1. If two code points are considered “same” by the user community, these should be included as IDN variants (this is not limited to visual similarity, but could also include semantic equivalence, like in Chinese, orthographic conventions or spelling simplification, like in Arabic, homophonic relations, like in Ethiopic, etc., as determined the respective script community)
  2. The “straightforward, non-subjective cases” of visual similarity could be included as IDN variants and blocked
  3. Beyond these, the analysis goes into the realm of string similarity review, which is beyond the intention of the LGR

 

Generation Panels have been asked to draw the line based on these guidelines provided in the RZ-LGR Procedure.  For example, Cyrillic GP agreed to consider homoglyph relations with other related scripts for this purpose.  Neo-Brahmi GP has used a slightly different technique, where it considers cross-script variants those code points which members of both scripts in question find such code points “indistinguishable” even if these are not homoglyphs (see the blog for some more details). 

 

Of course, the Latin GP also needs to draw these lines for the analysis for identifying within-script and cross-script IDN variant cases. 

 

Regards,
Sarmad

 

 

From: Latingp [mailto:latingp-bounces@icann.org] On Behalf Of Bill Jouris
Sent: Saturday, May 19, 2018 5:28 AM
To: Tan Tanaka, Dennis <dtantanaka@verisign.com>; Meikal Mumin <meikal@mumin.de>
Cc: Tan Tanaka, Dennis via Latingp <latingp@icann.org>
Subject: Re: [Latingp] Variant cross-script analysis worksheets

 

It's been clear for some time, even before Brussels, that you think we should only look at homoglyphs.  (Also that you don't think that there are any in-script homoglyphs.  See the discussion about the schwa and the turned e.) 



But there is a world of difference between agreeing, and merely deciding not to waste time arguing with a closed mind.  Which, for me, is what happened in the discussion in Brussels. 

 

Bill Jouris
Inside Products
bill.jouris@insidethestack.com
831-659-8360
925-855-9512 (direct)

 


From: "Tan Tanaka, Dennis" <dtantanaka@verisign.com>
To: Bill Jouris <bill.jouris@insidethestack.com>; Meikal Mumin <meikal@mumin.de>
Cc: Michael Bauland <Michael.Bauland@knipp.de>; "Tan Tanaka, Dennis via Latingp" <latingp@icann.org>
Sent: Friday, May 18, 2018 1:43 PM
Subject: Re: [Latingp] Variant cross-script analysis worksheets

 

I believe we delimited the scope of variants for the Latin script in the face to face meeting in Brussels, did we not?

 

From: Bill Jouris <bill.jouris@insidethestack.com>
Reply-To: Bill Jouris <bill.jouris@insidethestack.com>
Date: Friday, May 18, 2018 at 2:18 PM
To: Dennis Tan Tanaka <dtantanaka@verisign.com>, Meikal Mumin <meikal@mumin.de>
Cc: Michael Bauland <Michael.Bauland@knipp.de>, "Tan Tanaka, Dennis via Latingp" <latingp@icann.org>
Subject: [EXTERNAL] Re: [Latingp] Variant cross-script analysis worksheets

 

It is pretty clear, if one reads the MSR-3 document, that we are supposed to deal with Variants.  Which include, but are NOT limited to, homoglyphs. 

 

Bill Jouris
Inside Products
bill.jouris@insidethestack.com
831-659-8360
925-855-9512 (direct)

 


From: "Tan Tanaka, Dennis" <dtantanaka@verisign.com>
To: Meikal Mumin <meikal@mumin.de>
Cc: "bill.jouris@insidethestack.com" <bill.jouris@insidethestack.com>; Michael Bauland <Michael.Bauland@knipp.de>; "Tan Tanaka, Dennis via Latingp" <latingp@icann.org>
Sent: Friday, May 18, 2018 10:20 AM
Subject: Re: [Latingp] Variant cross-script analysis worksheets

 

 

we must deal with such confusable characters or sequences of characters in the context of variants

 

No, we don’t. Confusability is not in scope. We established the Latin panel will deal with homoglyphs or nearly homoglyphs (i.e. font variation) in the context of cross-scripts.