My gut level response is that mixing scripts within a label is much worse than mixing scripts in FQDN, though I am hard-pressed to quantify that feeling.  One could use confusable code points from Latin, Greek and/or Cyrillic, with only a single script in each label but a mixture overall within the FQDN.  Is that really less dangerous? Hmm…

 

From: ua-discuss-bounces@icann.org [mailto:ua-discuss-bounces@icann.org] On Behalf Of Asmus Freytag
Sent: Monday, August 7, 2017 3:36 PM
To: ua-discuss@icann.org
Subject: Re: [UA-discuss] Checking for Mixed Scripts in a domain name oremail address - is this a UASG issue? - Group Discussion

 

The ASCII digits have script "common", not "Latin".

The issue isn't numbers, it's mixing letters.

Also, mixing scripts within a label seems somehow different from mixing scripts within a FQDN.

A./

On 8/7/2017 12:46 PM, Maxim Alzoba wrote:

Hello Don,

 

usually IDN cyrillic contains digits, so one of the examples is not correct (Cyrillic+Latin Numerals is not a mixed script).

 

https://www.iana.org/domains/idn-tables/tables/xn--80adxhks_ru_1.0.txt

 

 

usually issues arise with the string containing chars , which do not fall into the same IDN table, for example 

something from IDN mistakes - 

tчху

xn--t-cubfh

 

from

https://www.icann.org/sites/default/files/packages/reserved-names/ReservedNames.xml

 

Or, which is important, IDN string from the table not allowed for the particular TLD

 

for example moscow.москва  or москва.moscow (both TLDs are strictly one script, only Cyrillic Russian in the first, and only allowed  ASCII symbols in the second)

 

P.s: formally each TLD has IDN policy or "no-IDN policy" and it describes allowed combinations, and what is important it might change over the time

(for example some old TLDs decided to allow some of IDN tables).

 

Sincerely Yours,

Maxim Alzoba
Special projects manager,
International Relations Department,
FAITID

m. +7 916 6761580(+whatsapp)

skype oldfrogger

 

Current UTC offset: +3.00 (.Moscow)

 

On Aug 7, 2017, at 21:52, Don Hollander <don.hollander@icann.org> wrote:

 

We’ve had the following suggested for the Programming Language Criteria:

 

    Could we include one more test case in "Programming Language Evaluation Criteria ".

   1. checking of multiple script in Email,Domain,Url.
       Ex: еріс.com[xn--e1awd7f.com] contain two language script (Cyrillic ,Latin ).
              deepak.भारत  contain two  language script (Latin, Devanagari)

 

 

I understand the thinking behind this, but I’m not sure that a) it’s in our remit, b) it’s a good idea c) it will be rejecting perfectly valid domain names (CJK, Cyrillic+Latin Numerals, etc)

 

I also don’t know what standard this would reference or what policy it would reference.   Something from the Unicode group or M3WAAG?

 

 

Your thoughts, please.

 

Don