Dear All,

 

Please find the test results tested by Mert Saka, our colleague in ICANN Istanbul office.

He also asked me to extend his sincere thanks to everyone on Latin GP for enabling the Turkish IDNs. 

 

Regards,

Pitinan

 

From: Mert Saka <mert.saka@icann.org>
Date: Wednesday, May 20, 2020 at 15:48
To: Pitinan Kooarmornpatana <pitinan.koo@icann.org>
Cc: Sarmad Hussain <sarmad.hussain@icann.org>
Subject: Re: [Ext] Re: Re: [Latingp] Handling of down casing of I-dotted in Turkish locale

 

Hi Pitinan,

 

Yesterday was a public holiday in Turkey, so I had plenty of time in my hands to test. I resulted testing on 4 different devices, but with only two browsers (Chrome and Firefox). More details about my testing environment (the device details) are below, hoping they help the team evaluate the results better:

 

Here are some highlights from the test results:

 

A note about the testing methodology:

 

Now, the test result details:

 

1. ICANN’s MacBook Pro on MacOS Mojave (10.14.6)

 

IDNA conversión https://www.punycoder.com/ 

User Input

Copy and Paste into “Text” box, then “Convert to Punycode”

zil

zıl

ZİL

ZIL

Output

Register output string in “Punycode” box

zil

xn--zl-hpa

 xn--zil-9dc

 ZIL

 

Browser Behavior (Chrome)  

User Input

Copy and Paste in URL address Bar

zil.test 

zıl.test*

ZİL.test

ZIL.test*

Output URL Bar

Register returned label in address bar

http://zil.test/

http://xn--zl-hpa.test/

http://xn--zil-9dc.test/

 

http://zil.test/

 

* Below are the images that show the different display behavior of Chrome for the two Turkish characters:

“zıl.test” (LOWERCASE I WITHOUT DOT) A screenshot of a cell phone

Description automatically generated

“ZİL.test” (CAPITAL I WITH DOT ON TOP) A screenshot of a cell phone

Description automatically generated

 

 

2. Personal MacBook Air on MacOS Mojave (10.14.6)

IDNA conversión https://www.punycoder.com/ 

User Input

Copy and Paste into “Text” box, then “Convert to Punycode”

zil

zıl

ZİL

ZIL

Output

Register output string in “Punycode” box

zil

xn--zl-hpa

 xn--zil-9dc

 ZIL

 

Browser Behavior (Chrome)  

User Input

Copy and Paste in URL address Bar

zil.test 

zıl.test*

ZİL.test*

ZIL.test

Output URL Bar

Register returned label in address bar

http://zil.test/

http://xn--zl-hpa.test/

http://xn--zil-9dc.test/

 

http://zil.test/

 

* Same results as ICANN’s MBPro above. Different display behavior of Chrome for the two Turkish characters were observed as in the above screenshots.

 

Browser Behavior (Firefox)  

User Input

Copy and Paste in URL address Bar

zil.test 

zıl.test

ZİL.test*

ZIL.test

Output URL Bar

Register returned label in address bar

http://www.zil.test/

http://www.zıl.test/

http://www.zi̇l.test/ 

http://www.zil.test/ 

* Pleaase note that the result may not be displayed above (due to Outlook Email) properly. Please refer to the results on the image below and the attached TXT file.

“ZİL.test” (CAPITAL I WITH DOT ON TOP)

 

 

3. Personal MSI GE73VR 8RF-210XTR Laptop on Windows 10

IDNA conversión https://www.punycoder.com/ 

User Input

Copy and Paste into “Text” box, then “Convert to Punycode”

zil

zıl

ZİL

ZIL

Output

Register output string in “Punycode” box

zil

xn--zl-hpa

 xn--zil-9dc

 ZIL

 

Browser Behavior (Chrome) 

User Input

Copy and Paste in URL address Bar

zil.test 

zıl.test*

ZİL.test*

ZIL.test

Output URL Bar

Register returned label in address bar

http://zil.test/

http://xn--zl-hpa.test/

http://xn--zil-9dc.test/

http://zil.test/ 

* Same results as ICANN’s MBPro above. Different display behavior of Chrome for the two Turkish characters were observed as in the above screenshots.

 

 

4. ICANN’s iPhone on iOS (13.4.1)

IDNA conversión https://www.punycoder.com/ 

User Input

Copy and Paste into “Text” box, then “Convert to Punycode”

zil

zıl

ZİL

ZIL

Output

Register output string in “Punycode” box

zil

xn--zl-hpa

 xn--zil-9dc

 ZIL

 

Browser Behavior (Chrome) 

User Input

Copy and Paste in URL address Bar

zil.test 

zıl.test

ZİL.test

ZIL.test

Output URL Bar

Register returned label in address bar

http://zil.test/

http://xn--zl-hpa.test/

http://xn--zil-9dc.test/

http://zil.test/

* Same results as ICANN’s MBPro above. Different display behavior of Chrome for the two Turkish characters were observed as seen in the screenshots below:

“zıl.test” (LOWERCASE I WITHOUT DOT) A picture containing drawing

Description automatically generated

“ZİL.test” (CAPITAL I WITH DOT ON TOP) A picture containing drawing

Description automatically generated

 

Browser Behavior (Safari) 

User Input

Copy and Paste in URL address Bar

zil.test 

zıl.test

ZİL.test*

ZIL.test

Output URL Bar

Register returned label in address bar

http://zil.test/

http://zıl.test

http://zi̇l.test

http://zil.test/

* Pleaase note that the result above may not be displayed (due to Outlook Email) properly. Please refer to the results on the image below and the attached TXT file.

“ZİL.test” (CAPITAL I WITH DOT ON TOP) A picture containing food, drawing

Description automatically generated

 

I hope these will be helpful to evaluate the browsers better and inline images are displayed on your computer properly.

 

If not, please let me know and I can send them in a different format.

 

I wish you a happy COVID-free day…

 

Best regards,

 

Mert Saka

gTLD Accounts Manager 

ICANN – www.icann.org

 

 

From: Pitinan Kooarmornpatana <pitinan.koo@icann.org>
Date: Monday, May 18, 2020 at 22:04
To: Mert Saka <mert.saka@icann.org>
Cc: Sarmad Hussain <sarmad.hussain@icann.org>
Subject: FW: [Ext] Re: Re: [Latingp] Handling of down casing of I-dotted in Turkish locale

 

Hi Mert,

 

Trust this email finds you well.

 

The Latin GP is testing some behavior with the Turkish locale. I understand that ICANN laptop would have setting of US Locale (en_US). So I’m wondering if you could find the Turkish local computer to conduct the test below? 

 

Please feel free to ping me on slack if you have any questions.

 

Regards,

Pitinan

 

From: "Tan Tanaka, Dennis" <dtantanaka@verisign.com>
Date: Saturday, May 16, 2020 at 03:59
To: Pitinan Kooarmornpatana <pitinan.koo@icann.org>, "mats.dufberg@internetstiftelsen.se" <mats.dufberg@internetstiftelsen.se>, "Latingp@icann.org" <Latingp@icann.org>
Subject: [Ext] Re: Re: [Latingp] Handling of down casing of I-dotted in Turkish locale

 

Thanks Pitinan.

 

We would need to determine the IDNA behavior in a conversion tool and the browser. To that effect, could you ask your colleague to try the below examples. (He/She needs to copy and paste the Input strings into the tool and URL address bars, and register the output string for each test). @Mats Dufberg Provided their machines are set up with Turkish locale, are we missing any other test case?

 

 

IDNA conversion https://www.punycoder.com/

 

User Input

Copy and Paste into “Text” box, then “Convert to Punycode”

zil

zıl

ZİL

ZIL

Output

Register output string in “Punycode” box

 

 

 

Browser Behavior (repeat for Chrome, Firefox and Safari/Edge)

 

User Input

Copy and Paste in URL address Bar

zil.test 

zıl.test

ZİL.test

ZIL.test

Output URL Bar

Register returned label in address bar

 

 

 

 

Thanks,

Dennis

 

 

 

From: Pitinan Kooarmornpatana <pitinan.koo@icann.org>
Date: Friday, May 15, 2020 at 4:36 PM
To: Mats Dufberg <mats.dufberg@internetstiftelsen.se>, Dennis Tan Tanaka <dtantanaka@verisign.com>, "Latingp@icann.org" <Latingp@icann.org>
Subject: [EXTERNAL] Re: [Latingp] Handling of down casing of I-dotted in Turkish locale

 

Dear all,

 

Please find attached test case and the results.  

1.       test case file (test case.xlsx)

2.       test result using Turkish locale (testcase-turkishlocale.xlsx)

3.       test result using en_US local (test case-en_US-locale.xlsx)

 

The test result for Turkish locale was done by someone in Turkey who use Turkish locale. The result for case folding seems to be stable both up and down.  The test result for en_US was done by me and it cannot produce original dotless I after a round-folding. I understand that this due to the absence of CAPITAL LETTER I WITH DOT ABOVE in en-US locale.

 

Kindly let us know if there is any further queries.

 

Regards,

Pitinan

 

From: Latingp <latingp-bounces@icann.org> on behalf of Mats Dufberg <mats.dufberg@internetstiftelsen.se>
Date: Friday, May 15, 2020 at 05:13
To: "Tan Tanaka, Dennis" <dtantanaka@verisign.com>, "Latingp@icann.org" <Latingp@icann.org>
Subject: Re: [Latingp] Handling of down casing of I-dotted in Turkish locale

 

> If I understand this correctly, when we test the IDNA behavior of the uppercase string in a Turkish setting we should see the behavior described in the second part.

 

That is my interpretation too. I will try to test some generic tool with Turkish locale (tr_TR.UTF-8) set.

 

 

Mats

 

---

Mats Dufberg

mats.dufberg@internetstiftelsen.se

Technical Expert

Internetstiftelsen (The Swedish Internet Foundation)

Mobile: +46 73 065 3899

https://internetstiftelsen.se/

 

 

 

From: "Tan Tanaka, Dennis" <dtantanaka@verisign.com>
Date: Thursday, 14 May 2020 at 23:19
To: Mats Dufberg <mats.dufberg@internetstiftelsen.se>, ICANN Latin GP <Latingp@icann.org>
Subject: Re: [Latingp] Handling of down casing of I-dotted in Turkish locale

 

There are two set of rules, one for non-Turkish

 

# Preserve canonical equivalence for I with dot. Turkic is handled below.

 

0130; 0069 0307; 0130; 0130; # LATIN CAPITAL LETTER I WITH DOT ABOVE

 

And another for Turkish

 

# Turkish and Azeri
 
# I and i-dotless; I-dot and i are case pairs in Turkish and Azeri
# The following rules handle those cases.
 
0130; 0069; 0130; 0130; tr; # LATIN CAPITAL LETTER I WITH DOT ABOVE
0130; 0069; 0130; 0130; az; # LATIN CAPITAL LETTER I WITH DOT ABOVE
 
# When lowercasing, remove dot_above in the sequence I + dot_above, which will turn into i.
# This matches the behavior of the canonically equivalent I-dot_above
 
0307; ; 0307; 0307; tr After_I; # COMBINING DOT ABOVE
0307; ; 0307; 0307; az After_I; # COMBINING DOT ABOVE
 
# When lowercasing, unless an I is before a dot_above, it turns into a dotless i.
 
0049; 0131; 0049; 0049; tr Not_Before_Dot; # LATIN CAPITAL LETTER I
0049; 0131; 0049; 0049; az Not_Before_Dot; # LATIN CAPITAL LETTER I
 
# When uppercasing, i turns into a dotted capital I
 
0069; 0069; 0130; 0130; tr; # LATIN SMALL LETTER I
0069; 0069; 0130; 0130; az; # LATIN SMALL LETTER I
 
# Note: the following case is already in the UnicodeData.txt file.
 
# 0131; 0131; 0049; 0049; tr; # LATIN SMALL LETTER DOTLESS I

 

If I understand this correctly, when we test the IDNA behavior of the uppercase string in a Turkish setting we should see the behavior described in the second part.

 

-Dennis

 

From: Latingp <latingp-bounces@icann.org> on behalf of Mats Dufberg <mats.dufberg@internetstiftelsen.se>
Date: Thursday, May 14, 2020 at 4:27 PM
To: ICANN Latin GP <Latingp@icann.org>
Subject: [EXTERNAL] [Latingp] Handling of down casing of I-dotted in Turkish locale

 

The link below goes to the file in the Unicode database that handles the special casing rules for i etc in Turkish and Azerian (sp?). The relevant section is the last section of the file.

 

https://unicode.org/Public/UNIDATA/SpecialCasing.txt [secure-web.cisco.com]

 

 

 

---

Mats Dufberg

mats.dufberg@internetstiftelsen.se

Technical Expert

Internetstiftelsen (The Swedish Internet Foundation)

Mobile: +46 73 065 3899

https://internetstiftelsen.se/