Programming Language Hacks - UA103
Thanks for the comments on an earlier edition of UA103 – especially from Tex, Jim & Dennis. Please find a second version of the document – with some work still needed in the highlighted sections. https://docs.google.com/document/d/1Zahm4ZVS9lH9Zx8v-8PRhOnTV7fmEI8tStgLUBCk... Also attached as a PDF Your comments would be most welcome indeed. Thanks. Don
Hi Don, I was just wondering why is Arabic considered different than Unicode@idn.idn in EAIs? Arabic characters are Unicode, and Arabic domain names are also IDN. The only difference is the LTR and RTL reading order, which is probably done by the rendering part, but storing and validating is the same. Regards, Hazem Hezzah From: ua-discuss-bounces@icann.org [mailto:ua-discuss-bounces@icann.org] On Behalf Of Don Hollander Sent: Wednesday, 09 August, 2017 1:54 To: ua-discuss@icann.org Subject: [UA-discuss] Programming Language Hacks - UA103 Thanks for the comments on an earlier edition of UA103 – especially from Tex, Jim & Dennis. Please find a second version of the document – with some work still needed in the highlighted sections. https://docs.google.com/document/d/1Zahm4ZVS9lH9Zx8v-8PRhOnTV7fmEI8tStgLUBCk... Also attached as a PDF Your comments would be most welcome indeed. Thanks. Don
This is a fair comment. We probably need to use the LTR and RTL nomenclatures, which is the essence of those test cases. That also makes it generic for all RTL scripts, not only Arabic. -Dennis From: <ua-discuss-bounces@icann.org> on behalf of Hazem Hezzah <hhezzah.las@gmail.com> Organization: LAS Date: Wednesday, August 9, 2017 at 9:06 AM To: Don Hollander <don.hollander@icann.org> Cc: "UA-discuss@icann.org" <ua-discuss@icann.org> Subject: [EXTERNAL] Re: [UA-discuss] Programming Language Hacks - UA103 Hi Don, I was just wondering why is Arabic considered different than Unicode@idn.idn<mailto:Unicode@idn.idn> in EAIs? Arabic characters are Unicode, and Arabic domain names are also IDN. The only difference is the LTR and RTL reading order, which is probably done by the rendering part, but storing and validating is the same. Regards, Hazem Hezzah From: ua-discuss-bounces@icann.org [mailto:ua-discuss-bounces@icann.org] On Behalf Of Don Hollander Sent: Wednesday, 09 August, 2017 1:54 To: ua-discuss@icann.org Subject: [UA-discuss] Programming Language Hacks - UA103 Thanks for the comments on an earlier edition of UA103 – especially from Tex, Jim & Dennis. Please find a second version of the document – with some work still needed in the highlighted sections. https://docs.google.com/document/d/1Zahm4ZVS9lH9Zx8v-8PRhOnTV7fmEI8tStgLUBCk... Also attached as a PDF Your comments would be most welcome indeed. Thanks. Don
I agree; good comment, and I appreciate Hazem taking the time to share it. Richard Merdinger VP, Domains - GoDaddy rmerdinger@godaddy.com<mailto:rmerdinger@godaddy.com> From: <ua-discuss-bounces@icann.org> on behalf of "Tan Tanaka, Dennis via UA-discuss" <ua-discuss@icann.org> Reply-To: Dennis Tan Tanaka <dtantanaka@verisign.com> Date: Wednesday, August 9, 2017 at 8:37 AM To: Hazem Hezzah <hhezzah.las@gmail.com>, Don Hollander <don.hollander@icann.org> Cc: "ua-discuss@icann.org" <ua-discuss@icann.org> Subject: Re: [UA-discuss] Programming Language Hacks - UA103 This is a fair comment. We probably need to use the LTR and RTL nomenclatures, which is the essence of those test cases. That also makes it generic for all RTL scripts, not only Arabic. -Dennis From: <ua-discuss-bounces@icann.org> on behalf of Hazem Hezzah <hhezzah.las@gmail.com> Organization: LAS Date: Wednesday, August 9, 2017 at 9:06 AM To: Don Hollander <don.hollander@icann.org> Cc: "UA-discuss@icann.org" <ua-discuss@icann.org> Subject: [EXTERNAL] Re: [UA-discuss] Programming Language Hacks - UA103 Hi Don, I was just wondering why is Arabic considered different than Unicode@idn.idn<mailto:Unicode@idn.idn> in EAIs? Arabic characters are Unicode, and Arabic domain names are also IDN. The only difference is the LTR and RTL reading order, which is probably done by the rendering part, but storing and validating is the same. Regards, Hazem Hezzah From: ua-discuss-bounces@icann.org [mailto:ua-discuss-bounces@icann.org] On Behalf Of Don Hollander Sent: Wednesday, 09 August, 2017 1:54 To: ua-discuss@icann.org Subject: [UA-discuss] Programming Language Hacks - UA103 Thanks for the comments on an earlier edition of UA103 – especially from Tex, Jim & Dennis. Please find a second version of the document – with some work still needed in the highlighted sections. https://docs.google.com/document/d/1Zahm4ZVS9lH9Zx8v-8PRhOnTV7fmEI8tStgLUBCk... Also attached as a PDF Your comments would be most welcome indeed. Thanks. Don
Actually, we recently discovered an Edge bug (via the browser review) where the order of labels in a RTL.RTL.ASCII domain name were transposed during rendering. So I like calling it out explicitly. From: ua-discuss-bounces@icann.org [mailto:ua-discuss-bounces@icann.org] On Behalf Of Richard Merdinger Sent: Wednesday, August 9, 2017 8:10 AM To: Tan Tanaka, Dennis <dtantanaka@verisign.com>; Hazem Hezzah <hhezzah.las@gmail.com>; 'Don Hollander' <don.hollander@icann.org> Cc: ua-discuss@icann.org Subject: Re: [UA-discuss] Programming Language Hacks - UA103 I agree; good comment, and I appreciate Hazem taking the time to share it. Richard Merdinger VP, Domains - GoDaddy rmerdinger@godaddy.com<mailto:rmerdinger@godaddy.com> From: <ua-discuss-bounces@icann.org<mailto:ua-discuss-bounces@icann.org>> on behalf of "Tan Tanaka, Dennis via UA-discuss" <ua-discuss@icann.org<mailto:ua-discuss@icann.org>> Reply-To: Dennis Tan Tanaka <dtantanaka@verisign.com<mailto:dtantanaka@verisign.com>> Date: Wednesday, August 9, 2017 at 8:37 AM To: Hazem Hezzah <hhezzah.las@gmail.com<mailto:hhezzah.las@gmail.com>>, Don Hollander <don.hollander@icann.org<mailto:don.hollander@icann.org>> Cc: "ua-discuss@icann.org<mailto:ua-discuss@icann.org>" <ua-discuss@icann.org<mailto:ua-discuss@icann.org>> Subject: Re: [UA-discuss] Programming Language Hacks - UA103 This is a fair comment. We probably need to use the LTR and RTL nomenclatures, which is the essence of those test cases. That also makes it generic for all RTL scripts, not only Arabic. -Dennis From: <ua-discuss-bounces@icann.org<mailto:ua-discuss-bounces@icann.org>> on behalf of Hazem Hezzah <hhezzah.las@gmail.com<mailto:hhezzah.las@gmail.com>> Organization: LAS Date: Wednesday, August 9, 2017 at 9:06 AM To: Don Hollander <don.hollander@icann.org<mailto:don.hollander@icann.org>> Cc: "UA-discuss@icann.org<mailto:UA-discuss@icann.org>" <ua-discuss@icann.org<mailto:ua-discuss@icann.org>> Subject: [EXTERNAL] Re: [UA-discuss] Programming Language Hacks - UA103 Hi Don, I was just wondering why is Arabic considered different than Unicode@idn.idn<mailto:Unicode@idn.idn> in EAIs? Arabic characters are Unicode, and Arabic domain names are also IDN. The only difference is the LTR and RTL reading order, which is probably done by the rendering part, but storing and validating is the same. Regards, Hazem Hezzah From: ua-discuss-bounces@icann.org<mailto:ua-discuss-bounces@icann.org> [mailto:ua-discuss-bounces@icann.org] On Behalf Of Don Hollander Sent: Wednesday, 09 August, 2017 1:54 To: ua-discuss@icann.org<mailto:ua-discuss@icann.org> Subject: [UA-discuss] Programming Language Hacks - UA103 Thanks for the comments on an earlier edition of UA103 – especially from Tex, Jim & Dennis. Please find a second version of the document – with some work still needed in the highlighted sections. https://docs.google.com/document/d/1Zahm4ZVS9lH9Zx8v-8PRhOnTV7fmEI8tStgLUBCkldA/edit?usp=sharing<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.google.com%2Fdocument%2Fd%2F1Zahm4ZVS9lH9Zx8v-8PRhOnTV7fmEI8tStgLUBCkldA%2Fedit%3Fusp%3Dsharing&data=02%7C01%7Cmarksv%40microsoft.com%7Cb1f19af7824e4892ef9a08d4df38be29%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636378882184271357&sdata=qfinZIwTAJjhwy1V1DGEVuJP4OXWjdbhOSMiXRdAEpo%3D&reserved=0> Also attached as a PDF Your comments would be most welcome indeed. Thanks. Don
The RTL/LTR combo, yes, but not specific to a script. Am I missing something? Richard Merdinger VP, Domains - GoDaddy rmerdinger@godaddy.com<mailto:rmerdinger@godaddy.com> From: Mark Svancarek <marksv@microsoft.com> Date: Wednesday, August 9, 2017 at 11:13 AM To: Richard Merdinger <rmerdinger@godaddy.com>, Dennis Tan Tanaka <dtantanaka@verisign.com>, Hazem Hezzah <hhezzah.las@gmail.com>, Don Hollander <don.hollander@icann.org> Cc: "ua-discuss@icann.org" <ua-discuss@icann.org> Subject: RE: [UA-discuss] Programming Language Hacks - UA103 Actually, we recently discovered an Edge bug (via the browser review) where the order of labels in a RTL.RTL.ASCII domain name were transposed during rendering. So I like calling it out explicitly. From: ua-discuss-bounces@icann.org [mailto:ua-discuss-bounces@icann.org] On Behalf Of Richard Merdinger Sent: Wednesday, August 9, 2017 8:10 AM To: Tan Tanaka, Dennis <dtantanaka@verisign.com>; Hazem Hezzah <hhezzah.las@gmail.com>; 'Don Hollander' <don.hollander@icann.org> Cc: ua-discuss@icann.org Subject: Re: [UA-discuss] Programming Language Hacks - UA103 I agree; good comment, and I appreciate Hazem taking the time to share it. Richard Merdinger VP, Domains - GoDaddy rmerdinger@godaddy.com<mailto:rmerdinger@godaddy.com> From: <ua-discuss-bounces@icann.org<mailto:ua-discuss-bounces@icann.org>> on behalf of "Tan Tanaka, Dennis via UA-discuss" <ua-discuss@icann.org<mailto:ua-discuss@icann.org>> Reply-To: Dennis Tan Tanaka <dtantanaka@verisign.com<mailto:dtantanaka@verisign.com>> Date: Wednesday, August 9, 2017 at 8:37 AM To: Hazem Hezzah <hhezzah.las@gmail.com<mailto:hhezzah.las@gmail.com>>, Don Hollander <don.hollander@icann.org<mailto:don.hollander@icann.org>> Cc: "ua-discuss@icann.org<mailto:ua-discuss@icann.org>" <ua-discuss@icann.org<mailto:ua-discuss@icann.org>> Subject: Re: [UA-discuss] Programming Language Hacks - UA103 This is a fair comment. We probably need to use the LTR and RTL nomenclatures, which is the essence of those test cases. That also makes it generic for all RTL scripts, not only Arabic. -Dennis From: <ua-discuss-bounces@icann.org<mailto:ua-discuss-bounces@icann.org>> on behalf of Hazem Hezzah <hhezzah.las@gmail.com<mailto:hhezzah.las@gmail.com>> Organization: LAS Date: Wednesday, August 9, 2017 at 9:06 AM To: Don Hollander <don.hollander@icann.org<mailto:don.hollander@icann.org>> Cc: "UA-discuss@icann.org<mailto:UA-discuss@icann.org>" <ua-discuss@icann.org<mailto:ua-discuss@icann.org>> Subject: [EXTERNAL] Re: [UA-discuss] Programming Language Hacks - UA103 Hi Don, I was just wondering why is Arabic considered different than Unicode@idn.idn<mailto:Unicode@idn.idn> in EAIs? Arabic characters are Unicode, and Arabic domain names are also IDN. The only difference is the LTR and RTL reading order, which is probably done by the rendering part, but storing and validating is the same. Regards, Hazem Hezzah From: ua-discuss-bounces@icann.org<mailto:ua-discuss-bounces@icann.org> [mailto:ua-discuss-bounces@icann.org] On Behalf Of Don Hollander Sent: Wednesday, 09 August, 2017 1:54 To: ua-discuss@icann.org<mailto:ua-discuss@icann.org> Subject: [UA-discuss] Programming Language Hacks - UA103 Thanks for the comments on an earlier edition of UA103 – especially from Tex, Jim & Dennis. Please find a second version of the document – with some work still needed in the highlighted sections. https://docs.google.com/document/d/1Zahm4ZVS9lH9Zx8v-8PRhOnTV7fmEI8tStgLUBCkldA/edit?usp=sharing<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.google.com%2Fdocument%2Fd%2F1Zahm4ZVS9lH9Zx8v-8PRhOnTV7fmEI8tStgLUBCkldA%2Fedit%3Fusp%3Dsharing&data=02%7C01%7Cmarksv%40microsoft.com%7Cb1f19af7824e4892ef9a08d4df38be29%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636378882184271357&sdata=qfinZIwTAJjhwy1V1DGEVuJP4OXWjdbhOSMiXRdAEpo%3D&reserved=0> Also attached as a PDF Your comments would be most welcome indeed. Thanks. Don
You are right that Arabic is no more problematic than Hebrew, and that my specific mixed-script example isn’t actually in the document. Looking at the document again, I see that it might be construed that we are calling out Arabic as a unique problem script. Perhaps we should edit the document to replace “Arabic.arabic@arabic” with rtl.rtl@rtl? From: Richard Merdinger [mailto:rmerdinger@godaddy.com] Sent: Wednesday, August 9, 2017 9:28 AM To: Mark Svancarek <marksv@microsoft.com>; Tan Tanaka, Dennis <dtantanaka@verisign.com>; Hazem Hezzah <hhezzah.las@gmail.com>; 'Don Hollander' <don.hollander@icann.org> Cc: ua-discuss@icann.org Subject: Re: [UA-discuss] Programming Language Hacks - UA103 The RTL/LTR combo, yes, but not specific to a script. Am I missing something? Richard Merdinger VP, Domains - GoDaddy rmerdinger@godaddy.com<mailto:rmerdinger@godaddy.com> From: Mark Svancarek <marksv@microsoft.com<mailto:marksv@microsoft.com>> Date: Wednesday, August 9, 2017 at 11:13 AM To: Richard Merdinger <rmerdinger@godaddy.com<mailto:rmerdinger@godaddy.com>>, Dennis Tan Tanaka <dtantanaka@verisign.com<mailto:dtantanaka@verisign.com>>, Hazem Hezzah <hhezzah.las@gmail.com<mailto:hhezzah.las@gmail.com>>, Don Hollander <don.hollander@icann.org<mailto:don.hollander@icann.org>> Cc: "ua-discuss@icann.org<mailto:ua-discuss@icann.org>" <ua-discuss@icann.org<mailto:ua-discuss@icann.org>> Subject: RE: [UA-discuss] Programming Language Hacks - UA103 Actually, we recently discovered an Edge bug (via the browser review) where the order of labels in a RTL.RTL.ASCII domain name were transposed during rendering. So I like calling it out explicitly. From: ua-discuss-bounces@icann.org<mailto:ua-discuss-bounces@icann.org> [mailto:ua-discuss-bounces@icann.org] On Behalf Of Richard Merdinger Sent: Wednesday, August 9, 2017 8:10 AM To: Tan Tanaka, Dennis <dtantanaka@verisign.com<mailto:dtantanaka@verisign.com>>; Hazem Hezzah <hhezzah.las@gmail.com<mailto:hhezzah.las@gmail.com>>; 'Don Hollander' <don.hollander@icann.org<mailto:don.hollander@icann.org>> Cc: ua-discuss@icann.org<mailto:ua-discuss@icann.org> Subject: Re: [UA-discuss] Programming Language Hacks - UA103 I agree; good comment, and I appreciate Hazem taking the time to share it. Richard Merdinger VP, Domains - GoDaddy rmerdinger@godaddy.com<mailto:rmerdinger@godaddy.com> From: <ua-discuss-bounces@icann.org<mailto:ua-discuss-bounces@icann.org>> on behalf of "Tan Tanaka, Dennis via UA-discuss" <ua-discuss@icann.org<mailto:ua-discuss@icann.org>> Reply-To: Dennis Tan Tanaka <dtantanaka@verisign.com<mailto:dtantanaka@verisign.com>> Date: Wednesday, August 9, 2017 at 8:37 AM To: Hazem Hezzah <hhezzah.las@gmail.com<mailto:hhezzah.las@gmail.com>>, Don Hollander <don.hollander@icann.org<mailto:don.hollander@icann.org>> Cc: "ua-discuss@icann.org<mailto:ua-discuss@icann.org>" <ua-discuss@icann.org<mailto:ua-discuss@icann.org>> Subject: Re: [UA-discuss] Programming Language Hacks - UA103 This is a fair comment. We probably need to use the LTR and RTL nomenclatures, which is the essence of those test cases. That also makes it generic for all RTL scripts, not only Arabic. -Dennis From: <ua-discuss-bounces@icann.org<mailto:ua-discuss-bounces@icann.org>> on behalf of Hazem Hezzah <hhezzah.las@gmail.com<mailto:hhezzah.las@gmail.com>> Organization: LAS Date: Wednesday, August 9, 2017 at 9:06 AM To: Don Hollander <don.hollander@icann.org<mailto:don.hollander@icann.org>> Cc: "UA-discuss@icann.org<mailto:UA-discuss@icann.org>" <ua-discuss@icann.org<mailto:ua-discuss@icann.org>> Subject: [EXTERNAL] Re: [UA-discuss] Programming Language Hacks - UA103 Hi Don, I was just wondering why is Arabic considered different than Unicode@idn.idn<mailto:Unicode@idn.idn> in EAIs? Arabic characters are Unicode, and Arabic domain names are also IDN. The only difference is the LTR and RTL reading order, which is probably done by the rendering part, but storing and validating is the same. Regards, Hazem Hezzah From: ua-discuss-bounces@icann.org<mailto:ua-discuss-bounces@icann.org> [mailto:ua-discuss-bounces@icann.org] On Behalf Of Don Hollander Sent: Wednesday, 09 August, 2017 1:54 To: ua-discuss@icann.org<mailto:ua-discuss@icann.org> Subject: [UA-discuss] Programming Language Hacks - UA103 Thanks for the comments on an earlier edition of UA103 – especially from Tex, Jim & Dennis. Please find a second version of the document – with some work still needed in the highlighted sections. https://docs.google.com/document/d/1Zahm4ZVS9lH9Zx8v-8PRhOnTV7fmEI8tStgLUBCkldA/edit?usp=sharing<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.google.com%2Fdocument%2Fd%2F1Zahm4ZVS9lH9Zx8v-8PRhOnTV7fmEI8tStgLUBCkldA%2Fedit%3Fusp%3Dsharing&data=02%7C01%7Cmarksv%40microsoft.com%7Cb1f19af7824e4892ef9a08d4df38be29%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636378882184271357&sdata=qfinZIwTAJjhwy1V1DGEVuJP4OXWjdbhOSMiXRdAEpo%3D&reserved=0> Also attached as a PDF Your comments would be most welcome indeed. Thanks. Don
Hello All, do I understand it right that only valid combinations should be used? (not limited to .TLD part) If not - than it is not going to help (code will pass combinations which can not be used) Sincerely Yours, Maxim Alzoba Special projects manager, International Relations Department, FAITID m. +7 916 6761580(+whatsapp) skype oldfrogger Current UTC offset: +3.00 (.Moscow)
On Aug 9, 2017, at 19:28, Richard Merdinger <rmerdinger@godaddy.com> wrote:
The RTL/LTR combo, yes, but not specific to a script. Am I missing something?
Richard Merdinger VP, Domains - GoDaddy rmerdinger@godaddy.com <mailto:rmerdinger@godaddy.com>
From: Mark Svancarek <marksv@microsoft.com <mailto:marksv@microsoft.com>> Date: Wednesday, August 9, 2017 at 11:13 AM To: Richard Merdinger <rmerdinger@godaddy.com <mailto:rmerdinger@godaddy.com>>, Dennis Tan Tanaka <dtantanaka@verisign.com <mailto:dtantanaka@verisign.com>>, Hazem Hezzah <hhezzah.las@gmail.com <mailto:hhezzah.las@gmail.com>>, Don Hollander <don.hollander@icann.org <mailto:don.hollander@icann.org>> Cc: "ua-discuss@icann.org <mailto:ua-discuss@icann.org>" <ua-discuss@icann.org <mailto:ua-discuss@icann.org>> Subject: RE: [UA-discuss] Programming Language Hacks - UA103
Actually, we recently discovered an Edge bug (via the browser review) where the order of labels in a RTL.RTL.ASCII domain name were transposed during rendering. So I like calling it out explicitly.
From: ua-discuss-bounces@icann.org <mailto:ua-discuss-bounces@icann.org> [mailto:ua-discuss-bounces@icann.org <mailto:ua-discuss-bounces@icann.org>] On Behalf Of Richard Merdinger Sent: Wednesday, August 9, 2017 8:10 AM To: Tan Tanaka, Dennis <dtantanaka@verisign.com <mailto:dtantanaka@verisign.com>>; Hazem Hezzah <hhezzah.las@gmail.com <mailto:hhezzah.las@gmail.com>>; 'Don Hollander' <don.hollander@icann.org <mailto:don.hollander@icann.org>> Cc: ua-discuss@icann.org <mailto:ua-discuss@icann.org> Subject: Re: [UA-discuss] Programming Language Hacks - UA103
I agree; good comment, and I appreciate Hazem taking the time to share it.
Richard Merdinger VP, Domains - GoDaddy rmerdinger@godaddy.com <mailto:rmerdinger@godaddy.com>
From: <ua-discuss-bounces@icann.org <mailto:ua-discuss-bounces@icann.org>> on behalf of "Tan Tanaka, Dennis via UA-discuss" <ua-discuss@icann.org <mailto:ua-discuss@icann.org>> Reply-To: Dennis Tan Tanaka <dtantanaka@verisign.com <mailto:dtantanaka@verisign.com>> Date: Wednesday, August 9, 2017 at 8:37 AM To: Hazem Hezzah <hhezzah.las@gmail.com <mailto:hhezzah.las@gmail.com>>, Don Hollander <don.hollander@icann.org <mailto:don.hollander@icann.org>> Cc: "ua-discuss@icann.org <mailto:ua-discuss@icann.org>" <ua-discuss@icann.org <mailto:ua-discuss@icann.org>> Subject: Re: [UA-discuss] Programming Language Hacks - UA103
This is a fair comment. We probably need to use the LTR and RTL nomenclatures, which is the essence of those test cases. That also makes it generic for all RTL scripts, not only Arabic.
-Dennis
From: <ua-discuss-bounces@icann.org <mailto:ua-discuss-bounces@icann.org>> on behalf of Hazem Hezzah <hhezzah.las@gmail.com <mailto:hhezzah.las@gmail.com>> Organization: LAS Date: Wednesday, August 9, 2017 at 9:06 AM To: Don Hollander <don.hollander@icann.org <mailto:don.hollander@icann.org>> Cc: "UA-discuss@icann.org <mailto:UA-discuss@icann.org>" <ua-discuss@icann.org <mailto:ua-discuss@icann.org>> Subject: [EXTERNAL] Re: [UA-discuss] Programming Language Hacks - UA103
Hi Don,
I was just wondering why is Arabic considered different than Unicode@idn.idn <mailto:Unicode@idn.idn> in EAIs? Arabic characters are Unicode, and Arabic domain names are also IDN. The only difference is the LTR and RTL reading order, which is probably done by the rendering part, but storing and validating is the same.
Regards, Hazem Hezzah
From: ua-discuss-bounces@icann.org <mailto:ua-discuss-bounces@icann.org> [mailto:ua-discuss-bounces@icann.org <mailto:ua-discuss-bounces@icann.org>] On Behalf Of Don Hollander Sent: Wednesday, 09 August, 2017 1:54 To: ua-discuss@icann.org <mailto:ua-discuss@icann.org> Subject: [UA-discuss] Programming Language Hacks - UA103
Thanks for the comments on an earlier edition of UA103 – especially from Tex, Jim & Dennis.
Please find a second version of the document – with some work still needed in the highlighted sections.
https://docs.google.com/document/d/1Zahm4ZVS9lH9Zx8v-8PRhOnTV7fmEI8tStgLUBCk... <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.google...>
Also attached as a PDF
Your comments would be most welcome indeed.
Thanks.
Don
On Wed, Aug 09, 2017 at 04:13:35PM +0000, Mark Svancarek via UA-discuss wrote:
Actually, we recently discovered an Edge bug (via the browser review) where the order of labels in a RTL.RTL.ASCII domain name were transposed during rendering. So I like calling it out explicitly.
This has been a regularly-recurring bug in various rendering engines since at least 2008, because I recall the demonstrations of it during the idnabis WG, and then seeing it in a completely different context during the VIP work for ICANN in 2011 or '12. It's not always only Arabic: at least one of the examples was reproducible in any bidi context. I seem to recall one example where the wire order [firstlabel]RTL[secondlabel]RTL[thirdlabel]LTR[fourthlabel]NULL got rendered as RTL.LTR.RTL Which I thought was a pretty cool bug. I have no idea how it happened that way, though I recall walking mysef through the bidi algorithm at the time and figuring out what the problem must have been. Bidi is hard. I therefore think it wise not to call out Arabic especially -- but maybe point out that Arabic is perhaps the most prominent writing system that uses RTL, so that programmers aren't tempted to dismiss the problem as a "corner case". Big corner, the Arabic-using population! Best regards, A -- Andrew Sullivan ajs@anvilwalrusden.com
Makes sense to me; I like mentioning the major-use writing system to make the point, but it also makes it clear that it is broader than a single case. --Rich Richard Merdinger VP, Domains - GoDaddy rmerdinger@godaddy.com<mailto:rmerdinger@godaddy.com> From: <ua-discuss-bounces@icann.org> on behalf of Andrew Sullivan <ajs@anvilwalrusden.com> Date: Wednesday, August 9, 2017 at 3:19 PM To: "ua-discuss@icann.org" <ua-discuss@icann.org> Subject: Re: [UA-discuss] Programming Language Hacks - UA103 On Wed, Aug 09, 2017 at 04:13:35PM +0000, Mark Svancarek via UA-discuss wrote: Actually, we recently discovered an Edge bug (via the browser review) where the order of labels in a RTL.RTL.ASCII domain name were transposed during rendering. So I like calling it out explicitly. This has been a regularly-recurring bug in various rendering engines since at least 2008, because I recall the demonstrations of it during the idnabis WG, and then seeing it in a completely different context during the VIP work for ICANN in 2011 or '12. It's not always only Arabic: at least one of the examples was reproducible in any bidi context. I seem to recall one example where the wire order [firstlabel]RTL[secondlabel]RTL[thirdlabel]LTR[fourthlabel]NULL got rendered as RTL.LTR.RTL Which I thought was a pretty cool bug. I have no idea how it happened that way, though I recall walking mysef through the bidi algorithm at the time and figuring out what the problem must have been. Bidi is hard. I therefore think it wise not to call out Arabic especially -- but maybe point out that Arabic is perhaps the most prominent writing system that uses RTL, so that programmers aren't tempted to dismiss the problem as a "corner case". Big corner, the Arabic-using population! Best regards, A -- Andrew Sullivan ajs@anvilwalrusden.com<mailto:ajs@anvilwalrusden.com>
Yes, bidi is hard but fascinating. From my work with text stacks, my understanding is that the assumption that something that is rtl.rtl.ltr has a predetermined rendering order is incorrect. It really will depend upon what is seen as the first strongly typed character in the first domain name. The Arabic/Hebrew/N’ko scripts all have an RTL script order within the RTL text direction for each language. Arabic and Hebrew both have characters commonly used (Unicode common) that the BiDi algorithm is required to treat as strongly typed LRT script order. Because of that, I doubt it’s enough to specify just the text direction for each element. From: ua-discuss-bounces@icann.org [mailto:ua-discuss-bounces@icann.org] On Behalf Of Richard Merdinger Sent: Wednesday, August 9, 2017 1:31 PM To: Andrew Sullivan <ajs@anvilwalrusden.com>; ua-discuss@icann.org Subject: Re: [UA-discuss] Programming Language Hacks - UA103 Makes sense to me; I like mentioning the major-use writing system to make the point, but it also makes it clear that it is broader than a single case. --Rich Richard Merdinger VP, Domains - GoDaddy rmerdinger@godaddy.com<mailto:rmerdinger@godaddy.com> From: <ua-discuss-bounces@icann.org<mailto:ua-discuss-bounces@icann.org>> on behalf of Andrew Sullivan <ajs@anvilwalrusden.com<mailto:ajs@anvilwalrusden.com>> Date: Wednesday, August 9, 2017 at 3:19 PM To: "ua-discuss@icann.org<mailto:ua-discuss@icann.org>" <ua-discuss@icann.org<mailto:ua-discuss@icann.org>> Subject: Re: [UA-discuss] Programming Language Hacks - UA103 On Wed, Aug 09, 2017 at 04:13:35PM +0000, Mark Svancarek via UA-discuss wrote: Actually, we recently discovered an Edge bug (via the browser review) where the order of labels in a RTL.RTL.ASCII domain name were transposed during rendering. So I like calling it out explicitly. This has been a regularly-recurring bug in various rendering engines since at least 2008, because I recall the demonstrations of it during the idnabis WG, and then seeing it in a completely different context during the VIP work for ICANN in 2011 or '12. It's not always only Arabic: at least one of the examples was reproducible in any bidi context. I seem to recall one example where the wire order [firstlabel]RTL[secondlabel]RTL[thirdlabel]LTR[fourthlabel]NULL got rendered as RTL.LTR.RTL Which I thought was a pretty cool bug. I have no idea how it happened that way, though I recall walking mysef through the bidi algorithm at the time and figuring out what the problem must have been. Bidi is hard. I therefore think it wise not to call out Arabic especially -- but maybe point out that Arabic is perhaps the most prominent writing system that uses RTL, so that programmers aren't tempted to dismiss the problem as a "corner case". Big corner, the Arabic-using population! Best regards, A -- Andrew Sullivan ajs@anvilwalrusden.com<mailto:ajs@anvilwalrusden.com>
On Wed, Aug 09, 2017 at 08:46:27PM +0000, Stuart Stuple wrote:
From my work with text stacks, my understanding is that the assumption that something that is rtl.rtl.ltr has a predetermined rendering order is incorrect. It really will depend upon what is seen as the first strongly typed character in the first domain name. The Arabic/Hebrew/N’ko scripts all have an RTL script order within the RTL text direction for each language. Arabic and Hebrew both have characters commonly used (Unicode common) that the BiDi algorithm is required to treat as strongly typed LRT script order. Because of that, I doubt it’s enough to specify just the text direction for each element.
Well, yes, but this is why IDNA2008 also has a bidi doc. You need to work through both to make it work. A -- Andrew Sullivan ajs@anvilwalrusden.com
participants (8)
-
Andrew Sullivan -
Don Hollander -
Hazem Hezzah -
Mark Svancarek -
Maxim Alzoba -
Richard Merdinger -
Stuart Stuple -
Tan Tanaka, Dennis