Re: [UA-EAI] [Ext] EAI Working group Archives
<minor point> Preventing confusing UI is *possibly* (not definitely) a UASG non-goal, but it's certainly my concern, so it's not something I'd like to gloss over. I suspect that a typical non-user of Arabic would find Arabic script more desirable than boxes or punycode. That's something usability folks can test. Personally, if I were expecting an email from an Arabic-scripting colleague, it would make a big difference to me. <major point> But my goal for EAI is not to enable the people I meet at ICANN, or similar globe-trotters, to exchange emails using unmatching scripts. I am interested in allowing people who use a single script to communicate with confidence to their friends and family and government and banks and insurance companies who use the same script. We have some research indicating that this probably improves usability, confidence and usage overall. In those examples, we are less likely to see these complex scenarios of multiple users replying-all in multiple scripts. Agreeing on a least-worst downgrading solution is important, but it's always going to be a hack and I think it's okay that it will be imperfect. The goal is for full adoption of the spec, at least within non-ASCII script zones. The target users are people who read only one non-ASCII script and who communicate exclusively, or almost exclusively, with others who use that same script. /marksv -----Original Message----- From: John C Klensin [mailto:klensin@jck.com] Sent: Wednesday, November 8, 2017 6:47 AM To: John R. Levine <johnl@iecc.com> Cc: Don Hollander <don.hollander@icann.org>; Joseph Yee <jyee@afilias.info>; HEALTH Yao <yaojk@cnnic.cn>; Mark Svancarek <marksv@microsoft.com>; Barry Leiba <barryleiba@computer.org> Subject: Re: [Ext] EAI Working group Archives --On Wednesday, November 8, 2017 09:22 -0500 "John R. Levine" <johnl@iecc.com> wrote:
On Wed, 8 Nov 2017, John C Klensin wrote:
To illustrate another part of the issue with an example, at least one very well-known operating system make a large distinction between languages the user identifies to the OS as familiar (and installs special IMEs and rendering software for them) and others which may not be handled as well or even treated as potentially hostile. So you are asking for good treatment of languages and scripts that are not identified as known by the user and that is somewhat contradictory of that OS's i19n design philosophy.
It seems to me that people typically configure their computer to handle the languages they understand. If my computer is set up to handle Chinese and Korean, and I get a message in Arabic, it doesn't matter if the Arabic is displayed poorly since I can't read it anyway.
I think that is part of the OS design assumption I referred to. I think it is problematic in some ways, but still quite rational for the reason you give. But "it doesn't matter if X is displayed badly" is inconsistent with Don's "the way the originator intended" formulation; my main point was that one needs to be quite careful about such formulations and their implications. For better or worse, "ok if an unfamiliar language is displayed poorly" also opens up some spoofing risks and other attack vectors that would present less opportunity if proper rendering could be assumed, but I suppose that is not a Universal Acceptance concern (see last week's slide). john
I suspect that a typical non-user of Arabic would find Arabic script more desirable than boxes or punycode. That's something usability folks can test. Personally, if I were expecting an email from an Arabic-scripting colleague, it would make a big difference to me.
In my (mostly non-Microsoft) experience, any system that handles Unicode will show Arabic as Arabic. What it won't do is to handle messy cases like combined roman and Arabic, nor will it have a usable input method. But it shouldn't show boxes. Regards, John Levine, johnl@iecc.com, Primary Perpetrator of "The Internet for Dummies", Please consider the environment before reading this e-mail. https://jl.ly PS: EAI maiboxes are not U-labels so punycode isn't relevant.
If the fonts aren't installed, you may see boxes. That's system-specific and user-specific and app-specific. I agree with you that it's pretty weak circa 2017. You may be right that boxes have been completely eliminated at this point, but I am not sure sure. While true that local parts aren't U-LABELS, and that punycode isn't defined for use outside of U-LABELS, some email services do resort to using ACE-style conversion in the local part as a downgrading technique - I've received them. They mostly work because in actual practice there are very few local parts that start with the "xn--" prefix. I agree that they are very phishy and are undesirable. -----Original Message----- From: John R. Levine [mailto:johnl@iecc.com] Sent: Wednesday, November 8, 2017 10:09 AM To: Mark Svancarek <marksv@microsoft.com> Cc: John C Klensin <klensin@jck.com>; Don Hollander <don.hollander@icann.org>; Joseph Yee <jyee@afilias.info>; HEALTH Yao <yaojk@cnnic.cn>; Barry Leiba <barryleiba@computer.org>; ua-eai@icann.org Subject: RE: [Ext] EAI Working group Archives
I suspect that a typical non-user of Arabic would find Arabic script more desirable than boxes or punycode. That's something usability folks can test. Personally, if I were expecting an email from an Arabic-scripting colleague, it would make a big difference to me.
In my (mostly non-Microsoft) experience, any system that handles Unicode will show Arabic as Arabic. What it won't do is to handle messy cases like combined roman and Arabic, nor will it have a usable input method. But it shouldn't show boxes. Regards, John Levine, johnl@iecc.com, Primary Perpetrator of "The Internet for Dummies", Please consider the environment before reading this e-mail. https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fjl.ly&data=... PS: EAI maiboxes are not U-labels so punycode isn't relevant.
On Wed, 8 Nov 2017, Mark Svancarek wrote:
If the fonts aren't installed, you may see boxes. That's system-specific and user-specific and app-specific. I agree with you that it's pretty weak circa 2017. You may be right that boxes have been completely eliminated at this point, but I am not sure sure.
At some point, you just lose. If I send you Arabic text and you don't have an Arabic font, unless you plan to do something heroic like send it to a faraway rendering service that returns a png image, what are you going to do?
While true that local parts aren't U-LABELS, and that punycode isn't defined for use outside of U-LABELS, some email services do resort to using ACE-style conversion in the local part as a downgrading technique - I've received them.
That's on the sending end. On the receiving end I don't see any point in inventing faux punycode just for a differently ugly display.
-----Original Message----- From: John R. Levine [mailto:johnl@iecc.com] Sent: Wednesday, November 8, 2017 10:09 AM To: Mark Svancarek <marksv@microsoft.com> Cc: John C Klensin <klensin@jck.com>; Don Hollander <don.hollander@icann.org>; Joseph Yee <jyee@afilias.info>; HEALTH Yao <yaojk@cnnic.cn>; Barry Leiba <barryleiba@computer.org>; ua-eai@icann.org Subject: RE: [Ext] EAI Working group Archives
I suspect that a typical non-user of Arabic would find Arabic script more desirable than boxes or punycode. That's something usability folks can test. Personally, if I were expecting an email from an Arabic-scripting colleague, it would make a big difference to me.
In my (mostly non-Microsoft) experience, any system that handles Unicode will show Arabic as Arabic. What it won't do is to handle messy cases like combined roman and Arabic, nor will it have a usable input method. But it shouldn't show boxes.
Regards, John Levine, johnl@iecc.com, Primary Perpetrator of "The Internet for Dummies", Please consider the environment before reading this e-mail. https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fjl.ly&data=...
If the fonts aren't installed, you may see boxes. That's system-specific and user-specific and app-specific. I agree with you that it's pretty weak circa 2017. You may be right that boxes have been completely eliminated at this point, but I am not sure sure.
At some point, you just lose. If I send you Arabic text and you don't have an Arabic font, unless you plan to do something heroic like send it to a faraway rendering service that returns a png image, what are you going to do?
Of course. But the main point is that most modern systems are able to display a good deal of the commonly used scripts without extra downloads. My iPad, configured for dead-standard English, can still display Wiki pages in Russian and Greek and Hebrew and Arabic and Chinese and Japanese and Korean and.... If I want a nicer Chinese font, I can deal with that myself, but the basic support is there and I don't see boxes. I even get Mongolian and Thai. b
I'm happy to be wrong on this topic! -----Original Message----- From: barryleiba@gmail.com [mailto:barryleiba@gmail.com] On Behalf Of Barry Leiba Sent: Wednesday, November 8, 2017 10:41 AM To: John R. Levine <johnl@iecc.com> Cc: Mark Svancarek <marksv@microsoft.com>; John C Klensin <klensin@jck.com>; Don Hollander <don.hollander@icann.org>; Joseph Yee <jyee@afilias.info>; HEALTH Yao <yaojk@cnnic.cn>; ua-eai@icann.org Subject: Re: [Ext] EAI Working group Archives
If the fonts aren't installed, you may see boxes. That's system-specific and user-specific and app-specific. I agree with you that it's pretty weak circa 2017. You may be right that boxes have been completely eliminated at this point, but I am not sure sure.
At some point, you just lose. If I send you Arabic text and you don't have an Arabic font, unless you plan to do something heroic like send it to a faraway rendering service that returns a png image, what are you going to do?
Of course. But the main point is that most modern systems are able to display a good deal of the commonly used scripts without extra downloads. My iPad, configured for dead-standard English, can still display Wiki pages in Russian and Greek and Hebrew and Arabic and Chinese and Japanese and Korean and.... If I want a nicer Chinese font, I can deal with that myself, but the basic support is there and I don't see boxes. I even get Mongolian and Thai. b
Of course. But the main point is that most modern systems are able to display a good deal of the commonly used scripts without extra downloads. My iPad, configured for dead-standard English, can still display Wiki pages in Russian and Greek and Hebrew and Arabic and Chinese and Japanese and Korean and.... If I want a nicer Chinese font, I can deal with that myself, but the basic support is there and I don't see boxes. I even get Mongolian and Thai.
I use alpine in the iTerm2 terminal emulator on a Mac, and it displays all sorts of Unicode just fine. If it can do it, anything can. R's, John
--On Wednesday, November 8, 2017 22:41 +0400 Barry Leiba <barryleiba@computer.org> wrote:
At some point, you just lose. If I send you Arabic text and you don't have an Arabic font, unless you plan to do something heroic like send it to a faraway rendering service that returns a png image, what are you going to do?
Of course. But the main point is that most modern systems are able to display a good deal of the commonly used scripts without extra downloads. My iPad, configured for dead-standard English, can still display Wiki pages in Russian and Greek and Hebrew and Arabic and Chinese and Japanese and Korean and.... If I want a nicer Chinese font, I can deal with that myself, but the basic support is there and I don't see boxes. I even get Mongolian and Thai.
Right... about this and several prior comments, however three observations: (1) A system can have (and be able to display) Arabic fonts without having the slightest clue about rendering, either at the bidi level (which is needed as soon as, e.g., digits appear in the text, not just for mixed-script or embedded script text) or with the much greater complexity (and interactions with other assumptions) implied by proposed UTR#53. Personally, I agree that it still much better than boxes: not only is some clue about what I've looking at useful (e.g., I can rather easily tell Arabic script from Devanagari script even though I cannot read either) but any given box is almost guaranteed to be confusable with any other box. (2) My main concern in raising this set of issues is to illustrate that it is very important that the Universal Acceptance effort be extremely clear about what it is talking about and what its expectations are, especially given my belief (and, apparently, given other comments above desires and preferences, the beliefs of several others) that "have it appear the way the originator intended" is not a realistic goal. (3) Mark wrote, in part.... "...to safely allow the creation of #1, which has historically been a concern; John L could probably do it in his sleep" With no disrespect to John, this actually illustrates part of the problem I was trying to identify. Certainly I could do it in my sleep (and, if they are familiar enough with some of the subtle details of SMTP, Joseph or Jiankang probably could too). But what you would be getting is my /our interpretation of what the documents say based on our understanding and recollections of the working group's discussions and intent, not just what the documents say. That is good from some points of view and bad from others. By contrast, John L was not very active in the EAI work during most of its duration. If he did it, in his sleep or otherwise, you would probably get something closer (if there is a difference) to what the documents actually say. That is an advantage in some respects, but note that there are outstanding errata against some of those documents, identifying places were they were not sufficiently clear. Barry was more active during that work, but, again, differently involved. Were he to do a set of tests, I would expect them to fall somewhere between John L's set and my set, whatever "somewhere between" means. Things are complicated somewhat by the observation that SMTP, and to a slightly lesser extent, the EAI/SMTPUTF8 specs were very much influenced, and written with knowledge of, what is often referred to as the Postel Principle. That makes experience and judgment much more important than simply reading specifications as a Protocol Lawyer might (not that John, Joseph, Jiankang, Barry, or I would do that). So, if John L does a test suite, you get his interpretation of what the standards say and what is important. If I were to do it, you would get my interpretation. Because our experiences, as well as our exposure to the development process for the protocols, are different, you would almost certainly get different tests, testing different things. That may be fine at least as long as you make it clear that what is being tested is UA's expectations or a set of expectations UA is signing off on, not tests of compliance to the standard. john
If you mean that different writers might prioritize different use cases, I agree. If you mean that different writers might consider certain robustness scenarios differently (e.g. whether one should accept punycode in the local part, where it is undefined), I can understand that, too. But I assume there is a core protocol defined in the documents which we can all agree unequivocally must be supported in a specific way... is that not the case? If that is the case, I would write tests which exercise that functionality explicitly and cover the robustness cases with caveats and warnings. -----Original Message----- From: John C Klensin [mailto:klensin@jck.com] Sent: Wednesday, November 8, 2017 12:00 PM To: Barry Leiba <barryleiba@computer.org>; John R. Levine <johnl@iecc.com> Cc: Mark Svancarek <marksv@microsoft.com>; Don Hollander <don.hollander@icann.org>; Joseph Yee <jyee@afilias.info>; HEALTH Yao <yaojk@cnnic.cn>; ua-eai@icann.org Subject: Re: [Ext] EAI Working group Archives --On Wednesday, November 8, 2017 22:41 +0400 Barry Leiba <barryleiba@computer.org> wrote:
At some point, you just lose. If I send you Arabic text and you don't have an Arabic font, unless you plan to do something heroic like send it to a faraway rendering service that returns a png image, what are you going to do?
Of course. But the main point is that most modern systems are able to display a good deal of the commonly used scripts without extra downloads. My iPad, configured for dead-standard English, can still display Wiki pages in Russian and Greek and Hebrew and Arabic and Chinese and Japanese and Korean and.... If I want a nicer Chinese font, I can deal with that myself, but the basic support is there and I don't see boxes. I even get Mongolian and Thai.
Right... about this and several prior comments, however three observations: (1) A system can have (and be able to display) Arabic fonts without having the slightest clue about rendering, either at the bidi level (which is needed as soon as, e.g., digits appear in the text, not just for mixed-script or embedded script text) or with the much greater complexity (and interactions with other assumptions) implied by proposed UTR#53. Personally, I agree that it still much better than boxes: not only is some clue about what I've looking at useful (e.g., I can rather easily tell Arabic script from Devanagari script even though I cannot read either) but any given box is almost guaranteed to be confusable with any other box. (2) My main concern in raising this set of issues is to illustrate that it is very important that the Universal Acceptance effort be extremely clear about what it is talking about and what its expectations are, especially given my belief (and, apparently, given other comments above desires and preferences, the beliefs of several others) that "have it appear the way the originator intended" is not a realistic goal. (3) Mark wrote, in part.... "...to safely allow the creation of #1, which has historically been a concern; John L could probably do it in his sleep" With no disrespect to John, this actually illustrates part of the problem I was trying to identify. Certainly I could do it in my sleep (and, if they are familiar enough with some of the subtle details of SMTP, Joseph or Jiankang probably could too). But what you would be getting is my /our interpretation of what the documents say based on our understanding and recollections of the working group's discussions and intent, not just what the documents say. That is good from some points of view and bad from others. By contrast, John L was not very active in the EAI work during most of its duration. If he did it, in his sleep or otherwise, you would probably get something closer (if there is a difference) to what the documents actually say. That is an advantage in some respects, but note that there are outstanding errata against some of those documents, identifying places were they were not sufficiently clear. Barry was more active during that work, but, again, differently involved. Were he to do a set of tests, I would expect them to fall somewhere between John L's set and my set, whatever "somewhere between" means. Things are complicated somewhat by the observation that SMTP, and to a slightly lesser extent, the EAI/SMTPUTF8 specs were very much influenced, and written with knowledge of, what is often referred to as the Postel Principle. That makes experience and judgment much more important than simply reading specifications as a Protocol Lawyer might (not that John, Joseph, Jiankang, Barry, or I would do that). So, if John L does a test suite, you get his interpretation of what the standards say and what is important. If I were to do it, you would get my interpretation. Because our experiences, as well as our exposure to the development process for the protocols, are different, you would almost certainly get different tests, testing different things. That may be fine at least as long as you make it clear that what is being tested is UA's expectations or a set of expectations UA is signing off on, not tests of compliance to the standard. john
But my goal for EAI is not to enable the people I meet at ICANN, or similar globe-trotters, to exchange emails using unmatching scripts. I am interested in allowing people who use a single script to communicate with confidence to their friends and family and government and banks and insurance companies who use the same script.
Yes. This is why we abandoned attempts to do general downgrades, once it became clear how much of a mess it was. Downgrades improve many use cases, but they're not necessary for the primary use case of allowing, say, Arabs to communicate with other Arabs in Arabic, using email addresses in Arabic. Barry
+1 -----Original Message----- From: barryleiba@gmail.com [mailto:barryleiba@gmail.com] On Behalf Of Barry Leiba Sent: Wednesday, November 8, 2017 10:36 AM To: Mark Svancarek <marksv@microsoft.com> Cc: John C Klensin <klensin@jck.com>; John R. Levine <johnl@iecc.com>; Don Hollander <don.hollander@icann.org>; Joseph Yee <jyee@afilias.info>; HEALTH Yao <yaojk@cnnic.cn>; ua-eai@icann.org Subject: Re: [Ext] EAI Working group Archives
But my goal for EAI is not to enable the people I meet at ICANN, or similar globe-trotters, to exchange emails using unmatching scripts. I am interested in allowing people who use a single script to communicate with confidence to their friends and family and government and banks and insurance companies who use the same script.
Yes. This is why we abandoned attempts to do general downgrades, once it became clear how much of a mess it was. Downgrades improve many use cases, but they're not necessary for the primary use case of allowing, say, Arabs to communicate with other Arabs in Arabic, using email addresses in Arabic. Barry
participants (4)
-
Barry Leiba -
John C Klensin -
John R. Levine -
Mark Svancarek