Re: [UA-EAI] FW: comments on uasg019b
Thanks for the comments. As an overall comment, I expect these slides to be used in a presentation by someone who is reasonably familiar with the technology and doesn't just read the slides aloud (aka Death by Powerpoint). The acronyms should all be in the notes if need be.
slide 12 MSA should be defined on previous slide. And I know it is ridiculously picky but it should be sending/receiving or sender/recipient MTAs.
Fixed to sender to match previous slide.
slide 14 the info about single and multiple scripts seems irrelevant.
It's come up, people ask why you can't just restrict identifiers to a single script.
The process of relating characters to values a computer can use is not mapping but character encoding.
Fixed.
slide 15 is correct that mapping relates characters to numbers. That is a different statement than the previous slide. (Those numbers are in turn encoded.)
Right, tightened up the text while I was there.
slide 17 should explain the U+hhhh nomenclature as a way of indicating the number assigned to a unicode character consisting of 4 to 6 hex digits appended to U+.
In the notes, but added it to the slide since people will doubtless see it.
slide 21. remove the comment that bidi text is often confusing to users. Seems pejorative.
It sure can when people deliberately mix scripts in addresses to confuse recipients.
slide 22- i dont understand the intent of saying domain can contain any TLD. Any part of the domain can be unicode. Maybe you mean any IDN.
Fixed to refer to IDNs.
23 - Is there a recommended practice for validating user parts?
Other than sending mail and seeing if anyone responds, no.
What should MUA do with a failed message? Anything different from other failures?
Nope.
24- fuzzy matching seems like a bad idea. Even case mapping is a bad idea when multiple languages are introduced, as the case rules can change, even for ascii characters.
An MTA that didn't do fuzzy matching would be unusuable. Remember these are not IDNs, local parts are interpereted only by the recipient MTA and each MTA uses whatever fuzz rules make sense.
If fuzzy matching is to be recommended, the rules should be specific.
No. They can't be.
what is the scenario where mail is sent from an mta to an mta that doesn't support EAI?
If this is going to an intermediate MTA wouldnt it be better to attempt to route to a different intermediate mta?
It fails. In the SMTP world there's no such thing as an intermediate MTA.
If the mta is the destination mta, then why wouldnt it accept the email for the domain that it represents?
An MTA that supports EAI can accept EAI mail for all or some of its mailboxes depending on whether they expect the recipients to handle EAI. An MTA that doesn't support EAI accepts no EAI mail.
Perhaps, the scenario is I send an email to 2 people one uses an EAI, the other not. The non-EAI recipient rejects the mail because of the EAI address in the to-field, and the SMTPUTF8 commands, even if it isn't the local destination address?
Yes, that is one scenario.
27- avoiding easily confused characters seems too vague. We want to prevent "One" vs "0ne" (the letter vs the digit) but we don't want to eliminate the use of zeroes.
It has to be vague -- easily confused is ill defined. We've been living with addresses like operat0r@hotmail.com (try it) for decades but we can try not to make the situation worse.
Also, names that differ only in accents is too limiting. And by restricting accents, it implicitly makes the names ascii equivalents. If this is the recommendation then it undermines the benefit of EAI.
This also seems like a latin based viewpoint. What does this mean for languages that heavily use modifiers?
That's not what it says or what it means. The idea is that if you make bób and bøb different mailboxes, you will be sorry. Updated slightly.
It would be better to insist on exact matches.
Trust me, that is a recipe for eternal hatred from your users and their correspondents. I am reasonably sure your MTA accept TEXTEXIN@ as equivalent to textexin@.
Slide 29 Should discuss mailbox or address downgrading not message downgrading
Sorry, can you provide a reference for mailbox or address downgrading? They are not terms that are used anywhere in the EAI standards. Message downgrading is discussed in RFCs 6857 and 6858.
32 the summary talks about non-latin support, but EAI is about supporting languages that are latin-based too.
Perhaps refer to languages that require characters outside of ascii.
Fixed. Regards, John Levine, john.levine@standcore.com Standcore LLC
2 авг. 2018 г., в 5:41 ПП, John Levine <john.levine@standcore.com> написал(а):
slide 21. remove the comment that bidi text is often confusing to users. Seems pejorative.
It sure can when people deliberately mix scripts in addresses to confuse recipients.
But that’s not what the slide says. Here’s the bullet: • * Displaying bi-directional text is complex and often confusing to users. Displaying bidi text is indeed complex and might be confusing to *developers* but properly displayed bidi text is not “often” confusing to *users*. Deliberately messed up bidi text is confusing, but how often are *users* seeing deliberately messed up text? Paul
some further comments slide 17, can we please either use characters for which there is a font or embed a font that supports these glyphs? slide 21: • includes words from English or other scripts → English is not a script, either say "English or other languages" or "Latin or other scripts” slide 23: "Display headings and prompts in the user’s language”: what are prompts? what are headings? is this supposed to say “headers”? slide 34: Codespace - Range that define the lower → *defines Code Points - A code point or code position is any of the numerical values that make up the code space. They are used to distinguish both, the → remove the comma after both and the comma after bits. slide 35: .рф is an abbreviation for RF = Russian Federation. Not sure that saying .рф (Russia) is accurate. IDN - Internationalized Domain Names. IDNs are domain names that include characters used in the local representation of languages → I’m not sure there is a need to say “the local representation”. I would say “that include characters typically used to write the language”. Note that one language can have multiple “local” representations. slide 37: UA-ready Software or UA-Readiness - Universal Acceptance Ready Software. It is a software → software is uncountable; drop the article. general: Vietnamese is a Latin-based language that many systems will gleefully choke on. I feel like it should get at least a passing mention here.
3 авг. 2018 г., в 1:21 ДП, Paul Borokhov <borokhov@apple.com> написал(а):
2 авг. 2018 г., в 5:41 ПП, John Levine <john.levine@standcore.com> написал(а):
slide 21. remove the comment that bidi text is often confusing to users. Seems pejorative.
It sure can when people deliberately mix scripts in addresses to confuse recipients.
But that’s not what the slide says. Here’s the bullet:
• * Displaying bi-directional text is complex and often confusing to users.
Displaying bidi text is indeed complex and might be confusing to *developers* but properly displayed bidi text is not “often” confusing to *users*. Deliberately messed up bidi text is confusing, but how often are *users* seeing deliberately messed up text?
Paul _______________________________________________ UA-EAI mailing list UA-EAI@icann.org https://mm.icann.org/mailman/listinfo/ua-eai
slide 17, can we please either use characters for which there is a font or embed a font that supports these glyphs?
Yeah, that was left over from the old version. Apparently those characters rendered on someone's PC but they don't work on my Mac. Can you suggest some exotic looking characters that will display reliably on the computers that are likely to run Powerpoint?
slide 21: • includes words from English or other scripts → English is not a script, either say "English or other languages" or "Latin or other scripts”
Fixed to Latin
slide 23: "Display headings and prompts in the user’s language”: what are prompts? what are headings? is this supposed to say “headers”?
This is the MUA. Headings like "Folders" on the list of folders or "Status" and "Date" on the list of messages. Prompts like "Compress folder now?" I don't think those are obscure terms. Whether to translate header field names like From: and Subject: when displaying messages is a religious issue that I would prefer to stay away from.
slide 34: Codespace - Range that define the lower → *defines
The term isn't used anywhere in the slides so I took it out.
Code Points - A code point or code position is any of the numerical values that make up the code space. They are used to distinguish both, the → remove the comma after both and the comma after bits.
Fixed.
slide 35: .рф is an abbreviation for RF = Russian Federation. Not sure that saying .рф (Russia) is accurate.
It's the same registry as .ru, I think we're fine.
IDN - Internationalized Domain Names. IDNs are domain names that include characters used in the local representation of languages → I’m not sure there is a need to say “the local representation”. I would say “that include characters typically used to write the language”. Note that one language can have multiple “local” representations.
slide 37: UA-ready Software or UA-Readiness - Universal Acceptance Ready Software. It is a software → software is uncountable; drop the article.
Unscrambled.
general: Vietnamese is a Latin-based language that many systems will gleefully choke on. I feel like it should get at least a passing mention here.
That seems kind of into the weeds here. The number of ways that software can screw up is unlimited. R's, John
2 авг. 2018 г., в 5:41 ПП, John Levine <john.levine@standcore.com> написал(а):
slide 21. remove the comment that bidi text is often confusing to users. Seems pejorative.
It sure can when people deliberately mix scripts in addresses to confuse recipients.
But that’s not what the slide says. Here’s the bullet:
• * Displaying bi-directional text is complex and often confusing to users.
Displaying bidi text is indeed complex and might be confusing to *developers* but properly displayed bidi text is not “often” confusing to *users*. Deliberately messed up bidi text is confusing, but how often are *users* seeing deliberately messed up text?
Paul _______________________________________________ UA-EAI mailing list UA-EAI@icann.org https://mm.icann.org/mailman/listinfo/ua-eai
Regards, John Levine, john.levine@standcore.com Standcore LLC
Here's a copy with the current edits. It's a PDF to avoid strangeness with different flavors of Powerpoint rendering stuff differently. https://drive.google.com/file/d/1SiWpheE_XF4VZSvEY42g36hvgi_-JWZy/view?usp=s... Regards, John Levine, john.levine@standcore.com Standcore LLC
3 авг. 2018 г., в 9:04 ДП, John Levine <john.levine@standcore.com> написал(а):
That seems kind of into the weeds here. The number of ways that software can screw up is unlimited.
R's, John
My point mainly was that we call out other scripts but we should mention that even in Latin things can be problematic.
I changed it to can be confusing. On Fri, 3 Aug 2018, Paul Borokhov wrote:
2 авг. 2018 г., в 5:41 ПП, John Levine <john.levine@standcore.com> написал(а):
slide 21. remove the comment that bidi text is often confusing to users. Seems pejorative.
It sure can when people deliberately mix scripts in addresses to confuse recipients.
But that’s not what the slide says. Here’s the bullet:
• * Displaying bi-directional text is complex and often confusing to users.
Displaying bidi text is indeed complex and might be confusing to *developers* but properly displayed bidi text is not “often” confusing to *users*. Deliberately messed up bidi text is confusing, but how often are *users* seeing deliberately messed up text?
Paul
Regards, John Levine, john.levine@standcore.com Standcore LLC
On Fri, 03 Aug 2018 23:53:05 +1000, John Levine <john.levine@standcore.com> wrote:
I changed it to can be confusing.
I think you want something like "can be used to confuse users". The sense is not that this is inherently more confusing (whether it is or not) but that there is a real situation where people deliberately confuse users, for malicious purposes... cheers
On Fri, 3 Aug 2018, Paul Borokhov wrote:
2 авг. 2018 г., в 5:41 ПП, John Levine <john.levine@standcore.com> написал(а):
slide 21. remove the comment that bidi text is often confusing to users. Seems pejorative.
It sure can when people deliberately mix scripts in addresses to confuse recipients.
But that’s not what the slide says. Here’s the bullet:
• * Displaying bi-directional text is complex and often confusing to users.
Displaying bidi text is indeed complex and might be confusing to *developers* but properly displayed bidi text is not “often” confusing to *users*. Deliberately messed up bidi text is confusing, but how often are *users* seeing deliberately messed up text?
Paul
Regards, John Levine, john.levine@standcore.com Standcore LLC
-- Chaals: Charles (McCathie) Nevile find more at https://yandex.com Using Opera's long-abandoned mail client: http://www.opera.com/mail/ Is there really still nothing better?
Yes, bidi can be confusing. It isn't to most people though, because it mostly is the normal way to show English words and Dotcom names in normal text. If you want to talk about risks, do so. If not, don't. Talking too much about risks risks making people think mostly about that. Risks are like boobs: people love to focus on them instead of the more important subject, whatever that subject may be. So be careful with mentioning them. Arnt PS: bit preoccupied at the moment, so not much mail. Sorry about any slow replies.
In article <kLPIrW9/iTLaImAdP1ajH1aBa8nhtc0tSh7dt1Itfrc=.sha-256@antelope.email> you write:
Yes, bidi can be confusing. It isn't to most people though, because it mostly is the normal way to show English words and Dotcom names in normal text.
If you want to talk about risks, do so. If not, don't.
Sheesh. It's one bullet in a 37 page deck. I said bidi "can be confusing" which is certainly true particularly if hostile parties are writing the text you're displaying, which is not unlikely if it's mail from a typical mail stream. Anything we noticed on the other 36 slides? R's, John
But can’t Cyrillic or Latin be confusing too by that metric? After all how many people can tell the difference between р and p?
5 авг. 2018 г., в 5:20 ПП, John Levine <john.levine@standcore.com> написал(а):
In article <kLPIrW9/iTLaImAdP1ajH1aBa8nhtc0tSh7dt1Itfrc=.sha-256@antelope.email> you write:
Yes, bidi can be confusing. It isn't to most people though, because it mostly is the normal way to show English words and Dotcom names in normal text.
If you want to talk about risks, do so. If not, don't.
Sheesh. It's one bullet in a 37 page deck. I said bidi "can be confusing" which is certainly true particularly if hostile parties are writing the text you're displaying, which is not unlikely if it's mail from a typical mail stream.
Anything we noticed on the other 36 slides?
R's, John
_______________________________________________ UA-EAI mailing list UA-EAI@icann.org https://mm.icann.org/mailman/listinfo/ua-eai
But can’t Cyrillic or Latin be confusing too by that metric? After all how many people can tell the difference between р and p?
Clearly I have breached some unwritten rule about bidi so I've removed all mention of it.
5 авг. 2018 г., в 5:20 ПП, John Levine <john.levine@standcore.com> написал(а):
In article <kLPIrW9/iTLaImAdP1ajH1aBa8nhtc0tSh7dt1Itfrc=.sha-256@antelope.email> you write:
Yes, bidi can be confusing. It isn't to most people though, because it mostly is the normal way to show English words and Dotcom names in normal text.
If you want to talk about risks, do so. If not, don't.
Sheesh. It's one bullet in a 37 page deck. I said bidi "can be confusing" which is certainly true particularly if hostile parties are writing the text you're displaying, which is not unlikely if it's mail from a typical mail stream.
Anything we noticed on the other 36 slides?
R's, John
_______________________________________________ UA-EAI mailing list UA-EAI@icann.org https://mm.icann.org/mailman/listinfo/ua-eai
Regards, John Levine, john.levine@standcore.com Standcore LLC
Here are my belated comments on ver0803 6 Quantity ambiguous in Title - "a global users" 20 Punycode in 3rd bullet ("xn----f38...") looks invalid 21 "Mixtures of LTR and RTL text in a single domain name is outside of the scope of this lesson." 24 "Fuzzy" should be mentioned in the glossary 27 The comments field provides a workable definition of "fuzzy", which could be used in slide 24 and glossary 29 Since we made an effort to establish "Downgrading with Aliasing" as a defined term last year, you might mention it in comments related to second bullet ("sometimes called downgrading, it's really just aliasing") 34 Downgrading could be mentioned here, even if it's to drive home the point that it is as much a fool's errand as the search for El Dorado 35 define fuzzy matching -----Original Message----- From: UA-EAI <ua-eai-bounces@icann.org> On Behalf Of John Levine Sent: Sunday, August 5, 2018 18:00 To: Paul Borokhov <borokhov@apple.com> Cc: ua-eai@icann.org Subject: Re: [UA-EAI] updated on uasg019b
But can’t Cyrillic or Latin be confusing too by that metric? After all how many people can tell the difference between р and p?
Clearly I have breached some unwritten rule about bidi so I've removed all mention of it.
5 авг. 2018 г., в 5:20 ПП, John Levine <john.levine@standcore.com> написал(а):
In article <kLPIrW9/iTLaImAdP1ajH1aBa8nhtc0tSh7dt1Itfrc=.sha-256@antelope.email> you write:
Yes, bidi can be confusing. It isn't to most people though, because it mostly is the normal way to show English words and Dotcom names in normal text.
If you want to talk about risks, do so. If not, don't.
Sheesh. It's one bullet in a 37 page deck. I said bidi "can be confusing" which is certainly true particularly if hostile parties are writing the text you're displaying, which is not unlikely if it's mail from a typical mail stream.
Anything we noticed on the other 36 slides?
R's, John
_______________________________________________ UA-EAI mailing list UA-EAI@icann.org https://mm.icann.org/mailman/listinfo/ua-eai
Regards, John Levine, john.levine@standcore.com Standcore LLC
Thanks, updated. The punycode with the four hyphens is right, just rechecked it. On Mon, 6 Aug 2018, Mark Svancarek (CELA) wrote:
Here are my belated comments on ver0803
6 Quantity ambiguous in Title - "a global users" 20 Punycode in 3rd bullet ("xn----f38...") looks invalid 21 "Mixtures of LTR and RTL text in a single domain name is outside of the scope of this lesson." 24 "Fuzzy" should be mentioned in the glossary 27 The comments field provides a workable definition of "fuzzy", which could be used in slide 24 and glossary 29 Since we made an effort to establish "Downgrading with Aliasing" as a defined term last year, you might mention it in comments related to second bullet ("sometimes called downgrading, it's really just aliasing") 34 Downgrading could be mentioned here, even if it's to drive home the point that it is as much a fool's errand as the search for El Dorado 35 define fuzzy matching
-----Original Message----- From: UA-EAI <ua-eai-bounces@icann.org> On Behalf Of John Levine Sent: Sunday, August 5, 2018 18:00 To: Paul Borokhov <borokhov@apple.com> Cc: ua-eai@icann.org Subject: Re: [UA-EAI] updated on uasg019b
But can’t Cyrillic or Latin be confusing too by that metric? After all how many people can tell the difference between р and p?
Clearly I have breached some unwritten rule about bidi so I've removed all mention of it.
5 авг. 2018 г., в 5:20 ПП, John Levine <john.levine@standcore.com> написал(а):
In article <kLPIrW9/iTLaImAdP1ajH1aBa8nhtc0tSh7dt1Itfrc=.sha-256@antelope.email> you write:
Yes, bidi can be confusing. It isn't to most people though, because it mostly is the normal way to show English words and Dotcom names in normal text.
If you want to talk about risks, do so. If not, don't.
Sheesh. It's one bullet in a 37 page deck. I said bidi "can be confusing" which is certainly true particularly if hostile parties are writing the text you're displaying, which is not unlikely if it's mail from a typical mail stream.
Anything we noticed on the other 36 slides?
R's, John
_______________________________________________ UA-EAI mailing list UA-EAI@icann.org https://mm.icann.org/mailman/listinfo/ua-eai
Regards, John Levine, john.levine@standcore.com Standcore LLC
Regards, John Levine, john.levine@standcore.com Standcore LLC
On Fri, 3 Aug 2018, Paul Borokhov wrote:
Displaying bidi text is indeed complex and might be confusing to *developers* but properly displayed bidi text is not “often” confusing to *users*. Deliberately messed up bidi text is confusing, but how often are *users* seeing deliberately messed up text?
PS: all the time when malicious senders put it in their phishes Regards, John Levine, john.levine@standcore.com Standcore LLC
participants (5)
-
Arnt Gulbrandsen -
Chaals Nevile -
John Levine -
Mark Svancarek (CELA) -
Paul Borokhov