Re: [UA-EAI] [UA-discuss] Issue needs discussion and closure
Here is my reasoning: We forbid script mixing in the root for well-understood reasons. A few exceptions were carved out, though. The same thought process could be applied to second level and below. It's not an obligation, but the benefits to the user are the same and I think it is safe to say that it's a good practice to apply those same restrictions and exemptions to any label in a domain name. The local part is even less restricted than the 2LD. But again, I think the same benefits to the users apply. Given the perceived benefits, is there any concern about defining a good practice on creation of local parts by a mail service provider? (It would be written more clearly than below...) -----Original Message----- From: Tan Tanaka, Dennis [mailto:dtantanaka@verisign.com] Sent: Sunday, March 11, 2018 12:49 PM To: Mark Svancarek <marksv@microsoft.com> Cc: Ajay Data <ajay@data.in>; ua-discuss@icann.org Subject: Re: [UA-discuss] Issue needs discussion and closure
On Mar 11, 2018, at 9:00 AM, Mark Svancarek via UA-discuss <ua-discuss@icann.org> wrote:
We should recognize that the local part rules are very permissive and therefore this should be an ALLOWED case per the spec. But I vote that UASG declare it as a NOT RECOMMENDED case EXCEPT for script combinations which are already allowed to be mixed in the root zone.
I would stop at the first part and add that each mail admin set its own rules as far as mailbox names. The second part is troublesome as it mixes mailbox names with the (dns) root zone. I don't see the need for a connection. Am I missing something? -Dennis
On Sun, Mar 11, 2018 at 06:43:34PM +0000, Mark Svancarek via UA-discuss wrote:
Here is my reasoning:
We forbid script mixing in the root for well-understood reasons.
But we don't forbid script mixing in the root. The LGR effort is busily trying to set the correct rules for this, but the root already has more than one script, and there are defintiely potential labels corresponding to Japanese words that require script mixing. The rules are actually aimed at prohibiting mixing of _writing systems_, which in the last go round was approximated as "script".
The same thought process could be applied to second level and below.
But this gets harder the lower in the tree you go, because there is no authority to enforce it. Moreover, things that would be a very bad idea for the root, such as (say) Egyptian hieroglyphs, would be just fine at other layers of the DNS. And there's the problem of different scripts in different labels, which is a permanent and unresolvable problem because of the nature of the DNS.
It's not an obligation, but the benefits to the user are the same and I think it is safe to say that it's a good practice to apply those same restrictions and exemptions to any label in a domain name.
What is certainly safe to say is that you should not create identifiers where you don't understand what the implications are. Best regards, A -- Andrew Sullivan ajs@anvilwalrusden.com
Discussing with Dennis, we wonder if M3AAWG already has a recommendation on this topic. If so, we should adopt theirs. -----Original Message----- From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of Andrew Sullivan Sent: Sunday, March 11, 2018 2:59 PM To: ua-discuss@icann.org; ua-eai@icann.org Subject: Re: [UA-discuss] Issue needs discussion and closure On Sun, Mar 11, 2018 at 06:43:34PM +0000, Mark Svancarek via UA-discuss wrote:
Here is my reasoning:
We forbid script mixing in the root for well-understood reasons.
But we don't forbid script mixing in the root. The LGR effort is busily trying to set the correct rules for this, but the root already has more than one script, and there are defintiely potential labels corresponding to Japanese words that require script mixing. The rules are actually aimed at prohibiting mixing of _writing systems_, which in the last go round was approximated as "script".
The same thought process could be applied to second level and below.
But this gets harder the lower in the tree you go, because there is no authority to enforce it. Moreover, things that would be a very bad idea for the root, such as (say) Egyptian hieroglyphs, would be just fine at other layers of the DNS. And there's the problem of different scripts in different labels, which is a permanent and unresolvable problem because of the nature of the DNS.
It's not an obligation, but the benefits to the user are the same and I think it is safe to say that it's a good practice to apply those same restrictions and exemptions to any label in a domain name.
What is certainly safe to say is that you should not create identifiers where you don't understand what the implications are. Best regards, A -- Andrew Sullivan ajs@anvilwalrusden.com
Mark Svancarek via UA-EAI writes:
Discussing with Dennis, we wonder if M3AAWG already has a recommendation on this topic. If so, we should adopt theirs.
Well, there already is a source of truth, you can see it if you look down: Your keyboard. I don't know where each of you are in the world, but the keyboard, whatever it is, provides strong guidance (even if not an absolute rule) on what sort of identifiers you and your correspondents can use. I know you (Mark/Dennis) thought about it more generally. Achoring the question to the keyboard can help segment the general question into usefully concrete ones, though. The rule (or each part-rule of there are several) can be stricter then the keyboard, can map the keyboarding ability exactly, or can be laxer. 1. Should there be any sort of general rule that restricts identifiers more than keyboards do? (Some such rules do exist, e.g. "can't have @ in a localpart" or "can't practically have space in a localpart".) IMO we cannot and should not expect anything on people's keyboards today to be useless or harmful, therefore we should not add rules to block anything. 2. Or that describes exactly, o so that email providers can enforce that people can enter only what their keyboards permit, and nothing else? Personally I don't see the point. A lot of tedious work and what will it achieve? Please don't answer "block confusing glyphs", because а@gamil.com and a@gmail.com may be confusable, but gamil and gmail aren't going to coordinate anyway. 3. Or be laxer? For example: "Software and sites SHOULD support all of the following code points in localparts: τυφχψωϊϋόύώϙϛϝϟϡϸ… Sites SHOULD allow users to include any code points from that list, if they permit user choice at all." I don't see the point either. It's the kind of rule people ignore. Arnt
Il 12 marzo 2018 alle 14.02 Arnt Gulbrandsen <arnt@gulbrandsen.priv mailto:arnt@gulbrandsen.priv .no> ha scritto:
Mark Svancarek via UA-EAI writes:
> > Discussing with Dennis, we wonder if M3AAWG already has a
recommendation on this topic. If so, we should adopt theirs.
> Well, there already is a source of truth, you can see it if you look down:
Your keyboard.
I don't know where each of you are in the world, but the keyboard, whatever it is, provides strong guidance (even if not an absolute rule) on what sort of identifiers you and your correspondents can use.
Unless I did not understand what you meant, I don't think this could work: - you, your keyboard and your mail system can receive email from someone who is using a completely different keyboard and script, and in no way the fact that your keyboard cannot type that email address is an issue that prevents you from just clicking "reply" and continuing the correspondence, as long as your mail system supports EAI; - in any case, nowadays you have key combinations and other instruments that allow you to type whatever character on whatever keyboard; it may be easier or harder, but you can do it if you need and know how; - also, people may use different devices with different keyboards, so the people <=> keyboard biunivocal mapping does not work; - even you just wanted to restrict which characters you can use when you connect to a free webmail platform and create a new email address, nothing would prevent the user from connecting again with a different keyboard; - all in all, it would be very weird for me to think that I can only exchange email with people who are using an Italian keyboard/email address; just in my company we have a couple dozen nationalities and different keyboards, we would have to shut down the company; - and finally, I'm not sure that a server has a way to know 100% securely and reliably which keyboard is the user typing on. Also, re your point 3, I would be wary of any central authority telling people in country X which characters from their script can or cannot be used in email addresses. If there is a need to prevent confusion, you may introduce specific technical rules (or maybe best practices) forbidding as few things as possible, but nothing more than that. Also, as you point out with your gmail/gamil example, most confusion/phishing in the West actually happens with Western-only characters in domains and addresses, and no one cares (or better, this is being addressed by other means, e.g. content scanning, blacklisting etc). I just received a "Paypal" email from "donot@repaly.com", but no one is asking to outlaw "repaly.com", so I'm not sure why there is all this desire to prevent people in non-Latin-script countries, or even those of them who live in Latin-script countries, from using all the characters they want. Regards, -- Vittorio Bertola | Head of Policy & Innovation, Open-Xchange vittorio.bertola@open-xchange.com mailto:vittorio.bertola@open-xchange.com Office @ Via Treviso 12, 10144 Torino, Italy
The thing you're forgetting is that EAI is for people who aren't friends with a, b and c. I have a runic address (just for fun). If we pretend that runes are the only alphabet I know well, then yes, I could receive email from you. that part is easy: My computer can display any alphabet you can type. But what about the rest? Even if you could enter my address without being able to type runes, what would you do? I could receive mail from you, but how can you send it and what can you send? There's roughly a hundred million internet users in West Bengal and Bangladesh who can read and type bangla, but don't know the latin alphabet very well. EAI is for letting them send email among themselves. The keyboard they have expresses what they can do, in a rather harsh manner. Any time you start thinking about how to let bangla people send email using runes, you've lost contact with reality and the problems you're solving aren't IMO interesting any more. Arnt
Agree that we must always remind folks that EAI is (mainly) not for ICANN people who travel the world and are multilingual and know other multilingual people. EAI is (mainly) for people who speak only one language, read one language in a single writing system, and communicate only with people who speak/read/communicate with similar people. -----Original Message----- From: UA-EAI [mailto:ua-eai-bounces@icann.org] On Behalf Of Arnt Gulbrandsen Sent: Monday, March 12, 2018 12:06 To: Vittorio Bertola <vittorio.bertola@open-xchange.com> Cc: ua-discuss@icann.org; ua-eai@icann.org Subject: Re: [UA-EAI] [UA-discuss] Issue needs discussion and closure The thing you're forgetting is that EAI is for people who aren't friends with a, b and c. I have a runic address (just for fun). If we pretend that runes are the only alphabet I know well, then yes, I could receive email from you. that part is easy: My computer can display any alphabet you can type. But what about the rest? Even if you could enter my address without being able to type runes, what would you do? I could receive mail from you, but how can you send it and what can you send? There's roughly a hundred million internet users in West Bengal and Bangladesh who can read and type bangla, but don't know the latin alphabet very well. EAI is for letting them send email among themselves. The keyboard they have expresses what they can do, in a rather harsh manner. Any time you start thinking about how to let bangla people send email using runes, you've lost contact with reality and the problems you're solving aren't IMO interesting any more. Arnt _______________________________________________ UA-EAI mailing list UA-EAI@icann.org https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmm.icann.or...
Il 12 marzo 2018 alle 18.50 Mark Svancarek <marksv@microsoft.com> ha scritto: Agree that we must always remind folks that EAI is (mainly) not for ICANN people who travel the world and are multilingual and know other multilingual people. EAI is (mainly) for people who speak only one language, read one language in a single writing system, and communicate only with people who speak/read/communicate with similar people.
Well, why? I mean, once EAI is widely supported, why should Indian or Japanese or Russian people still use ASCII email addresses at all? Of course people mostly email friends and local companies in their same script and language, which they'd do with an email address in their own script, but then, why would they need a separate ASCII email address for the rare cases when they have to email someone in a different part of the planet? In the end, the email address is a label and does not necessarily have semantic value (even today, there are people with addresses like j43@yrwx.com) - and if it has, it may even be false or misleading (paypal@example.com is not likely to be Paypal). You just use it and pass it on as it is - you may not even need to type it once, if you receive it by electronic means, including an incoming email. So what difference does it make if it is in a script that you can or cannot read? Regards, -- Vittorio Bertola | Head of Policy & Innovation, Open-Xchange vittorio.bertola@open-xchange.com Office @ Via Treviso 12, 10144 Torino, Italy
Vittorio Bertola writes:
Well, why? I mean, once EAI is widely supported, why should Indian or Japanese or Russian people still use ASCII email addresses at all?
You appear to take for given that EAI becomes widely supported. That's not exactly certain at the present. IMO, if does become widely supported, then we'll have learnt enough to answer that question, and the one below. But I'll offer an anecodotal comment which, perhaps, hints at an answer:
... So what difference does it make if it is in a script that you can or cannot read?
JFYI, I am supposed to have an Indian address, but it's broken right now, due to someone's cut and paste error. Such mistakes happen easily when you can't read or write the letters you're trying to use. Arnt
Cut and paste errors are a really annoying problem with RTL scripts. So easy to make. -----Original Message----- From: Arnt Gulbrandsen [mailto:arnt@gulbrandsen.priv.no] Sent: Tuesday, March 13, 2018 08:04 To: Vittorio Bertola <vittorio.bertola@open-xchange.com> Cc: Mark Svancarek <marksv@microsoft.com>; ua-eai@icann.org; ua-discuss@icann.org Subject: Re: [UA-EAI] [UA-discuss] Issue needs discussion and closure Vittorio Bertola writes:
Well, why? I mean, once EAI is widely supported, why should Indian or Japanese or Russian people still use ASCII email addresses at all?
You appear to take for given that EAI becomes widely supported. That's not exactly certain at the present. IMO, if does become widely supported, then we'll have learnt enough to answer that question, and the one below. But I'll offer an anecodotal comment which, perhaps, hints at an answer:
... So what difference does it make if it is in a script that you can or cannot read?
JFYI, I am supposed to have an Indian address, but it's broken right now, due to someone's cut and paste error. Such mistakes happen easily when you can't read or write the letters you're trying to use. Arnt
Please let me clarify - SMTPUTF8 mail is for everyone. All systems should use the up to date standards. But actually there will always be friction when attempting to use languages and scripts with someone who doesn't know that language or script. Today, that friction is felt by users of non-latinate languages, and they just have to endure it. In the future the friction will be limited to folks like you and me on those occasions when we share contact info with international people who we meet on business and at conferences. -----Original Message----- From: Vittorio Bertola [mailto:vittorio.bertola@open-xchange.com] Sent: Tuesday, March 13, 2018 07:17 To: Mark Svancarek <marksv@microsoft.com>; Arnt Gulbrandsen <arnt@gulbrandsen.priv.no> Cc: ua-eai@icann.org; ua-discuss@icann.org Subject: RE: [UA-EAI] [UA-discuss] Issue needs discussion and closure
Il 12 marzo 2018 alle 18.50 Mark Svancarek <marksv@microsoft.com> ha scritto: Agree that we must always remind folks that EAI is (mainly) not for ICANN people who travel the world and are multilingual and know other multilingual people. EAI is (mainly) for people who speak only one language, read one language in a single writing system, and communicate only with people who speak/read/communicate with similar people.
Well, why? I mean, once EAI is widely supported, why should Indian or Japanese or Russian people still use ASCII email addresses at all? Of course people mostly email friends and local companies in their same script and language, which they'd do with an email address in their own script, but then, why would they need a separate ASCII email address for the rare cases when they have to email someone in a different part of the planet? In the end, the email address is a label and does not necessarily have semantic value (even today, there are people with addresses like j43@yrwx.com) - and if it has, it may even be false or misleading (paypal@example.com is not likely to be Paypal). You just use it and pass it on as it is - you may not even need to type it once, if you receive it by electronic means, including an incoming email. So what difference does it make if it is in a script that you can or cannot read? Regards, -- Vittorio Bertola | Head of Policy & Innovation, Open-Xchange vittorio.bertola@open-xchange.com Office @ Via Treviso 12, 10144 Torino, Italy
Actually, my “keyboard” is pretty flexible. To your other point about mainly-ASCII phishing, I am happy to consider I may be overthinking things by looking for recommended good practices for this topic. [cid:image001.jpg@01D3BA08.B8323840] From: UA-discuss [mailto:ua-discuss-bounces@icann.org] On Behalf Of Vittorio Bertola Sent: Monday, March 12, 2018 09:59 To: ua-eai@icann.org; Arnt Gulbrandsen <arnt@gulbrandsen.priv.no>; ua-discuss@icann.org Subject: Re: [UA-discuss] [UA-EAI] Issue needs discussion and closure Il 12 marzo 2018 alle 14.02 Arnt Gulbrandsen <arnt@gulbrandsen.priv<mailto:arnt@gulbrandsen.priv>.no> ha scritto: Mark Svancarek via UA-EAI writes: Discussing with Dennis, we wonder if M3AAWG already has a recommendation on this topic. If so, we should adopt theirs. Well, there already is a source of truth, you can see it if you look down: Your keyboard. I don't know where each of you are in the world, but the keyboard, whatever it is, provides strong guidance (even if not an absolute rule) on what sort of identifiers you and your correspondents can use. Unless I did not understand what you meant, I don't think this could work: - you, your keyboard and your mail system can receive email from someone who is using a completely different keyboard and script, and in no way the fact that your keyboard cannot type that email address is an issue that prevents you from just clicking "reply" and continuing the correspondence, as long as your mail system supports EAI; - in any case, nowadays you have key combinations and other instruments that allow you to type whatever character on whatever keyboard; it may be easier or harder, but you can do it if you need and know how; - also, people may use different devices with different keyboards, so the people <=> keyboard biunivocal mapping does not work; - even you just wanted to restrict which characters you can use when you connect to a free webmail platform and create a new email address, nothing would prevent the user from connecting again with a different keyboard; - all in all, it would be very weird for me to think that I can only exchange email with people who are using an Italian keyboard/email address; just in my company we have a couple dozen nationalities and different keyboards, we would have to shut down the company; - and finally, I'm not sure that a server has a way to know 100% securely and reliably which keyboard is the user typing on. Also, re your point 3, I would be wary of any central authority telling people in country X which characters from their script can or cannot be used in email addresses. If there is a need to prevent confusion, you may introduce specific technical rules (or maybe best practices) forbidding as few things as possible, but nothing more than that. Also, as you point out with your gmail/gamil example, most confusion/phishing in the West actually happens with Western-only characters in domains and addresses, and no one cares (or better, this is being addressed by other means, e.g. content scanning, blacklisting etc). I just received a "Paypal" email from "donot@repaly.com<mailto:donot@repaly.com>", but no one is asking to outlaw "repaly.com", so I'm not sure why there is all this desire to prevent people in non-Latin-script countries, or even those of them who live in Latin-script countries, from using all the characters they want. Regards, -- Vittorio Bertola | Head of Policy & Innovation, Open-Xchange vittorio.bertola@open-xchange.com<mailto:vittorio.bertola@open-xchange.com> Office @ Via Treviso 12, 10144 Torino, Italy
Mark Svancarek writes:
Actually, my “keyboard” is pretty flexible.
That's your excellent OS support, not the keyoard you use. Your keycaps say "A", "S", "D" and so on, right? I see Hindi traditional on your list, is there a keycap that says आ? EAI is for the people whose knowledge of A-Z is too shaky to be comfortable emailing using A-Z, but who can read and write some other writing system, and who have a computer so they want to email. For these people, the keyboard they have is a decent proxy. They chose that keyboard because it reflects their knowledge. It isn't necessarily a hard and fast border of their skill, but it's a pretty good working description.
To your other point about mainly-ASCII phishing, I am happy to consider I may be overthinking things by looking for recommended good practices for this topic.
Phishing's a big threat nowadays. It's worth keeping in mind that exploiting similar-looking glyphs isn't a big part of that evil industry. It was a clever, novel attack once, but hasn't grown to much more in the decade that's passed since it was invented. Phishers don't need cyrillic a to impersonate Chase, they just register chase-account-security-team.com and fool enough people. So I don't think that warrants much attention. More than anything else, that issue is an attention magnet for nerds like us. We do like clever toys and hacks like that, don't we ;) Arnt
Yeah, I am prone to overthinking stuff like this 😝 -- More than anything else, that issue is an attention magnet for nerds like us. We do like clever toys and hacks like that, don't we ;) -----Original Message----- From: Arnt Gulbrandsen [mailto:arnt@gulbrandsen.priv.no] Sent: Monday, March 12, 2018 14:24 To: Mark Svancarek <marksv@microsoft.com> Cc: Vittorio Bertola <vittorio.bertola@open-xchange.com>; ua-eai@icann.org; ua-discuss@icann.org Subject: Re: [UA-discuss] [UA-EAI] Issue needs discussion and closure Mark Svancarek writes:
Actually, my “keyboard” is pretty flexible.
That's your excellent OS support, not the keyoard you use. Your keycaps say "A", "S", "D" and so on, right? I see Hindi traditional on your list, is there a keycap that says आ? EAI is for the people whose knowledge of A-Z is too shaky to be comfortable emailing using A-Z, but who can read and write some other writing system, and who have a computer so they want to email. For these people, the keyboard they have is a decent proxy. They chose that keyboard because it reflects their knowledge. It isn't necessarily a hard and fast border of their skill, but it's a pretty good working description.
To your other point about mainly-ASCII phishing, I am happy to consider I may be overthinking things by looking for recommended good practices for this topic.
Phishing's a big threat nowadays. It's worth keeping in mind that exploiting similar-looking glyphs isn't a big part of that evil industry. It was a clever, novel attack once, but hasn't grown to much more in the decade that's passed since it was invented. Phishers don't need cyrillic a to impersonate Chase, they just register chase-account-security-team.com and fool enough people. So I don't think that warrants much attention. More than anything else, that issue is an attention magnet for nerds like us. We do like clever toys and hacks like that, don't we ;) Arnt
Hello Mark, others, On 2018/03/12 03:43, Mark Svancarek via UA-EAI wrote:
Here is my reasoning:
We forbid script mixing in the root for well-understood reasons. A few exceptions were carved out, though. The same thought process could be applied to second level and below. It's not an obligation, but the benefits to the user are the same and I think it is safe to say that it's a good practice to apply those same restrictions and exemptions to any label in a domain name. The local part is even less restricted than the 2LD. But again, I think the same benefits to the users apply.
For local parts, I think it depends a lot on the size of the 'operation'. The problem is easy to handle on a case-by-case base for e.g. an University lab that's handling out email addresses to its members. Script mixtures may not be a problem at all, because it's easy to avoid conflicts and confusions. On the other end of the spectrum (big web mail providers and such), strict rules will have the benefit that they eliminate a lot of problems while keeping almost everything automatic.
Given the perceived benefits, is there any concern about defining a good practice on creation of local parts by a mail service provider? (It would be written more clearly than below...)
With regards to script mixing, the Japanese case has already been mentioned. Also, the main arguments against script mixing are visual confusion across scripts and bidirectionality issues. Latin/Greek/Cyrillic is the main case; there may be some well-known cases across Indic scripts, too, but for any two scripts taken at random, the chance of confusability is pretty low. Bidirectionality considerations essentially split the scripts into two, but will still allow e.g. mixing Arabic and Hebrew. Mixing arbitrary scripts in LHS probably doesn't have too much of a need anyway in the first place, but that doesn't mean it should be totally discouraged when it's not harmful. Regards, Martin.
-----Original Message----- From: Tan Tanaka, Dennis [mailto:dtantanaka@verisign.com] Sent: Sunday, March 11, 2018 12:49 PM To: Mark Svancarek <marksv@microsoft.com> Cc: Ajay Data <ajay@data.in>; ua-discuss@icann.org Subject: Re: [UA-discuss] Issue needs discussion and closure
On Mar 11, 2018, at 9:00 AM, Mark Svancarek via UA-discuss <ua-discuss@icann.org> wrote:
We should recognize that the local part rules are very permissive and therefore this should be an ALLOWED case per the spec. But I vote that UASG declare it as a NOT RECOMMENDED case EXCEPT for script combinations which are already allowed to be mixed in the root zone.
I would stop at the first part and add that each mail admin set its own rules as far as mailbox names. The second part is troublesome as it mixes mailbox names with the (dns) root zone. I don't see the need for a connection. Am I missing something?
-Dennis _______________________________________________ UA-EAI mailing list UA-EAI@icann.org https://mm.icann.org/mailman/listinfo/ua-eai .
participants (5)
-
Andrew Sullivan -
Arnt Gulbrandsen -
Mark Svancarek -
Martin J. Dürst -
Vittorio Bertola