UA EAI and Technology Working Group colleagues:
I have a pair of topics to bring up, and both topics are relevant to both working groups.
Topic 1. HTML5 <input type="email"> specification
HTML includes a form field entity known as <input>. This takes a "type" attribute. One state for the "type" attribute is "email"[1]. When the page has a <input type="email"> field, the browser must check that the email address matches a syntax[2] which is
limited to ASCII. Thus, by specification, this form field may not accept EAI email addresses.
This is a problem for Universal Acceptance.
John Levine opened an issue against the HTML 5 spec about this: Validating internationalized mail addresses in <input type="email"> #4562 [3]. Since it was opened 2019, discussion has started and stopped in this issue. On April 30, the discussion
started again[4]. I have the impression that this is a particularly opportune moment. From the comments, influential people seem to be involved from both the Universal Acceptance and HTML implementor groups.
People in these working groups who would like to move EAI in HTML forward, I suggest that you read through the recent discussion. Look for questions which the HTML folks are asking which have not received answers that satisfy them. Perhaps post suggested interventions on our WG lists, or suggest them to experts we know such as John Levine, or even (if you think you can do so productively) post in that issue thread.
For example, [5]
It seems to me that one useful function UASG could serve is to be a centre of expertise which is capable of giving reliable, well-thought-out answers to questions like this.Can you tell me what kind of inputs are there that IDNA 2008 accepts but that UTS 46 in non-transitional mode either rejects or maps to different ASCII form than IDNA 2008?
As an UTS 46 implementor, my current understanding is that there are none, but if there are some, it would be useful for me to know. [5]
Topic 2. A UASG White paper on Email Address Validation
I encouraged the UA Technology working group to set a 2024 goal[5] of writing a paper recommending how best to validate incoming email addresses as a universally-accepting application.
Here is a very simple, high-level outline of that paper:
1. Understand why you are validatingWhen reading the issue #4562 discussion of the past two weeks, I began to get the more extreme thought that we should say under #4, because the HTML5 specification is broken, we recommend against using <input type="email">, and instead recommend using <input type="text"> and not expect the browser to do any significant email address validation.
2. The best way to validate is by sending message to user and let user send
back a confirmation receipt
3. Simple minded regular expression and reject valid email addresses.
4. HTML5 specification is broken
It seems like now is a good time to solicit ideas for this paper. If you have thoughts, please send them to the UA-Tech WG email list. I will collect them, and write a draft.
Thoughts?
—Jim DeLaHunt
[1] 4.10.5.1.5 The Input Element, Email state (type=email),
<https://html.spec.whatwg.org/#email-state-(type=email)>
[2] "A valid email address"
<https://html.spec.whatwg.org/#valid-e-mail-address>
[3] Validating internationalized mail addresses in <input type="email">
<https://github.com/whatwg/html/issues/4562>
[4] Comment by hsivonen about #4562 on 2024-04-30 08:39Z
<https://github.com/whatwg/html/issues/4562#issuecomment-2084725431> and following comments
[5] Comment by hsivonen about #4562 on 2024-05-08 06:49Z
<https://github.com/whatwg/html/issues/4562#issuecomment-2099862952>
[6] UA Technology WG, 2024-02-05 meeting notes, p.4
<https://community.icann.org/display/TUA/UA+Technology+WG?preview=/58720693/322109590/Meeting%20notes%20UA%20Tech%20WG%2020240205.pdf>
-- --Jim DeLaHunt, jdlh@jdlh.com http://blog.jdlh.com/ (http://jdlh.com/) multilingual websites consultant, Vancouver, Canada