Hi,
I promised to write something about email address syntax. It’s actually a touch more complicated than I mentioned yesterday. As John says, Email addresses are more complicated than one usually sees, and there exist syntax
features that can be used to confuse people. But, I think, three reduced grammars:
- One might say that banning all the obs- productions in RFC5322 produces a reduced grammar. Those productions are obsolete, that’s why they’re called obs-whatever,
and should not be used. Noone uses them anyway, I’m only mentioning this for completeness.
- The WHATWG HTML specification contains a simple grammar, which web browsers implement. Since one must expect email addresses to be typed into HTML forms, IMO following
this specification is strongly advisable. The spec is on
https://html.spec.whatwg.org/multipage/input.html#email-state-(type=email) and so far isn’t UA-ready. I fear that getting that UA-ready will shortly be one of my tasks.
- \X+ is an extremely simple PCRE (a regular expression) that IMO should match all addresses. I suspect we’ll see that in some well-known web software in the near
future.
IMO it’s safe to advise anyone to use only addresses that match the WHATWG spec (when extended) and also match \X+.
The two suggestions complement each other;
“ “@example.com matches \X+ but I’m not brave enough to use that an address like that
😉
Arnt Gulbrandsen
UA Technology Sr. Manager, ICANN
+32 492 374706