On 14 Aug 2020, at 11:51, Abdelmeniem Tharwat wrote:

Could we have an examples so I could fix that !!

for IDNA, find any codepoint which does not have a Unicode property Letter but is PVALID from IDNA. Digits come to mind immediately \p{N}, but that is not the end of the story: many PVALID codepoints do not have the Unicode property L or N. Again, see RFC5892 and IANA IDNA registry. As I wrote and discussed in the Java tutorial for UA, if you fully want to correctly handle IDNA in a regex, you end up coding the full IDNA rules into Regex, which is, well, if not impossible, very very very complicated, and not worth the work.

for EAI, then find any codepoint does not have a Unicode property Letter, and it won’t work with the regex below.

The danger here is again promoting a regex which « kinda work but not quite » and that it becomes the « standard » everybody uses, and then one essentially create a fork of the RFCs with more limitations from an implementation point of view.

One may argue that we should have done IDNA2008 based on the fact that it could be implemented in a regex, but that did not happen.

The best way would be to modify the regex engine itself to embed the IDNA protocol inside it and then define a new regex token for IDNA and then we will be in business… Not an easy task.

Regards, Marc.

Sent from my iPhone

On Aug 14, 2020, at 5:49 PM, Marc Blanchet <marc.blanchet@viagenie.ca> wrote:

On 14 Aug 2020, at 11:32, Abdelmeniem Tharwat wrote:

Dear colleagues,
I hope that you are doing well, since along time I tried to use regex to validate EAI addresses for many project I have related to UA, I used the tool here<https://rubular.com/> and used this
Regex "^[\p{L}.%+-]+@[\p{L}.-]+\.[\p{L}]{2,}$" to validate some EAI addresses and it works well like the below screenshot.

{L} is for Unicode property Letter. So:
- for IDNA, it is near (as IDNA base is Unicode Letter property) but not quite. see RFC5892
- for EAI, then it is restricting a lot since the mailbox can be almost any UTF8 string. see RFC6531

So you may want to use that regex, but be aware of its side-effects, including not accepting some domains and mailboxes.

Finally, not all regex engines support Unicode properties, so make sure the one used support it.

Regards, Marc.

[cid:image010.png@01D67260.F5721690]


Thanks a lot.

All the Best,
Abdalmonem Tharwat Galila
Deputy Manager, Dot Masr Registry,
Operation Sector.

[NTRA Logo 2016]
National Telecommunication Regulatory Authority
[Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: 1365523405_telephone] Office Tel.: +2 02 35341582<tel:02%2035341582> - +2 02 35341300<tel:02%2035341300>
[Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Mobile] Mobile: +2 010 00049068<tel:010%2000049068>
[Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: ICON] Fax : +2 02 35370537<tel:02%2035370537>
[Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: oNLINE] Website : http:\\www.mcit.gov.eg<http://www.mcit.gov.eg/>
: http:\\www.tra.gov.eg<http://www.mcit.gov.eg/>
[Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: 1365523294_email] E-mail : agalila@mcit.gov.eg<mailto:agalila@mcit.gov.eg>
: atharwat@tra.gov.eg<mailto:atharwat@tra.gov.eg>
[Description: 1447802547_skype] Skype : abdalmonem.galila
[Description: static_qr_code_without_logo]
[Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: 1365523469_error]DISCLAIMER
This e-mail and any files transmitted with it are confidential and intended solely for the use of the individual or entity to which they are addressed. If you have received this email in error please notify your system support manager. Please note that any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the National Telecom Regulatory Authority (NTRA) . Finally, the recipient should check this email and any attachments for the presence of viruses. The NTRA accepts no liability for any damage caused by any virus transmitted by this email.

_______________________________________________
UA-discuss mailing list
UA-discuss@icann.org
https://mm.icann.org/mailman/listinfo/ua-discuss
_______________________________________________
By submitting your personal data, you consent to the processing of your personal data for purposes of subscribing to this mailing list accordance with the ICANN Privacy Policy (https://www.icann.org/privacy/policy) and the website Terms of Service (https://www.icann.org/privacy/tos). You can visit the Mailman link above to change your membership status or configuration, including unsubscribing, setting digest-style delivery or disabling delivery altogether (e.g., for a vacation), and so on.