Nov. 10, 2018
9:40 a.m.
On 11/10/2018 1:11 AM, Dr Ajay Data wrote:
Is there any encoding/decoding method like punycode for these special symbols , which browsers are following. What makes browser map these symbols to three different characters.. ?
Unicode *compatibility* decomposition. Probably the browsers are applying normalization form NF*K*C to the input data. That normalization form is defined as applying compatibility decomposition followed by *canonical* composition. As a result of NFKC the data is in NFC. Likewise you will find browsers do accept uppercase strings for IDNs and apply case folding to lower case before resolving. This allows users to enter IDNs in uppercase, even though IDNs are only lowercase per IDNA 2008. A./