On Tue, Apr 16, 2019 at 11:46:54AM +0000, Andre Schappo wrote:
The individual letters of Korean are called Jamo eg ㅎ ㅏ ㄴ ㄱ ㅡ ㄹ or ㅂ ㅏ ㄴ ㅏ ㄴ ㅏ
Jamo are formed into squared syllable blocks eg 한글 (Hangeul) or 바나나 (banana)
This is not _quite_ the way it works, because there are also precomposed forms. IDNA, for instance, uses only the precomposed forms and treats the Jamo forms as INVALID. The invalidity of them is actually contrary to the principles used in setting out the IDNA2008 work, but there seemed to be very strong agreement about this. (IIRC, but I haven't gone back and checked, this had something to do with the rather tricky normalization rules.)
Universal Acceptance = Hangul Syllables U+AC00➔D7AF Uncritical Acceptance = Hangul Syllables U+AC00➔D7AF + Hangul Jamo U+1100➔11FF + Hangul Jamo Extended-A U+A960➔A97F + Hangul Jamo Extended-B U+D7B0➔D7FF + Hangul Compatibility Jamo U+3130➔318F + ...
I most definitely am not an expert on Korean so those that are please check my reasoning
This was what IDNA baked into the protocol for Korean, so you could do something similar for conventions in local-parts and get a long way. I think there remains, for Korean, a problem with Han, but that is not related to this issue. Best regards, A -- Andrew Sullivan ajs@anvilwalrusden.com