Revised ASCII-Unicode Diacritics Analyzer Tool

Sept. 15, 2025

      Everyone,

Following a question that came up during the review of the 
recommendations document, I have revised the tool that generates the 
list of characters that are within our scope. I am happy to report that 
the methodology still holds strong, resulting in 106 characters with one 
diacritic mark, 30 characters with two diacritic marks, and 14 "other 
occurrences".

However, to clarify matters further, I have now added a list titled: 
"Appendix: Latin repertoire not canonically decomposable to ASCII base", 
which holds the major examples that *seem* like they might be within 
scope, but actually aren't due to being precomposed.

I am attaching the new report to this email for your convenience.

Nevertheless, you can inspect the new code it in the same repository as 
before: https://github.com/mark-wd/ASCII-Unicode-Diacritics-Analyzer-Tool

Best,

-- 
Mark W. Datysgeld
Director at Governance Primer [governanceprimer.com 
<https://governanceprimer.com>]
Project Lead Developer at ICANNWiki [icannwiki.org <https://icannwiki.org/>]

Mark W. Datysgeld

tags

participants (1)