Everyone,
Following a question that came up during the review of the
recommendations document, I have revised the tool that generates
the list of characters that are within our scope. I am happy to
report that the methodology still holds strong, resulting inĀ 106 characters
with one diacritic mark, 30 characters
with two diacritic marks, and 14 "other occurrences".
However, to clarify matters further, I have now added a list titled: "Appendix: Latin repertoire not canonically decomposable to ASCII base", which holds the major examples that seem like they might be within scope, but actually aren't due to being precomposed.
I am attaching the new report to this
email for your convenience.
Nevertheless, you can inspect the new code it in the same repository as before: https://github.com/mark-wd/ASCII-Unicode-Diacritics-Analyzer-Tool
Best,