Dear all, Let's consider three alternatives: (1) Decomposable diacritics only. This is technically easy, single, straightforward rule. The downside is that it excludes several diacritical letters. (2) Include also "SMALL LATIN LETTER [A-Z] WITH ..." Also technically easy (how long did it take Mark to add this to his tool?), only a bit more complicated in that we'd need two different rules. It provides unambiguous, machine-testable set of letters and relationships between the diacritics and their ASCII counterparts, and it adds 15 letters that are used in at least 35 languages. (3) Add also letters that aren't real diacritics, like ŋ, æ, ð, þ etc, but that could still be handled the same way. These would need to be evaluated individually and automated processing would have to be based on tables. It would, however, cover even more languages. I would argue that (2) is closest match to our charter as it covers almost(?) all diacritics. I would also argue that those extra letters and languages supported by (2) are at least symbolically significant. I do not see a significant difference in technical difficulty between (1) and (2), but I am open to persuasion here. Staff input would be appreciated. I don't think (3) would be technically all that difficult either, but it would certainly be at least stretching our mandate and it would require significant amount of extra work. Looking forward to an interesting discussion tomorrow, -- Tapani Tarvainen