Dear Chris and colleagues, apologies for the late reply. I believe we don't need to exclude digraphs. We could simply set them up as variants, e.g. ij as equivalent of i + j. It could be useful to verify with IP, if it is possible to declare a sequence of two code-points as a variant of one - we had not encountered such a case with Arabic script. Best wishes, Meikal 2016-03-29 9:54 GMT+02:00 Dillon, Chris <c.dillon@ucl.ac.uk>:
Dear colleagues,
Mirjana’s recent research on Montenegrin has raised some interesting issues.
One of them is diagraphs.
Currently we have digraphs like æ and œ in our repertoire, but Dutch ij (U+0133) as in vijf ‘five’ is white in MSR-2 (not compatible with IDNA 2008). Certainly many digraphs, including ij are visually similar to their component letters. We could consider adding all digraphs to the list of criteria for exclusion, or adding them with exceptions (less good from a usability point of view). Incidentally, ß and & are probably excluded for other reasons, Longevity Principle and Punctuation, respectively.
What do you think?
Français: Qu’est-ce qu’on devrait faire avec les digraphs dans notre répertoire – les permettre ou pas?
Regards,
Chris.
==
Research Associate in Linguistic Computing, Centre for Digital Humanities, UCL, Gower St, London WC1E 6BT Tel +44 20 7679 1599 (int 31599) www.ucl.ac.uk/dis/people/chrisdillon
_______________________________________________ Latingp mailing list Latingp@icann.org https://mm.icann.org/mailman/listinfo/latingp