Michael, The Middle Dot U+00B7 is special since the IDNA standard requires it to be limited to the context between 'l' and 'l' ("U+006C U+00B7 U+006C"). (https://tools.ietf.org/html/rfc5892#page-16) Yours, Mats --- Mats Dufberg DNS Specialist, IIS Mobile: +46 73 065 3899 https://www.iis.se/en/ -----Original Message----- From: <latingp-bounces@icann.org> on behalf of Michael Bauland <Michael.Bauland@knipp.de> Date: Tuesday 17 January 2017 at 09:55 To: "latingp@icann.org" <latingp@icann.org> Subject: Re: [Latingp] How should combining diacritic marks be handled? Hi Mats, hi all, On 16.01.2017 17:21, Mats Dufberg wrote:
MSR2 contains a number of combining diacritic marks, e.g. U+0323 COMBINING DOT BELOW. It might be that we find that some of the languages that should be supported requires that code point in combination with, say, "n", i.e. "U+006E U+0323". Let us assume that there is no pre-composed equivalent code point.
We can then justify the inclusion of U+0323. Will then the Integration Panel accept that code point in any context, or just in the specific context?
I assume that we will need to define the context those combining marks are allowed. At least we did this for middle dot of the "ela geminada" in the Catalan language tables (see, e.g., http://www.iana.org/domains/idn-tables/tables/sap_ca_1.0.txt). But I guess Sarmad will know for certain.
If the IP requires that we justify combining diacritic marks for every context it will be allowed for, then we have to go language by language to find all combinations to support.
If the IP accepts to include a combining diacritic mark for any context as long as it is justified for one language, then we can go code point by code point as long as we can find justification for all Latin code points in MSR2 and we assume no more code points are needed.
If the purpose of our work is to create a Latin IDN table that supports all listed languages (EGIDS value 4 or 5 as decided) then I cannot see how we can achieve that without inspecting all those languages.
Going by language instead of going by character also has the advantage that we will be able to distribute the languages to members of the group. Then everybody can work with a certain sub-set of all languages. If we distributed the characters, everybody would have to get acquainted with every single language. Cheers, Michael -- ____________________________________________________________________ | | | knipp | Knipp Medien und Kommunikation GmbH ------- Technologiepark Martin-Schmeisser-Weg 9 44227 Dortmund Germany Dipl.-Informatiker Fon: +49 231 9703-284 Fax: +49 231 9703-200 Dr. Michael Bauland SIP: Michael.Bauland@knipp.de Software Development E-mail: Michael.Bauland@knipp.de Register Court: Amtsgericht Dortmund, HRB 13728 Chief Executive Officers: Dietmar Knipp, Elmar Knipp _______________________________________________ Latingp mailing list Latingp@icann.org https://mm.icann.org/mailman/listinfo/latingp