Dear All, Thank you Bill for the effort. I would like to further clarify what may be needed to consider additional languages. As you know the Integration Panel (IP) has recommended GPs to align what the common and widespread use requirement by the LGR Procedure<https://www.icann.org/en/system/files/files/lgr-procedure-20mar13-en.pdf> (e.g. see discussion on pp. 38-39). The IP had suggested EGIDS scale as a possible measure to meet this requirement, where languages between EGIDS value 1-4 are clear for inclusion and those with EGIDS value 5 are borderline and may require additional evidence for inclusion. The latter means that an argument has to be made for the languages with EGIDS level 5 because these are not automatically included. Latin GP had made an argument on the basis of population of 1 million people using the languages with EGIDS level 5, which the IP has not objected to-date. The LGR Procedure says: “This would be an area for judgment by the integration panel. For scripts where there is doubt that they meet the criteria of eligibility, the default action under the Conservatism and Inclusion principles would be provisional exclusion until positive evidence is brought forward that establishes widespread use for a living language.” Please note that if the Latin GP changes the criteria of including languages in the Latin LGR proposal, this will not be automatically accepted by the IP. The IP would do the evaluations and may request evidence of “widespread use for a living language” for the languages being included with EGIDS level 5 to meet the requirements of the LGR Procedure. Thus, in case the GP decides to expand the existing set on languages for inclusion, the additional effort by Latin GP may not be limited to the evaluation of the code points by the GP. The work may also require finding and documenting concrete evidence of widespread usage for each of the languages being proposed from EGIDS level 5. Regards, Sarmad From: Latingp <latingp-bounces@icann.org> on behalf of Latin GP <latingp@icann.org> Reply-To: Bill Jouris <b_jouris@yahoo.com> Date: Monday, December 6, 2021 at 7:07 AM To: Latin GP <latingp@icann.org> Subject: [Latingp] Repertoire Expansion Dear colleagues, During our meeting last week, there were concerns expressed about expanding the number of languages we consider for our repertoire. In particular the number of additional languages was thought to be large (Dennis’ draft response says “hundreds, if not thousands, of languages around the world”) and the time required to deal with them amounting to weeks. Good news: * There aren’t * It didn’t There may be as many as a thousand languages which, at one time or another, were written using the Latin script. But the IDN project has been clear from early on that it would only consider scripts from “living languages.” It does not seem unreasonable, therefore, that only living languages should be considered when analyzing a given script. And the number of living languages using the Latin script is approximately 450. (Of which, we have already done over 200.) Furthermore, to be worth including in our analysis a language would need to have what the EGIDS 5 definition calls “literature in a standardized form”. (Dennis, in his comments in the meeting and his draft response, appears to suggest conflating EGDIS 5 and EGIDS 6a. However, it seems to me that this confuses the issue. The languages in EGDIS 6a do not involve (yet) a standard orthography; that’s why they aren’t EGDIS 5.) What the comments are therefore suggesting, it seems to me, is eliminating the 1,000,000 native speakers threshold, and including every language which EGDIS 5. An expansion of our work; but not, as we shall see, an enormous one. Mirjana noted during the meeting that it had taken her 3 months to compile the list of languages which we analyzed initially. The implication being that it would take as long to do the same this time. Fortunately, the fruits of her labors then are still available to us now. In particular, her compilation of languages using the Latin script which are EGDIS 5. See https://docs.google.com/document/d/1PwUa4Tkqpp2GGz8-hYDbKz357BSlMG6vkbrAmUqB... [docs.google.com]<https://urldefense.com/v3/__https:/docs.google.com/document/d/1PwUa4Tkqpp2GG...> The total number of languages which are EGDIS 5? 110. Better yet, some 30 of those are already included in our work. Further, 4 are no longer using the Latin script (although they did in the past) and one appears to no longer have living native speakers. So we are left with 75 languages additional languages. Not thousands. Not hundreds. 75. I’ve created a spreadsheet (attached), the first tab of which builds on Mirjana’s list, but includes columns for the new code points, if any, which appear in each language (as well as the Unicode and name). The first column is a flag. Languages which are already included have a green flag , languages which are no longer EGDIS 5 have a pink flag , and the languages to be added have a yellow flag . How many new code points are there? 26. (Or possibly 28, if someone smarter than me can contrive a way to produce Latin Small Letter E with Breve and Combining Circumflex, or Latin Small Letter O with Breve and Combining Circumflex, that look like the images given in Omniglot for the Jarai language.) Happily, the analysis didn’t take weeks. More like 2 half days. The second tab of the spreadsheet gives the new code points, in the same format used in our repertoire tables, ready to be folded in to our existing tables. Of course, that requires having the References available. So a second attachment gives all the references, in proper form, ready to be tacked on to the end of our Reference section. Of course, we can still refuse, on principle, to include any EGDIS 5 language with less than a million native speakers. Assuming that someone can come up with such a principle. But we need to be clear that the time and effort required to analyze the additional languages cannot be our excuse to do so. Bill