Dear colleagues, During our meeting last week, there were concerns expressedabout expanding the number of languages we consider for our repertoire. In particular the number of additionallanguages was thought to be large (Dennis’ draft response says “hundreds, ifnot thousands, of languages around the world”) and the time required to deal withthem amounting to weeks. Good news: - There aren’t - It didn’t There may be as many as a thousand languages which, at onetime or another, were written using the Latin script. But the IDN project has been clear from earlyon that it would only consider scripts from “living languages.” It does not seem unreasonable, therefore,that only living languages should be considered when analyzing a givenscript. And the number of living languagesusing the Latin script is approximately 450. (Of which, we have already done over 200.) Furthermore, to be worth including in our analysisa language would need to have what the EGIDS 5 definition calls “literature ina standardized form”. (Dennis, in hiscomments in the meeting and his draft response, appears to suggest conflating EGDIS5 and EGIDS 6a. However, it seems to methat this confuses the issue. The languagesin EGDIS 6a do not involve (yet) a standard orthography; that’s why they aren’tEGDIS 5.) What the comments aretherefore suggesting, it seems to me, is eliminating the 1,000,000 nativespeakers threshold, and including every language which EGDIS 5. An expansion of our work; but not, as weshall see, an enormous one. Mirjana noted during the meeting that it had taken her 3months to compile the list of languages which we analyzed initially. The implication being that it would take aslong to do the same this time. Fortunately, the fruits of her labors then are still available to usnow. In particular, her compilation of languagesusing the Latin script which are EGDIS 5. See https://docs.google.com/document/d/1PwUa4Tkqpp2GGz8-hYDbKz357BSlMG6vkbrAmUqB... The total number of languages which are EGDIS 5? 110. Betteryet, some 30 of those are already included in our work. Further, 4 are no longer using the Latinscript (although they did in the past) and one appears to no longer have livingnative speakers. So we are left with 75languages additional languages. Notthousands. Not hundreds. 75. I’ve created a spreadsheet (attached), the first tab ofwhich builds on Mirjana’s list, but includes columns for the new code points,if any, which appear in each language (as well as the Unicode and name). The first column is a flag. Languages which are already included have agreen flag , languages which are nolonger EGDIS 5 have a pink flag , andthe languages to be added have a yellow flag . How many new code points are there? 26. (Or possibly 28, if someone smarter than me can contrive a way to produceLatin Small Letter E with Breve and Combining Circumflex, or Latin Small LetterO with Breve and Combining Circumflex, that look like the images given inOmniglot for the Jarai language.) Happily, the analysis didn’t take weeks. More like 2 half days. The second tab of the spreadsheet gives the new code points,in the same format used in our repertoire tables, ready to be folded in to ourexisting tables. Of course, thatrequires having the References available. So a second attachment gives all the references, in proper form, readyto be tacked on to the end of our Reference section. Of course, we can still refuse, on principle, to include anyEGDIS 5 language with less than a million native speakers. Assuming that someone can come up with such aprinciple. But we need to be clear thatthe time and effort required to analyze the additional languages cannot be our excuse to do so. Bill