On 19 марта 2016, at 16:40, Dusan Stojicevic <dusan@dukes.in.rs> wrote:
Dear Dmitry, all,
Sorry for being late on response, I had a big DIDS event in Belgrade during this week.
So, first of all, some minutes from Cyrillic GP meeting on ICANN 55 in Marrakesh.
No problem and thank you!
with one point of action. Until 21. march, Iliya, Dmitry and me (and all of you who wanna help), we will try to finish first step - creation of a full set of national scripts, which will be the base table for our work.
According to this action item, in attach You can find .txt files, extracted from IANA tables. @Dmitry, @Iliya - format is ok?
Format is ok - I see there are some extra characters in Ukrainian table. Let me provide proper character set. They are, in order (I use uppercase variants here): Base Cyrillic set: 0410 to 0429 (26 letters, A to Shcha), 042C (soft sign), 042E (YU), 042F (YA) - total 29, excludes 042A, 042B, 042D. Cyrillic extensions: 0404 (Ukrainian IE), 0406 (Ukrainian-Belarusian I), 0407 (Ukrainian YI) - total 3 letters. Extended Cyrillic: 0490 - Cyrillic Ghe with upturn - 1 letter. Total is 29+2+1=33 letters (excluding modifying apostrophe letter, 02BC - not part of MSR). See also: https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode https://en.wikipedia.org/wiki/Ukrainian_alphabet https://tools.ietf.org/html/rfc2319 (KOI8-U encoding for Internet use) I must also note that RFC 5992 provides incorrect information about Ukrainian: https://tools.ietf.org/html/rfc5992#section-2 says: <<< 2.9. Ukrainian The character list for modern Ukrainian has apparently not completely stabilized. Some references claim 31 characters and therefore an additional 8 characters to the Base Cyrillic set of 23. Others claim 33, adding U+0438 and U+0439 and replacing U+044A (Hard Sign) with U+044C (Soft Sign), for a total of an additional 11 characters as compared to the Base Cyrillic set. Unless better information is available, the prudent registry should probably assume that all 34 characters are in use, i.e., the Base Cyrillic set plus U+0438, U+0439, U+0454, U+0456, U+0457, U+0491, U+0449, U+044A, U+044C, U+044E, U+044F.
Per RFC 4992, Base Cyrillic refers to: <<< "Base Cyrillic" consists of the following Unicode code points (names associated with these code points and those below appear in Appendix A ): U+0430, U+0431, U+0432, U+0433, U+0434, U+0435, U+0436, U+0437, U+043A, U+043B, U+043C, U+043D, U+043E, U+043F, U+0440, U+0441, U+0442, U+0443, U+0444, U+0445, U+0446, U+0447, U+0448.
So, advice is to use 430 to 448 range inclusive, and 454, 456, 457, 491, 449, 44A, 44C, 44E, 44F. I don't know which sources authors consulted. Since at least 1992 (independence of Ukraine), Ukrainian alphabet had 33 letters and apostrophe (total 34 characters). in Soviet era, 0490 was excluded (banned.) Sometimes, soft sign was considered not a letter but a modifier, and was placed at end of alphabet (but still part of it.) The table at https://en.wikipedia.org/wiki/Ukrainian_alphabet has color-coded area of Unicode Cyrillic block.
Regards, Dusan
Please, feel free to check and add scripts as
On 17.3.2016 21:56, Dmitry Kohmanyuk wrote:
On 3 маÑÑа 2016, at 20:29, Dusan Stojicevic <dusan@dukes.in.rs> wrote:
Dear all,
Let me remind You about the work done in some friendly organizations> https://tools.ietf.org/html/rfc5992 Also, can we start to send tables?
I am ready - which format should be used? Text - one line per character -
A 0x0401 ...
works for me :)
We need to get this all together ASAP - I can collate tables together once we have raw data (which would be soon, right?)
so it would be like
Russian = Set 1 + Set 2 Ukrainian = Set 1 + Set 3 Kyrgyz = Set 1 + Set 2 + Set 4 ...
which can simplify our work later. It is not very necessary, a "columnar table" (letter - wjich languages use it) would also work.
On 3.3.2016 10:34, Dmitry Kohmanyuk wrote:
Minorities or not, languages should be represented (so if there is a majority in another country it may be covering a minority in another, as Dusan said - but it is not a universal situation.)
Case in point: Tatar language.
-- dk@
On 2 маÑÑа 2016, at 17:30, Dmitry Belyavsky <beldmit@gmail.com <mailto:beldmit@gmail.com>> wrote:
Dear Dusan,
I think that if we narrow the task not taking into account the minorities with own Cyrillic scripts, it will be better. For now it seems a reasonable enough simplification.
Thank you!
On Mon, Feb 15, 2016 at 5:28 PM, Dusan Stojicevic <dusan@dukes.in.rs <mailto:dusan@dukes.in.rs>> wrote:
Dear all,
According to the Proposal... and working plan (0.2 and 0.3), let me suggest first stage of the work: creation of a full set of national scripts, which will be the base table for our work.
Please send Your national Cyrillic script table with Unicode labels. If You have a minority in Your country using their own Cyrillic script, please send them too on the list, but check first EGIDS level (https://www.ethnologue.com/about/language-status) which have to be smaller than 4. Also, check the Proposal...
One thing> there is no need, for example, for Serbian representative to send Bulgarian script because of the Bulgarian minority in Serbia, we already have Bulgarian representatives...
Do You agree with this first step? Any suggestion?
If yes, let us set first deadline - 7 days from now, or Monday, 21 March 2016.
Regards, Dusan
--- Ova e-pošta je provjerena na viruse Avast protuvirusnim programom. https://www.avast.com/antivirus
_______________________________________________ Cyrillicgp mailing list Cyrillicgp@icann.org <mailto:Cyrillicgp@icann.org> https://mm.icann.org/mailman/listinfo/cyrillicgp
-- SY, Dmitry Belyavsky _______________________________________________ Cyrillicgp mailing list Cyrillicgp@icann.org <mailto:Cyrillicgp@icann.org> https://mm.icann.org/mailman/listinfo/cyrillicgp
--- Ova e-pošta je provjerena na viruse Avast protuvirusnim programom. https://www.avast.com/antivirus
--- Ova e-pošta je provjerena na viruse Avast protuvirusnim programom. https://www.avast.com/antivirus <Bulgaria.txt><Macedonia.txt><Russia.txt><Serbia.txt><Ukraine.txt>