Dear Dusan, The Bulgarian table is ok. This is what we use for .bg already. In addition to these letters, the hyphen (U+002D), as well as the digits zero to nine (U+0030 .. U+0039) are permitted. So we can consider that the correct set for Bulgarian. We also discussed the use of U+045D # CYRILLIC SMALL LETTER I WITH GRAVE which is an interesting case. The character exists in the language, but not in the alphabet. It is considered an “stressed” I. Because of this, it usually does not exist on the keyboards, although in recent years “Bulgarian” keyboards appeared, usually in various manufacturers' laptops that have it as separate key! The usage of this character was heavy in printed media few decades ago, but with the introduction of computers, computerized typesetting and the general ASCII-sation of the world, it almost faded away. It sort of returns to the language again. We have not considered the use of this character in the Bulgarian IDN tables so far, although it might come useful for some specific domain names ;-) Probably, that will be more useful at the second level anyway, so we could consider it later. Best Regards, Daniel Kalchev Register.BG
On 29.03.2016 г., at 13:55, Dusan Stojicevic <dusan@dukes.in.rs> wrote:
Thanks Dmitry.
@Iliya, Daniel or Nelly, can You check Bulgarian table?
Also, Sanja / macedonian table? Can You check this one?
Both are in my previous mails.
Dmitry (Belyavski) - can we agree only about the Russian table? I am not happy to use letters without origin? Can we start with Russian set, and we will add the rest during the work? .SU table can be our goal.
Also, Dmitry (Kohmanyuk) - You will compile everything into one table?
@Almaz - I assume that Kyrgyz table is done. Correct?
Let us try to finish this stage. We will extend the table during the work / we have no tables from Mongolia f.e. Opinions, comments?
Cheers, Dusan
On 22.3.2016 17:08, Dmitry Kohmanyuk wrote:
On 19 марта 2016, at 16:40, Dusan Stojicevic <dusan@dukes.in.rs> wrote:
Dear Dmitry, all,
Sorry for being late on response, I had a big DIDS event in Belgrade during this week.
So, first of all, some minutes from Cyrillic GP meeting on ICANN 55 in Marrakesh.
No problem and thank you!
with one point of action. Until 21. march, Iliya, Dmitry and me (and all of you who wanna help), we will try to finish first step - creation of a full set of national scripts, which will be the base table for our work.
According to this action item, in attach You can find .txt files, extracted from IANA tables. @Dmitry, @Iliya - format is ok?
Format is ok - I see there are some extra characters in Ukrainian table. Let me provide proper character set. They are, in order (I use uppercase variants here):
Base Cyrillic set: 0410 to 0429 (26 letters, A to Shcha), 042C (soft sign), 042E (YU), 042F (YA) - total 29, excludes 042A, 042B, 042D.
Cyrillic extensions: 0404 (Ukrainian IE), 0406 (Ukrainian-Belarusian I), 0407 (Ukrainian YI) - total 3 letters.
Extended Cyrillic: 0490 - Cyrillic Ghe with upturn - 1 letter.
Total is 29+2+1=33 letters (excluding modifying apostrophe letter, 02BC - not part of MSR).
See also: https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode https://en.wikipedia.org/wiki/Ukrainian_alphabet https://tools.ietf.org/html/rfc2319 (KOI8-U encoding for Internet use)
I must also note that RFC 5992 provides incorrect information about Ukrainian: https://tools.ietf.org/html/rfc5992#section-2 says:
<<< 2.9. Ukrainian
The character list for modern Ukrainian has apparently not completely stabilized. Some references claim 31 characters and therefore an additional 8 characters to the Base Cyrillic set of 23. Others claim 33, adding U+0438 and U+0439 and replacing U+044A (Hard Sign) with U+044C (Soft Sign), for a total of an additional 11 characters as compared to the Base Cyrillic set. Unless better information is available, the prudent registry should probably assume that all 34 characters are in use, i.e., the Base Cyrillic set plus U+0438, U+0439, U+0454, U+0456, U+0457, U+0491, U+0449, U+044A, U+044C, U+044E, U+044F.
Per RFC 4992, Base Cyrillic refers to:
<<< "Base Cyrillic" consists of the following Unicode code points (names associated with these code points and those below appear in
Appendix A ): U+0430, U+0431, U+0432, U+0433, U+0434, U+0435, U+0436, U+0437, U+043A, U+043B, U+043C, U+043D, U+043E, U+043F, U+0440, U+0441, U+0442, U+0443, U+0444, U+0445, U+0446, U+0447, U+0448.
So, advice is to use 430 to 448 range inclusive, and 454, 456, 457, 491, 449, 44A, 44C, 44E, 44F.
I don't know which sources authors consulted. Since at least 1992 (independence of Ukraine), Ukrainian alphabet had 33 letters and apostrophe (total 34 characters). in Soviet era, 0490 was excluded (banned.) Sometimes, soft sign was considered not a letter but a modifier, and was placed at end of alphabet (but still part of it.)
The table at https://en.wikipedia.org/wiki/Ukrainian_alphabet has color-coded area of Unicode Cyrillic block.
Regards, Dusan
Please, feel free to check and add scripts as
On 17.3.2016 21:56, Dmitry Kohmanyuk wrote:
On 3 маÑÑа 2016, at 20:29, Dusan Stojicevic <dusan@dukes.in.rs> wrote:
Dear all,
Let me remind You about the work done in some friendly organizations> https://tools.ietf.org/html/rfc5992 Also, can we start to send tables?
I am ready - which format should be used? Text - one line per character -
A 0x0401 ...
works for me :)
We need to get this all together ASAP - I can collate tables together once we have raw data (which would be soon, right?)
so it would be like
Russian = Set 1 + Set 2 Ukrainian = Set 1 + Set 3 Kyrgyz = Set 1 + Set 2 + Set 4 ...
which can simplify our work later. It is not very necessary, a "columnar table" (letter - wjich languages use it) would also work.
On 3.3.2016 10:34, Dmitry Kohmanyuk wrote:
Minorities or not, languages should be represented (so if there is a majority in another country it may be covering a minority in another, as Dusan said - but it is not a universal situation.)
Case in point: Tatar language.
-- dk@
On 2 маÑÑа 2016, at 17:30, Dmitry Belyavsky <beldmit@gmail.com <mailto:beldmit@gmail.com>> wrote:
> Dear Dusan, > > I think that if we narrow the task not taking into account the > minorities with own Cyrillic scripts, it will be better. > For now it seems a reasonable enough simplification. > > Thank you! > > On Mon, Feb 15, 2016 at 5:28 PM, Dusan Stojicevic <dusan@dukes.in.rs > <mailto:dusan@dukes.in.rs>> wrote: > > Dear all, > > According to the Proposal... and working plan (0.2 and 0.3), let me > suggest first stage of the work: creation of a full set of national > scripts, which will be the base table for our work. > > Please send Your national Cyrillic script table with Unicode labels. > If You have a minority in Your country using their own Cyrillic > script, > please send them too on the list, but check first EGIDS level > (https://www.ethnologue.com/about/language-status) which have to be > smaller than 4. Also, check the Proposal... > > One thing> there is no need, for example, for Serbian > representative to > send Bulgarian script because of the Bulgarian minority in Serbia, we > already have Bulgarian representatives... > > Do You agree with this first step? Any suggestion? > > If yes, let us set first deadline - 7 days from now, or Monday, 21 > March > 2016. > > Regards, > Dusan > > > --- > Ova e-pošta je provjerena na viruse Avast protuvirusnim programom. > https://www.avast.com/antivirus > > _______________________________________________ > Cyrillicgp mailing list > Cyrillicgp@icann.org <mailto:Cyrillicgp@icann.org> > https://mm.icann.org/mailman/listinfo/cyrillicgp > > > > > -- > SY, Dmitry Belyavsky > _______________________________________________ > Cyrillicgp mailing list > Cyrillicgp@icann.org <mailto:Cyrillicgp@icann.org> > https://mm.icann.org/mailman/listinfo/cyrillicgp
--- Ova e-pošta je provjerena na viruse Avast protuvirusnim programom. https://www.avast.com/antivirus
--- Ova e-pošta je provjerena na viruse Avast protuvirusnim programom. https://www.avast.com/antivirus <Bulgaria.txt><Macedonia.txt><Russia.txt><Serbia.txt><Ukraine.txt>
--- Ova e-pošta je provjerena na viruse Avast protuvirusnim programom. https://www.avast.com/antivirus
_______________________________________________ Cyrillicgp mailing list Cyrillicgp@icann.org https://mm.icann.org/mailman/listinfo/cyrillicgp