Hi Mats,

By marks I mean marking the code point as present in a language. Like making a check mark √ for that code point.

Regards,
Hazem Hezzah

From: Mats Dufberg

Sent: Tuesday, June 06, 2017 6:57 PM

To: Hazem Hezzah ; 'Latin GP'

Subject: Re: [Latingp] Minutes from the call on May 30, 2017

> 5. For every other language analyzing, if code point found already has a

> mark, go on to next character.

> 6. At the end by combining all marks of all languages inspected, I think

> we can consider that the repertoire consists of all marked code points.

I think we should keep record of the combination of Letter code point and Mark code point (or code points). I think we should assume that we should restrict the usage of Marks to those contexts that we can motivate from the language material.

Mats

---

Mats Dufberg

DNS Specialist, IIS

Mobile: +46 73 065 3899

https://www.iis.se/en/

From: <latingp-bounces@icann.org> on behalf of Hazem Hezzah <hhezzah.las@gmail.com>
Organization: LAS
Date: Tuesday 6 June 2017 at 15:12
To: 'Latin GP' <latingp@icann.org>
Subject: Re: [Latingp] Minutes from the call on May 30, 2017

Dear all,

After going through this thread of messages, let me suggest the following approach as a start, with possibility to refine as we go on.

1. Decide which languages we will include in the first phase of repertoire building. According to EGIDS scale either 1-2 (93 languages), 1-3 (134 languages), or 1-4 (180 languages). I would suggest to finish 1-2 then go on to 3- 4. The 3 Latin EGIDS 0 languages (English, French, Spanish) are included in the 1 list

2. Taking the MSR-2 tables as our starting pool, I see that there are already a number of code points ineligible from being used in the root zone (white and pink background), so our pool will be only characters with white background. (recitation needed if pink background characters are to be included)

3. Distributing the languages on members for analysis, I would suggest that each one goes through the characters of the language in hand, then makes a mark on each code point found in the MSR.

4. If any character is not found in the MSR, take a note to look after it later.

5. For every other language analyzing, if code point found already has a mark, go on to next character.

6. At the end by combining all marks of all languages inspected, I think we can consider that the repertoire consists of all marked code points.

7. Missing code points taken notes with should be decided what to do towards.

Any suggestions for handling combinations?

Welcoming your opinions.

Regards,

Hazem Hezzah

From: latingp-bounces@icann.org [mailto:latingp-bounces@icann.org] On Behalf Of Bill Jouris
Sent: Monday, 05 June, 2017 18:19
To: Mats Dufberg
Cc: Latin GP
Subject: Re: [Latingp] Minutes from the call on May 30, 2017

Pardon my ignorance -- being with the Variant group, I didn't realize that you folks in Repetoire were considering any other approach.

Starting with the most used languages, and then working thru as many of the less used ones as time allows, seems like the obvious approach. Were you guys actually considering a different one?

Bill Jouris
Inside Products
bill.jouris@insidethestack.com
831-659-8360
925-855-9512 (direct)

From: Mats Dufberg <mats.dufberg@iis.se>
To: Bill Jouris <bill.jouris@insidethestack.com>
Cc: Latin GP <latingp@icann.org>; "ahmedbakhat@yahoo.com" <ahmedbakhat@yahoo.com>
Sent: Monday, June 5, 2017 9:11 AM
Subject: Re: [Latingp] Minutes from the call on May 30, 2017

Bill,

Inclusion of languages or code points?

The only code points that we can include are the code points that we have confirmed to be used by languages according to the criteria, i.e. the language must be high enough on the EGIDS scale (low number) and the usage in some language should be contemporary and established. All other code points are excluded.

The number of languages is high. That is a fact. The only way to reduce the number of languages is to move the border higher up in scale.

My suggestion is that we should start working by taking the languages highest up on the scale (0-2) and get some experience from that. When we see what we get, we can move into languages 3-4.

https://www.ethnologue.com/about/language-status

Yours,

Mats

---

Mats Dufberg

DNS Specialist, IIS

Mobile: +46 73 065 3899

https://www.iis.se/en/

From: Bill Jouris <bill.jouris@insidethestack.com>
Reply-To: Bill Jouris <bill.jouris@insidethestack.com>
Date: Monday 5 June 2017 at 17:38
To: Mats Dufberg <mats.dufberg@iis.se>, "ahmedbakhat@yahoo.com" <ahmedbakhat@yahoo.com>
Cc: Latin GP <latingp@icann.org>
Subject: Re: [Latingp] Minutes from the call on May 30, 2017

Given the enormous number of languages involved, perhaps it would be better to establish which ones will be included at this time. That is, go for inclusion, rather than exclusion.

And then, separately, principles and processes for including the occasional additional codepoint, if a language which we did not get thru in this initial effort requires it.

Bill Jouris
Inside Products
bill.jouris@insidethestack.com
831-659-8360
925-855-9512 (direct)

From: Mats Dufberg <mats.dufberg@iis.se>
To: "ahmedbakhat@yahoo.com" <ahmedbakhat@yahoo.com>
Cc: Latin GP <latingp@icann.org>
Sent: Monday, June 5, 2017 3:09 AM
Subject: Re: [Latingp] Minutes from the call on May 30, 2017

Ahmed,

If you start with MSR -- or actually MSR2 -- and try to find languages that support the inclusion of its code points you would never be able to confirm that no code points outside of MSR2 that are needed to support the languages that the Latin GP wants to support. I do not say that such code points will be included, but we should be aware of any limitation in the support of the languages that are claimed to be supported.

If there is any code point in MSR2 not used by any language we would have to investigate every language anyway to confirm that the code point can be excluded.

Besides the Latin code points there are non-spacing marks that are used in combination with Latin code points. Those combinations could have different status in language, either being considered to be a character on its own or being a modified character. In the repertoire that the Latin GP suggests that such non-spacing marks are limited to just those combinations that are really used in the languages that the group wants to support. To find those combinations we have to investigate all languages.

Another aspect is that the method of going code point by code point in MS2 requires that we already know where to find what we are looking for. And when you start studying the material for a language, the hardest step can be to find sources and understanding what they say. After that it could be more straight forward to extract the characters. -- I do not claim that the task is simple. In my work for ICANN Pre-Delegation Testing, I have already done that. There are many grey areas, but that is our task to dig into.

There is no other way than going through all the languages.

Yours,

Mats

---

Mats Dufberg

DNS Specialist, IIS

Mobile: +46 73 065 3899

https://www.iis.se/en/

From: <latingp-bounces@icann.org> on behalf of Ahmed Bakhat via Latingp <latingp@icann.org>
Reply-To: "ahmedbakhat@yahoo.com" <ahmedbakhat@yahoo.com>
Date: Sunday 4 June 2017 at 15:35
To: "textualsolutions@gmail.com" <textualsolutions@gmail.com>, Mirjana Tasić <Mirjana.Tasic@rnids.rs>
Cc: Latin GP <latingp@icann.org>
Subject: Re: [Latingp] Minutes from the call on May 30, 2017

I raised the issue during the to meeting work on Repertoire, as this group has yet not started its meetings, to devise principles for inclusion / exclusion, so ghat we should have solid grounds to include code points. Furthermore, some one has to present on behalf of the group, what we have done and what is the way forward.

Regarding my email containing draft principles, I wanted to communicate that before going for any strategy ( either inclusion of code ponits on the basis of language or on the basis of MSR) we should have principles for it. In my perception it would be easy to go for MSR is much easy as compared to languages, as it would take years to finish 180 languages.

It doesn't mean at all that we will start from zero, Marjina has already done most of the work, so we can quickly go through it and work on rest of the code points.

Best Regards,

Ahmed Bakht

Sent from Yahoo Mail on Android

On Sun, 4 Jun 2017 at 3:52 pm, Textual Solutions

<textualsolutions@gmail.com> wrote:

_______________________________________________
Latingp mailing list
Latingp@icann.org
https://mm.icann.org/mailman/listinfo/latingp

_______________________________________________
Latingp mailing list
Latingp@icann.org
https://mm.icann.org/mailman/listinfo/latingp

Virus-free. www.avg.com