> 5. For every other
language analyzing, if code point found already has a
> mark,
go on to next character.
> 6. At the end by
combining all marks of all languages inspected, I think
> we
can consider that the repertoire consists of all marked code
points.
I
think we should keep record of the combination of Letter code point and Mark
code point (or code points). I think we should assume that we should restrict
the usage of Marks to those contexts that we can motivate from the language
material.
Mats
---
Mats
Dufberg
DNS
Specialist, IIS
Mobile: +46 73 065
3899
https://www.iis.se/en/
From:
<latingp-bounces@icann.org> on
behalf of Hazem Hezzah <hhezzah.las@gmail.com>
Organization:
LAS
Date: Tuesday 6 June 2017 at 15:12
To: 'Latin GP'
<latingp@icann.org>
Subject: Re: [Latingp] Minutes from the call
on May 30, 2017
Dear
all,
After
going through this thread of messages, let me suggest the following approach as
a start, with possibility to refine as we go on.
1.
Decide which
languages we will include in the first phase of repertoire building. According
to EGIDS scale either 1-2 (93 languages), 1-3 (134 languages), or 1-4 (180
languages). I would suggest to finish 1-2 then go on to 3-
4. The 3 Latin EGIDS 0 languages
(English, French, Spanish) are included in the 1 list
2.
Taking the
MSR-2 tables as our starting pool, I see that there are already a number of code
points ineligible from being used in the root zone (white and pink background),
so our pool will be only characters with white background. (recitation needed
if pink background characters are to be included)
3.
Distributing
the languages on members for analysis, I would suggest that each one goes
through the characters of the language in hand, then makes a mark on each code
point found in the MSR.
4.
If any
character is not found in the MSR, take a note to look after it
later.
5.
For every other
language analyzing, if code point found already has a mark, go on to next
character.
6.
At the end by
combining all marks of all languages inspected, I think we can consider that the
repertoire consists of all marked code points.
7.
Missing code
points taken notes with should be decided what to do
towards.
Any
suggestions for handling combinations?
Welcoming
your opinions.
Regards,
Hazem
Hezzah
From: latingp-bounces@icann.org
[mailto:latingp-bounces@icann.org] On Behalf Of Bill
Jouris
Sent: Monday, 05 June, 2017 18:19
To: Mats
Dufberg
Cc: Latin GP
Subject: Re: [Latingp] Minutes from the
call on May 30, 2017
Pardon my ignorance -- being
with the Variant group, I didn't realize that you folks in Repetoire were
considering any other approach.
Starting with the most used
languages, and then working thru as many of the less used ones as time allows,
seems like the obvious approach. Were you guys actually considering a
different one?
Bill Jouris
Inside
Products
bill.jouris@insidethestack.com
831-659-8360
925-855-9512
(direct)
From: Mats Dufberg <mats.dufberg@iis.se>
To: Bill
Jouris <bill.jouris@insidethestack.com>
Cc: Latin GP <latingp@icann.org>; "ahmedbakhat@yahoo.com" <ahmedbakhat@yahoo.com>
Sent:
Monday, June 5, 2017 9:11 AM
Subject: Re: [Latingp] Minutes from the
call on May 30, 2017
Bill,
Inclusion
of languages or code points?
The only
code points that we can include are the code points that we have confirmed to be
used by languages according to the criteria, i.e. the language must be high
enough on the EGIDS scale (low number) and the usage in some language should be
contemporary and established. All other code points are
excluded.
The number
of languages is high. That is a fact. The only way to reduce the number of
languages is to move the border higher up in scale.
My
suggestion is that we should start working by taking the languages highest up on
the scale (0-2) and get some experience from that. When we see what we get, we
can move into languages 3-4.
Yours,
Mats
From: Bill Jouris <bill.jouris@insidethestack.com>
Reply-To:
Bill Jouris <bill.jouris@insidethestack.com>
Date:
Monday 5 June 2017 at 17:38
To: Mats Dufberg <mats.dufberg@iis.se>, "ahmedbakhat@yahoo.com" <ahmedbakhat@yahoo.com>
Cc:
Latin GP <latingp@icann.org>
Subject: Re:
[Latingp] Minutes from the call on May 30,
2017
Given the enormous number of
languages involved, perhaps it would be better to establish which ones will be
included at this time. That is, go for inclusion, rather than
exclusion.
And then, separately,
principles and processes for including the occasional additional codepoint, if a
language which we did not get thru in this initial effort requires it.
Bill Jouris
Inside
Products
bill.jouris@insidethestack.com
831-659-8360
925-855-9512
(direct)
From: Mats Dufberg <mats.dufberg@iis.se>
To: "ahmedbakhat@yahoo.com" <ahmedbakhat@yahoo.com>
Cc:
Latin GP <latingp@icann.org>
Sent:
Monday, June 5, 2017 3:09 AM
Subject: Re: [Latingp] Minutes from the
call on May 30, 2017
Ahmed,
If you
start with MSR -- or actually MSR2 -- and try to find languages that support the
inclusion of its code points you would never be able to confirm that no code
points outside of MSR2 that are needed to support the languages that the Latin
GP wants to support. I do not say that such code points will be included, but we
should be aware of any limitation in the support of the languages that are
claimed to be supported.
If there is
any code point in MSR2 not used by any language we would have to investigate
every language anyway to confirm that the code point can be
excluded.
Besides the
Latin code points there are non-spacing marks that are used in combination with
Latin code points. Those combinations could have different status in language,
either being considered to be a character on its own or being a modified
character. In the repertoire that the Latin GP suggests that such non-spacing
marks are limited to just those combinations that are really used in the
languages that the group wants to support. To find those combinations we have to
investigate all languages.
Another
aspect is that the method of going code point by code point in MS2 requires that
we already know where to find what we are looking for. And when you start
studying the material for a language, the hardest step can be to find sources
and understanding what they say. After that it could be more straight forward to
extract the characters. -- I do not claim that the task is simple. In my work
for ICANN Pre-Delegation Testing, I have already done that. There are many grey
areas, but that is our task to dig into.
There is no
other way than going through all the
languages.
Yours,
Mats
From: <latingp-bounces@icann.org> on
behalf of Ahmed Bakhat via Latingp <latingp@icann.org>
Reply-To:
"ahmedbakhat@yahoo.com" <ahmedbakhat@yahoo.com>
Date:
Sunday 4 June 2017 at 15:35
To: "textualsolutions@gmail.com" <textualsolutions@gmail.com>,
Mirjana Tasić <Mirjana.Tasic@rnids.rs>
Cc: Latin GP <latingp@icann.org>
Subject: Re:
[Latingp] Minutes from the call on May 30,
2017
I raised the issue during
the to meeting work on Repertoire, as this group has yet not started its
meetings, to devise principles for inclusion / exclusion, so ghat we should have
solid grounds to include code points. Furthermore, some one has to present on
behalf of the group, what we have done and what is the way forward.
Regarding my email
containing draft principles, I wanted to communicate that before going for any
strategy ( either inclusion of code ponits on the basis of language or on the
basis of MSR) we should have principles for it. In my perception it would be
easy to go for MSR is much easy as compared to languages, as it would take years
to finish 180 languages.
It doesn't mean at all that
we will start from zero, Marjina has already done most of the work, so we
can quickly go through it and work on rest of the code
points.
Best
Regards,
Ahmed
Bakht
On Sun, 4 Jun 2017 at 3:52 pm, Textual Solutions
<textualsolutions@gmail.com> wrote:
_______________________________________________
Latingp mailing list
Latingp@icann.org
https://mm.icann.org/mailman/listinfo/latingp
_______________________________________________
Latingp
mailing list
Latingp@icann.org
https://mm.icann.org/mailman/listinfo/latingp
| Virus-free. www.avg.com |