[LSD PDP] Review Request: Draft Message for Early Input Request
Dear LSD PDP WG, In accordance with the GNSO’s PDP requirements, the first step of the process after the WG is formed is to seek an input from each Supporting Organization, Advisory Committee, and GNSO’s Stakeholder Group/Constituency on the topics within the WG Charter. As we prepare for the Kick-off meeting, please share any concerns/suggested edits through sidebar comments within the google doc.<https://docs.google.com/document/d/1omNCnlnVUVecAPSrTkJgK6Ien_sAKApD/edit?us...> by Wednesday, 05 March, at 23:59 UTC. Any substantive concerns/comments will be addressed during Kick-off (08 March). Subsequently, the finalized message will be sent out to SO/AC/SG/Cs by mid-end of March. Kind regards, Saewon Saewon Lee Policy Development Support Manager for GNSO Internet Corporation for Assigned Names and Numbers (ICANN) Mobile: +1 (310) 463-5541 Email: saewon.lee@icann.org<mailto:saewon.lee@icann.org> Skype: saewon.lee.icann Website: www.icann.org<http://www.icann.org>
Dear all, This is not really a substantial concern, but I noticed a couple of small errors in the Proposal for Latin Root Zone document, specifically in the included code points table (5.3.), languages using the code point column: š (latin small letter s with caron) and ž (latin small letter z with caron) are also used in Finnish, and their behaviour is unusual enough and possibly relevant to us that it should be discussed at some point. æ (latin small letter ae) and ø (latin small letter o with stroke) are also used in Norwegian. ü (latin small letter u with diaeresis) is *not* used in Swedish unless you count proper names of foreign origin, in which case it should also be included in Finnish along with é and some others. Whether those actually matter for us, I'm not sure. But as an observation, Finnish has historically treated ü as a variant of y, not of u, and likewise å (a with ring above) as a variant of o rather than a, and although things have become muddled since the advent of computers, a Finnish speaker might still consider "yber" and "über" equivalent and potentially confuse them with "uber" as well. There are other similar cases, too. Regards, -- Tapani Tarvainen On Mon, Feb 24, 2025 at 07:46:54PM +0000, Saewon Lee via Gnso-latin-diacritics (gnso-latin-diacritics@icann.org) wrote:
Dear LSD PDP WG,
In accordance with the GNSO’s PDP requirements, the first step of the process after the WG is formed is to seek an input from each Supporting Organization, Advisory Committee, and GNSO’s Stakeholder Group/Constituency on the topics within the WG Charter. As we prepare for the Kick-off meeting, please share any concerns/suggested edits through sidebar comments within the google doc.<https://docs.google.com/document/d/1omNCnlnVUVecAPSrTkJgK6Ien_sAKApD/edit?us...> by Wednesday, 05 March, at 23:59 UTC.
Any substantive concerns/comments will be addressed during Kick-off (08 March). Subsequently, the finalized message will be sent out to SO/AC/SG/Cs by mid-end of March. Kind regards, Saewon
Saewon Lee Policy Development Support Manager for GNSO Internet Corporation for Assigned Names and Numbers (ICANN)
Mobile: +1 (310) 463-5541 Email: saewon.lee@icann.org<mailto:saewon.lee@icann.org> Skype: saewon.lee.icann Website: www.icann.org<http://www.icann.org>
Tapani, As far as my understanding so far goes, and this is really just my personal assessment: since the decision of the Latin RZ-LGR WG was not to handle individual cases such as the ones you mentioned, our task in this WG will be to architect a solution that does not depend on individual linguistic cases, but rather on the rationale for the double registration of a non-variant according to the RZ-LGR. In PT-BR, an example that I can think of is my home city of São Paulo, which is honestly written both with and without the tilde, making both "Latin Small Letter A with Tilde" and ASCII table 097 "a" into equally valid representations. But again, just my personal understanding. Best, On February 26, 2025 6:23:14 PM UTC, Tapani Tarvainen via Gnso-latin-diacritics <gnso-latin-diacritics@icann.org> wrote:
Dear all,
This is not really a substantial concern, but I noticed a couple of small errors in the Proposal for Latin Root Zone document, specifically in the included code points table (5.3.), languages using the code point column:
š (latin small letter s with caron) and ž (latin small letter z with caron) are also used in Finnish, and their behaviour is unusual enough and possibly relevant to us that it should be discussed at some point.
æ (latin small letter ae) and ø (latin small letter o with stroke) are also used in Norwegian.
ü (latin small letter u with diaeresis) is *not* used in Swedish unless you count proper names of foreign origin, in which case it should also be included in Finnish along with é and some others.
Whether those actually matter for us, I'm not sure. But as an observation, Finnish has historically treated ü as a variant of y, not of u, and likewise å (a with ring above) as a variant of o rather than a, and although things have become muddled since the advent of computers, a Finnish speaker might still consider "yber" and "über" equivalent and potentially confuse them with "uber" as well. There are other similar cases, too.
Regards,
--- Mark W. Datysgeld from Governance Primer
Hi Mark, Perhaps I misunderstood your point, but just to clarify: I did not mean that we should resolve individual cases, but that they would be useful for testing possible general solutions against. Indeed I'm not sure how else we could evaluate how well a possible general solution would work. Best regards, Tapani On Wed, Feb 26, 2025 at 07:48:19PM +0000, Mark W. Datysgeld (mark@governanceprimer.com) wrote:
Tapani,
As far as my understanding so far goes, and this is really just my personal assessment: since the decision of the Latin RZ-LGR WG was not to handle individual cases such as the ones you mentioned, our task in this WG will be to architect a solution that does not depend on individual linguistic cases, but rather on the rationale for the double registration of a non-variant according to the RZ-LGR.
In PT-BR, an example that I can think of is my home city of São Paulo, which is honestly written both with and without the tilde, making both "Latin Small Letter A with Tilde" and ASCII table 097 "a" into equally valid representations.
But again, just my personal understanding.
Best,
On February 26, 2025 6:23:14 PM UTC, Tapani Tarvainen via Gnso-latin-diacritics <gnso-latin-diacritics@icann.org> wrote:
Dear all,
This is not really a substantial concern, but I noticed a couple of small errors in the Proposal for Latin Root Zone document, specifically in the included code points table (5.3.), languages using the code point column:
š (latin small letter s with caron) and ž (latin small letter z with caron) are also used in Finnish, and their behaviour is unusual enough and possibly relevant to us that it should be discussed at some point.
æ (latin small letter ae) and ø (latin small letter o with stroke) are also used in Norwegian.
ü (latin small letter u with diaeresis) is *not* used in Swedish unless you count proper names of foreign origin, in which case it should also be included in Finnish along with é and some others.
Whether those actually matter for us, I'm not sure. But as an observation, Finnish has historically treated ü as a variant of y, not of u, and likewise å (a with ring above) as a variant of o rather than a, and although things have become muddled since the advent of computers, a Finnish speaker might still consider "yber" and "über" equivalent and potentially confuse them with "uber" as well. There are other similar cases, too.
Regards,
--- Mark W. Datysgeld from Governance Primer
-- Tapani Tarvainen
Hi Tapani and all. The LGR tables are “there” and (thankfully!!!) it is not our job to discuss/evaluate/amend them. They contain the list of valid codeponints and made some decisions on what are “variants”, or, in fact, what are not. Beyond that, it is embarrassing to see the number of debattable explanations, and outright misatkes they contain, but this is beyond our role here. Our point is not whether SaoTapani or S{ãoTapani (with a tilde over the a or not) is corret in Portuguese, or Finnish, or Italian or Quetchua. Nor it is whehter SaoTapani is a correct geographic name, a real family name or a real brand name. Let’s take “Häagen-Dasz, a coined term meant to mean nothing in any language and not to be correct in any language. Our point is to discuss whehter .häagendasz and .haagendasz mahy coexist and if yes, under which conditions. “Correctness” of the labels in any other sense as “valid for TLD labels” is out of scope. Regards Amadeu El 26 febr. 2025, a les 21:52, Tapani Tarvainen via Gnso-latin-diacritics <gnso-latin-diacritics@icann.org> va escriure:
Hi Mark,
Perhaps I misunderstood your point, but just to clarify: I did not mean that we should resolve individual cases, but that they would be useful for testing possible general solutions against. Indeed I'm not sure how else we could evaluate how well a possible general solution would work.
Best regards,
Tapani
On Wed, Feb 26, 2025 at 07:48:19PM +0000, Mark W. Datysgeld (mark@governanceprimer.com) wrote:
Tapani,
As far as my understanding so far goes, and this is really just my personal assessment: since the decision of the Latin RZ-LGR WG was not to handle individual cases such as the ones you mentioned, our task in this WG will be to architect a solution that does not depend on individual linguistic cases, but rather on the rationale for the double registration of a non-variant according to the RZ-LGR.
In PT-BR, an example that I can think of is my home city of São Paulo, which is honestly written both with and without the tilde, making both "Latin Small Letter A with Tilde" and ASCII table 097 "a" into equally valid representations.
But again, just my personal understanding.
Best,
On February 26, 2025 6:23:14 PM UTC, Tapani Tarvainen via Gnso-latin-diacritics <gnso-latin-diacritics@icann.org> wrote: Dear all,
This is not really a substantial concern, but I noticed a couple of small errors in the Proposal for Latin Root Zone document, specifically in the included code points table (5.3.), languages using the code point column:
š (latin small letter s with caron) and ž (latin small letter z with caron) are also used in Finnish, and their behaviour is unusual enough and possibly relevant to us that it should be discussed at some point.
æ (latin small letter ae) and ø (latin small letter o with stroke) are also used in Norwegian.
ü (latin small letter u with diaeresis) is *not* used in Swedish unless you count proper names of foreign origin, in which case it should also be included in Finnish along with é and some others.
Whether those actually matter for us, I'm not sure. But as an observation, Finnish has historically treated ü as a variant of y, not of u, and likewise å (a with ring above) as a variant of o rather than a, and although things have become muddled since the advent of computers, a Finnish speaker might still consider "yber" and "über" equivalent and potentially confuse them with "uber" as well. There are other similar cases, too.
Regards,
--- Mark W. Datysgeld from Governance Primer
-- Tapani Tarvainen _______________________________________________ Gnso-latin-diacritics mailing list -- gnso-latin-diacritics@icann.org To unsubscribe send an email to gnso-latin-diacritics-leave@icann.org
Hi all, I may have misunderstood our mandate, but perhaps this discussion is useful just for clarifying it. On Wed, Feb 26, 2025 at 11:30:50PM +0100, Amadeu Abril i Abril (CORE) (amadeu.abril@corenic.org) wrote:
The LGR tables are “there” and (thankfully!!!) it is not our job to discuss/evaluate/amend them.
I understand that we're not supposed to amend them, but I don't see how we can do anything if we aren't allowed to discuss them. They do have an impact on what we're supposed to do, don't they.
Let’s take “Häagen-Dasz, a coined term meant to mean nothing in any language and not to be correct in any language. Our point is to discuss whehter .häagendasz and .haagendasz mahy coexist and if yes, under which conditions. “Correctness” of the labels in any other sense as “valid for TLD labels” is out of scope.
How can we decide or even meaningfully discuss about what should or should not be valid for TLD labels without considering the underlying language issues? The real life impact of the decision can be drastically different for real words and names and for such made-up words. If we can't take that into account, I don't see what we have to do at all. -- Tapani Tarvainen
Hi Mark, In fact, the Latin generation pane, as we understood our charge, looked *only* at distinguishable visual similarity. The only case of equally valid representation that we accepted was the German sharp S / double S. (Perhaps because two of the seven members were native speakers of German.) My understanding, slightly different from yours, is that we are looking both at a paradigm for cases where ASCII has been used because diacritics were not available and at how to deal with cases like the one you mention where there are multiple valid representations of the same characters in some languages. Bill Yahoo Mail - Email Simplified On Wed, Feb 26, 2025 at 11:48 AM, Mark W. Datysgeld via Gnso-latin-diacritics<gnso-latin-diacritics@icann.org> wrote: Tapani, As far as my understanding so far goes, and this is really just my personal assessment: since the decision of the Latin RZ-LGR WG was not to handle individual cases such as the ones you mentioned, our task in this WG will be to architect a solution that does not depend on individual linguistic cases, but rather on the rationale for the double registration of a non-variant according to the RZ-LGR. In PT-BR, an example that I can think of is my home city of São Paulo, which is honestly written both with and without the tilde, making both "Latin Small Letter A with Tilde" and ASCII table 097 "a" into equally valid representations. But again, just my personal understanding. Best, On February 26, 2025 6:23:14 PM UTC, Tapani Tarvainen via Gnso-latin-diacritics <gnso-latin-diacritics@icann.org> wrote:
Dear all,
This is not really a substantial concern, but I noticed a couple of small errors in the Proposal for Latin Root Zone document, specifically in the included code points table (5.3.), languages using the code point column:
š (latin small letter s with caron) and ž (latin small letter z with caron) are also used in Finnish, and their behaviour is unusual enough and possibly relevant to us that it should be discussed at some point.
æ (latin small letter ae) and ø (latin small letter o with stroke) are also used in Norwegian.
ü (latin small letter u with diaeresis) is *not* used in Swedish unless you count proper names of foreign origin, in which case it should also be included in Finnish along with é and some others.
Whether those actually matter for us, I'm not sure. But as an observation, Finnish has historically treated ü as a variant of y, not of u, and likewise å (a with ring above) as a variant of o rather than a, and although things have become muddled since the advent of computers, a Finnish speaker might still consider "yber" and "über" equivalent and potentially confuse them with "uber" as well. There are other similar cases, too.
Regards,
--- Mark W. Datysgeld from Governance Primer _______________________________________________ Gnso-latin-diacritics mailing list -- gnso-latin-diacritics@icann.org To unsubscribe send an email to gnso-latin-diacritics-leave@icann.org
Hi all, Apologies if this is a stupid question, but are situations where multiple different diacritics for the same letter are in use in some languages already decided and out of scope for us? E.g., assume a Swedish applicant applies for ".sjö" and a Norwegian one for ".sjø" at the same time. My instinctive reaction would be to give each of them what they want and ".sjo" to nobody. Does some rule prevent that? Or could one applicant get all of ".sjö", ".sjø" and ".sjo"? Also, I brought up š in Finnish because it behaves somewhat similarly to ß in German: it can be (and regularly is) replaced by sh, and it could be highly confusing if, say, ".šakki" and ".shakki" were given to different entities. I guess this group is wrong place to deal with such issues, but maybe it would be possible to just note them and mention in our report that they caused problems (or at least discussion) in our group. Tapani On Wed, Feb 26, 2025 at 09:54:09PM +0000, Bill Jouris (b_jouris@yahoo.com) wrote:
Hi Mark, In fact, the Latin generation pane, as we understood our charge, looked *only* at distinguishable visual similarity. The only case of equally valid representation that we accepted was the German sharp S / double S. (Perhaps because two of the seven members were native speakers of German.) My understanding, slightly different from yours, is that we are looking both at a paradigm for cases where ASCII has been used because diacritics were not available and at how to deal with cases like the one you mention where there are multiple valid representations of the same characters in some languages. Bill
Yahoo Mail - Email Simplified
On Wed, Feb 26, 2025 at 11:48 AM, Mark W. Datysgeld via Gnso-latin-diacritics<gnso-latin-diacritics@icann.org> wrote: Tapani,
As far as my understanding so far goes, and this is really just my personal assessment: since the decision of the Latin RZ-LGR WG was not to handle individual cases such as the ones you mentioned, our task in this WG will be to architect a solution that does not depend on individual linguistic cases, but rather on the rationale for the double registration of a non-variant according to the RZ-LGR.
In PT-BR, an example that I can think of is my home city of São Paulo, which is honestly written both with and without the tilde, making both "Latin Small Letter A with Tilde" and ASCII table 097 "a" into equally valid representations.
But again, just my personal understanding.
Best,
On February 26, 2025 6:23:14 PM UTC, Tapani Tarvainen via Gnso-latin-diacritics <gnso-latin-diacritics@icann.org> wrote:
Dear all,
This is not really a substantial concern, but I noticed a couple of small errors in the Proposal for Latin Root Zone document, specifically in the included code points table (5.3.), languages using the code point column:
š (latin small letter s with caron) and ž (latin small letter z with caron) are also used in Finnish, and their behaviour is unusual enough and possibly relevant to us that it should be discussed at some point.
æ (latin small letter ae) and ø (latin small letter o with stroke) are also used in Norwegian.
ü (latin small letter u with diaeresis) is *not* used in Swedish unless you count proper names of foreign origin, in which case it should also be included in Finnish along with é and some others.
Whether those actually matter for us, I'm not sure. But as an observation, Finnish has historically treated ü as a variant of y, not of u, and likewise å (a with ring above) as a variant of o rather than a, and although things have become muddled since the advent of computers, a Finnish speaker might still consider "yber" and "über" equivalent and potentially confuse them with "uber" as well. There are other similar cases, too.
Regards,
--- Mark W. Datysgeld from Governance Primer
Hi Tapani, To address your Swedish/Norwegian case. As the rules are currently written, "ö", "ø", and "o" are not variants of each other. So one applicant could get all of ".sjö", ".sjø" and ".sjo". Or three different applicants could each get one of them. Bill Jouris Sent from Yahoo Mail on Android On Wed, Feb 26, 2025 at 10:43 PM, Tapani Tarvainen via Gnso-latin-diacritics<gnso-latin-diacritics@icann.org> wrote: Hi all, Apologies if this is a stupid question, but are situations where multiple different diacritics for the same letter are in use in some languages already decided and out of scope for us? E.g., assume a Swedish applicant applies for ".sjö" and a Norwegian one for ".sjø" at the same time. My instinctive reaction would be to give each of them what they want and ".sjo" to nobody. Does some rule prevent that? Or could one applicant get all of ".sjö", ".sjø" and ".sjo"? Also, I brought up š in Finnish because it behaves somewhat similarly to ß in German: it can be (and regularly is) replaced by sh, and it could be highly confusing if, say, ".šakki" and ".shakki" were given to different entities. I guess this group is wrong place to deal with such issues, but maybe it would be possible to just note them and mention in our report that they caused problems (or at least discussion) in our group. Tapani On Wed, Feb 26, 2025 at 09:54:09PM +0000, Bill Jouris (b_jouris@yahoo.com) wrote:
Hi Mark, In fact, the Latin generation pane, as we understood our charge, looked *only* at distinguishable visual similarity. The only case of equally valid representation that we accepted was the German sharp S / double S. (Perhaps because two of the seven members were native speakers of German.) My understanding, slightly different from yours, is that we are looking both at a paradigm for cases where ASCII has been used because diacritics were not available and at how to deal with cases like the one you mention where there are multiple valid representations of the same characters in some languages. Bill
Yahoo Mail - Email Simplified On Wed, Feb 26, 2025 at 11:48 AM, Mark W. Datysgeld via Gnso-latin-diacritics<gnso-latin-diacritics@icann.org> wrote: Tapani,
As far as my understanding so far goes, and this is really just my personal assessment: since the decision of the Latin RZ-LGR WG was not to handle individual cases such as the ones you mentioned, our task in this WG will be to architect a solution that does not depend on individual linguistic cases, but rather on the rationale for the double registration of a non-variant according to the RZ-LGR.
In PT-BR, an example that I can think of is my home city of São Paulo, which is honestly written both with and without the tilde, making both "Latin Small Letter A with Tilde" and ASCII table 097 "a" into equally valid representations.
But again, just my personal understanding.
Best,
On February 26, 2025 6:23:14 PM UTC, Tapani Tarvainen via Gnso-latin-diacritics <gnso-latin-diacritics@icann.org> wrote:
Dear all,
This is not really a substantial concern, but I noticed a couple of small errors in the Proposal for Latin Root Zone document, specifically in the included code points table (5.3.), languages using the code point column:
š (latin small letter s with caron) and ž (latin small letter z with caron) are also used in Finnish, and their behaviour is unusual enough and possibly relevant to us that it should be discussed at some point.
æ (latin small letter ae) and ø (latin small letter o with stroke) are also used in Norwegian.
ü (latin small letter u with diaeresis) is *not* used in Swedish unless you count proper names of foreign origin, in which case it should also be included in Finnish along with é and some others.
Whether those actually matter for us, I'm not sure. But as an observation, Finnish has historically treated ü as a variant of y, not of u, and likewise å (a with ring above) as a variant of o rather than a, and although things have become muddled since the advent of computers, a Finnish speaker might still consider "yber" and "über" equivalent and potentially confuse them with "uber" as well. There are other similar cases, too.
Regards,
--- Mark W. Datysgeld from Governance Primer
_______________________________________________ Gnso-latin-diacritics mailing list -- gnso-latin-diacritics@icann.org To unsubscribe send an email to gnso-latin-diacritics-leave@icann.org
Hi all, thanks again for all the discussion. I guess it's too early to find a solution for everything already now ... even before the group had its official kick-off meeting. But it's good to see you're all very motivated. ;-) Am 27.02.2025 um 07:53 schrieb Bill Jouris via Gnso-latin-diacritics:
Hi Tapani,
To address your Swedish/Norwegian case. As the rules are currently written, "ö", "ø", and "o" are not variants of each other. So one applicant could get all of ".sjö", ".sjø" and ".sjo". Or three different applicants could each get one of them.
In theory this is correct, though in practise the ICANN string similarity review panel would almost certainly put all of the three in the same contention set, meaning that only one of them would be handed out and the other two would be rejected and unavailable to anybody. And that problem is essentially what this group is tasked to talk about. Should there be situation, in which one (or more) of the contention strings be available and what restrictions would have to be created to avoid user confusion. Cheers, Michael -- ____________________________________________________________________ | | | knipp | Knipp Medien und Kommunikation GmbH ------- Technologiepark Martin-Schmeisser-Weg 9 44227 Dortmund Germany Dipl.-Informatiker Fon: +49 231 9703-0 Fax: +49 231 9703-200 Dr. Michael Bauland SIP: Michael.Bauland@knipp.de Software Development E-mail: Michael.Bauland@knipp.de Register Court: Amtsgericht Dortmund, HRB 13728 Chief Executive Officers: Dietmar Knipp, Elmar Knipp Certified according DIN ISO/IEC 27001:2017
To me, the core of what we are looking into is precisely that under current rules ".sjö", ".sjø" and ".sjo" would (based on what we heard from previous applicants), be forced into a decision of only one being able to exist regardless of their merit, which seems like a fundamentally insufficient solution. There needs to be a fair and informed process underlying those decisions that doesn't just result in an automatic failure. Anyway, looking forward to working with everyone! It's great to be able to discuss these nuances with people who are also passionate about language. Best, On February 27, 2025 8:57:46 AM UTC, Michael Bauland via Gnso-latin-diacritics <gnso-latin-diacritics@icann.org> wrote:
Hi all,
thanks again for all the discussion. I guess it's too early to find a solution for everything already now ... even before the group had its official kick-off meeting. But it's good to see you're all very motivated. ;-)
Am 27.02.2025 um 07:53 schrieb Bill Jouris via Gnso-latin-diacritics:
Hi Tapani,
To address your Swedish/Norwegian case. As the rules are currently written, "ö", "ø", and "o" are not variants of each other. So one applicant could get all of ".sjö", ".sjø" and ".sjo". Or three different applicants could each get one of them.
In theory this is correct, though in practise the ICANN string similarity review panel would almost certainly put all of the three in the same contention set, meaning that only one of them would be handed out and the other two would be rejected and unavailable to anybody.
And that problem is essentially what this group is tasked to talk about. Should there be situation, in which one (or more) of the contention strings be available and what restrictions would have to be created to avoid user confusion.
Cheers,
Michael
--- Mark W. Datysgeld from Governance Primer
Mark, Congratulations for getting elected as VC of PDP. Anil Sent from my iPhone
On 27 Feb 2025, at 15:14, Mark W. Datysgeld via Gnso-latin-diacritics <gnso-latin-diacritics@icann.org> wrote:
To me, the core of what we are looking into is precisely that under current rules ".sjö", ".sjø" and ".sjo" would (based on what we heard from previous applicants), be forced into a decision of only one being able to exist regardless of their merit, which seems like a fundamentally insufficient solution. There needs to be a fair and informed process underlying those decisions that doesn't just result in an automatic failure.
Anyway, looking forward to working with everyone! It's great to be able to discuss these nuances with people who are also passionate about language.
Best,
On February 27, 2025 8:57:46 AM UTC, Michael Bauland via Gnso-latin-diacritics <gnso-latin-diacritics@icann.org> wrote: Hi all,
thanks again for all the discussion. I guess it's too early to find a solution for everything already now ... even before the group had its official kick-off meeting. But it's good to see you're all very motivated. ;-)
Am 27.02.2025 um 07:53 schrieb Bill Jouris via Gnso-latin-diacritics: Hi Tapani,
To address your Swedish/Norwegian case. As the rules are currently written, "ö", "ø", and "o" are not variants of each other. So one applicant could get all of ".sjö", ".sjø" and ".sjo". Or three different applicants could each get one of them.
In theory this is correct, though in practise the ICANN string similarity review panel would almost certainly put all of the three in the same contention set, meaning that only one of them would be handed out and the other two would be rejected and unavailable to anybody.
And that problem is essentially what this group is tasked to talk about. Should there be situation, in which one (or more) of the contention strings be available and what restrictions would have to be created to avoid user confusion.
Cheers,
Michael
--- Mark W. Datysgeld from Governance Primer _______________________________________________ Gnso-latin-diacritics mailing list -- gnso-latin-diacritics@icann.org To unsubscribe send an email to gnso-latin-diacritics-leave@icann.org
Terve Tapani, thanks for getting the discussion started. Just some quick clarifications about the meaning of code points table in Section 5.3 of the RZ-LRG Proposal. Am 26.02.2025 um 19:23 schrieb Tapani Tarvainen via Gnso-latin-diacritics:
Dear all,
This is not really a substantial concern, but I noticed a couple of small errors in the Proposal for Latin Root Zone document, specifically in the included code points table (5.3.), languages using the code point column:
š (latin small letter s with caron) and ž (latin small letter z with caron) are also used in Finnish, and their behaviour is unusual enough and possibly relevant to us that it should be discussed at some point.
æ (latin small letter ae) and ø (latin small letter o with stroke) are also used in Norwegian.
These are not errors in the Proposal. The list was never meant to be exhaustive, i.e., list all languages in which a certain code points was used. The goal was to find some example that proves the actual use of the character in any language in level 0-3(4). It was a simple yes/no decision, in or out of the repertoire. For a code point to be in, we needed to show it was actually used.
ü (latin small letter u with diaeresis) is *not* used in Swedish unless you count proper names of foreign origin, in which case it should also be included in Finnish along with é and some others.
I don't know Swedish, so indeed this might be a mistake, but there are enough other languages having ü that this does not cause any harm for the definition of the repertoire. In that sense, the RZ-LGR cannot be taken as an exhaustive list of code point usages. Cheers, Michael -- ____________________________________________________________________ | | | knipp | Knipp Medien und Kommunikation GmbH ------- Technologiepark Martin-Schmeißer-Weg 9 44227 Dortmund Deutschland Dipl.-Informatiker Tel: +49 231 9703-0 Fax: +49 231 9703-200 Dr. Michael Bauland SIP: Michael.Bauland@knipp.de Software-Entwicklung E-Mail: Michael.Bauland@knipp.de Registereintrag: Amtsgericht Dortmund, HRB 13728 Geschäftsführer: Dietmar Knipp, Elmar Knipp Zertifiziert nach DIN ISO/IEC 27001:2017
It's good that we are being able to get some initial thoughts in on the list, so that the conversation from our upcoming initial meeting can be more focused. At least to me, it's helpful to understand where the other members are before we carry out the general consultation. Best, On 24 Feb 2025 16:46, Saewon Lee via Gnso-latin-diacritics wrote:
Dear LSD PDP WG,
In accordance with the GNSO’s PDP requirements, the first step of the process after the WG is formed is to seek an input from each Supporting Organization, Advisory Committee, and GNSO’s Stakeholder Group/Constituency on the topics within the WG Charter. As we prepare for the Kick-off meeting, please share any concerns/suggested edits through sidebar comments within the google doc. <https://docs.google.com/document/d/1omNCnlnVUVecAPSrTkJgK6Ien_sAKApD/edit?us...> *_by Wednesday, 05 March, at 23:59 UTC._*
Any substantive concerns/comments will be addressed during Kick-off (08 March). Subsequently, the finalized message will be sent out to SO/AC/SG/Cs by mid-end of March.
Kind regards,
Saewon
**
*Saewon Lee*
Policy Development Support Manager for GNSO
Internet Corporation for Assigned Names and Numbers (ICANN)
*Mobile:*+1 (310) 463-5541
*Email:*saewon.lee@icann.org <mailto:saewon.lee@icann.org>
*Skype:* saewon.lee.icann
*Website:*www.icann.org <http://www.icann.org>
_______________________________________________ Gnso-latin-diacritics mailing list --gnso-latin-diacritics@icann.org To unsubscribe send an email tognso-latin-diacritics-leave@icann.org
-- Mark W. Datysgeld Director at Governance Primer [governanceprimer.com <https://governanceprimer.com>] Project Lead Developer at ICANNWiki [icannwiki.org <https://icannwiki.org/>]
If we take the narrowest gloss on our mandate, things simplify enormously. Basically, we have two broad cases:Case 1) those who have already registered an ASCII gTLD, and wish to also have the same letters but with one or more diacritics added,Case 2) those who do not currently have an ASCII gTLD. For Case 1), simply say that they can have one (1) additional gTLD (at no charge), provided it uses exactly the same Latin letters, albeit with some diacritics. Doesn't matter what the diacritics are. Doesn't matter whether the diacritic version has ever been used in a known language (and who has time to research that anyway?). But you only get one diacritic version. So choose carefully. For Case 2) we break it down a bit. -- If an applicant wants both an ASCII version and a diacritic version, and nobody else is applying for a gTLD with diacritics which would have the same ASCII version, treat this the same as Case 1). -- If two applicants want to register diacritic versions which reduce to the same ASCII version, neither gets it. With the caveat that, if the two diacritic gTLDs are variants of each other, whoever wins out to register theirs can also get the ASCII version. Perhaps there are other cases that we need to consider. But those are the major ones I can see. Bill Jouris On Wednesday, February 26, 2025 at 03:11:29 PM PST, Mark W. Datysgeld via Gnso-latin-diacritics <gnso-latin-diacritics@icann.org> wrote: It's good that we are being able to get some initial thoughts in on the list, so that the conversation from our upcoming initial meeting can be more focused. At least to me, it's helpful to understand where the other members are before we carry out the general consultation. Best, On 24 Feb 2025 16:46, Saewon Lee via Gnso-latin-diacritics wrote: #yiv0189681968 filtered {}#yiv0189681968 filtered {}#yiv0189681968 filtered {}#yiv0189681968 p.yiv0189681968MsoNormal, #yiv0189681968 li.yiv0189681968MsoNormal, #yiv0189681968 div.yiv0189681968MsoNormal {margin:0in;font-size:12.0pt;font-family:"Aptos", sans-serif;}#yiv0189681968 a:link, #yiv0189681968 span.yiv0189681968MsoHyperlink {color:#467886;text-decoration:underline;}#yiv0189681968 .yiv0189681968MsoChpDefault {font-size:10.0pt;}#yiv0189681968 div.yiv0189681968WordSection1 {} Dear LSD PDP WG, In accordance with the GNSO’s PDP requirements, the first step of the process after the WG is formed is to seek an input from each Supporting Organization, Advisory Committee, and GNSO’s Stakeholder Group/Constituency on the topics within the WG Charter. As we prepare for the Kick-off meeting, please share any concerns/suggested edits through sidebar comments within the google doc. by Wednesday, 05 March, at 23:59 UTC. Any substantive concerns/comments will be addressed during Kick-off (08 March). Subsequently, the finalized message will be sent out to SO/AC/SG/Cs by mid-end of March. Kind regards, Saewon Saewon Lee Policy Development Support Manager for GNSO Internet Corporation for Assigned Names and Numbers (ICANN) Mobile: +1 (310) 463-5541 Email: saewon.lee@icann.org Skype: saewon.lee.icann Website: www.icann.org _______________________________________________ Gnso-latin-diacritics mailing list -- gnso-latin-diacritics@icann.org To unsubscribe send an email to gnso-latin-diacritics-leave@icann.org -- Mark W. Datysgeld Director at Governance Primer [governanceprimer.com] Project Lead Developer at ICANNWiki [icannwiki.org] _______________________________________________ Gnso-latin-diacritics mailing list -- gnso-latin-diacritics@icann.org To unsubscribe send an email to gnso-latin-diacritics-leave@icann.org
participants (8)
-
Amadeu Abril i Abril (CORE) -
anil Jain -
Bill Jouris -
Mark W. Datysgeld -
Michael Bauland -
Saewon Lee -
Tapani Tarvainen -
Tarvainen Tapani