Outcomes, Action Items, and Notes Meeting #30
Dear LD PDP WG, Please see the attached outcomes, action items, and notes from meeting #30 [OUTCOMES] * Wording from IG X to be incorporated into IG 7 [ACTION ITEMS] * Staff and Leadership to adjust language on IG8 [NOTES] 1. Welcome and SOI Updates 2. Recap of Meeting #29 * Key Outcomes and Action Items * Reviewed the slides<https://icann-community.atlassian.net/wiki/x/AQBENw> with PR6 and IG8 updated with redlines and strikethrough PR 6 on slide 7 * Wording from PR 6 in the past was turned into IG7 to make PR6 more concise, but still convey the group’s intention. * Discussion of withdrawal of applications corresponding to existing TLDs options outlined on slides<https://icann-community.atlassian.net/wiki/x/AQBENw> 9-10. The cases are a bit clearer in option 2 for IG X. WG coalesced around Option 2. * Should Option 2 be a separate IG or a combined into a single IG? * Policy Team comment: PR 6 was the original recommendation as it aligns with IDN EPDP and then the IG X was divided in this way to have an existing PR 6 and a separate IG 7 to align with ICANN org comment to align with EPDP IDNs while explaining the necessary steps. Framed this way as a best practice for drafting recommendations. Recommendation is capturing principle, but deeply into specifics that should be IG since it is so specific. * PR 6 basically just says you can get rid of strings or labels and the IGs explain in details how to withdraw a TLD under what circumstances this is possible. That’s why the separation is there * There is a chance that things are missed in the process and if it is in a recommendation then there is rigidity. The IRT then guides to ensure * Majority is indifferent, so wording from IG X will be incorporated into IG 7 * IG8 overview on slide<https://icann-community.atlassian.net/wiki/x/AQBENw> 11. * Query about an ASCII and to LDs if ASCII cannot go through then the set will get dissolved. Whether the two LD TLDs can go together without a set or whether just one LD would be processed. * Answer: since the rule is that one cannot switch between the exceptions process and the standard process. One would have to choose a single one and then would be able to continue. * Staff and leadership to improve the wording of IG8 based on the query * Looking at scope and characters to stay with diacritics as decomposable or expanding characters with a limit as outlined on slide<https://icann-community.atlassian.net/wiki/x/AQBENw> 12. * Two Options discussed: questions raised by Council liaison what is the clear, objective, and machine-testable baseline. Is the baseline sufficient for WG to achieve its goals? Would this be an expansion of scope? * Christian: added analogy ensuring that the WG can set a floor and a ceiling. Floor is common application and ceiling is outliers. * Floor would be the unicode decomposable, and the Tapani proposal would be a ceiling and should not go too far for outliers. The definition could be finite, predictable, and could be changed if Unicode changes * Process problem rather than a technical problem of the code points. The PDP WG cannot enlarge its remit and scope. The WG has provided a definition all along and the Public Comment did not raise this. There was a very real request from the community resulting in a narrowly scoped PDP. Option 2 can accommodate more, but there is a lot of anxiety about when the work will finish. Is the group redefining its own scope? But having worked all this time on the unicode table, it might be difficult to change and redefine that. It was not a push from the community, and it will make it longer. * ICANN org points outlined on slide<https://icann-community.atlassian.net/wiki/x/AQBENw> 14 and Sarmad contextualized his points. Explained the development of RZ-LGR and its design. Inclusion principle and conservatism principle works generally overall. Inclusion principle discusses two ways of managing characters: start with everything and exclude, or start with an empty set and include only necessary characters. Each script should be an empty set and community decides the characters needed for RZ-LGR * Second principle of variants. RZ-LGR maximize variant sets and block them. Wherever there is a chance of confusion, that confusion should be addressed and then they should be allowed and blocked to avoid confusion. RZ-LGR wants to maximize blocked variants. As an exceptions process some of those could be allocatable, which would be minimized (only those absolutely necessary). * Those are the principles. Here the WG is creating an exceptions process for what is already happening in the RZ-LGR. This PDP is an extension of the allocatable variants. Allocatable variants are an exception. This an exception to that exception, so this should be extremely conservative. Motivation is the underlying SSR challenges. * Sarmad discussed the conservatism principle and this is the first time doing it at TLD level as it has implications with millions of second-level implications. Once the community gains experience it can be made more liberal. Using the same argumentation in designing RZ-LGR, considering allowed diacritics a third option: design principle of inclusion starting with an empty set to add diacritics which are needed. * At a policy level the WG can create policy to guide how that set should be developed then a separate community WG could be done to document why each diacritic should be included * Michael asked: Do not deal with the question in this PDP, just make the rules for some other WG or generation panel that would then be responsible for defining which characters can be used in this policy? * Sarmad: Yes, RZ-LGR is a process that can evolve over time. If you make it part of the policy, it gets stuck unless there is a mechanism to spin it off into a community WG to develop this over time to not constantly update the policy. * Response from WG was skeptical of the third option proposed by Sarmad. There needs to be a criteria and support for Option 2 from Tapani seems quite workable. * Sarmad clarified through RZ determining what TLD can go into the root zone. In certain cases two strings going into the root zone, they must be managed by the same registry operator. From that perspective, the LD TLD is trying to do the same thing. Which additional strings are not allowed could still go to the root zone. LD and RZ-LGR are parallel and similar to allow more strings into the root zone. * Overall, the WG was convening on Option 2 and did not agree with the comments from Sarmad. * Majority of group members in favor of option 2 * Sarmad: unicode experts consulted on option 2, and parsing letter names is not a predictable process as unicode does not design names “with” that anything follows has to be a diacritic, it could actually be other items that may or may not be a diacritic. Unicode expert in the context of option 2 there is an implication for the unicode. * Machine readability is not 100% required as the list created does not necessarily have to be generated over and over again. The list here could be a fixed list and a rule on what to do if new characters get introduced. * Sarmad shared the Unicode expert opinion ““Not a good idea to data mine the names list for this. The way decomposition is made on the names is not regular, the concept of diacritics may also be fuzzy. You may have a ‘with’ but the following element is not a diacritic (example 00D8: LATIN CAPITAL LETTER O WITH STROKE), the description of the diacritics may also not be totally regular, thinking of the block 1E00-1EFF. The best way is to use UnicodeData.txt, Scripts.txt (to determine Latin), and then look into UnicodeData.txt to see which Latin characters decompose and get the exact decomposition.” * Sarmad: his proposal would be a smaller set likely. But the motivation is to create a deliberate set to only include those things that are actually needed, and it may lead to the same set, but the inclusion method was the proposal not the smaller set. 3. Review of LD PDP Initial Report Public Comment 4. Next Steps 5. AOB John R. Emery, Ph.D. Policy Development Support Senior Specialist Generic Names Supporting Organization (GNSO) Internet Corporation for Assigned Names and Numbers (ICANN) www.icann.org<http://www.icann.org>
On Wed, Apr 08, 2026 at 02:57:06PM +0000, John Emery via Gnso-latin-diacritics (gnso-latin-diacritics@icann.org) wrote:
Inclusion principle and conservatism principle works generally overall. Inclusion principle discusses two ways of managing characters: start with everything and exclude, or start with an empty set and include only necessary characters.
Unfortunately the latter approach does not really help us here, because it leaves us with the question: necessary *for what*? We haven't been given a criteria for necessity. If we go that route, we'll have to come up with our own. I don't think that would make our task any easier. At least to me it seems exclusion rules would be easier to formulate and justify in a way that'd be easy to understand. If conservatism principle implies we should aim for the smallest possible set of characters, I'll note that first, the set of decomposable diacritics is bigger than the set of precomposed ones (latin letter with ...), and second, we could get an even smaller set from their intersection, that is, by allowing only characters that are BOTH decomposable AND precomposed. Even smaller sets would be possible. Our charter doesn't say "all diacritics" but neither does it say "some diacritics". I don't see how that would require minimizing the set of acceptable characters. In any case we should justify and explain our decision. Here's a tentative attempt: * Only characters that are in RZ-LGR. This is not controversial, nobody's suggested anything to the contrary. * Only characters that are unambigously formed from a base Latin letter by adding a glyph or two. This excludes characters like æ, þ, ð, ŋ. This can be justified by noting that allowing them would require extra work to resolve possible issues arising from multiple potentially confusable characters, without needing to refer to any diacritic definition. We could note and maybe recommend that a future PDP could deal with such cases. * Only characters that are based on ASCII letters (a-z). This is implied by our restriction that the base TLD must be ASCII. Besides Greek diacritics &c this excludes also Latin extensions like ǯ (latin letter ezh with caron). If a subsequent PDP takes on cases without base ASCII TLD, out rules would probably work almost unchanged for dealing with, say, someone wanting .vuaʒʒ, .vuâǯǯ and .vuäǯǯ. I don't see a need for anything else. All of those are satisfied equally well with both decomposable and precomposed (latin letter with ...) diacritics. -- Tapani Tarvainen
participants (2)
-
John Emery -
Tapani Tarvainen