Re: [Koreangp] some questions for IP on CJK coordination
*From:*王伟[mailto:wangwei@cnic.cn] *Sent:* Friday, May 15, 2015 5:47 PM *To:* integrationPanel@icann.org *Cc:* Sarmad Hussain; chineseGP@icann.org; japaneseGP@icann.org; koreanGP@icann.org *Subject:* some questions for IP on CJK coordinaton
Dear IP members
During the discussion in CJK coordination meeting this afternoon, I checked the draft of “Representing Label Generation Rulesets using XML” (https://www.ietf.org/id/draft-davies-idntables-09.txt) again.
Here are some questions about the terminology and label disposition rule in the draft.
1)Terminology for “variant subtype”
Besides the variant type like “allocatable” and “blocked”, some other attributes like “trad” “simp” “both” are given in the draft as follows:
-------------------------------------------------------------------------------------------------------------------
Assuming an LGR where all variants have been given suitable "type"
attributes of "block", "simplified", "traditional", or "both",
similar to the ones discussed in Appendix B. Given such an LGR, the
following example actions evaluate the disposition for the variant
label:
<action disp="block" any-variant="block" />
<action disp="allocate" only-variants="simplified both" />
<action disp="allocate" only-variants="traditional both" />
<action disp="block" all-variants="simplified traditional " />
<action disp="allocate" />
The first action matches any variant label for which at least one of
the code point variants is of type "block". The second matches any
variant label for which all of the code point variants are of type
"simplified" or "both", in other words an all-simplified label. The
third matches any label for which all variants are of type
"traditional" or "both", that is all traditional. These two actions
are not triggered by any variant labels containing some original code
points, unless each of those code points has a variant defined with a
reflexive mapping (Section 4.2.4).
----------------------------------------------------------------------------------------------------------------
Yoneya san suggested that we define them (“simp”“trad””both”…) as “variant subtype”
If it is not appropriate to call that, is there any other terminology you might prefer?
There seems to be a desire among the CJK-GP to stick with the term "... subtype", based on the idea (perhaps) that only CPs with the variant type "allocatable" will in practice have any of these attributes. But this is misleading. The generalization rather is that these are attributes invented by the GP, which can then be transformed, though application of rules, ultimately to either "allocatable" or "blocked" variant types.
The revealing terminology would therefore be "GP-defined type", which also high-lights the fact that the there is no closed list of these attributes, even if [simple], [traditional], [both] are our favourite examples.
I am fine with thinking of these are GP-defined types, but in some of the IP documents we use the term subtype, so we should be aware that these two terms mean the same thing.
2)The variant type of “out-of-repertoire”
In a work email last year, Asmus mentioned
“The MSR already contains a default action <action disposition="invalid" any-variant="out-of-repertoire-var" comment="any variant label with a code point out of repertoire is invalid"/>”
However, I didn’t find the definition of “out-of-repertoire” and the corresponding action in draft-davies-idntables-09
Will this part be added in the next version?
Can't comment on this. But no doubt Asmus or Wil can.
I'm currently revising the draft-davies-idntables and will see whether this is appropriate to mention it there. The key thing is to remember that from the XML formalism, one is free to use any "type" values whatsoever, and any "disposition" values whatsoever -- because the XML format is intended for ALL types of domains, not just the root. For the Root Zone LGR project, the IP has defined the type "out-of-repertoire-var" and provided a default WLE <action> to resolve it. That definition was made in the MSR. Several of the documents that the IP created, including "Packaging the MSR and LGR" (see the LGR Root Zone Project wiki) describe how to use it. (Somebody had a link for that yesterday).
3) suggestion about a new variant type and action type
JGP-LGR-1 doesn’t have variants, but many CGP variants will be added into JGP-LGR-2,which means, JGP will adopt many Chinese variants to reach a CJK consensus,
The meaning of most those variants are same, however, there are some specific code points mean totally different things in Japanese language environment, while the exchangeable variants in Chinese.
Like 机/機,both mean machine in Chinese, but mean machine and table separately in Japanese.
Though JGP will set them as variants and both “allocatable”, I wonder if it is possible to create a new type of “allocatable-reserved” to help tell the difference in WLE.
The corresponding action will go like <action disposition="allocatable-reserved" any-variant="allocatable-reserved">
Which means, when机上is applied, 機上will be generated and allocatable, but a special process is needed before the real delegation/activation of “機上”
Because unlike the common allocatable variant label, the activation of these “allocatable-reserved” label might bring domain name disputes or abuse.
does the new type suggestion make sense? Do you think it is practical ?
In general, the Root Zone process only allows the two dispositions "allocatable" and "blocked" for variant labels (as well as "invalid" for labels that are not valid at all). And dispositions for the labels are not redefinable. I understand from Yoneya-san's examples that there is a common practice of multiple labels treated as variant labels by their applicant in the .jp domain (and that the choice is not predictable by any rule). These examples are good evidence that there is a rational basis for allowing at least some of the CGP variants to be considered "allocatable" in the JGP LGR. This suggestion of an "allocatable-reserved" disposition would indicate, there may be some other CGP variants for which considering them "allocatable" in the JGP LGR is not as motivated. In other words, there are some facts, which would make certain combinations of variants very unlikely -- perhaps so unlikely that they do not need to be supported in the root. If that is the case, we need to understand how unlikely such combinations are. Would an "exception" be needed rarely, but still relatively often? Or is the desire to allow an exception mostly to prevent even the risk of disallowing a single label? The reason I am asking these questions is, that for the root zone, we are requested to be conservative. Because of that, if some rule leads to the prevention of a few possible labels (that are unusual to begin with), it is more acceptable to not support such labels in the root, than to make complicated additions to the system. Therefore, the investigation by the JGP has to be whether it isn't acceptable to treat these labels (where CGP variants are not semantically related) as "blocked" instead of "allocatable-reserved". If the result is that a small percentage of exceptional variant labels that might have been allowed under the second level cannot be allowed under the root, it would likely be better to live with such a restriction. This would be in the spirit of "conservatism". So, whether or not the Root Zone LGR process can actually create a disposition like the proposed "alloctable-reserved" (and our hands may be tied by the procedure) it would be important to have examples of actual cases and an understanding on whether simply making these "blocked" would be unduly restrictive in practice (and not just in theory). In other words, it would be important to have the same kind of use cases as Yoneya-san sent for the semantically related (old-vs.-new ideograph) variants. A./
participants (1)
-
Asmus Freytag