NOTES | ccPDP4 IDN Confusing Similarity Subgroup (#5) | 24 May 2022 at 13:00 UTC.
1 Welcome and roll call
Welcome by Kenny
Apology by Anil
2 Administrative Items
a. IDN EPDP Update
Bart: last week they dealt with the base for comparison. What is the scope of a CS review? They have additional issues: other kinds of similarity. They work through the issues with smaller WGs.
Could someone please confirm or correct what I said?
Pitinan: correct. String similarity in smaller group to discuss the examples. Scope being discussed in 3 levels: applied for string, applied for and allocatable set, blocked variants and allocatable variants.
The main group talks about the next charter question, which is the objection process.
Bart: objection process is not in ccPDP4
b. Sessions at ICANN74
Tuesday, 14 June 2022
• ccNSO: Policy Update | 11:15-12:30 UTC
• ccPDP4 IDN Meeting | 14:30-15:30 UTC
The ccNSO is planning a policy session for ICANN74 to showcase its work. Two ccNSO PDP (ccPDP) working groups will provide updates on their progress and seek input. Both working groups will also hold their respective sessions during ICANN74.
Bart: check full policy, once completed against scenarios. Do the results make sense, against the original intentions?
>>>> Definition of Stress Testing
Stress Testing is defined as:
After completion of the draft process the Stress Testing was conducted through answering the following questions:
Bart: Shall I prepare draft slides for the 31 meeting?
Kenny: Yes, please. Lets also circulate with the WG members before we present it at ICANN74
Hadia: do you want to know more about string similarity review small teams? Meeting last wednesday. Pragmatic group. Avoid edge cases. Question if there is a chance to appeal decisions on string similarity. Out
of scope. Limited appeals process. Edmun presented an edge SBC case. Strings that look alike, but are not variants. Group meets again on Wednesday. No examples that address the presented cases. 3 levels
Bart: The idea is that the groups stay informed about each other's process, and there is alignment between the 2. Some aspects do not apply to ccTLDs.
3. Basics Confusing Similarity Review – document v3
Discussion section 3. Base for comparison
Sarmad: question about conditions at top of the page.
3rd bullet. Are we also comparing them with gTLDs?
Bart: both ccTLDs and gTLDs? Hence TLDs. at this stage.
Bart: all delegated TLDs are different delegations. Therefore, all need to be included in the base for comparison. The discussion on allocatable or blocked is secondary for delegated TLDs.
Questions or comments ? we are at page 2 now.
Ai-Chin: think one of the reasons for submitting the IDN table is to avoid CS. maybe a CS problem in future. Example. We do not need to submit the table, if we just check or review the delegatable variant. That
is easy. We just provide 2 requested strings. That is why we submit the IDN variant label. Blocked and delegated variant cause confusion. Perhaps also include the blocked and allocatable variants? Please consider
Bart: idn table is not submitted anymore for TLDs. determined by the RZ-LGR
Ai-Chin: we prepared a CGK. that is the variant table. We already considered blocked variants, that will cause CS with the requested idn ccTLD string. Maybe not now, but later.
Bart: other comments or questions?
Sarmad: as a background. Shares that the reference point for this comes from SSAC16 report. 2 kinds of problems:
SSAC thinks is not a good solution. Try from an end-user perspective, the solution does not cause misconnection.
Bart: where does it stop? Blocked variants? Or allocatable? Moreover, we talk about delegatable. When a string is requested, we need to ensure about CS. probability needed. But where do you draw the line with
respect to risk? Needs to be probable, not just possible.
Sarmad: string that is highly probably confusable with a blocked string of another variant. Case: potentially highly probable. That is what we try to solve. EPDP looks at some scenarios. Try to understand the
problem better.
Bart: what is already delegated as a starting point. Why do we do this?
Section 4: defining the base for comparison, taking into account arguments listed above.
Hadia: what would be the benefit of comparing the original label, in addition to the allocatable, and the blocked, if any? Why included the blocked as well
Bart: see what Sarmad said
Sarmad: quick summary. Basically possible to generate some confusion through transitive relationship. If you have a TLD string A, and it has a variant blocked (B). A and B are considered the same. Separate TLD
(C ). Compare A and C: are they similar? Should we also compare B with C, given that B is not applied for. B is a variant. And is the same as A, that is the definition of a variant. If C is highly probably confusable with B, there is a high confusing connection
between A and C. that is what we try to solve here
Hadia: 1/ we cannot put blocked variants with allocatable but not requested. Blocked variants will never be allocated. Unless LGR changes. Highly improbable
2/ what do we mean by the same? Only visually? Or also the same in meaning?
Sarmad: in variants, “same” could be anything. Defined by the script community. Same could be based on meaning. Concrete example. Simplified chinese label. Which has a blocked traditional chinese label. Another
applicant applies for a traditional chinese label, which is very similar to the blocked variant for the simplified, 1st application. Visually similar to the traditional chinese blocked label.
There is a potential for misconnection indirectly.
Hadia: i do not see it happening. Indirect connection
Ai-Chin: That's why we have RZ-LGR IDN vaiant table.
Sarmad: agrees. Extensive work done by script communities. Lots of visually closed cases have been pulled into the variant sets. There may be some which are on the borderline. The more obvious cases have been
pulled into variants sets. Problem is now smaller
Bart: leave it as it is for the time being. Revisit next time. Arguments need to sink in
Need to avoid unwanted side-effects
Questions regarding section 4
Ai-Chin: for the CS issues, it will be operated by a program. Not by a human. So the number of variants is not important. Why do we need to limit?
Bart: to date not automated. Manually.
That is the real issue.
Hadia: if you include more labels of strings in the comparison, that will not affect the space, you limit the possibility of the existence of a label that is not existing, since it was compared to an irrelevant
label
Bart: yes, second reason: unwanted side-effects.
This was 1st reading. Positive vibe to this type of definition?
Hadia: scalability void if automated in future
Bart: both. Scalability, and unforeseen side-effects.
Will update section 1 and 2 based on what Sarmad said 2 weeks ago. We will revisit again in 2 weeks time. This sets the base for the next steps. The base for comparison has been the most complicated part.
4. AOB
none
5. Next meetings
31 May Full Group Prep/discussion ICANN74 | 13:00 UTC (first hour)
·
VM SG 14:00 UTC | Second hour
7 June – Prep ICANN74 (TBD)
Bart: When does this group re-convene?
Either 2 or 3 weeks after ICANN74. 28 june, 5 july?
Kenny: tentatively propose 28 June. same time
6. Closure
Thank you all.
Joke Braeken
joke.braeken@icann.org