NOTES | ccPDP4 IDN Confusing Similarity Subgroup (#5) | 24 May 2022 at 13:00 UTC.

1 Welcome and roll call

Welcome by Kenny

Apology by Anil

2 Administrative Items

a. IDN EPDP Update

Bart: last week they dealt with the base for comparison. What is the scope of a CS review? They have additional issues: other kinds of similarity. They work through the issues with smaller WGs.

Could someone please confirm or correct what I said?

Pitinan: correct. String similarity in smaller group to discuss the examples. Scope being discussed in 3 levels: applied for string, applied for and allocatable set, blocked variants and allocatable variants. The main group talks about the next charter question, which is the objection process.

Bart: objection process is not in ccPDP4

b. Sessions at ICANN74

Tuesday, 14 June 2022

• ccNSO: Policy Update | 11:15-12:30 UTC

• ccPDP4 IDN Meeting | 14:30-15:30 UTC

The ccNSO is planning a policy session for ICANN74 to showcase its work. Two ccNSO PDP (ccPDP) working groups will provide updates on their progress and seek input. Both working groups will also hold their respective sessions during ICANN74.

Third ccPDP on the Retirement of ccTLDs (ccPDP3) | The ccPDP3 working group has identified which decisions related to the retirement of a country code top-level domain (ccTLD) that should be subject to consideration by the group and explored requirements for the review mechanism. The working group is considering various requirements for approval, including the bindingness of the review mechanism. The working group charter, work plan, and other relevant documents are available on its website and workspace.

Fourth ccPDP Working Group on the (de)selection of Internationalized Domain Name (IDN) ccTLD strings (ccPDP4) | The ccPDP4 working group completed its review and discussion of the 2013 policy proposals for the IDN ccTLD string selection process. The ccPDP4 consists of three subgroups: the variant management subgroup, the confusing similarity review subgroup, and the de-selection subgroup.
By ICANN74, all subgroups will have completed their initial proposals, or will be in the final phases of doing so. During ICANN74, the working group will hold a session to start the stress-testing of its draft policy recommendations. The working group charter, work plan, and other relevant documents are available on its website and workspace.

Bart: check full policy, once completed against scenarios. Do the results make sense, against the original intentions?

>>>> Definition of Stress Testing

Stress Testing is defined as:

Test the process as developed by applying the process to “corner case” situations and understand whether such a case results in an unwanted outcome or side effects.
If the outcome of that situation results in an unwanted outcome or side effects adjust Policy/Process as needed.

After completion of the draft process the Stress Testing was conducted through answering the following questions:

What is the outcome of this situation when the process is invoked?
Is the outcome of that situation/the result unwanted or are side effects unwanted/unacceptable?
Does the Policy/Process need to be adjusted/refined?

Bart: Shall I prepare draft slides for the 31 meeting?

Kenny: Yes, please. Lets also circulate with the WG members before we present it at ICANN74

Hadia: do you want to know more about string similarity review small teams? Meeting last wednesday. Pragmatic group. Avoid edge cases. Question if there is a chance to appeal decisions on string similarity. Out of scope. Limited appeals process. Edmun presented an edge SBC case. Strings that look alike, but are not variants. Group meets again on Wednesday. No examples that address the presented cases. 3 levels

Bart: The idea is that the groups stay informed about each other's process, and there is alignment between the 2. Some aspects do not apply to ccTLDs.

3. Basics Confusing Similarity Review – document v3

Discussion section 3. Base for comparison

Sarmad: question about conditions at top of the page.

3rd bullet. Are we also comparing them with gTLDs?

Bart: both ccTLDs and gTLDs? Hence TLDs. at this stage.

Bart: all delegated TLDs are different delegations. Therefore, all need to be included in the base for comparison. The discussion on allocatable or blocked is secondary for delegated TLDs.

Questions or comments ? we are at page 2 now.

Ai-Chin: think one of the reasons for submitting the IDN table is to avoid CS. maybe a CS problem in future. Example. We do not need to submit the table, if we just check or review the delegatable variant. That is easy. We just provide 2 requested strings. That is why we submit the IDN variant label. Blocked and delegated variant cause confusion. Perhaps also include the blocked and allocatable variants? Please consider

Bart: idn table is not submitted anymore for TLDs. determined by the RZ-LGR

Ai-Chin: we prepared a CGK. that is the variant table. We already considered blocked variants, that will cause CS with the requested idn ccTLD string. Maybe not now, but later.

Bart: other comments or questions?

Sarmad: as a background. Shares that the reference point for this comes from SSAC16 report. 2 kinds of problems:

If both of the DNs are delegated, it causes a blocked connection
Causes a denial of service.

SSAC thinks is not a good solution. Try from an end-user perspective, the solution does not cause misconnection.

Bart: where does it stop? Blocked variants? Or allocatable? Moreover, we talk about delegatable. When a string is requested, we need to ensure about CS. probability needed. But where do you draw the line with respect to risk? Needs to be probable, not just possible.

Sarmad: string that is highly probably confusable with a blocked string of another variant. Case: potentially highly probable. That is what we try to solve. EPDP looks at some scenarios. Try to understand the problem better.

Bart: what is already delegated as a starting point. Why do we do this?

Scaling. Understanding the numbers. See discussion 2 weeks ago. Examples from the staff report on IDN variants
Unwanted side-effects

Section 4: defining the base for comparison, taking into account arguments listed above.

Hadia: what would be the benefit of comparing the original label, in addition to the allocatable, and the blocked, if any? Why included the blocked as well

Bart: see what Sarmad said

Sarmad: quick summary. Basically possible to generate some confusion through transitive relationship. If you have a TLD string A, and it has a variant blocked (B). A and B are considered the same. Separate TLD (C ). Compare A and C: are they similar? Should we also compare B with C, given that B is not applied for. B is a variant. And is the same as A, that is the definition of a variant. If C is highly probably confusable with B, there is a high confusing connection between A and C. that is what we try to solve here

Hadia: 1/ we cannot put blocked variants with allocatable but not requested. Blocked variants will never be allocated. Unless LGR changes. Highly improbable

2/ what do we mean by the same? Only visually? Or also the same in meaning?

Sarmad: in variants, “same” could be anything. Defined by the script community. Same could be based on meaning. Concrete example. Simplified chinese label. Which has a blocked traditional chinese label. Another applicant applies for a traditional chinese label, which is very similar to the blocked variant for the simplified, 1st application. Visually similar to the traditional chinese blocked label.

There is a potential for misconnection indirectly.

Hadia: i do not see it happening. Indirect connection

Ai-Chin: That's why we have RZ-LGR IDN vaiant table.

Sarmad: agrees. Extensive work done by script communities. Lots of visually closed cases have been pulled into the variant sets. There may be some which are on the borderline. The more obvious cases have been pulled into variants sets. Problem is now smaller

Bart: leave it as it is for the time being. Revisit next time. Arguments need to sink in

Need to avoid unwanted side-effects

Questions regarding section 4

Ai-Chin: for the CS issues, it will be operated by a program. Not by a human. So the number of variants is not important. Why do we need to limit?

Bart: to date not automated. Manually.

That is the real issue.

Hadia: if you include more labels of strings in the comparison, that will not affect the space, you limit the possibility of the existence of a label that is not existing, since it was compared to an irrelevant label

Bart: yes, second reason: unwanted side-effects.

This was 1st reading. Positive vibe to this type of definition?

Hadia: scalability void if automated in future
Bart: both. Scalability, and unforeseen side-effects.

Will update section 1 and 2 based on what Sarmad said 2 weeks ago. We will revisit again in 2 weeks time. This sets the base for the next steps. The base for comparison has been the most complicated part.

4. AOB

none

5. Next meetings

31 May Full Group Prep/discussion ICANN74 | 13:00 UTC (first hour)

· VM SG 14:00 UTC | Second hour

7 June – Prep ICANN74 (TBD)

Bart: When does this group re-convene?

Either 2 or 3 weeks after ICANN74. 28 june, 5 july?

Kenny: tentatively propose 28 June. same time

6. Closure

Thank you all.

Joke Braeken

joke.braeken@icann.org