Re: [Gnso-newgtld-wg] EBERO / SLAM Data Status and next steps
Dear All, There did not appear to be any comments, questions, or objections to how the EBERO/SLAM data topic was characterized by Jeff, including what will be sought from ICANN Org’s GDD team. The information below will be sent to GDD for their review and resolution. Best, Steve From: Gnso-newgtld-wg <gnso-newgtld-wg-bounces@icann.org> on behalf of Jeff Neuman <jeff.neuman@comlaude.com> Date: Wednesday, October 2, 2019 at 8:14 AM To: "gnso-newgtld-wg@icann.org" <gnso-newgtld-wg@icann.org> Subject: [Gnso-newgtld-wg] EBERO / SLAM Data Status and next steps All, On 23 August 2019, high-level statistics were shared from ICANN org’s SLA Monitoring program (attached). These statistics provided the monthly number of 1) DNS failures 2) RDDS failures and 3) DNS and RDDS failures that reached the EBERO threshold. On 26 August 2019, Donna Austin shared a RySG document which included data previously shared by ICANN org at the Madrid DNS Symposium / GDD Summit (attached). This data provided more detail on group 3 above in particular, including elements like: The cause for why the EBERO threshold was reached (i.e., DNS/DNSSEC or RDDS) A further, more specific breakdown within each of the two categories that indicate why the EBERO threshold was reached (e.g., IPv6 transport error, broken chain of trust in DNSSEC, etc.) Within our Working Group there has been some discussion about what impact this data may have. For example, we discussed whether past failures should have any impact on future application evaluations or RSP pre-approval for future rounds? The Working Group, however, is trending towards agreeing that evaluation and thus RSP pre-approval (which is likely to leverage the same or substantially the same evaluation criteria) should be forward looking and not be impacted by past performance. The Working Group is also trending towards requiring the pre-approval prior to each evaluation round and that the approval will be good for the duration of the then-current round. A second area in which this data could have an impact is whether or not existing Registry Service Providers should be grandfathered into any future pre-approval processes? Based on the Working Group’s deliberations and public comment received, the answer appears to be no. Most of the comments seem to point to requiring all RSPs (whether new or existing) to go through the same processes in future rounds. Therefore, based on where the Working Group is heading, the Co-Chairs do no believe that we are in need of additional historical data for purposes of finalizing its work on the RSP Pre-approval program. We recognize that registry service provider performance has been a topic of discussion for matters other than the pre-approval program. We understand that there has been discussion about whether the relatively high number of registry services failures indicates that RSP might need to improve how they operate to ensure the security and stability of the DNS. While this WG is forward looking and not intending to impose new commitments for existing RSPs, lessons may be learned to improve the evaluation process for future applicants/RSPs. Based on the above, the WG believes it does need more context from GDD to understand the data it has been provided, since not all SLA failures are equally harmful. Therefore in looking at the failures, for example, the WG needs clarity on whether they refer to service availability or Round Trip Time (RTT). The WG also wants to try and examine if there was any harm caused by the downtime, which is benefitted in part by understanding the timing of the failure (e.g., prior to Sunrise, during Sunrise, before General Availability, after GA). For the failures that did NOT reach the EBERO threshold, the WG would like summary data of which of the SLRs was missed. For the failures that DID reach the EBERO threshold, a detailed breakdown (e.g., similar specificity that was shared at the Madrid DNS Symposium / GDD Summit), which includes the timing of the failure. This information may help identify weaknesses in the evaluation criteria, registry system testing, and/or contractual requirements captured in the Registry Agreement. The leadership intends to send the request for additional context next week, but we would like to know if there are any comments from the Working Group. Thanks for your attention to this matter. Best regards, Jeff Neuman and Cheryl Langdon-Orr SubPro PDP Co-Chairs The contents of this email and any attachments are confidential to the intended recipient. They may not be disclosed, used by or copied in any way by anyone other than the intended recipient. If you have received this message in error, please return it to the sender (deleting the body of the email and attachments in your reply) and immediately and permanently delete it. Please note that the Com Laude Group does not accept any responsibility for viruses and it is your responsibility to scan or otherwise check this email and any attachments. The Com Laude Group does not accept liability for statements which are clearly the sender's own and not made on behalf of the group or one of its member entities. The Com Laude Group includes Nom-IQ Limited t/a Com Laude, a company registered in England and Wales with company number 5047655 and registered office at 28-30 Little Russell Street, London, WC1A 2HN England; Valideus Limited, a company registered in England and Wales with company number 06181291 and registered office at 28-30 Little Russell Street, London, WC1A 2HN England; Demys Limited, a company registered in Scotland with company number SC197176, having its registered office at 33 Melville Street, Edinburgh, Lothian, EH3 7JF Scotland; Consonum, Inc. dba Com Laude USA and Valideus USA, headquartered at 1751 Pinnacle Drive, Suite 600, McLean, VA 22102, USA; Com Laude (Japan) Corporation, a company registered in Japan having its registered office at Suite 319,1-3-21 Shinkawa, Chuo-ku, Tokyo, 104-0033, Japan. For further information see www.comlaude.com
Dear WG Members, In response to the request for additional data on EBERO and SLA Monitoring (see email thread below for details and context), please find the response from ICANN org’s Global Domains Division: Sub Pro PDP Request #1: Looking at the failures, the WG needs clarity on whether they refer to service availability or Round Trip Time (RTT). · ICANN org Response to Request #1: The failures refer to Service Availability and NOT Round Trip Time (RTT). Sub Pro PDP Request #2: For the failures that did NOT reach the EBERO threshold, the WG would like summary data of which of the SLRs was missed. · ICANN org Response to Request #2: The SLRs that were missed were DNS Service Availability and RDDS Service Availability Sub Pro PDP Request #3: For the failures that DID reach the EBERO threshold, a detailed breakdown (e.g., similar specificity that was shared at the Madrid DNS Symposium / GDD Summit), which includes the timing of the failure. · ICANN org Response to Request #3: See Below The ICANN SLAM Stats from Jan 2014 – Nov 2019 shows that there have been 52 cases where a gTLD reached one of the emergency thresholds: · 23 out of the 52 cases were triggered by failures in the DNS/DNSSEC services · 29 out of the 52 cases were triggered by failures in the RDDS Of the 52 cases, 11 occurred prior to the TLDs Sunrise period, 8 during Sunrise, 2 post Sunrise, 4 before General Availability and 27 during General Availability. DNS Failures: · Prior to Sunrise: 4 TLDs · During Sunrise: 2 TLDs · Post Sunrise: 0 TLDs · Before General Availability: 2 TLDs · General Availability: 15 TLDs RDDS Failures: · Prior to Sunrise: 7 TLDs · During Sunrise: 6 TLDs · Post Sunrise: 2 TLDs · Before General Availability: 2 TLDs · General Availability: 12 TLDs 13 RSPs, 35 gTLDs and a total of 760.7k active names were involved in the 52 cases. The root cause, which ICANN began tracking in 2015, can be broken down as follows: DNS23 The DNS name servers were timing out8 ICANN does not know the root cause4 Either the DNS servers were not responding or if they were responding, they were returning a malformed DNSSEC response where the NSEC3 records were not included4 There was no response from the DNS servers (apparently a routing issue)2 Expired signatures followed by breakage of the chain of trust in DNSSEC2 Break in the chain of trust in DNSSEC1 Expire DNSSEC signatures1 There were no DS records when requesting delegation from IANA1 RDDS29 ICANN does not know the root cause12 IPv6 transport failure12 RDDS name servers were timing out2 Web WHOIS service not responding2 Broken chain of trust in DNSSEC1 Total52 From: Gnso-newgtld-wg <gnso-newgtld-wg-bounces@icann.org> on behalf of Steve Chan <steve.chan@icann.org> Date: Friday, October 18, 2019 at 2:00 PM To: Jeff Neuman <jeff.neuman@comlaude.com>, "gnso-newgtld-wg@icann.org" <gnso-newgtld-wg@icann.org> Subject: Re: [Gnso-newgtld-wg] EBERO / SLAM Data Status and next steps Dear All, There did not appear to be any comments, questions, or objections to how the EBERO/SLAM data topic was characterized by Jeff, including what will be sought from ICANN Org’s GDD team. The information below will be sent to GDD for their review and resolution. Best, Steve From: Gnso-newgtld-wg <gnso-newgtld-wg-bounces@icann.org> on behalf of Jeff Neuman <jeff.neuman@comlaude.com> Date: Wednesday, October 2, 2019 at 8:14 AM To: "gnso-newgtld-wg@icann.org" <gnso-newgtld-wg@icann.org> Subject: [Gnso-newgtld-wg] EBERO / SLAM Data Status and next steps All, On 23 August 2019, high-level statistics were shared from ICANN org’s SLA Monitoring program (attached). These statistics provided the monthly number of 1) DNS failures 2) RDDS failures and 3) DNS and RDDS failures that reached the EBERO threshold. On 26 August 2019, Donna Austin shared a RySG document which included data previously shared by ICANN org at the Madrid DNS Symposium / GDD Summit (attached). This data provided more detail on group 3 above in particular, including elements like: The cause for why the EBERO threshold was reached (i.e., DNS/DNSSEC or RDDS) A further, more specific breakdown within each of the two categories that indicate why the EBERO threshold was reached (e.g., IPv6 transport error, broken chain of trust in DNSSEC, etc.) Within our Working Group there has been some discussion about what impact this data may have. For example, we discussed whether past failures should have any impact on future application evaluations or RSP pre-approval for future rounds? The Working Group, however, is trending towards agreeing that evaluation and thus RSP pre-approval (which is likely to leverage the same or substantially the same evaluation criteria) should be forward looking and not be impacted by past performance. The Working Group is also trending towards requiring the pre-approval prior to each evaluation round and that the approval will be good for the duration of the then-current round. A second area in which this data could have an impact is whether or not existing Registry Service Providers should be grandfathered into any future pre-approval processes? Based on the Working Group’s deliberations and public comment received, the answer appears to be no. Most of the comments seem to point to requiring all RSPs (whether new or existing) to go through the same processes in future rounds. Therefore, based on where the Working Group is heading, the Co-Chairs do no believe that we are in need of additional historical data for purposes of finalizing its work on the RSP Pre-approval program. We recognize that registry service provider performance has been a topic of discussion for matters other than the pre-approval program. We understand that there has been discussion about whether the relatively high number of registry services failures indicates that RSP might need to improve how they operate to ensure the security and stability of the DNS. While this WG is forward looking and not intending to impose new commitments for existing RSPs, lessons may be learned to improve the evaluation process for future applicants/RSPs. Based on the above, the WG believes it does need more context from GDD to understand the data it has been provided, since not all SLA failures are equally harmful. Therefore in looking at the failures, for example, the WG needs clarity on whether they refer to service availability or Round Trip Time (RTT). The WG also wants to try and examine if there was any harm caused by the downtime, which is benefitted in part by understanding the timing of the failure (e.g., prior to Sunrise, during Sunrise, before General Availability, after GA). For the failures that did NOT reach the EBERO threshold, the WG would like summary data of which of the SLRs was missed. For the failures that DID reach the EBERO threshold, a detailed breakdown (e.g., similar specificity that was shared at the Madrid DNS Symposium / GDD Summit), which includes the timing of the failure. This information may help identify weaknesses in the evaluation criteria, registry system testing, and/or contractual requirements captured in the Registry Agreement. The leadership intends to send the request for additional context next week, but we would like to know if there are any comments from the Working Group. Thanks for your attention to this matter. Best regards, Jeff Neuman and Cheryl Langdon-Orr SubPro PDP Co-Chairs The contents of this email and any attachments are confidential to the intended recipient. They may not be disclosed, used by or copied in any way by anyone other than the intended recipient. If you have received this message in error, please return it to the sender (deleting the body of the email and attachments in your reply) and immediately and permanently delete it. Please note that the Com Laude Group does not accept any responsibility for viruses and it is your responsibility to scan or otherwise check this email and any attachments. The Com Laude Group does not accept liability for statements which are clearly the sender's own and not made on behalf of the group or one of its member entities. The Com Laude Group includes Nom-IQ Limited t/a Com Laude, a company registered in England and Wales with company number 5047655 and registered office at 28-30 Little Russell Street, London, WC1A 2HN England; Valideus Limited, a company registered in England and Wales with company number 06181291 and registered office at 28-30 Little Russell Street, London, WC1A 2HN England; Demys Limited, a company registered in Scotland with company number SC197176, having its registered office at 33 Melville Street, Edinburgh, Lothian, EH3 7JF Scotland; Consonum, Inc. dba Com Laude USA and Valideus USA, headquartered at 1751 Pinnacle Drive, Suite 600, McLean, VA 22102, USA; Com Laude (Japan) Corporation, a company registered in Japan having its registered office at Suite 319,1-3-21 Shinkawa, Chuo-ku, Tokyo, 104-0033, Japan. For further information see www.comlaude.com
Very interesting data. The DNS part is what is known and expected: doing DNSSEC right is not easy, and even experienced DNS operators might get themselves into availability issues there. But the RDDS part is surprising to me: all RDDS protocols use TCP, where IPv6 transport problems are unusual. There is a known problem with UDP, IPV6 and fragmentation that affects DNS, but WHOIS, RDAP and WebWHOIS are all TCP-based. It would a very interesting phenomena for ICANN to study and come up with best practices to avoid those issues. Rubens
The root cause, which ICANN began tracking in 2015, can be broken down as follows:
DNS 23 The DNS name servers were timing out 8 ICANN does not know the root cause 4 Either the DNS servers were not responding or if they were responding, they were returning a malformed DNSSEC response where the NSEC3 records were not included 4 There was no response from the DNS servers (apparently a routing issue) 2 Expired signatures followed by breakage of the chain of trust in DNSSEC 2 Break in the chain of trust in DNSSEC 1 Expire DNSSEC signatures 1 There were no DS records when requesting delegation from IANA 1 RDDS 29 ICANN does not know the root cause 12 IPv6 transport failure 12 RDDS name servers were timing out 2 Web WHOIS service not responding 2 Broken chain of trust in DNSSEC 1 Total 52
From: Gnso-newgtld-wg <gnso-newgtld-wg-bounces@icann.org> on behalf of Steve Chan <steve.chan@icann.org> Date: Friday, October 18, 2019 at 2:00 PM To: Jeff Neuman <jeff.neuman@comlaude.com>, "gnso-newgtld-wg@icann.org" <gnso-newgtld-wg@icann.org> Subject: Re: [Gnso-newgtld-wg] EBERO / SLAM Data Status and next steps
Dear All,
There did not appear to be any comments, questions, or objections to how the EBERO/SLAM data topic was characterized by Jeff, including what will be sought from ICANN Org’s GDD team. The information below will be sent to GDD for their review and resolution.
Best, Steve
From: Gnso-newgtld-wg <gnso-newgtld-wg-bounces@icann.org> on behalf of Jeff Neuman <jeff.neuman@comlaude.com> Date: Wednesday, October 2, 2019 at 8:14 AM To: "gnso-newgtld-wg@icann.org" <gnso-newgtld-wg@icann.org> Subject: [Gnso-newgtld-wg] EBERO / SLAM Data Status and next steps
All,
On 23 August 2019, high-level statistics were shared from ICANN org’s SLA Monitoring program (attached). These statistics provided the monthly number of 1) DNS failures 2) RDDS failures and 3) DNS and RDDS failures that reached the EBERO threshold.
On 26 August 2019, Donna Austin shared a RySG document which included data previously shared by ICANN org at the Madrid DNS Symposium / GDD Summit (attached). This data provided more detail on group 3 above in particular, including elements like: The cause for why the EBERO threshold was reached (i.e., DNS/DNSSEC or RDDS) A further, more specific breakdown within each of the two categories that indicate why the EBERO threshold was reached (e.g., IPv6 transport error, broken chain of trust in DNSSEC, etc.)
Within our Working Group there has been some discussion about what impact this data may have. For example, we discussed whether past failures should have any impact on future application evaluations or RSP pre-approval for future rounds? The Working Group, however, is trending towards agreeing that evaluation and thus RSP pre-approval (which is likely to leverage the same or substantially the same evaluation criteria) should be forward looking and not be impacted by past performance. The Working Group is also trending towards requiring the pre-approval prior to each evaluation round and that the approval will be good for the duration of the then-current round.
A second area in which this data could have an impact is whether or not existing Registry Service Providers should be grandfathered into any future pre-approval processes? Based on the Working Group’s deliberations and public comment received, the answer appears to be no. Most of the comments seem to point to requiring all RSPs (whether new or existing) to go through the same processes in future rounds.
Therefore, based on where the Working Group is heading, the Co-Chairs do no believe that we are in need of additional historical data for purposes of finalizing its work on the RSP Pre-approval program.
We recognize that registry service provider performance has been a topic of discussion for matters other than the pre-approval program. We understand that there has been discussion about whether the relatively high number of registry services failures indicates that RSP might need to improve how they operate to ensure the security and stability of the DNS. While this WG is forward looking and not intending to impose new commitments for existing RSPs, lessons may be learned to improve the evaluation process for future applicants/RSPs.
Based on the above, the WG believes it does need more context from GDD to understand the data it has been provided, since not all SLA failures are equally harmful. Therefore in looking at the failures, for example, the WG needs clarity on whether they refer to service availability or Round Trip Time (RTT). The WG also wants to try and examine if there was any harm caused by the downtime, which is benefitted in part by understanding the timing of the failure (e.g., prior to Sunrise, during Sunrise, before General Availability, after GA). For the failures that did NOT reach the EBERO threshold, the WG would like summary data of which of the SLRs was missed. For the failures that DID reach the EBERO threshold, a detailed breakdown (e.g., similar specificity that was shared at the Madrid DNS Symposium / GDD Summit), which includes the timing of the failure. This information may help identify weaknesses in the evaluation criteria, registry system testing, and/or contractual requirements captured in the Registry Agreement.
The leadership intends to send the request for additional context next week, but we would like to know if there are any comments from the Working Group.
Thanks for your attention to this matter.
Best regards,
Jeff Neuman and Cheryl Langdon-Orr SubPro PDP Co-Chairs
The contents of this email and any attachments are confidential to the intended recipient. They may not be disclosed, used by or copied in any way by anyone other than the intended recipient. If you have received this message in error, please return it to the sender (deleting the body of the email and attachments in your reply) and immediately and permanently delete it. Please note that the Com Laude Group does not accept any responsibility for viruses and it is your responsibility to scan or otherwise check this email and any attachments. The Com Laude Group does not accept liability for statements which are clearly the sender's own and not made on behalf of the group or one of its member entities. The Com Laude Group includes Nom-IQ Limited t/a Com Laude, a company registered in England and Wales with company number 5047655 and registered office at 28-30 Little Russell Street, London, WC1A 2HN England; Valideus Limited, a company registered in England and Wales with company number 06181291 and registered office at 28-30 Little Russell Street, London, WC1A 2HN England; Demys Limited, a company registered in Scotland with company number SC197176, having its registered office at 33 Melville Street, Edinburgh, Lothian, EH3 7JF Scotland; Consonum, Inc. dba Com Laude USA and Valideus USA, headquartered at 1751 Pinnacle Drive, Suite 600, McLean, VA 22102, USA; Com Laude (Japan) Corporation, a company registered in Japan having its registered office at Suite 319,1-3-21 Shinkawa, Chuo-ku, Tokyo, 104-0033, Japan. For further information see www.comlaude.com <https://comlaude.com/>_______________________________________________ Gnso-newgtld-wg mailing list Gnso-newgtld-wg@icann.org https://mm.icann.org/mailman/listinfo/gnso-newgtld-wg _______________________________________________ By submitting your personal data, you consent to the processing of your personal data for purposes of subscribing to this mailing list accordance with the ICANN Privacy Policy (https://www.icann.org/privacy/policy) and the website Terms of Service (https://www.icann.org/privacy/tos). You can visit the Mailman link above to change your membership status or configuration, including unsubscribing, setting digest-style delivery or disabling delivery altogether (e.g., for a vacation), and so on.
Steve – thanks for this. After the discussion on today’s call and looking at the data, I have a few thoughts on the RSP deliberations. From Jeffs original email - “The Working Group, however, is trending towards agreeing that evaluation and thus RSP pre-approval (which is likely to leverage the same or substantially the same evaluation criteria) should be forward looking and not be impacted by past performance.” I Agree, no RSP should be disqualified for new rounds based on past performance. Ability to operate a registry system should be determined based on current requirements (EPDP temp specs for example). “The Working Group is also trending towards requiring the pre-approval prior to each evaluation round and that the approval will be good for the duration of the then-current round.” On this I disagree. No RP should be “certified” or “pre-approved” . It’s very clear from the statistics that many RSPs failed to meet requirements even AFTER they passed PDT “Of the 52 cases, 11 occurred prior to the TLDs Sunrise period, 8 during Sunrise, 2 post Sunrise, 4 before General Availability and 27 during General Availability.” I understand the desire for efficiencies and the frustration that 2012 round operators experienced having to repeat the same tests hundreds of times. Instead of a “pre-certification” or “pre-approval” I suggest breaking the testing into phases, so that the elements of RSP testing that are not specific to a unique TLD could be performed prior to the application window or even during application review, one time, per provider. A good example of this is verifying EPP – are the communications between the registry and registrar working? IDN tables and LGRs could be reviewed at any time. Testing of data escrow could occur at any time. The parts of the Self-certification documents that are not specific to the TLD could be submitted at any time. But instead of preferring to this as “Pre approval”, which to me and to future applicants who are not ICANN insiders, implies some sort of service level guarantee, I think we should refer to those RSPs as having passed “Phase 1 testing.” The elements of the test that are specific to a TLD such as DNS and DNSSEC could be proctored as soon as the winner of the Vickrey auction is determined. We could call this Phase 2 testing. What is important to point out is that based on the SLAM statistics showing failures of RSPs before, during, and after PDT testing, as well as unforeseen new requirements such as the EPDP Temp Spec, we cannot assure any applicant that an RSP is pre-certified, But we can find efficiencies in the testing process. “A second area in which this data could have an impact is whether or not existing Registry Service Providers should be grandfathered into any future pre-approval processes? Based on the Working Group’s deliberations and public comment received, the answer appears to be no. “ Agree. Past performance is not an indicator of future performance. So in summary – I suggest we should drop the term pre-approval and instead refer to it as Phase 1 and Phase 2 testing From: Gnso-newgtld-wg <gnso-newgtld-wg-bounces@icann.org> On Behalf Of Steve Chan Sent: Friday, January 17, 2020 1:54 PM To: Jeff Neuman <jeff.neuman@comlaude.com>; gnso-newgtld-wg@icann.org Subject: Re: [Gnso-newgtld-wg] EBERO / SLAM Data Status and next steps Dear WG Members, In response to the request for additional data on EBERO and SLA Monitoring (see email thread below for details and context), please find the response from ICANN org’s Global Domains Division: Sub Pro PDP Request #1: Looking at the failures, the WG needs clarity on whether they refer to service availability or Round Trip Time (RTT). · ICANN org Response to Request #1: The failures refer to Service Availability and NOT Round Trip Time (RTT). Sub Pro PDP Request #2: For the failures that did NOT reach the EBERO threshold, the WG would like summary data of which of the SLRs was missed. · ICANN org Response to Request #2: The SLRs that were missed were DNS Service Availability and RDDS Service Availability Sub Pro PDP Request #3: For the failures that DID reach the EBERO threshold, a detailed breakdown (e.g., similar specificity that was shared at the Madrid DNS Symposium / GDD Summit), which includes the timing of the failure. · ICANN org Response to Request #3: See Below The ICANN SLAM Stats from Jan 2014 – Nov 2019 shows that there have been 52 cases where a gTLD reached one of the emergency thresholds: · 23 out of the 52 cases were triggered by failures in the DNS/DNSSEC services · 29 out of the 52 cases were triggered by failures in the RDDS Of the 52 cases, 11 occurred prior to the TLDs Sunrise period, 8 during Sunrise, 2 post Sunrise, 4 before General Availability and 27 during General Availability. DNS Failures: · Prior to Sunrise: 4 TLDs · During Sunrise: 2 TLDs · Post Sunrise: 0 TLDs · Before General Availability: 2 TLDs · General Availability: 15 TLDs RDDS Failures: · Prior to Sunrise: 7 TLDs · During Sunrise: 6 TLDs · Post Sunrise: 2 TLDs · Before General Availability: 2 TLDs · General Availability: 12 TLDs 13 RSPs, 35 gTLDs and a total of 760.7k active names were involved in the 52 cases. The root cause, which ICANN began tracking in 2015, can be broken down as follows: DNS 23 The DNS name servers were timing out 8 ICANN does not know the root cause 4 Either the DNS servers were not responding or if they were responding, they were returning a malformed DNSSEC response where the NSEC3 records were not included 4 There was no response from the DNS servers (apparently a routing issue) 2 Expired signatures followed by breakage of the chain of trust in DNSSEC 2 Break in the chain of trust in DNSSEC 1 Expire DNSSEC signatures 1 There were no DS records when requesting delegation from IANA 1 RDDS 29 ICANN does not know the root cause 12 IPv6 transport failure 12 RDDS name servers were timing out 2 Web WHOIS service not responding 2 Broken chain of trust in DNSSEC 1 Total 52 From: Gnso-newgtld-wg <gnso-newgtld-wg-bounces@icann.org<mailto:gnso-newgtld-wg-bounces@icann.org>> on behalf of Steve Chan <steve.chan@icann.org<mailto:steve.chan@icann.org>> Date: Friday, October 18, 2019 at 2:00 PM To: Jeff Neuman <jeff.neuman@comlaude.com<mailto:jeff.neuman@comlaude.com>>, "gnso-newgtld-wg@icann.org<mailto:gnso-newgtld-wg@icann.org>" <gnso-newgtld-wg@icann.org<mailto:gnso-newgtld-wg@icann.org>> Subject: Re: [Gnso-newgtld-wg] EBERO / SLAM Data Status and next steps Dear All, There did not appear to be any comments, questions, or objections to how the EBERO/SLAM data topic was characterized by Jeff, including what will be sought from ICANN Org’s GDD team. The information below will be sent to GDD for their review and resolution. Best, Steve From: Gnso-newgtld-wg <gnso-newgtld-wg-bounces@icann.org<mailto:gnso-newgtld-wg-bounces@icann.org>> on behalf of Jeff Neuman <jeff.neuman@comlaude.com<mailto:jeff.neuman@comlaude.com>> Date: Wednesday, October 2, 2019 at 8:14 AM To: "gnso-newgtld-wg@icann.org<mailto:gnso-newgtld-wg@icann.org>" <gnso-newgtld-wg@icann.org<mailto:gnso-newgtld-wg@icann.org>> Subject: [Gnso-newgtld-wg] EBERO / SLAM Data Status and next steps All, On 23 August 2019, high-level statistics were shared from ICANN org’s SLA Monitoring program (attached). These statistics provided the monthly number of 1) DNS failures 2) RDDS failures and 3) DNS and RDDS failures that reached the EBERO threshold. On 26 August 2019, Donna Austin shared a RySG document which included data previously shared by ICANN org at the Madrid DNS Symposium / GDD Summit (attached). This data provided more detail on group 3 above in particular, including elements like: * The cause for why the EBERO threshold was reached (i.e., DNS/DNSSEC or RDDS) * A further, more specific breakdown within each of the two categories that indicate why the EBERO threshold was reached (e.g., IPv6 transport error, broken chain of trust in DNSSEC, etc.) Within our Working Group there has been some discussion about what impact this data may have. For example, we discussed whether past failures should have any impact on future application evaluations or RSP pre-approval for future rounds? The Working Group, however, is trending towards agreeing that evaluation and thus RSP pre-approval (which is likely to leverage the same or substantially the same evaluation criteria) should be forward looking and not be impacted by past performance. The Working Group is also trending towards requiring the pre-approval prior to each evaluation round and that the approval will be good for the duration of the then-current round. A second area in which this data could have an impact is whether or not existing Registry Service Providers should be grandfathered into any future pre-approval processes? Based on the Working Group’s deliberations and public comment received, the answer appears to be no. Most of the comments seem to point to requiring all RSPs (whether new or existing) to go through the same processes in future rounds. Therefore, based on where the Working Group is heading, the Co-Chairs do no believe that we are in need of additional historical data for purposes of finalizing its work on the RSP Pre-approval program. We recognize that registry service provider performance has been a topic of discussion for matters other than the pre-approval program. We understand that there has been discussion about whether the relatively high number of registry services failures indicates that RSP might need to improve how they operate to ensure the security and stability of the DNS. While this WG is forward looking and not intending to impose new commitments for existing RSPs, lessons may be learned to improve the evaluation process for future applicants/RSPs. Based on the above, the WG believes it does need more context from GDD to understand the data it has been provided, since not all SLA failures are equally harmful. Therefore in looking at the failures, for example, the WG needs clarity on whether they refer to service availability or Round Trip Time (RTT). The WG also wants to try and examine if there was any harm caused by the downtime, which is benefitted in part by understanding the timing of the failure (e.g., prior to Sunrise, during Sunrise, before General Availability, after GA). * For the failures that did NOT reach the EBERO threshold, the WG would like summary data of which of the SLRs was missed. * For the failures that DID reach the EBERO threshold, a detailed breakdown (e.g., similar specificity that was shared at the Madrid DNS Symposium / GDD Summit), which includes the timing of the failure. * This information may help identify weaknesses in the evaluation criteria, registry system testing, and/or contractual requirements captured in the Registry Agreement. The leadership intends to send the request for additional context next week, but we would like to know if there are any comments from the Working Group. Thanks for your attention to this matter. Best regards, Jeff Neuman and Cheryl Langdon-Orr SubPro PDP Co-Chairs ________________________________ The contents of this email and any attachments are confidential to the intended recipient. They may not be disclosed, used by or copied in any way by anyone other than the intended recipient. If you have received this message in error, please return it to the sender (deleting the body of the email and attachments in your reply) and immediately and permanently delete it. Please note that the Com Laude Group does not accept any responsibility for viruses and it is your responsibility to scan or otherwise check this email and any attachments. The Com Laude Group does not accept liability for statements which are clearly the sender's own and not made on behalf of the group or one of its member entities. The Com Laude Group includes Nom-IQ Limited t/a Com Laude, a company registered in England and Wales with company number 5047655 and registered office at 28-30 Little Russell Street, London, WC1A 2HN England; Valideus Limited, a company registered in England and Wales with company number 06181291 and registered office at 28-30 Little Russell Street, London, WC1A 2HN England; Demys Limited, a company registered in Scotland with company number SC197176, having its registered office at 33 Melville Street, Edinburgh, Lothian, EH3 7JF Scotland; Consonum, Inc. dba Com Laude USA and Valideus USA, headquartered at 1751 Pinnacle Drive, Suite 600, McLean, VA 22102, USA; Com Laude (Japan) Corporation, a company registered in Japan having its registered office at Suite 319,1-3-21 Shinkawa, Chuo-ku, Tokyo, 104-0033, Japan. For further information see www.comlaude.com<https://comlaude.com>
On 30 Jan 2020, at 19:49, Jim Prendergast <jim@GALWAYSG.COM> wrote:
Steve – thanks for this. After the discussion on today’s call and looking at the data, I have a few thoughts on the RSP deliberations.
From Jeffs original email - “The Working Group, however, is trending towards agreeing that evaluation and thus RSP pre-approval (which is likely to leverage the same or substantially the same evaluation criteria) should be forward looking and not be impacted by past performance.”
I Agree, no RSP should be disqualified for new rounds based on past performance. Ability to operate a registry system should be determined based on current requirements (EPDP temp specs for example).
That is already verified by continuous ICANN monitoring and compliance.
“The Working Group is also trending towards requiring the pre-approval prior to each evaluation round and that the approval will be good for the duration of the then-current round.”
On this I disagree. No RP should be “certified” or “pre-approved” . It’s very clear from the statistics that many RSPs failed to meet requirements even AFTER they passed PDT “Of the 52 cases, 11 occurred prior to the TLDs Sunrise period, 8 during Sunrise, 2 post Sunrise, 4 before General Availability and 27 during General Availability.”
I understand the desire for efficiencies and the frustration that 2012 round operators experienced having to repeat the same tests hundreds of times. Instead of a “pre-certification” or “pre-approval” I suggest breaking the testing into phases, so that the elements of RSP testing that are not specific to a unique TLD could be performed prior to the application window or even during application review, one time, per provider. A good example of this is verifying EPP – are the communications between the registry and registrar working? IDN tables and LGRs could be reviewed at any time. Testing of data escrow could occur at any time. The parts of the Self-certification documents that are not specific to the TLD could be submitted at any time.
Those are things of a different nature. EPP can be verified in real world testing, while IDN tables and LGRs are more of a evaluation test.
But instead of preferring to this as “Pre approval”, which to me and to future applicants who are not ICANN insiders, implies some sort of service level guarantee, I think we should refer to those RSPs as having passed “Phase 1 testing.”
I like the idea of not using words such as "approval", since those could carry liablities to ICANN in case of an RSP failure. I also like the idea of slicing the tests, but the previous paragraph shows that I might cut in a different format. Rubens
participants (3)
-
Jim Prendergast -
Rubens Kuhl -
Steve Chan