David, thanks for the reply. Some responses below.

On 01/05/2018 10:42 AM, David Conrad wrote:
Doug,

On January 4, 2018 at 11:50:02 PM, Doug Barton (dougb@dougbarton.email) wrote:
Since a little before September when the 8145 data started rolling in 
all I've heard discussed is the risk to the deployed base if we do the 
roll and their stuff breaks. But there is another, arguably greater risk 
that is not being discussed, what happens if we get ourselves into a 
position where we are forced to do an emergency roll? (The common 
scenarios for that are key compromise, which is very unlikely but not 
impossible, and alg failure.) 

If they key gets lost or compromised, my understanding is that we cannot use RFC 5011 to do the roll and must fall back to doing an out-of-band key rollover. We aren’t really exercising this under this iteration of the community defined KSK rollover plan.


Correct, but we're well past the point where we can run an emergency roll as an exercise. When I was leading IANA and we were in the early stages of planning for signing the root, several of us (and I include myself in that group) wanted to hold the first 90 days or so after the signing went public as a sort of "production beta" period, and perform an emergency roll during that period for precisely this reason. But that ship sailed long ago, so at this point we are limited in what we can do.

While the current exercise won't directly address any concerns related to an emergency roll, ANY roll at this point is better than no roll, because the risk of not knowing what will happen during/after a roll is great now, and increases daily as DNSSEC grows in importance.

There are only two conditions that can be true at this point:
[…]
If #1 is true we should do the roll ASAP […]
If #2 is true we should do the roll ASAP […]

As I’ve noted previously, this would appear to argue that SAC-063 rec#3 should not have been made and that the amount of “breakage” is irrelevant. It would be nice if SSAC were to weigh in on this.

I was a founding member of the SSAC before I joined the ICANN staff. While it's been some time since I participated with them, I feel I understand their remit pretty well. The recommendation you refer to reads (sorry for any copy/paste issues):
It is expected that there will be some issues during at least the first KSK rollover, and
probably the next few.  It will not be possible to anticipate all the problems that may
occur but an agreed understanding of when the rollover has affected operational stability
beyond a reasonable boundary is essential so the decision to rollback the rollover can be
made quickly and efficiently.
To me, that recommendation seems to provide a solid balance between acknowledging that there will be problems, and also taking the importance of stability into account by asking for both a rollback plan and a criteria for the rollback decision. That all seems perfectly reasonable and appropriate. The SSAC is asked to provide advice on both Security and Stability. in this case, adding security (by showing that a 5011 roll can be performed with a minimum of disruption) requires a small, but necessary sacrifice in stability. That makes this issue no different than other, similar issues; like IPv6, IDNs, the new gTLD program, or even the introduction of DNSSEC itself.

At this point we have gathered as much data as we can realistically obtain (as you, Matt, and others have pointed out). That the data is both incomplete, and imperfect, is unfortunate; but inevitable given the state of the technology. If the reason for failing to proceed with the plan at this point is the data, I fear that we will never proceed, as there is no reasonable expectation that the data will improve in quality, no matter how long we wait.

It's also worth pointing out in regards to your concern about the SSAC recommendation that it does not call for perfect data. In fact, it specifically acknowledges that, "It will not be possible to anticipate all the problems that may occur." That point seems to bear repeating since i have seen several references to SSAC #063 as a reason not to proceed, while my reading of that document does not support that conclusion in any way. Like you, I think it would be useful if the members of the AC that worked on that document would weigh in.

At the end of the day, I sympathize with your position, David. (Perhaps more so than most others would be able to.) But waiting makes the problem worse, not better. At this point it's entirely reasonable to conclude that ICANN has gone above and beyond in their efforts to make the transition process as smooth as possible, and now it's time to move forward.

Warmest regards,

Doug