Hi Joe, comments in-line, below... All opinions are my own personal thoughts (again, no hats). On Wed 2024-05-22 19:22:55+0200 jabley@strandkip.nl wrote:
It seems like the problems with c.root-servers.org (note, .org) have no material impact to the root server system.
However, the fact that C-Root has been failing to keep up with new revisions of the root zone as they are published for some period of time seems material. On the DNS-OARC dns-operations mailing list there are reports of two top-level domain DNSSEC algorithm rolls whose timing have been impacted, for example, so it doesn't seem to be much of a stretch to say that there's potential for security-related consequences of whatever this mishap turns out to be, even if they are minor.
I am not familiar with the work that Wataru mentioned and I don't know how "security incident" is defined, but I think Wataru's question is reasonable.
The work being referenced is the Security Incident Reporting work party, and the document is here: https://docs.google.com/document/d/1NvSw7PoLGYhXPuMEjiBgqjCtp_khTGGEh0DaHkNJ... I completely agree that the question is reasonable, and I was merely stating my opinion based on my feel for the way the document has been progressing.
I know you didn't mean to suggest that spending a few minutes searching for impact is sufficient as criteria for judging whether an incident has occured, but we have metrics defined in RSSAC002 that relate directly to serving stale data; those metrics for C are surely well beyond the expected values over this event. Perhaps it's an idea to use those metrics as quantitative measures of impact?
The statement of work for the SIR wp explicitly states that 'the work party should focus on security incidents that have a *material adverse effect* on the root service.' The working party is carefully avoiding tying any hard numbers or rules to whether or not an incident qualifies as 'reportable', or trying to imagine whether or not any particular scenario qualifies or not, and explicitly stating that the decision is left to the RSO(s). Based on the information I have at the moment, my personal opinion is that this incident wouldn't qualify for security incident reporting as defined in the document. Other interesting questions are: - what is the impact of stale data being served from some or all of the instances of a single RSO? Does it depend on how old the stale data is? - what would the impact have been if the rollovers had proceeded? The answers to those questions, or other additional information, could possibly sway my opinion. Regards, Robert USC Information Sciences Institute <http://www.isi.edu/> Networking and Cybersecurity Division