Re: [RSSAC Caucus] 48 HOUR LAST CALL: UPDATED RSSAC Advisory on Metrics for the DNS Root Servers and Root Server System

Feb. 18, 2020

      On Tue, Feb 18, 2020 at 2:30 PM Geoff Huston <gih@apnic.net> wrote:
...
...
On 19 Feb 2020, at 5:57 am, Fred Baker <fred@isc.org> wrote:
...
On Feb 13, 2020, at 5:25 PM, Geoff Huston <gih@apnic.net> wrote:
I could imagine a test of sending 10 (or some other not too small, not
too large number)  back-to-back queries to a root server and checking that
all queries receive a response. A highly loaded server instance would not
necessarily provide all 10 responses, while a server instance operating
with its designed query load paramters would provide all the responses
I'll echo Paul and Duane's comments here. On this one, I have a question
of statistical validity. RFC 6928 recommends a TCP initial window of ten
because that is a number that can be reasonably expected to traverse the
open Internet if initiated as a burst. Matt has gone so far as to tell me
that his measurements suggest that some TCP Offload Engines appear to
successfully send bursts of 60K bytes or about 40 segments back to back. So
I wonder whether we would learn anything from a ten packet burst - do we
need a 100 packet burst, or something else, and for what reason do we need
that?
In any event, I think we would need something resembling suggested text,
and some evidence that the measurement tests a case that eluded the
existing tests. That's not "push back" as much as "what do we learn if we
add this one?”
So in terms of what "we do we learn if we add this one", if I can address
that first, is the extent to which individual service instances are
“coping” with the query load that is imposed on them.
The other questions are more about the details of the measurement. If a
train of UDP packets is injected into the network what is the anticipated
success rate of transmission through the network. Will 10 enjoy a higher
success probability for all 10 packets than 100? and so on. The ‘signal’
that this measurement would be looking for is missing responses and the
inference would be that an overloaded UDP service would load shed by
discarding incoming queries.
Another thing to consider (especially with regards to load shedding or
mitigation efforts):
What about sending queries with DNS Cookies set, or differentially querying
with/without those, when doing multiple UDP packet trains?
(That would potentially provide useful data on loss rates and whether there
is a suspicion of loss due to non-organic traffic levels, such as DOS/DDOS
events, as well as discovery of support for DNS Cookies by instances.)

Brian
...
Its late in the process for this particular incarnation of the metrics
document to bring this up and if we think that updating metrics of the RSS
is an ongoing effort then another response may well be to keep this in mind
for the next round of document revisions.
Geoff
_______________________________________________
rssac-caucus mailing list
rssac-caucus@icann.org
https://mm.icann.org/mailman/listinfo/rssac-caucus
_______________________________________________
By submitting your personal data, you consent to the processing of your
personal data for purposes of subscribing to this mailing list accordance
with the ICANN Privacy Policy (https://www.icann.org/privacy/policy) and
the website Terms of Service (https://www.icann.org/privacy/tos). You can
visit the Mailman link above to change your membership status or
configuration, including unsubscribing, setting digest-style delivery or
disabling delivery altogether (e.g., for a vacation), and so on.

Re: [RSSAC Caucus] 48 HOUR LAST CALL: UPDATED RSSAC Advisory on Metrics for the DNS Root Servers and Root Server System

Brian Dickson