Observation on Large response issue during Yeti KSK rollover
Hi ICANN KSK rollover team, For your information, I have an observation on large response impacts during Yeti KSK rollover. Please check the article. http://yeti-dns.org/yeti/blog/2017/08/02/large-packet-impact-during-yeti-ksk -rollover.html Best regards, Davey
Thanks Davey, Just to make sure I understand, these are IPv6-only measurements and results, correct? DW
On Aug 2, 2017, at 2:31 AM, Davey Song(宋林健) <ljsong@biigroup.cn> wrote:
Hi ICANN KSK rollover team,
For your information, I have an observation on large response impacts during Yeti KSK rollover. Please check the article.
http://yeti-dns.org/yeti/blog/2017/08/02/large-packet-impact-during-yeti-ksk...
Best regards, Davey _______________________________________________ ksk-rollover mailing list ksk-rollover@icann.org https://mm.icann.org/mailman/listinfo/ksk-rollover
Yes. It would be better if there was a comparison between IPv4 and IPv6. But we only have IPv6 traffic. In the initial setting ,there is a same group of probes using TCP as a comparison in case of routing problem or other network failure. But some probes I chose have some bugs sending DNS over TCP at a old version. Do you guys have similar testing or reference of other's work providing quantitative result on this regard? I mean the degree of impact due to large response in IPv6 (or IPv4) network. I'm not sure the result I got (less than 1% misbehave) is a common sense or not. Davey
-----邮件原件----- 发件人: Wessels, Duane [mailto:dwessels@verisign.com] 发送时间: 2017年8月2日 23:16 收件人: Davey Song(宋林健) 抄送: ksk-rollover@icann.org 主题: Re: [ksk-rollover] Observation on Large response issue during Yeti KSK rollover
Thanks Davey,
Just to make sure I understand, these are IPv6-only measurements and results, correct?
DW
On Aug 2, 2017, at 2:31 AM, Davey Song(宋林健) <ljsong@biigroup.cn> wrote:
Hi ICANN KSK rollover team,
For your information, I have an observation on large response impacts during Yeti KSK rollover. Please check the article.
http://yeti-dns.org/yeti/blog/2017/08/02/large-packet-impact-during-yeti-ksk... ollover.html
Best regards, Davey _______________________________________________ ksk-rollover mailing list ksk-rollover@icann.org https://mm.icann.org/mailman/listinfo/ksk-rollover
Davey Song(宋林健) writes:
Do you guys have similar testing or reference of other's work providing quantitative result on this regard? I mean the degree of impact due to large response in IPv6 (or IPv4) network. I'm not sure the result I got (less than 1% misbehave) is a common sense or not.
We did test this for the ksk rollover plan. The interesting result was that if one doesn't fragment but just do send with sizes bigger the 1220 (but below 1480 or so) more queries go through. Geoff Houston has a blog about this ("starring root servers" of so it in it's tittle). jaap
I'm sorry. I made a mistake in conclusion part. The failure rate is around 7% not 0.7%. it seems worse than the conclusion I made before. Davey
-----邮件原件----- 发件人: Davey Song(宋林健) [mailto:ljsong@biigroup.cn] 发送时间: 2017年8月3日 9:28 收件人: 'Wessels, Duane' 抄送: 'ksk-rollover@icann.org' 主题: 答复: [ksk-rollover] Observation on Large response issue during Yeti KSK rollover
Yes.
It would be better if there was a comparison between IPv4 and IPv6. But we only have IPv6 traffic.
In the initial setting ,there is a same group of probes using TCP as a comparison in case of routing problem or other network failure. But some probes I chose have some bugs sending DNS over TCP at a old version.
Do you guys have similar testing or reference of other's work providing quantitative result on this regard? I mean the degree of impact due to large response in IPv6 (or IPv4) network. I'm not sure the result I got (less than 1% misbehave) is a common sense or not.
Davey
-----邮件原件----- 发件人: Wessels, Duane [mailto:dwessels@verisign.com] 发送时间: 2017年8月2日 23:16 收件人: Davey Song(宋林健) 抄送: ksk-rollover@icann.org 主题: Re: [ksk-rollover] Observation on Large response issue during Yeti KSK rollover
Thanks Davey,
Just to make sure I understand, these are IPv6-only measurements and results, correct?
DW
On Aug 2, 2017, at 2:31 AM, Davey Song(宋林健) <ljsong@biigroup.cn> wrote:
Hi ICANN KSK rollover team,
For your information, I have an observation on large response impacts during Yeti KSK rollover. Please check the article.
http://yeti-dns.org/yeti/blog/2017/08/02/large-packet-impact-during-ye ti-ksk-r ollover.html
Best regards, Davey _______________________________________________ ksk-rollover mailing list ksk-rollover@icann.org https://mm.icann.org/mailman/listinfo/ksk-rollover
I changed the conclusion by correcting the number to 7% and add a proposed solution to hold 1220-octets boundary on DNS response size. Davey
-----邮件原件----- 发件人: Davey Song(宋林健) [mailto:ljsong@biigroup.cn] 发送时间: 2017年8月3日 9:36 收件人: 'Wessels, Duane' 抄送: 'ksk-rollover@icann.org' 主题: 答复: [ksk-rollover] Observation on Large response issue during Yeti KSK rollover
I'm sorry. I made a mistake in conclusion part. The failure rate is around 7% not 0.7%. it seems worse than the conclusion I made before.
Davey
-----邮件原件----- 发件人: Davey Song(宋林健) [mailto:ljsong@biigroup.cn] 发送时间: 2017年8月3日 9:28 收件人: 'Wessels, Duane' 抄送: 'ksk-rollover@icann.org' 主题: 答复: [ksk-rollover] Observation on Large response issue during Yeti KSK rollover
Yes.
It would be better if there was a comparison between IPv4 and IPv6. But we only have IPv6 traffic.
In the initial setting ,there is a same group of probes using TCP as a comparison in case of routing problem or other network failure. But some probes I chose have some bugs sending DNS over TCP at a old version.
Do you guys have similar testing or reference of other's work providing quantitative result on this regard? I mean the degree of impact due to large response in IPv6 (or IPv4) network. I'm not sure the result I got (less than 1% misbehave) is a common sense or not.
Davey
-----邮件原件----- 发件人: Wessels, Duane [mailto:dwessels@verisign.com] 发送时间: 2017年8月2日 23:16 收件人: Davey Song(宋林健) 抄送: ksk-rollover@icann.org 主题: Re: [ksk-rollover] Observation on Large response issue during Yeti KSK rollover
Thanks Davey,
Just to make sure I understand, these are IPv6-only measurements and results, correct?
DW
On Aug 2, 2017, at 2:31 AM, Davey Song(宋林健) <ljsong@biigroup.cn> wrote:
Hi ICANN KSK rollover team,
For your information, I have an observation on large response impacts during Yeti KSK rollover. Please check the article.
http://yeti-dns.org/yeti/blog/2017/08/02/large-packet-impact-during- ye ti-ksk-r ollover.html
Best regards, Davey _______________________________________________ ksk-rollover mailing list ksk-rollover@icann.org https://mm.icann.org/mailman/listinfo/ksk-rollover
I make adding in the article like this: In ICANN’s KSK rollover plan, the packet size will exceed 1280 Octets limit up to 1414 octets on 2017-Dec-20 and 1424 octets on 2018-Jan-11. It means around 2% IPv6 resolvers(or IPv6 DNSKEY queries with DO bit set) will experience timeout. Geoff reported that 17% of resolvers cannot ask a query in TCP. So probably in extreme case there are 0.34% of IPv6 resolvers around the world will fail to validate the answers. 0.34% of millions (if IPv6 dominant), It is not a trivial number. Davey
-----邮件原件----- 发件人: ksk-rollover-bounces@icann.org [mailto:ksk-rollover-bounces@icann.org] 代表 Davey Song(宋林健) 发送时间: 2017年8月3日 9:50 收件人: 'Wessels, Duane' 抄送: ksk-rollover@icann.org 主题: [ksk-rollover] 答复: Observation on Large response issue during Yeti KSK rollover
I changed the conclusion by correcting the number to 7% and add a proposed solution to hold 1220-octets boundary on DNS response size.
Davey
-----邮件原件----- 发件人: Davey Song(宋林健) [mailto:ljsong@biigroup.cn] 发送时间: 2017年8月3日 9:36 收件人: 'Wessels, Duane' 抄送: 'ksk-rollover@icann.org' 主题: 答复: [ksk-rollover] Observation on Large response issue during Yeti KSK rollover
I'm sorry. I made a mistake in conclusion part. The failure rate is around 7% not 0.7%. it seems worse than the conclusion I made before.
Davey
-----邮件原件----- 发件人: Davey Song(宋林健) [mailto:ljsong@biigroup.cn] 发送时间: 2017年8月3日 9:28 收件人: 'Wessels, Duane' 抄送: 'ksk-rollover@icann.org' 主题: 答复: [ksk-rollover] Observation on Large response issue during Yeti KSK rollover
Yes.
It would be better if there was a comparison between IPv4 and IPv6. But we only have IPv6 traffic.
In the initial setting ,there is a same group of probes using TCP as a comparison in case of routing problem or other network failure. But some probes I chose have some bugs sending DNS over TCP at a old version.
Do you guys have similar testing or reference of other's work providing quantitative result on this regard? I mean the degree of impact due to large response in IPv6 (or IPv4) network. I'm not sure the result I got (less than 1% misbehave) is a common sense or not.
Davey
-----邮件原件----- 发件人: Wessels, Duane [mailto:dwessels@verisign.com] 发送时间: 2017年8月2日 23:16 收件人: Davey Song(宋林健) 抄送: ksk-rollover@icann.org 主题: Re: [ksk-rollover] Observation on Large response issue during Yeti KSK rollover
Thanks Davey,
Just to make sure I understand, these are IPv6-only measurements and results, correct?
DW
On Aug 2, 2017, at 2:31 AM, Davey Song(宋林健) <ljsong@biigroup.cn> wrote:
Hi ICANN KSK rollover team,
For your information, I have an observation on large response impacts during Yeti KSK rollover. Please check the article.
http://yeti-dns.org/yeti/blog/2017/08/02/large-packet-impact-durin g- ye ti-ksk-r ollover.html
Best regards, Davey _______________________________________________ ksk-rollover mailing list ksk-rollover@icann.org https://mm.icann.org/mailman/listinfo/ksk-rollover
_______________________________________________ ksk-rollover mailing list ksk-rollover@icann.org https://mm.icann.org/mailman/listinfo/ksk-rollover
On 3 Aug 2017, at 7:33, Davey Song wrote:
Geoff reported that 17% of resolvers cannot ask a query in TCP. So probably in extreme case there are 0.34% of IPv6 resolvers around the world will fail to validate the answers. 0.34% of millions (if IPv6 dominant), It is not a trivial number.
Is the set of resolvers that cannot ask a TCP query (inversely) correlated with resolvers that do DNSSEC? I would assume that a DNSSEC capable resolver will happily resolve over TCP. I can't imagine that there is a 17% prevalence of TCP blocking firewalls. But who knows… —Olaf
On 22 Aug 2017, at 12:23 am, Olaf Kolkman <kolkman@isoc.org> wrote:
On 3 Aug 2017, at 7:33, Davey Song wrote:
Geoff reported that 17% of resolvers cannot ask a query in TCP. So probably in extreme case there are 0.34% of IPv6 resolvers around the world will fail to validate the answers. 0.34% of millions (if IPv6 dominant), It is not a trivial number.
Is the set of resolvers that cannot ask a TCP query (inversely) correlated with resolvers that do DNSSEC? I would assume that a DNSSEC capable resolver will happily resolve over TCP. I can't imagine that there is a 17% prevalence of TCP blocking firewalls. But who knows…
(Heh - no matter how broken and stupid the behaviour you are looking for, if you look hard enough on the Internet you _will_ find it.!) I got that number of deliberately truncating a DNS response and then looking at the resolvers that performed a followup response using TCP. The primary observation was that 17% of the IP addresses that queries using UDP and (presumably) received the UDP response failed to query using TCP. This appears to relate to resolvers used by some 6% of endpoints. 2/3 of these endpoints appear to then re-query using another resolver, and this other resolver performed a TCP query in response to the truncated UDP response. Some 1/3 of these endpoints, (or 2% of all observed endpoints) appeared to not resolve the name as we saw NO TCP queries for the given DNS name. The issue with truncation is that you should get a truncated response back to the resolver to trigger the TCP re-query. Now for that to happen the resolver needs to query using a small (or none) EDNS(0) UDP buffer size, or the server needs to arbitrarily truncate the UDP response even though it may be smaller than the offered EDNS(0) UDP Bbuffer size. If we look at root server letters it appears that B and G truncate a large UDP response in IPv4 at 1,252 bytes of payload (1,280 bytes overall IP packet size). In IPv6 it appears that A, B, G and J truncate IPv6 UDP responses at 1232 octets of DNS payload (corresponding to an Ipv6 packet of 1,280 bytes in size). So your question was : Is this correlated with resolvers that do DNSSEC? This is a hard question to answer as it is not clear how to reliably tell if a resolver “does” DNSSEC. Merely looking for resolvers that set the DO bit in the query is highly misleading. Some 70% of all observed queries have the DO bit set, but more than half of these resolvers never followup with validation-related queries for DNSKEY and DS RRs. So maybe we should look for DNSKEY and DS queries as well? But if I ask a query through a recursive resolvers and set the CD bit then the resolver will not query for DS and DNSKEY records. Equally, if a validating recursive resolver uses a non-validating resolver as its forwarder then the “front end” of the resolver will issue the DNSSEC and DS queries to the server, so it will give the appearance that it is validating when in fact it is not. It’s not an easy problem to unravel, and in the end I used a slightly different approach and looked for a count of the proportion of endpoints that exclusively use DNSSEC validating resolvers rather than attempt to conduct a count of the number of DNSSEC validating resolvers. regards, Geoff
Apologies for not answering sooner, travelling last week and then out of the office on Friday. I saw the post earlier, just couldn't respond. We've been tracking this issue. The measurements we relied upon were done some time ago, as Jaap mentions via Geoff Huston's work. What we have been doing in addition includes - tracking the existing deployment of large-ish key sets across Top Level Domains. (I.e., there's already operational experience with the situation.) One of the tests done is described at the URL coming up. (It's not as extensive as a fleet of RIPE anchors though.) https://www.youtube.com/watch?v=sc2J5w4Zi2g&list=PLQziMT9GXafWAdedvzQYmqvW3N... We've also been recommending, via talks at operator venues, to run TCP and have also presented DNS-OARC's and Verisign's response size tests. See slide 27 of https://ripe74.ripe.net/presentations/25-RIPE74-lewis-submission.pdf. BTW, slide 25, leading up to that, was inspired by Yoneya(-san) Yoshiro of JPRS. On 8/2/17, 02:31, "ksk-rollover-bounces@icann.org on behalf of Davey Song(宋林健)" <ksk-rollover-bounces@icann.org on behalf of ljsong@biigroup.cn> wrote: Hi ICANN KSK rollover team, For your information, I have an observation on large response impacts during Yeti KSK rollover. Please check the article. http://yeti-dns.org/yeti/blog/2017/08/02/large-packet-impact-during-yeti-ksk... Best regards, Davey
participants (6)
-
Davey Song(宋林健) -
Edward Lewis -
Geoff Huston -
Jaap Akkerhuis -
Olaf Kolkman -
Wessels, Duane