[Ssr2-review] Recommendation 9

June 21, 2019

      On Thu, Jun 20, 2019 at 04:36:05PM +0300, Matogoro Jabera wrote:
...
Dear Colleagues,
I have started my trip to Marrakech and have already boarded. Please note
my apology to this meeting.
I have seen KC review on Rec. 9 and I support the suggestion that have
improved the phrasing and context.
Matagoro,

I actually did not finish a re-write of Recommendation 9,
I spent a couple hours reading and thinking about the 
comments to the recent DAAR reports

https://www.icann.org/en/system/files/files/octo-ssr-responses-daar-public-i...

I find the public comments and ICANN's responses to them 
to be a precious anthology of What's Wrong With This Ecosystem.

Vixie waxes his usual eloquence:

	The immediate impact of the DAAR methodology report is to enter
	into public evidence the following quite damning fact: A Whois
	query is the only means available to obtain the identity of a
	domain name's sponsoring registrar. This was an accident of
	history, overlooked during the IFWP process which separated
	registrar functions from registry functions for the first time.
	We needed a machine-readable way to determine, at scale, the
	identity of the sponsoring registrar for a domain.  The absence
	of such a facility has allowed many registrars to operate in a
	very dirty, ugly, extractive, and public-abusive way. It's common
	to register domains and then drag one's feet about complaints.
	There is no business risk to a registrar who behaves in this
	way. In the absence of such business risk, these public-abusive
	behaviors have scaled quite well and that's a problem.

Several SSR2 members, among others, have observed that ICANN
obviously has the information about sponsoring registrars of
domains these blacklists say are abusive, but won't reveal it.
ICANN claims it can't reveal this information, due to agreements
with the companies it is buying proprietary blacklist information from.  
But it also says that:

	In order to ensure reliable output, we will not be publishing
	registrar related data and analytics until we are able to collect
	comprehensive data and develop reliable metrics for registrar abuse.

So apparently it's not (just?) about proprietary data NDAs.

But it is not clear what "able to collect comprehensive
data" and "develop reliable metrics for registrar abuse" means.
Is this research in ICANN's strategic plan?  I do not remember
ICANN taking on this task.  

This also means there is no scientific validation of the
accuracy or coverage of the blacklist information, although
ICANN claims "the false-positive rates in the blocklists used
in DAAR are extremely low; some exhibit a false-positive rate
of about 0.1%, or one in a thousand", although I don't know
where this claim comes from, and who validated it against what
ground truth at what time.  

ICANN also comments that "the data used by DAAR is seen by the
network operations community, e.g., email providers, ISPs,
website operators,etc., as a reliable indicator of where abuse
related to TLDs exists, and in what concentration", but again,
I don't know where this claim comes from.  

ICANN also comments that "As the DAAR methodology documents
which reputation providers and lists offered by those providers
are being used, registry and registrar operators are able
toindependently consume or monitor those blocklists as part
of their anti-abuse efforts." which I think means ICANN thinks
registries and registrars should go off and pay for those 
proprietary feeds themselves, though technically ICANN has
already used revenue from registries and registrars to pay
for these feeds.  Why should the registries and registrars
have to pay for them twice?

Finally, ICANN comments that "There are still ongoing discussions
about whether and how the data will be published on the ICANN
Open Data Program, in cases where licensing permits. ICANN org
also intends to publish monthly reports based on DAAR data
including anonymized aggregatesand making the DAAR data
associated with individual registries and registrars available
to those registries and registrars."

I guess if we want to learn more  about this ecosystem, to
write a more informed recommendation, we could ask ICANN to
share copies of the contracts with these companies, and we
could talk with the companies about various pricing strategies
that would actually address the problem Vixie describes.

In turn, industry (RySG) claims that:

-- ICANN should provide more information about how it selects the sources
for the DAAR collection system, including selection criteria and how the
quality of the sources is assessed and measured over time.

-- Members of ICANN's Contractual Compliance staff are already on the record
as noting that they cannot use the DAAR statistics in isolation as an
enforcement tool. It would be a worrying precedent should ICANN Compliance
actually use such data, in its current form, to ground any enforcement
action, especially considering the stated lack of actionable evidence
and the decision not to perform any actual quality review or verification
of the Data Feeds (RBLs) in use

So, DNS industry players (registrars and registry) object to the fact that 
DAAR amplifies and anonymizes unvalidate proprietary blacklist data 
in an unaccountable fashion.  Researchers and security professionals
object to the coarse granularity of the reports. Although the reports
confirm that the vast majority of inferred malicious activity was 
occurring in just a few global TLDs, ICANN will not publish the
names of the registrars and registries serving these malicious domains.
The lack of attribution prevents any cleanup of the malicious activity.
ICANN has asserted that it cannot publish this information due to 
agreements with the blacklist feed operators, although apparently
epistemological issues are also in play.

This to say nothing of the elephant in the room, which is that
even if those agreements were not in place, and even if the 
blacklists could be validated, why would ICANN publish information
that would jeopardize the reputation of any of the registries 
or registrars?  It strikes me as an unreasonable expectation.

In summary, I think there is more to say than what I put in the
doc last nite.  I joined the call today hoping to discuss it,
but there were so few people on line, I did not consider it worth it.

Besides, Naveed was very busy on that call speaking the truth,
and I think what he said was more important, for the moment, 
than Recommendation 9.

k

[Ssr2-review] Recommendation 9

k claffy