Hi, On Thu, Feb 15, 2018 at 02:03:49PM +0000, Greg Aaron wrote:
Well... no. We can certainly agree that a move to RDAP is sorely needed. But deficiencies in the WHOIS protocol were not the problem.
Actually, your quote shows otherwise. See below.
Rather it was failure by many registrars to implement properly and uniformly -- not just "bad actors" but the many more that were inattentive or not competent.
I used "bad actors" loosely, to include registrars who didn't do their job. I can't tell whether a registrar who doesn't do its job is incompetent, lazy, or malicious. And I don't care.
"Historically, the centralized databases of thick Whois registries are operated under a single administrator that sets conventions and standards for submission and display,
The mere fact that this talks about display _at all_ is evidence that the whois protocol itself was indeed part of the problem. Display in a data system should not be under the control of the data source, but under some formatting system. This is the same reason that my user agent (browser) is responsible for formatting things on the web according to the css file sent by the server (or, perhaps, some other css file I can use locally to override that stylesheet). The submission standards are a different problem, because in fact there are agreemnets that _already_ govern such submissions. There is no need of a central datbase to get the submmission correct, unless some participants in the system are just not doing their job. The answer to that, of course, is market discipline, including either reputation system counter-bias or deaccreditation. ICANN's dependence on fees from the registry business makes it an ineffective agent of deaccreditation, of course.
The thin model is thus criticized for introducing variability among Whois services, which can be problematic for legitimate forms of automation.
This is again evidence that whois the protocol was part of the problem. You can't automate against whois because the only way to do it is to scrape screens, and that is unreliable. Indeed, in a properly formatted data output, even missing data isn't as great a problem, because your automation can cope with the missing data precisely because it is formatted for machine consumption.
In other words: security, stability, and usability reasons.
Those may be the reasons people selected this path, but it has always been evident to many of us that the mistake was in relying on a protocol misfit to the purpose. We didn't get greater security from it: data leaks like crazy, there is no authentication of who is requesting, and people lie about their data for the perfectly reasonable end of not getting doxed just because of having a domain name. We didn't get greater stability, either, because we have increased the data maintenance burden on registries for no obvious benefit, and have increased the probabilty of data mismatches across two different "sources of truth" (as they say, "The man with two watches never knows what time it is"). And usability was not improved, either, because whois can't do internationalization, can't give you only the data you want, and can't do referrals reliably or effectively.
The accuracy of the data is a completely separate matter.
It is not. A significant reason for data problems is manifestly the bad protocol, which creates incentives for white lies.
A distributed system relies on the competence, robustness, and good faith of all the parties involved. Centralizing some aspects can mitigate failures, incompetence, and bad faith.
It seems to me that the current whois is, quite literally, a counterexample to your claim, whereas the DNS and its actual operation on the Internet suggests to me that when the incentives are correctly aligned a distributed system works well. I can buy the argument that the R/R/R model was dumb, and that registrars are a needless wheel that does no work in the registration system. _That_ is a reason to centralise all data in the registry. But I doubt we are headed in that direction. Alternatively, I can buy the argument that the registry should be the only source of data having to do with any registration, and that registrars are basically just authorized agents of the registry and must provide passthrough access to such data as is related to domain name registrations. _That_ could be a reason to centralise all data in the registry, too. It leads pretty quickly to pretty serious questions (the ones we have been debating) about exactly which data the registry really needs in support of domain name registration, and it also leads to additional questions (not yet discussed) about whether registrars may retain any of that data in their own repositories when they are collected only for domain name registration. I suspect the answer is no, at least not without consent (registrars would have access to it anyway, through the same SRS where the data would be stored. But I can't imagine any registrar being comfortable with nailing their own uptime to that of every registry in the world). I cannot buy, however, any claim that centralising the data necessarily makes things better for the Internet. I don't think the claim has been demonstrated, and I can think of lots of ways in which it is obviously false. Best regards, A -- Andrew Sullivan ajs@anvilwalrusden.com