The answer is yes, and thanks for the question. I was going to jump in earlier and challenge the same assertion, but figured I had said enough recently. :-) Furthermore, even the datestamp and registrar-generated data may reveal association of domains that leads you to the registrant. Let's say I register ten names one day, with the same registrar, one of which is Stephanieperrin.com, another is canadianconvertstoterrorism.com, is it not possible to find that cluster of registrations, and associate all domains with me? The data commissioners pointed out many years ago (2003 I think, I can check) that they had a problem with the reverse directory capability of the WHOIS, because it was not at all necessary for the functioning of the domain system, or at least ICANN had never made the argument. They did not think WHOIS should offer the capability of searching by registrant name. I would argue further, these days, that publication of other data should not make registrant identity reasonably retrievable. There is a question that I have in return. I presume that much of the current configuration and policy of WHOIS and its data elements is based on simply building on a flimsy foundation. A primary drive has been to keep the costs to the registrars/registries down, since human intervention is too expensive, and the appetite for data is proving to be insatiable. I don't think either of those parties were keen on publishing the personal data of their customers, but the alternatives were not at all attractive. I realize costs are to be dealt with later, but to what extent has the technical capability increased to the point where we can stop caring about whether the registrar/registry actually publishes the data, or merely allows (for instance) a duly authorized law enforcement agent in the appropriate jurisdiction (ie one with a valid warrant or other judicial authorization) to have access to the data in their files? I realize we talked about this concept of tiered access extensively in the EWG, but at least one member of that group (me) never understood whether the tiered access we were specing is something that is technically possible but financially, legally and operationally infeasible. [Shortly before I retired, I had to deal with a lot of breathless enthusiasm about what "big data" was going to do to transform our risk management in benefits programs. Totally infeasible, in my view, given the state of our data systems, the accuracy rate, the available budget, and the availability of investigators to act on findings (a critical factor; once you know you have fraud you have to do something about it if you are a government). We won't even talk about whether such risk assessment is constitutionally acceptable.] We have a similar situation here in my view. Much of what is going on now violates data protection law, we have plenty of input from the DPAs pointing that out. A new system ought to be attentive to that point. In the example I cited, the relevant law enforcement authority would have no legal trouble getting access to all data related to the registrant of canadianconvertstoterrorism.com in my view, the operative question is how fast can they do it, what authority do they have to show and how, and what mechanisms does the registrar/registry have to build in order to permit this access securely (from all three perspectives, registrar, registrant, and LEA) and at reasonable cost. The same applies to others with less compelling interests (ie domain speculators, IP and trademark owners, etc) and here we run into complex cost and authorities issues, in my view. Cheers Stephanie On 2016-12-08 06:17, Michael D. Palage wrote:
Greg,
Again I am not trying to be confrontation, but I would respectfully disagree with you on Thin Data never containing PII.
Take for example the very domain name that I am using on this email, PALAGE.COM. I believe it is possible for PII to be contained in the very domain name itself.
Take for example the following three domain name examples
FirstName_SurName.CHRISTIAN
FirstName_SurName.HIV
FirstName_SurName.LGBT
I believe that any information that discloses a person’s religious affiliation, sexual orientation or medical condition, could be deemed PII in certain jurisdictions. I will to defer to Stephanie on this question, however, I believe the answer is yes.
So NOW lets come to a point where I think “we” can find some agreement.
I believe that all Thin Data ( as I previously defined as all data elements necessary for the minimum operation of a gTLD SRS – including status) should be made available even if it does contain PII in the domain name itself of the domain name of the name servers.
Domain Name:
Registrar:
Sponsoring Registrar IANA ID:
Whois Server:
Referral URL:
Name Server:
Name Server:
Status:
Updated Date:
Creation Date:
Expiration Date:
Notwithstanding the fact that PII may be contained in the domain name or the name server domain, I believe that this “thin” data is so necessary that it MUST be disclosed and there is no situation that I can foresee where this “thin” data can be withheld. Again however, I will let Stephanie answer this question. If we can all agree on this “thin” data question that could be an important first building block toward consensus.
Best regards,
Michael
*From:* Greg Aaron [mailto:gca@icginc.com] *Sent:* Wednesday, December 7, 2016 7:30 PM *To:* Michael D. Palage <michael@palage.com>; 'Gomes, Chuck' <cgomes@verisign.com>; gnso-rds-pdp-wg@icann.org *Subject:* RE: [gnso-rds-pdp-wg] key concepts: say "contact data" when that is what we mean
BTW, much of the thin data in WHOIS is not even “collected” from or provided by the registrant. Much of it is generated automatically at the registry, as a key registry function/responsibility. When you register a domain:
·the registry knows what registrar is creating the domain, and records that and associates the registrar’s IANA ID. The registry then displays those in WHOIS.
·policy dictates what initial domain statuses there are.
·the registrar indicates how many years the registrant wants, but the create/updated/expiration timestamps are generated and maintained by the registry.
·Nameserver data is provided by the registrant. (Unless he or she didn’t specify any, in which case the registrar often provides defaults.)
·Domain statuses can be manipulated after the domain’s out of AGP. Depending on the status type and the situation, they can be added and deleted by the registrant, the registrar, and/or by the registry.
None of these thin data fields are sensitive info AFAIK.
All best,
--Greg
*From:* Michael D. Palage [mailto:michael@palage.com] *Sent:* Wednesday, December 7, 2016 5:04 PM *To:* 'Gomes, Chuck' <cgomes@verisign.com <mailto:cgomes@verisign.com>>; Greg Aaron <gca@icginc.com <mailto:gca@icginc.com>>; gnso-rds-pdp-wg@icann.org <mailto:gnso-rds-pdp-wg@icann.org> *Subject:* RE: [gnso-rds-pdp-wg] key concepts: say "contact data" when that is what we mean
Chuck,
This is where a choice/orientation of words may have significant legal distinction.
(My text) - All data associated with a domain name registration
(WG Text) – Registration Data
I am taking a much more expansive view of data associated with a domain name registration to include data potentially NOT originally provided by a registrant at the time of registration. Versus the potentially more restrictive definition of only data provided by Registrant to Registrar at the time of registration.
Take for example a .BRAND registry where licensees of that trademark owner are permitted to register in that .BRAND TLD. As part of promoting awareness to consumers, the registry operator (trademark owner) may desire to include/append authoritative data associated with each licensees consumer ranking (e.g. rating 1 thru 5 stars) so that consumers can better choose which licensee to conduct business. Because this ranking may change over time, the Registrant/Licensee is NOT in a position to provide this data as it appears in the RDS/WHOIS output. Only the Registry Operator (trademark owner) would be best positioned to include this authoritative data in the RDS/Whois output.
The point I am trying to make is that innovation has only just begun in connection with the new gTLD expansion. While I respect the rights of privacy advocates to safeguard registrant PII, I do not want broad policy statements to have unintended consequences in impeding future innovation.
Best regards,
Michael
*From:* Gomes, Chuck [mailto:cgomes@verisign.com] *Sent:* Wednesday, December 7, 2016 4:34 PM *To:* michael@palage.com <mailto:michael@palage.com>; gca@icginc.com <mailto:gca@icginc.com>; gnso-rds-pdp-wg@icann.org <mailto:gnso-rds-pdp-wg@icann.org> *Subject:* RE: [gnso-rds-pdp-wg] key concepts: say "contact data" when that is what we mean
Thanks Mike. I am glad to see this discussion going on in advance of considering the first users/purposes question: “*Should gTLD registration data be accessible for any purpose or only for specific purposes?*”
Chuck
*From:* Michael D. Palage [mailto:michael@palage.com] *Sent:* Wednesday, December 07, 2016 4:13 PM *To:* Gomes, Chuck <cgomes@verisign.com <mailto:cgomes@verisign.com>>; gca@icginc.com <mailto:gca@icginc.com>; gnso-rds-pdp-wg@icann.org <mailto:gnso-rds-pdp-wg@icann.org> *Subject:* [EXTERNAL] RE: [gnso-rds-pdp-wg] key concepts: say "contact data" when that is what we mean
Chuck,
I appreciate Greg’s historical context of Whois data primarily being for purposes of “contacting” the registrant of a domain name using those data fields with personally identifying information. However, I think introducing/relying upon the concept of “CONTACT DATA” as proposed by Greg while well intentioned will only lead to greater confusion.
First Greg acknowledges that not ALL data other than the thin technical data falls within his CONTACT DATA definition (trademark, nexus, reseller, etc). So we begin today with a model that is less than 100% inclusive and will likely become less inclusive as more innovative uses of the RDS and Whois data are created.
Second, the use of this terminology ignores the reality in the marketplace that Registrant data is widely relied upon to make legal determinations (i.e. ownership, authority to transfer a domain name, infringement, etc.). When law enforcement is trying to shut down a counterfeit operation, they are not looking to use this data to ‘contact” the registrant, but instead ‘arrest” him/her.
I understand how the term “contact data” provides a certain comfort level to Stephanie and the valid concerns she has. However, as someone that is involved in making legal determinations regarding the ownership rights (property/service contract) concerning domain name registrations on a regular basis, this concept of “Contact Data” will just lead to a lot of confusion.
The whole legal construct (private contractual rights) upon which the domain name system is based recognizes the Registrant and the Registrant Data that it provides. In fact ICANN’s Whois web page makes the following statement: “ICANN's WHOIS Lookup gives you the ability to lookup any generic domains, such as "icann.org" _to find out the registered domain owner_.” (emphasis added) Again this data by ICANN’s own admission is relied upon to make “ownership” decisions NOT mere “contact” information.
So I think we stick to one of the first things I learned as a young engineer. Keep It Simple Stupid (KISS)
Thin Data – the minimum technical data necessary for a registry to perform its function as a registry operator in a shared registry system.
Thick Data – All data associated with a domain name registration made available via Whois/RDS, which may include Personal Identifying Information (PII)
Again I appreciate the constructive efforts of Greg, Stephanie and others, but I just do not see this concept scaling meaningfully.
Best regards,
Michael
*From:* gnso-rds-pdp-wg-bounces@icann.org <mailto:gnso-rds-pdp-wg-bounces@icann.org> [mailto:gnso-rds-pdp-wg-bounces@icann.org] *On Behalf Of *Gomes, Chuck *Sent:* Wednesday, December 7, 2016 10:20 AM *To:* gca@icginc.com <mailto:gca@icginc.com>; gnso-rds-pdp-wg@icann.org <mailto:gnso-rds-pdp-wg@icann.org> *Subject:* Re: [gnso-rds-pdp-wg] key concepts: say "contact data" when that is what we mean
Thanks Greg for the helpful suggestion. I have one question for you and others: If we exclude THIN DATA, is there any data we will need to consider that could not be accurately classified as CONTACT DATA. If not, then dividing data into these two categories should suffice.
Chuck
*From:* gnso-rds-pdp-wg-bounces@icann.org <mailto:gnso-rds-pdp-wg-bounces@icann.org> [mailto:gnso-rds-pdp-wg-bounces@icann.org] *On Behalf Of *Greg Aaron *Sent:* Wednesday, December 07, 2016 9:55 AM *To:* gnso-rds-pdp-wg@icann.org <mailto:gnso-rds-pdp-wg@icann.org> *Subject:* [EXTERNAL] [gnso-rds-pdp-wg] key concepts: say "contact data" when that is what we mean
Speaking of key concepts… people often say “registration data” when they really mean “contact data.” Being plain and specific here can help discussion in our group. The concept will come up in next week’s discussion.
There are basically two kinds of “registration data”. The first is called the*THIN DATA*. This is the basic data about a domain name registration: the domain name, the sponsoring registrar name and ID, the domain’s status(es) , created-updated-expiration dates, and nameservers. (https://whois.icann.org/en/what-are-thick-and-thin-entries ) This data is factual, accurate, is not personally identifiable, and I think is completely noncontroversial.
The second kind of registration data is *CONTACT DATA* – contact names, postal and email addresses, phone numbers. Contact data raises issues of privacy and data protection. Contact data can be (and regularly is) inaccurate because it’s ultimately supplied by the registrants. When people talk about “registration data accuracy” and “registration data validation” they are really talking about the accuracy of *CONTACT DATA*, not all “registration data.”
In the coming discussions, one approach could be: There are good reasons to publish the thin data … is there any compelling reason /not/ to publish it? If we can take care of this low-hanging fruit, we will solve part of the puzzle and we can concentrate on the issues around contact data. This is not a proposal to publish thin data only. It’s an attempt to disentangle concepts and find a way forward. Not all data is the same, so let’s stop treating all data the same. We may not have to iterate repeatedly about thin data.
Even the EWG’s language wasn’t always clear and specific in this area. Here’s the question we will begin with next week:
/Should gTLD registration data be accessible for any purpose or only for specific purposes?/
/“The EWG unanimously recommends abandoning today’s WHOIS model of giving every user the same entirely anonymous public access to (often inaccurate) gTLD registration data. Instead, the EWG recommends a paradigm shift to a next-generation RDS that collects, validates and discloses gTLD registration data for permissible purposes only./
/While basic data would remain publicly available, the rest would be accessible only to accredited requestors who identify themselves, state their purpose, and agree to be held accountable for appropriate use.”/
What the EWG really meant was:
·Give public, anonymous access to the THIN data. (“Basic data” as the EWG called it.)
·Don’t give every user the same anonymous public access to (“often inaccurate”) gTLD CONTACT DATA.
·Shift to an RDS that collects, validates and discloses gTLD CONTACT DATA for permissible purposes only.
All best,
--Greg
**********************************
Greg Aaron
Vice-President, Product Management
iThreat Cyber Group / Cybertoolbelt.com
mobile: +1.215.858.2257
**********************************
The information contained in this message is privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to the message and deleting it from your computer.
_______________________________________________ gnso-rds-pdp-wg mailing list gnso-rds-pdp-wg@icann.org https://mm.icann.org/mailman/listinfo/gnso-rds-pdp-wg