Re: [gtld-tech] [weirds] Search Engines Indexing RDAP Server Content

Feb. 3, 2016


      ...
-----Original Message-----
From: gtld-tech-bounces@icann.org [mailto:gtld-tech-bounces@icann.org]
On Behalf Of Stephane Bortzmeyer
Sent: Wednesday, February 03, 2016 5:05 AM
To: Francisco Arias
Cc: gtld-tech@icann.org
Subject: Re: [gtld-tech] [weirds] Search Engines Indexing RDAP Server
Content
On Wed, Feb 03, 2016 at 12:23:42AM +0000,
 Francisco Arias <francisco.arias@icann.org> wrote
 a message of 77 lines which said:
...
The search page
(https://www.google.co.uk/search?q=site:rdg.afilias.info) appears to
be the result of crawling links from the first link that appears
there (http://rdg.afilias.info/rdap/help). The help page contains
links to search and lookup examples that return several objects with
their directly-related objects, which are in turn shown in the
search results. This could have happened in web-Whois if someone
were to publish a page containing example queries.
It seems to me that having a robots.txt at the root of the RDAP server
would solve the problem (if you regard it as a problem).
User-agent: *
Disallow: /
That will only work if a crawler reads robots.txt and respects the published directive(s). Not all do.

Scott

Re: [gtld-tech] [weirds] Search Engines Indexing RDAP Server Content

Hollenbeck, Scott