Re: [rssac-caucus] Handing the anonymization document off to RSSAC

April 10, 2018

      On 09/04/2018 21:42, Paul Vixie wrote:
...
anonymizing at /48 for v6 and /24 for v4 isn't enough. even the
least capable data scientist, using data that's less than a millionth
of the other data in google's or facebook's or cambridge analytica's
 possession, can _trivially_ deanonymize that.
Couldn't that same data scientist also reverse anything that maintains a
1:1 relationship between input and output?

At least truncating the data does ensure that some portion of the input
data is intentionally destroyed.   I think there's a balance somewhere
in the (to us desirable) property that this is prefix preserving,
against the increase in difficulty because of the N:1 mapping it creates.

If there are arguments to be made against prefix truncation then they
should be properly documented *in the paper*.
...
please re-think this. you're making decisions about third party 
safety
You appear to be shifting the goal posts.  The document doesn't mention
safety.

The entire documented rationale for the entire RSSAC study and therefore
anonymization seems to be this single sentence in the Introduction:
...
Some operators are uncomfortable sharing IP addresses of the query 
sources and some are even legally prevented from doing so.
GDPR seems to be the main driver for this right now.  I'm (currently)
satisfied that pseudonymization of IP addresses by truncation satisfies
any obligations we might have there.

Ray

Re: [rssac-caucus] Handing the anonymization document off to RSSAC

Ray Bellis