Re: [vip] FW: WHITE PAPER ON "DEFINITIONS" : YOUR COMMENTS AND FEEDBACK REQUESTED

Sept. 10, 2011

      At 19:53 09/09/2011, Naela Sarras wrote:
...
From: doc <<mailto:raymond.doctor@gmail.com>raymond.doctor@gmail.com>
Date: Fri, 19 Aug 2011 20:13:48 -0700
To: "<mailto:vip@icann.org>vip@icann.org" 
<<mailto:vip@icann.org>vip@icann.org>
Subject: WHITE PAPER ON "DEFINITIONS" : YOUR COMMENTS AND FEEDBACK REQUESTED
Dear colleagues,
                         I had drafted a white paper on the 
definitions and questionnaires put forward at the Singapore meet 
which I attended remotely.
The paper comprises three sections:
2.1 On Unicode
2.2. DNS
2.3. The main part of the paper dealing with the notion of 
variant-hood and which links variants with a script typology to 
arrive at a "unified theory" of variants.
Dear Naela,

I am currently overloaded, but I will try to read all of the 
documents sent by Raymond.

However, I wish we would first clearly and commonly understand where 
we are and what we are currently doing before we try doing anything 
else. The Internet is a communication process that aims at permitting 
every human and machine to digitally relate together. As such, it is 
the most complex system ever built by humans, and the first one that 
has attained a universal nature.

1. VIP wants to create some new order in the use of this system in 
replacing supposed bijective resolution/registration relations (one 
name -> one IP) with surjective relations (several variant names -> 
one IP). I say "supposed" because the DNS system is already 
surjective (the same IP can support several hosts - HTTP.1.1.). This 
means that the bijection is today "one name -> one IP + one name -> 
one host". If we want to be complete, the communication of multicast 
addressing is intensely injective (one name -> multiple hosts).

2. we also have an additional problem, which is IDNA. IDNA introduces 
four major issues:

     - punycode does not transport the characteristics of a variance 
(what makes the variant equivalent) into ASCII. The impact of this 
has not been studied yet, to my knowledge, in terms of security and 
of the certain identification of the destination.

     - punycode is not complete. This is due to the lack of a 
definition at this time of the metadata injection method. This method 
is necessary for supporting, for example, French majuscules, what may 
or may not lead to a transliteration in uppercases.

     - IDNA is an incorrect architecture on the user side that has to 
be changed. This is because it is defined as being supported at the 
application level. On the client side, several applications with 
different versions or parameters may, therefore, resolve different 
"address+domain-name"s. On the host side one becomes dependent on the 
distant application architecture and one does not know for sure 
(otherwise, this is a VPN) what may happened on the User side. 
Anyway, the the relation becomes: "one out of several names->client 
punycode -> server-punycode -> IP + one out of several names -> host 
-> application". Sometimes the dichotomy host/application will be 
reduced but we have to live with it for now and be sure that it does 
not introduce too many discrepancies or security risks.

     - IDNA is based upon Unicode. IDNA2008 has reduced the impact of 
the use of Unicode and of its versioning. However, it has not 
eliminated the noise and limitations and constraints introduced by 
the use of a middle foreign system. "Foreign" in the sense that ISO 
10646 was not designed to support IDNA. This means that the relation 
now actually becomes: "one out of several names->unicode->client 
punycode -> server-punycode->unicode -> IP + one out of several names 
-> host -> application"

3. we have another important problem, which is IPv6. IPv6 provides 
each Internet user with:

     - a way to be independently called.

     - more IIDs (second part of the IPv6 address, that for clarity I 
name IDv6) than the whole existing Internet number of IP addresses. 
It is, therefore, possible that every user scales his/her naming 
scheme accordingly. There is no technical restriction to that; it is 
just a matter of the database size on his/her PC. Plug and Play will 
most probably result in such weird local name-spaces populated by 
different SDOs with their own possible support of variants. This 
should lead ICANN to publish variant support rules  in a way that 
other SDOs can use and adapt-- and adopt a strategy that supports the 
transition to such a brave new naming world.

4. all this is obviously subject to the information theory and to the 
algorithmic information theory 
<http://en.wikipedia.org/wiki/Algorithmic_information_theory>http://en.wikipedia.org/wiki/Algorithmic_information_theory 
that takes into account that domain-names are information to 
processes and people. Let's look at the issue as a general issue for 
the general DNS family of systems: DDDS. 
<http://en.wikipedia.org/wiki/Dynamic_Delegation_Discovery_System>http://en.wikipedia.org/wiki/Dynamic_Delegation_Discovery_System. 

The DDDS should be reversible, like the DNS. Do we want, and how do 
we make, such systems to be transparently reversible to variants? 
This means, if a variant is entered and results into an 
IP+host+application, how do we make sure that the reversion (reverse 
process operation) may not result in another variant? This calls for 
some additional implicit, passive, referent or active metadata (i.e. 
in the copper, in the header, in the context, or in the system 
intelligent dynamic). Our chain architecturally becomes:

  "one out of several names->metadata->unicode->client punycode -> 
server-punycode->unicode -> metadata -> IP + one out of several names 
-> host -> application"

5. then, there are morphological, semantical, and pragmatical issues 
to be considered by the linguists. (e.g. cf. Raymond). Not a small 
task, but which has to be carried within the framework I describe.

6. then, we have the multilinguistic problem of homography, i.e. 
finding a canonicalization algorithm to prevent the signs of a script 
used by a language to be confusable with signs used by the script of 
another language. We started from linguistic diversity and its 
implications and we have to control what we decide against the 
consequences on linguistic mutuality in the linguistic ecosystem.

Now, what do we have that will enables us to discuss this?
We have seven fundamental concepts that we can define "à la" Gregory Bateson:

- data: the differences necessary for a process.
- information: the differences that make a difference (Bateson).
- variants: the differences that make no difference.
- canonicalization: reducing the unnecessary differences.
- consistency: the differences do not conflict.
- protocol: what document data interchanges.
- languages: human communication protocols.

This means that every other notion that we may need (glossary [I 
fully agree with Raymond here]) has to be referenced in relation to 
new concepts that we first have to accept as pertinent and coherent 
with the seven master concept above.

Why so? Because we need to ensure that we do not introduce any flaws 
(logic, security, etc.) to the reasoning and consequences. This is 
based upon RFC 1958 (we are to be ready for every possible "change" - 
here, a new kind of variant) and RFC 3439 (in a very large system, 
like the Internet, in which its naming is larger than the Internet 
itself as it may extend to other technologies, the prevalent 
principle is the principle of simplicity). Reasoning at the 
conceptual level gives us a better chance to keep things simple and 
coherent at the operational level.

jfc

Re: [vip] FW: WHITE PAPER ON "DEFINITIONS" : YOUR COMMENTS AND FEEDBACK REQUESTED

JFC Morfin