This message is getting a little thick but I’ve added some comments inline while keeping most of the history. On 13 Jan 2017, at 12:33, Amr Elsadr wrote:
Hi Jim,
Some thoughts in-line below:
On Jan 12, 2017, at 10:37 PM, James Galvin <jgalvin@afilias.info> wrote:
On 5 Jan 2017, at 14:31, Brian Aitchison wrote:
The principle that the language tag must be present is the problematic one. My thinking has evolved since both the IRD working group and the T&T working group have completed their work.
I believe it is consistent with the various recommendations from both groups to suggest the following.
It must be possible to store a language tag with a data element (we’ll have to talk about what is meant by an “element” here, e.g., postal address in total or separately for each part of a postal address) if that value is known.
When the T/T PDP was being chartered (when the working group charter was being drafted), the charter drafting team considered adding what data elements should be considered in the context of transformation of contact information. The response from staff at the time was that this should remain out of scope of this PDP, as it would be considered more thoroughly in another PDP in the future. This was presumably the next generation RDS PDP, which is indeed considering data elements, and should do so taking in to account this PDP as well as the work done by the IRD WG.
My understanding of this policy is that the language/script used to enter registration data by a registrant must be easily identifiable, but I’m not sure that a tag needs to be attached to every data element. I’m have no technical expertise on this, so can’t say what is more technically feasible, but all that is really required is that the language/script be identified. This could be interpreted as an independent data element that is included with the rest of the registration data, couldn’t it? I am presuming that the registrant will use only one language/script to enter the contact information, not multiple ones.
A registrar should provide it if known and the registry should store it. If it is present it should be displayed whenever that element is output as part of a directory service.
I don’t see why a registrar wouldn’t know what language/script is being used. It is the business model of the registrar and registry that will dictate what languages/scripts are permitted to be used, no? Surely, it is one that the registrar is familiar with, and is offering services to customers using it?
I called the language tag problematic in part because, as you suggest, there is this perception that it ought to be straightforward but, in fact, it’s not. Just a couple example issues to consider. Should there be a language tag for the “name”? Well, people may change their name for a number of different reasons and each part of the name may have a different language tag (now there’s a technical challenge for you). Consider though, is a name likely to be transformed anyway? Isn’t it best to just leave that alone? The postal address itself probably needs a tag separate from any other element. How about a country like India, with 21 official languages, and scripts that are used in multiple languages. You can probably guess the language by intersecting all the possible languages for each of the code points, but what do you do when still end up with two or more languages to choose from? Registrars have argued that asking a registrant to tell them what language is being used is just asking for trouble. Geolocation for IP address source is pretty good, but what if I’m traveling and I want to buy something while away from my home but use my home address? What language do I default to? Give me some time and I may think of a few more things.
If a data element is transformed, then both the origin language and the destination language must be known and both must be displayed on output. If the originating language tag is not specified, the requestor of the transformation must determine the origin language through a means that is outside the scope of these recommendations.
I think we may be reading this policy a little differently. My understanding is that regardless of whether or not the authoritative original data is transformed, the language/script used must be identified, not only if it is transformed. If it is transformed, then yes…, but languages/scripts need to be identified, as well as the source of the transformation (who actually did it). There shouldn’t be any need for the requestor of the transformation to have to determine the original language.
Have I gotten this wrong?
See above regarding language. I think we already agree regarding script.
Implications:
1. If a third party (i.e., not the registrar or the registry) is doing the transformation, no additional requirements apply.
If the registrant is voluntarily doing the transformation, then there should be a field that indicates that the data is transformed, as well as an indication to who did it. Same would apply for the registrar or registry.
Agree.
2. If a registry or a registrar is doing the transformation, then upon display or storage: a) both forms must always be shown; b) the language tag must always be included.
Yes. And again…, the source of the transformation needs to be included as well. Keeping in mind of course, that the transformed data is not authoritative, and the accuracy of translation/transliteration cannot be guaranteed.
Agree. Jim
Thanks.
Amr
* Requirements for gathering language data, if any. The IRT seems to be gravitating toward automated methods for detecting script. Some discussion was had on whether parties requesting transformation should bear the burden of inferring what language was entered by a registrant based on the country he/she entered into the RDS system, script used, and any other method the requesting party deems appropriate to make their transformation.
Please see my comment above.
Jim
* Reconciling the optional provisions contained within the T/T WG recommendationsespecially Rec. 1with any requirements we identify for contracted parties to gather language and script data to enable transformations. Some of our team is still out on holiday, so we won¹t have a chance to brainstorm possible approaches and solutions to these issues until next week. Once we¹ve had a chance to do so, I¹ll send out a doodle poll and invite as usual. I expect our next call to take place during the week of the 16th.
Thanks all very much for your thoughtful contributions to this project. I look forward to our discussions this year.
All best,
Brian
Brian Aitchison, MRes, PhD Lead Researcher Operations & Policy Research Internet Corporation for Assigned Names and Numbers (ICANN) 12025 Waterfront Drive, Suite 300 Los Angeles, CA 90094-2536
Direct Line: +1 310 578 8688 Mobile: +1 424 353 9041 Email: brian.aitchison@icann.org Skype: brian.aitchison.icann Twitter: @BrianAitch LinkedIn: linkedin.com/in/baitchison <http://linkedin.com/in/baitchison> www.icann.org <http://www.icann.org>
CONFIDENTIALITY NOTICE: This email and any attachments are for the sole use of the intended recipient(s) and contain information that may be confidential and/or legally privileged. If you have received this email in error, please notify the sender by reply email and delete the message. Any disclosure, copying, distribution or use of this communication by someone other than the intended recipient is prohibited.
_______________________________________________ Translationtransliterationirt mailing list Translationtransliterationirt@icann.org https://mm.icann.org/mailman/listinfo/translationtransliterationirt
Translationtransliterationirt mailing list Translationtransliterationirt@icann.org https://mm.icann.org/mailman/listinfo/translationtransliterationirt