Dear Wang Wei, Chris, +1 Regards, Jonathan Shea From: Dillon, Chris [mailto:c.dillon@ucl.ac.uk] Sent: Wednesday, 30 April 2014 3:00 PM To: Wang Wei; Jonathan Shea Cc: ChineseGP@icann.org Subject: RE: [ChineseGP] Updates : Inclusion & Exclusion Principles Dear Wang Wei, +1 Incidentally, it is interesting to see Yoshiro Yoneya’s post (http://forum.icann.org/lists/comments-msr-03mar14/msg00000.html ) which ends in a list of kanji not in MSR-1. Judging from the numbers, some of them are also in CJK-B. Regards, Chris. -- Research Associate in Linguistic Computing, Centre for Digital Humanities, UCL, Gower St, London WC1E 6BT Tel +44 20 7679 1599 (int 31599) ucl.ac.uk/dis/people/chrisdillon From: Wang Wei [mailto:wangwei@cnnic.cn] Sent: 30 April 2014 04:42 To: 'Jonathan Shea'; Dillon, Chris Cc: ChineseGP@icann.org<mailto:ChineseGP@icann.org> Subject: 答复: [ChineseGP] Updates : Inclusion & Exclusion Principles Thanks Jonathan & Chris The current CDNC table and JP table are all located in CJK & CJK exts A. That’s why we suggests make a intersection of MSR & CJK+CJK-A But I just checked HKSCS and found that are hundreds of character in CJK-B So I’d like to change principle 1 into “the maximum range of CGP character set would be all CJK Unified Ideographs that are included in the MSR contributed by ICANN” Which means, if there are some characters neither in CDNC table nor in MSR, first, we push ICANN accept them into MSR, second, we add them into CDNC table and CGP table through an appropriate evaluation process. Will this suggestion work for you? Regards Wang Wei 发件人: chinesegp-bounces@icann.org<mailto:chinesegp-bounces@icann.org> [mailto:chinesegp-bounces@icann.org] 代表 Jonathan Shea 发送时间: 2014年4月29日 17:30 收件人: Dillon, Chris 抄送: ChineseGP@icann.org<mailto:ChineseGP@icann.org>; wangwei 主题: Re: [ChineseGP] Updates : Inclusion & Exclusion Principles Dear Chris, We (HKIRC, registry for the .HK ccTLD and .香港 IDN ccTLD) has submitted a comment to ICANN requesting their consideration to add 2,677 HKSCS (Hong Kong Supplemental Character Set) characters to MSR-1. HKSCS contains Chinese characters that are used in Hong Kong but may not be used in other Chinese-speaking communities. http://forum.icann.org/lists/comments-msr-03mar14/msg00002.html. As these HKSCS characters are not in the CDNC variant table, HKIRC is in the process of applying to CDNC to add these characters to the CDNC table batch by batch, the first batch containing 21 characters and is being processed. Adding HKSCS characters to MSR-1 is for administrative convenience mainly – otherwise when CDNC approves the addition of some HKSCS characters to the CDNC table in the future, these characters cannot be added to the CGP table because MSR-1 does not contain them. Also, we are not sure at this stage whether the IP will produce new versions of MSR such as MSR-2. As MSR-1 is a superset and CGP is not obliged to consider all characters in the MSR, our comment to add HKSCS characters into MSR-1 should not have any impact on the 7 proposed principles listed by Wang Wei. Also. I have already communicated with the CDNC co-chairs, council members and secretariat before submitting the comment to ICANN. Regards, Jonathan Shea HKIRC From: chinesegp-bounces@icann.org<mailto:chinesegp-bounces@icann.org> [mailto:chinesegp-bounces@icann.org] On Behalf Of Dillon, Chris Sent: Tuesday, 29 April 2014 4:25 PM To: 齐超; wangwei; ChineseGP@icann.org<mailto:ChineseGP@icann.org> Subject: Re: [ChineseGP] Updates : Inclusion & Exclusion Principles Dear Qi Chao, Thank you. Your diagram makes things clearer. As you write, “sum” is not accurate either. We need to have a longer explanation, similar to what you’ve written below, but deciding the questions such as: • Are we sure we can leave out the characters in CJK exts-B? • As you say, is anyone in the group aware of any characters outside MSR-1-HAN being required? Regards, Chris. -- Research Associate in Linguistic Computing, Centre for Digital Humanities, UCL, Gower St, London WC1E 6BT Tel +44 20 7679 1599 (int 31599) ucl.ac.uk/dis/people/chrisdillon From: 齐超 [mailto:qichao@cnnic.cn] Sent: 29 April 2014 07:23 To: Dillon, Chris; wangwei; ChineseGP@icann.org<mailto:ChineseGP@icann.org> Subject: Re: Re: [ChineseGP] Updates : Inclusion & Exclusion Principles Hello, Chirs Thank you. Your edit make the principles clear and comprehensive. But for 'SUM', I think CGP Script is not a summary of CJK and MSR. There is a picture for CJK & MSR & CGP & CDNC Script(CGP Script) [Principle-1, Principle-3]. [cid:image001.jpg@01CF648C.B157BCF0] The CGP script as showed in orange colour, is just a part from MSR-1-Han. CDNC script is its origin. 1. CDNC Script does not include Hanzi from CJK exts-B; 2. CDNC Script does not include some Hanzi code points from other registry scripts as JP, DotAsia[Principle-6]. And here is also some points in CDNC script that maybe conflict with other registry as Kr or JP [Principle-5]. So 'SUM' maybe confuse the relation of MSR-1 and CJK(exts-A, exts-B). May CGP script cover points beyond MSR-1-han? And if true, it is a hard work to handle thousands of CJK Hanzi, case by case, for CGP members. Thanks. ________________________________ 齐超 via foxmail 发件人: Dillon, Chris<mailto:c.dillon@ucl.ac.uk> 发送时间: 2014年4月28日(星期一) 下午4:30 收件人: Wang Wei<mailto:wangwei@cnic.cn>; ChineseGP@icann.org<mailto:ChineseGP@icann.org> 主题: Re: [ChineseGP] Updates : Inclusion & Exclusion Principles Dear colleagues, Please find some minor changes in the version below. These are to make the English smoother. There is also one substantial change: the word “intersection” in paragraph one, often means the small area where two circles (in this case tables) overlap. I think here, the meaning is all characters in the three tables and so I think a word like “sum” is better. 1. Based on MSR 1 character set contributed by ICANN, with CJK Unified Ideographs and Extension A as reference, the maximum range of CGP THE character set would be the SUM. 2. CGP character set shall be programmed according to the requirements of RFC3743/4713 and [Representing] Label Generation Rulesets WILL BE REPRESENTED using XML. 3. The CDNC table widely accepted among Chinese domain name area can be employed as the initial set of CGP. 4. The initial set shall be checked following the [standard of] criteriA listed by The Normalized Hanzi Chart for General Use and IIcore. 5. Some [of the] abandoned archaic characters, for instance KOREAN Idu charactersS (이두/吏读字), shall be deleted based on the consensus with CDNC. 6. Some Code Points are listed in the intersection of CJK and MSR-1, yet not included in the CGP. Such will be included in CGP only when they meet the following requirements: 1. A. The Code Points of different languages shall be programmed by THEIR affiliated institutions, such as JP, KR, HK, DotAsia, etc. 2. B. Each Code Point shall pass CHECKS conducted by both the language expert in the CGP panels and CDNC. 3. C. All the strings in an application FOR THE CGP shall not [be] collide with existing characterS in the process of Variant evaluation. 7. THE CGP is expected to submit a unified Chinese character set under its combination with all Chinese script communities. [] means text I have removed. Regards, Chris. -- Research Associate in Linguistic Computing, Centre for Digital Humanities, UCL, Gower St, London WC1E 6BT Tel +44 20 7679 1599 (int 31599) ucl.ac.uk/dis/people/chrisdillon From: chinesegp-bounces@icann.org<mailto:chinesegp-bounces@icann.org> [mailto:chinesegp-bounces@icann.org] On Behalf Of Wang Wei Sent: 25 April 2014 13:29 To: ChineseGP@icann.org<mailto:ChineseGP@icann.org> Subject: [ChineseGP] Updates : Inclusion & Exclusion Principles Dear CGP members While ICANN is reviewing the CGP proposal, some members has drafted the principles of character inclusion & exclusion as follows. 1. Based on MSR 1 character set contributed by ICANN, with CJK Unified Ideographs and Extension A as reference, the maximum range of CGP character set would be the intersection. 2. CGP character set shall be programmed according to the requirements of RFC3743/4713 and Representing Label Generation Rulesets using XML 3. The CDNC table widely accepted among Chinese domain name area can be employed as the initial set of CGP. 4. The initial set shall be checked following the standard of criterion listed by The Normalized Hanzi Chart for General Use and IIcore. 5. Some of the abandoned archaic characters, for instance Idu character(吏读字), shall be deleted based on the consensus with CDNC. 6. Some Code Points are listed in the intersection of CJK and MSR-1, yet not included in the CGP. Such will be included in CGP only when they meet the following requirements: 4. A. The Code Points of different languages shall be programmed by its affiliated institutions, such as JP, KR, HK, DotAsia, ect. 5. B. Each Code Point shall pass the interview conducted by both the language expert in the CGP panels and CDNC. 6. C. All the strings in an application to join CGP shall not be collide with existing character in the process of Variant evaluation. 7. CGP is expected to submit a unified Chinese character set under its combination with all Chinese script communities. Please give your comments and advice on these principles. Once we reach a consensus, the technical guys will make a character table and submit it to Integration Panel. Regards Wang Wei