Dear Prof. Lu, Prof. Li and Prof. Zhang ��½��ʦ������ʦ������ʦ JGP provided its latest repertoire in which a new character �� was added. do you think �� should be an independent character, or, a variant to �� �� �����µ��ձ��ּ���������ַ������� ������Ϊ���Ƿ�Ӧ��Ϊ������壬���Ƕ������֣� Looking forward to your suggestion Regards WANG Wei
I am sorry that I do not follow up the latest development of IRG. I wonder whether or not IRG really added a new ideographic zero besides �� U+3007 ? FYI , attached please found a document about the issue with ������. Thanks, Zhang �Բ�����û�и���IRG���µķ�չ���ѵ��ک� U+3007֮��CJK��������һ���µı����ַ��� Ϊ�˸�������Ұ�2015���Ҳμӱ�д��һ���ļ����ϡ� �����һ�¡� ����� ������: chinesegp-bounces@icann.org [mailto:chinesegp-bounces@icann.org] ���� ��ΰ ����ʱ��: 2017��6��5�� 13:48 �ռ���: 'Qin Lu' <csluqin@comp.polyu.edu.hk>; bjlgy@bnu.edu.cn; 'Zhang Zhoucai' <joe.zhang@unihan.com.cn> ����: ChineseGP@icann.org ����: [ChineseGP] question about �� and �� Dear Prof. Lu, Prof. Li and Prof. Zhang ��½��ʦ������ʦ������ʦ JGP provided its latest repertoire in which a new character �� was added. do you think �� should be an independent character, or, a variant to �� �� �����µ��ձ��ּ���������ַ������� ������Ϊ���Ƿ�Ӧ��Ϊ������壬���Ƕ������֣� Looking forward to your suggestion Regards WANG Wei
Dear Wang Wei, What is the U code of this Japanese? I think U+3007 being considered the simplified form of �� is a good one, simply consider it as variant is not as good. But, U+3007 is considered a CJK symbol, not a CJK ideograph. Best regards, LuQin From: ��ΰ [mailto:wangwei@cnic.cn] Sent: Monday, 5 June 2017 1:48 PM To: 'Qin Lu' <csluqin@comp.polyu.edu.hk>; bjlgy@bnu.edu.cn; 'Zhang Zhoucai' <joe.zhang@unihan.com.cn> Cc: ChineseGP@icann.org Subject: question about �� and �� Dear Prof. Lu, Prof. Li and Prof. Zhang ��½��ʦ������ʦ������ʦ JGP provided its latest repertoire in which a new character �� was added. do you think �� should be an independent character, or, a variant to �� �� �����µ��ձ��ּ���������ַ������� ������Ϊ���Ƿ�Ӧ��Ϊ������壬���Ƕ������֣� Looking forward to your suggestion Regards WANG Wei
Hello, I agree with Professor LuQin's point. Pls kindly consider the following words in section 2.6 of RFC5892(for IDNA ) 2.6<https://tools.ietf.org/html/rfc5892#section-2.6>. Exceptions (F) F: cp is in {00B7, 00DF, 0375, 03C2, 05F3, 05F4, 0640, 0660, 0661, 0662, 0663, 0664, 0665, 0666, 0667, 0668, 0669, 06F0, 06F1, 06F2, 06F3, 06F4, 06F5, 06F6, 06F7, 06F8, 06F9, 06FD, 06FE, 07FA, 0F0B, 3007, 302E, 302F, 3031, 3032, 3033, 3034, 3035, 303B, 30FB} This category explicitly lists code points for which the category cannot be assigned using only the core property values that exist in the Unicode standard. The values are according to the table below: PVALID -- Would otherwise have been DISALLOWED 00DF; PVALID # LATIN SMALL LETTER SHARP S 03C2; PVALID # GREEK SMALL LETTER FINAL SIGMA 06FD; PVALID # ARABIC SIGN SINDHI AMPERSAND 06FE; PVALID # ARABIC SIGN SINDHI POSTPOSITION MEN 0F0B; PVALID # TIBETAN MARK INTERSYLLABIC TSHEG 3007; PVALID # IDEOGRAPHIC NUMBER ZERO It means that U+3007 is only allowed as the number zero, otherwise it will not allowed in use in IDNA protocols. �� can be used to express many meanings besides the number zero, but �� can be only used as the number zero In IDNA. �� is a special code point in IDNA while �� is a normal code point. so it may be not proper that �� is regarded as a variant to �� based on IDNA protocols. Best Regards ________________________________ Jiankang Yao �����ˣ� csluqin@comp.polyu.edu.hk<mailto:csluqin@comp.polyu.edu.hk> ����ʱ�䣺 2017-06-07 13:50 �ռ��ˣ� '��ΰ'<mailto:wangwei@cnic.cn>; bjlgy@bnu.edu.cn<mailto:bjlgy@bnu.edu.cn>; 'Zhang Zhoucai'<mailto:joe.zhang@unihan.com.cn> ���ͣ� ChineseGP@icann.org<mailto:ChineseGP@icann.org> ���⣺ [ChineseGP]RE: question about �� and �� Dear Wang Wei, What is the U code of this Japanese? I think U+3007 being considered the simplified form of �� is a good one, simply consider it as variant is not as good. But, U+3007 is considered a CJK symbol, not a CJK ideograph. Best regards, LuQin From: ��ΰ [mailto:wangwei@cnic.cn] Sent: Monday, 5 June 2017 1:48 PM To: 'Qin Lu' <csluqin@comp.polyu.edu.hk>; bjlgy@bnu.edu.cn; 'Zhang Zhoucai' <joe.zhang@unihan.com.cn> Cc: ChineseGP@icann.org Subject: question about �� and �� Dear Prof. Lu, Prof. Li and Prof. Zhang ��½��ʦ������ʦ������ʦ JGP provided its latest repertoire in which a new character �� was added. do you think �� should be an independent character, or, a variant to �� �� �����µ��ձ��ּ���������ַ������� ������Ϊ���Ƿ�Ӧ��Ϊ������壬���Ƕ������֣� Looking forward to your suggestion Regards WANG Wei
Dear all, Sorry that I was confused by the word ��new character�� in Wang Wei��s email. Now it is clear that this is an issue purely with CGP-JGP, not IRG. The character �� U+3007 has been encoded since the first version of Unicode/CJK. Concerning the question about �� and ��, I have some points: 1. They are the corresponding member in the two common frequently-used subsets ����һ�����������߰˾�ʮ �� and �� ��Ҽ��������½��ƾ�ʰ��. 2. The traditional form of �� ��Ҽ��������½��ƾ�ʰ�� are �� ��Ҽ�E���������ƾ�ʰ�� used in Taiwan and Hongkong, Some characters may have more forms, say �� or��. 3. �� and ��may regarded as somehow low case-uppercase relation, or simplified -unsimplified ones. Whatever you consider they are, they have the same meaning as ideographic number zero. By LGR definition ,they are VARIANT each other. Even if they are simplified �C traditional relationship which still belong to VARIANT concept . 4. I remember that TLD-LGR does not require variants must have the ALL meaning(s) are the same. In summary�� It is OK to treat �� as a variant of �� in TLD-LGR scope. Thanks, Zhang ������: chinesegp-bounces@icann.org [mailto:chinesegp-bounces@icann.org] ���� Yao HEALTH ����ʱ��: 2017��6��7�� 15:09 �ռ���: csluqin@comp.polyu.edu.hk; Wang Wei <wangwei@cnic.cn>; bjlgy@bnu.edu.cn; 'Zhang Zhoucai' <joe.zhang@unihan.com.cn> ����: ChineseGP@icann.org ����: [ChineseGP] Re: [ChineseGP]RE: question about �� and �� Hello, I agree with Professor LuQin's point. Pls kindly consider the following words in section 2.6 of RFC5892(for IDNA ) 2.6<https://tools.ietf.org/html/rfc5892#section-2.6>. Exceptions (F) F: cp is in {00B7, 00DF, 0375, 03C2, 05F3, 05F4, 0640, 0660, 0661, 0662, 0663, 0664, 0665, 0666, 0667, 0668, 0669, 06F0, 06F1, 06F2, 06F3, 06F4, 06F5, 06F6, 06F7, 06F8, 06F9, 06FD, 06FE, 07FA, 0F0B, 3007, 302E, 302F, 3031, 3032, 3033, 3034, 3035, 303B, 30FB} This category explicitly lists code points for which the category cannot be assigned using only the core property values that exist in the Unicode standard. The values are according to the table below: PVALID -- Would otherwise have been DISALLOWED 00DF; PVALID # LATIN SMALL LETTER SHARP S 03C2; PVALID # GREEK SMALL LETTER FINAL SIGMA 06FD; PVALID # ARABIC SIGN SINDHI AMPERSAND 06FE; PVALID # ARABIC SIGN SINDHI POSTPOSITION MEN 0F0B; PVALID # TIBETAN MARK INTERSYLLABIC TSHEG 3007; PVALID # IDEOGRAPHIC NUMBER ZERO It means that U+3007 is only allowed as the number zero, otherwise it will not allowed in use in IDNA protocols. �� can be used to express many meanings besides the number zero, but �� can be only used as the number zero In IDNA. �� is a special code point in IDNA while �� is a normal code point. so it may be not proper that �� is regarded as a variant to �� based on IDNA protocols. Best Regards ________________________________ Jiankang Yao �����ˣ� csluqin@comp.polyu.edu.hk<mailto:csluqin@comp.polyu.edu.hk> ����ʱ�䣺 2017-06-07 13:50 �ռ��ˣ� '��ΰ'<mailto:wangwei@cnic.cn>; bjlgy@bnu.edu.cn<mailto:bjlgy@bnu.edu.cn>; 'Zhang Zhoucai'<mailto:joe.zhang@unihan.com.cn> ���ͣ� ChineseGP@icann.org<mailto:ChineseGP@icann.org> ���⣺ [ChineseGP]RE: question about �� and �� Dear Wang Wei, What is the U code of this Japanese? I think U+3007 being considered the simplified form of �� is a good one, simply consider it as variant is not as good. But, U+3007 is considered a CJK symbol, not a CJK ideograph. Best regards, LuQin From: ��ΰ [mailto:wangwei@cnic.cn] Sent: Monday, 5 June 2017 1:48 PM To: 'Qin Lu' <csluqin@comp.polyu.edu.hk<mailto:csluqin@comp.polyu.edu.hk>>; bjlgy@bnu.edu.cn<mailto:bjlgy@bnu.edu.cn>; 'Zhang Zhoucai' <joe.zhang@unihan.com.cn<mailto:joe.zhang@unihan.com.cn>> Cc: ChineseGP@icann.org<mailto:ChineseGP@icann.org> Subject: question about �� and �� Dear Prof. Lu, Prof. Li and Prof. Zhang ��½��ʦ������ʦ������ʦ JGP provided its latest repertoire in which a new character �� was added. do you think �� should be an independent character, or, a variant to �� �� �����µ��ձ��ּ���������ַ������� ������Ϊ���Ƿ�Ӧ��Ϊ������壬���Ƕ������֣� Looking forward to your suggestion Regards WANG Wei
Dear Everyone, I echo Joe��s point. Rgds Lu Qin From: Zhang Joe [mailto:JoeZhang43@hotmail.com] Sent: Thursday, 8 June 2017 7:53 AM To: Yao HEALTH <healthyao@hotmail.com>; csluqin@comp.polyu.edu.hk; Wang Wei <wangwei@cnic.cn>; bjlgy@bnu.edu.cn; 'Zhang Zhoucai' <joe.zhang@unihan.com.cn> Cc: ChineseGP@icann.org Subject: ��: [ChineseGP]RE: question about �� and �� Importance: High Dear all, Sorry that I was confused by the word ��new character�� in Wang Wei��s email. Now it is clear that this is an issue purely with CGP-JGP, not IRG. The character �� U+3007 has been encoded since the first version of Unicode/CJK. Concerning the question about �� and ��, I have some points: 1. They are the corresponding member in the two common frequently-used subsets ����һ�����������߰˾�ʮ �� and �� ��Ҽ��������½��ƾ�ʰ��. 2. The traditional form of �� ��Ҽ��������½��ƾ�ʰ�� are �� ��Ҽ�E ���������ƾ�ʰ�� used in Taiwan and Hongkong, Some characters may have more forms, say �� or��. 3. �� and ��may regarded as somehow low case-uppercase relation, or simplified -unsimplified ones. Whatever you consider they are, they have the same meaning as ideographic number zero. By LGR definition ,they are VARIANT each other. Even if they are simplified �C traditional relationship which still belong to VARIANT concept . 4. I remember that TLD-LGR does not require variants must have the ALL meaning(s) are the same. In summary�� It is OK to treat �� as a variant of �� in TLD-LGR scope. Thanks, Zhang ������: chinesegp-bounces@icann.org <mailto:chinesegp-bounces@icann.org> [mailto:chinesegp-bounces@icann.org] ���� Yao HEALTH ����ʱ��: 2017��6��7�� 15:09 �ռ���: csluqin@comp.polyu.edu.hk <mailto:csluqin@comp.polyu.edu.hk> ; Wang Wei <wangwei@cnic.cn <mailto:wangwei@cnic.cn> >; bjlgy@bnu.edu.cn <mailto:bjlgy@bnu.edu.cn> ; 'Zhang Zhoucai' <joe.zhang@unihan.com.cn <mailto:joe.zhang@unihan.com.cn> > ����: ChineseGP@icann.org <mailto:ChineseGP@icann.org> ����: [ChineseGP] Re: [ChineseGP]RE: question about �� and �� Hello, I agree with Professor LuQin's point. Pls kindly consider the following words in section 2.6 of RFC5892(for IDNA ) 2.6 <https://tools.ietf.org/html/rfc5892#section-2.6> . Exceptions (F) F: cp is in {00B7, 00DF, 0375, 03C2, 05F3, 05F4, 0640, 0660, 0661, 0662, 0663, 0664, 0665, 0666, 0667, 0668, 0669, 06F0, 06F1, 06F2, 06F3, 06F4, 06F5, 06F6, 06F7, 06F8, 06F9, 06FD, 06FE, 07FA, 0F0B, 3007, 302E, 302F, 3031, 3032, 3033, 3034, 3035, 303B, 30FB} This category explicitly lists code points for which the category cannot be assigned using only the core property values that exist in the Unicode standard. The values are according to the table below: PVALID -- Would otherwise have been DISALLOWED 00DF; PVALID # LATIN SMALL LETTER SHARP S 03C2; PVALID # GREEK SMALL LETTER FINAL SIGMA 06FD; PVALID # ARABIC SIGN SINDHI AMPERSAND 06FE; PVALID # ARABIC SIGN SINDHI POSTPOSITION MEN 0F0B; PVALID # TIBETAN MARK INTERSYLLABIC TSHEG 3007; PVALID # IDEOGRAPHIC NUMBER ZERO It means that U+3007 is only allowed as the number zero, otherwise it will not allowed in use in IDNA protocols. �� can be used to express many meanings besides the number zero, but �� can be only used as the number zero In IDNA. �� is a special code point in IDNA while �� is a normal code point. so it may be not proper that �� is regarded as a variant to �� based on IDNA protocols. Best Regards _____ Jiankang Yao �����ˣ� <mailto:csluqin@comp.polyu.edu.hk> csluqin@comp.polyu.edu.hk ����ʱ�䣺 2017-06-07 13:50 �ռ��ˣ� <mailto:wangwei@cnic.cn> '��ΰ'; <mailto:bjlgy@bnu.edu.cn> bjlgy@bnu.edu.cn; <mailto:joe.zhang@unihan.com.cn> 'Zhang Zhoucai' ���ͣ� <mailto:ChineseGP@icann.org> ChineseGP@icann.org ���⣺ [ChineseGP]RE: question about �� and �� Dear Wang Wei, What is the U code of this Japanese? I think U+3007 being considered the simplified form of �� is a good one, simply consider it as variant is not as good. But, U+3007 is considered a CJK symbol, not a CJK ideograph. Best regards, LuQin From: ��ΰ [ <mailto:wangwei@cnic.cn> mailto:wangwei@cnic.cn] Sent: Monday, 5 June 2017 1:48 PM To: 'Qin Lu' < <mailto:csluqin@comp.polyu.edu.hk> csluqin@comp.polyu.edu.hk>; <mailto:bjlgy@bnu.edu.cn> bjlgy@bnu.edu.cn; 'Zhang Zhoucai' < <mailto:joe.zhang@unihan.com.cn> joe.zhang@unihan.com.cn> Cc: <mailto:ChineseGP@icann.org> ChineseGP@icann.org Subject: question about �� and �� Dear Prof. Lu, Prof. Li and Prof. Zhang ��½��ʦ������ʦ������ʦ JGP provided its latest repertoire in which a new character �� was added. do you think �� should be an independent character, or, a variant to �� �� �����µ��ձ��ּ���������ַ������� ������Ϊ���Ƿ�Ӧ��Ϊ������壬���Ƕ������֣� Looking forward to your suggestion Regards WANG Wei
Dear all U+3007 doesn��t not exist in CDNC IDN Table I raised this question because JGP imported U+3007 �� in their latest repertoire. That��s why we need to review the relationship between �� and �� again. ������: Zhang Joe [mailto:JoeZhang43@hotmail.com] ����ʱ��: 2017��6��8�� 7:53 �ռ���: Yao HEALTH <healthyao@hotmail.com>; csluqin@comp.polyu.edu.hk; Wang Wei <wangwei@cnic.cn>; bjlgy@bnu.edu.cn; 'Zhang Zhoucai' <joe.zhang@unihan.com.cn> ����: ChineseGP@icann.org ����: ��: [ChineseGP]RE: question about �� and �� ��Ҫ��: �� Dear all, Sorry that I was confused by the word ��new character�� in Wang Wei��s email. Now it is clear that this is an issue purely with CGP-JGP, not IRG. The character �� U+3007 has been encoded since the first version of Unicode/CJK. Concerning the question about �� and ��, I have some points: 1. They are the corresponding member in the two common frequently-used subsets ����һ�����������߰˾�ʮ �� and �� ��Ҽ��������½��ƾ�ʰ��. 2. The traditional form of �� ��Ҽ��������½��ƾ�ʰ�� are �� ��Ҽ�E ���������ƾ�ʰ�� used in Taiwan and Hongkong, Some characters may have more forms, say �� or��. 3. �� and ��may regarded as somehow low case-uppercase relation, or simplified -unsimplified ones. Whatever you consider they are, they have the same meaning as ideographic number zero. By LGR definition ,they are VARIANT each other. Even if they are simplified �C traditional relationship which still belong to VARIANT concept . 4. I remember that TLD-LGR does not require variants must have the ALL meaning(s) are the same. In summary�� It is OK to treat �� as a variant of �� in TLD-LGR scope. Thanks, Zhang ������: chinesegp-bounces@icann.org <mailto:chinesegp-bounces@icann.org> [mailto:chinesegp-bounces@icann.org] ���� Yao HEALTH ����ʱ��: 2017��6��7�� 15:09 �ռ���: csluqin@comp.polyu.edu.hk <mailto:csluqin@comp.polyu.edu.hk> ; Wang Wei <wangwei@cnic.cn <mailto:wangwei@cnic.cn> >; bjlgy@bnu.edu.cn <mailto:bjlgy@bnu.edu.cn> ; 'Zhang Zhoucai' <joe.zhang@unihan.com.cn <mailto:joe.zhang@unihan.com.cn> > ����: ChineseGP@icann.org <mailto:ChineseGP@icann.org> ����: [ChineseGP] Re: [ChineseGP]RE: question about �� and �� Hello, I agree with Professor LuQin's point. Pls kindly consider the following words in section 2.6 of RFC5892(for IDNA ) 2.6 <https://tools.ietf.org/html/rfc5892#section-2.6> . Exceptions (F) F: cp is in {00B7, 00DF, 0375, 03C2, 05F3, 05F4, 0640, 0660, 0661, 0662, 0663, 0664, 0665, 0666, 0667, 0668, 0669, 06F0, 06F1, 06F2, 06F3, 06F4, 06F5, 06F6, 06F7, 06F8, 06F9, 06FD, 06FE, 07FA, 0F0B, 3007, 302E, 302F, 3031, 3032, 3033, 3034, 3035, 303B, 30FB} This category explicitly lists code points for which the category cannot be assigned using only the core property values that exist in the Unicode standard. The values are according to the table below: PVALID -- Would otherwise have been DISALLOWED 00DF; PVALID # LATIN SMALL LETTER SHARP S 03C2; PVALID # GREEK SMALL LETTER FINAL SIGMA 06FD; PVALID # ARABIC SIGN SINDHI AMPERSAND 06FE; PVALID # ARABIC SIGN SINDHI POSTPOSITION MEN 0F0B; PVALID # TIBETAN MARK INTERSYLLABIC TSHEG 3007; PVALID # IDEOGRAPHIC NUMBER ZERO It means that U+3007 is only allowed as the number zero, otherwise it will not allowed in use in IDNA protocols. �� can be used to express many meanings besides the number zero, but �� can be only used as the number zero In IDNA. �� is a special code point in IDNA while �� is a normal code point. so it may be not proper that �� is regarded as a variant to �� based on IDNA protocols. Best Regards _____ Jiankang Yao �����ˣ� <mailto:csluqin@comp.polyu.edu.hk> csluqin@comp.polyu.edu.hk ����ʱ�䣺 2017-06-07 13:50 �ռ��ˣ� <mailto:wangwei@cnic.cn> '��ΰ'; <mailto:bjlgy@bnu.edu.cn> bjlgy@bnu.edu.cn; <mailto:joe.zhang@unihan.com.cn> 'Zhang Zhoucai' ���ͣ� <mailto:ChineseGP@icann.org> ChineseGP@icann.org ���⣺ [ChineseGP]RE: question about �� and �� Dear Wang Wei, What is the U code of this Japanese? I think U+3007 being considered the simplified form of �� is a good one, simply consider it as variant is not as good. But, U+3007 is considered a CJK symbol, not a CJK ideograph. Best regards, LuQin From: ��ΰ [ <mailto:wangwei@cnic.cn> mailto:wangwei@cnic.cn] Sent: Monday, 5 June 2017 1:48 PM To: 'Qin Lu' < <mailto:csluqin@comp.polyu.edu.hk> csluqin@comp.polyu.edu.hk>; <mailto:bjlgy@bnu.edu.cn> bjlgy@bnu.edu.cn; 'Zhang Zhoucai' < <mailto:joe.zhang@unihan.com.cn> joe.zhang@unihan.com.cn> Cc: <mailto:ChineseGP@icann.org> ChineseGP@icann.org Subject: question about �� and �� Dear Prof. Lu, Prof. Li and Prof. Zhang ��½��ʦ������ʦ������ʦ JGP provided its latest repertoire in which a new character �� was added. do you think �� should be an independent character, or, a variant to �� �� �����µ��ձ��ּ���������ַ������� ������Ϊ���Ƿ�Ӧ��Ϊ������壬���Ƕ������֣� Looking forward to your suggestion Regards WANG Wei
Dear Wang Wei, Yes, I understand. Even though U+3007 is a CJK symbol( not categorized as a CJK ideograph), it is used in real text to mean zero alongside of һ�� ������. Also by adding it as the simplified/lower-case of ��, we have a tighter rule in its use compared to not including it. Since JGP already imported it, it is good to include it as the simplified/lower-case of ��. Best regards, Lu Qin From: ��ΰ [mailto:wangwei@cnic.cn] Sent: Thursday, 8 June 2017 11:10 AM To: 'Zhang Joe' <JoeZhang43@hotmail.com>; 'Yao HEALTH' <healthyao@hotmail.com>; csluqin@comp.polyu.edu.hk; bjlgy@bnu.edu.cn; 'Zhang Zhoucai' <joe.zhang@unihan.com.cn> Cc: ChineseGP@icann.org Subject: ��: [ChineseGP]RE: question about �� and �� Dear all U+3007 doesn��t not exist in CDNC IDN Table I raised this question because JGP imported U+3007 �� in their latest repertoire. That��s why we need to review the relationship between �� and �� again. ������: Zhang Joe [mailto:JoeZhang43@hotmail.com] ����ʱ��: 2017��6��8�� 7:53 �ռ���: Yao HEALTH <healthyao@hotmail.com <mailto:healthyao@hotmail.com> >; csluqin@comp.polyu.edu.hk <mailto:csluqin@comp.polyu.edu.hk> ; Wang Wei <wangwei@cnic.cn <mailto:wangwei@cnic.cn> >; bjlgy@bnu.edu.cn <mailto:bjlgy@bnu.edu.cn> ; 'Zhang Zhoucai' <joe.zhang@unihan.com.cn <mailto:joe.zhang@unihan.com.cn> > ����: ChineseGP@icann.org <mailto:ChineseGP@icann.org> ����: ��: [ChineseGP]RE: question about �� and �� ��Ҫ��: �� Dear all, Sorry that I was confused by the word ��new character�� in Wang Wei��s email. Now it is clear that this is an issue purely with CGP-JGP, not IRG. The character �� U+3007 has been encoded since the first version of Unicode/CJK. Concerning the question about �� and ��, I have some points: 1. They are the corresponding member in the two common frequently-used subsets ����һ�����������߰˾�ʮ �� and �� ��Ҽ��������½��ƾ�ʰ��. 2. The traditional form of �� ��Ҽ��������½��ƾ�ʰ�� are �� ��Ҽ�E ���������ƾ�ʰ�� used in Taiwan and Hongkong, Some characters may have more forms, say �� or��. 3. �� and ��may regarded as somehow low case-uppercase relation, or simplified -unsimplified ones. Whatever you consider they are, they have the same meaning as ideographic number zero. By LGR definition ,they are VARIANT each other. Even if they are simplified �C traditional relationship which still belong to VARIANT concept . 4. I remember that TLD-LGR does not require variants must have the ALL meaning(s) are the same. In summary�� It is OK to treat �� as a variant of �� in TLD-LGR scope. Thanks, Zhang ������: chinesegp-bounces@icann.org <mailto:chinesegp-bounces@icann.org> [mailto:chinesegp-bounces@icann.org] ���� Yao HEALTH ����ʱ��: 2017��6��7�� 15:09 �ռ���: csluqin@comp.polyu.edu.hk <mailto:csluqin@comp.polyu.edu.hk> ; Wang Wei <wangwei@cnic.cn <mailto:wangwei@cnic.cn> >; bjlgy@bnu.edu.cn <mailto:bjlgy@bnu.edu.cn> ; 'Zhang Zhoucai' <joe.zhang@unihan.com.cn <mailto:joe.zhang@unihan.com.cn> > ����: ChineseGP@icann.org <mailto:ChineseGP@icann.org> ����: [ChineseGP] Re: [ChineseGP]RE: question about �� and �� Hello, I agree with Professor LuQin's point. Pls kindly consider the following words in section 2.6 of RFC5892(for IDNA ) 2.6 <https://tools.ietf.org/html/rfc5892#section-2.6> . Exceptions (F) F: cp is in {00B7, 00DF, 0375, 03C2, 05F3, 05F4, 0640, 0660, 0661, 0662, 0663, 0664, 0665, 0666, 0667, 0668, 0669, 06F0, 06F1, 06F2, 06F3, 06F4, 06F5, 06F6, 06F7, 06F8, 06F9, 06FD, 06FE, 07FA, 0F0B, 3007, 302E, 302F, 3031, 3032, 3033, 3034, 3035, 303B, 30FB} This category explicitly lists code points for which the category cannot be assigned using only the core property values that exist in the Unicode standard. The values are according to the table below: PVALID -- Would otherwise have been DISALLOWED 00DF; PVALID # LATIN SMALL LETTER SHARP S 03C2; PVALID # GREEK SMALL LETTER FINAL SIGMA 06FD; PVALID # ARABIC SIGN SINDHI AMPERSAND 06FE; PVALID # ARABIC SIGN SINDHI POSTPOSITION MEN 0F0B; PVALID # TIBETAN MARK INTERSYLLABIC TSHEG 3007; PVALID # IDEOGRAPHIC NUMBER ZERO It means that U+3007 is only allowed as the number zero, otherwise it will not allowed in use in IDNA protocols. �� can be used to express many meanings besides the number zero, but �� can be only used as the number zero In IDNA. �� is a special code point in IDNA while �� is a normal code point. so it may be not proper that �� is regarded as a variant to �� based on IDNA protocols. Best Regards _____ Jiankang Yao �����ˣ� <mailto:csluqin@comp.polyu.edu.hk> csluqin@comp.polyu.edu.hk ����ʱ�䣺 2017-06-07 13:50 �ռ��ˣ� <mailto:wangwei@cnic.cn> '��ΰ'; <mailto:bjlgy@bnu.edu.cn> bjlgy@bnu.edu.cn; <mailto:joe.zhang@unihan.com.cn> 'Zhang Zhoucai' ���ͣ� <mailto:ChineseGP@icann.org> ChineseGP@icann.org ���⣺ [ChineseGP]RE: question about �� and �� Dear Wang Wei, What is the U code of this Japanese? I think U+3007 being considered the simplified form of �� is a good one, simply consider it as variant is not as good. But, U+3007 is considered a CJK symbol, not a CJK ideograph. Best regards, LuQin From: ��ΰ [ <mailto:wangwei@cnic.cn> mailto:wangwei@cnic.cn] Sent: Monday, 5 June 2017 1:48 PM To: 'Qin Lu' < <mailto:csluqin@comp.polyu.edu.hk> csluqin@comp.polyu.edu.hk>; <mailto:bjlgy@bnu.edu.cn> bjlgy@bnu.edu.cn; 'Zhang Zhoucai' < <mailto:joe.zhang@unihan.com.cn> joe.zhang@unihan.com.cn> Cc: <mailto:ChineseGP@icann.org> ChineseGP@icann.org Subject: question about �� and �� Dear Prof. Lu, Prof. Li and Prof. Zhang ��½��ʦ������ʦ������ʦ JGP provided its latest repertoire in which a new character �� was added. do you think �� should be an independent character, or, a variant to �� �� �����µ��ձ��ּ���������ַ������� ������Ϊ���Ƿ�Ӧ��Ϊ������壬���Ƕ������֣� Looking forward to your suggestion Regards WANG Wei
I personally agree with Prof. Lu's suggestion. Actually, we do not have reason to reject it because JGP has the real use case here. 2017-06-08 YAN Zhiwei �����ˣ� csluqin@comp.polyu.edu.hk ����ʱ�䣺 2017-06-08 13:48:16 �ռ��ˣ� '��ΰ'; 'Zhang Joe'; 'Yao HEALTH'; bjlgy@bnu.edu.cn; 'Zhang Zhoucai' ���ͣ� ChineseGP ���⣺ [ChineseGP]RE: [ChineseGP]RE: question about �� and �� Dear Wang Wei, Yes, I understand. Even though U+3007 is a CJK symbol( not categorized as a CJK ideograph), it is used in real text to mean zero alongside of һ��������. Also by adding it as the simplified/lower-case of ��, we have a tighter rule in its use compared to not including it. Since JGP already imported it, it is good to include it as the simplified/lower-case of ��. Best regards, Lu Qin From: ��ΰ [mailto:wangwei@cnic.cn] Sent: Thursday, 8 June 2017 11:10 AM To: 'Zhang Joe' <JoeZhang43@hotmail.com>; 'Yao HEALTH' <healthyao@hotmail.com>; csluqin@comp.polyu.edu.hk; bjlgy@bnu.edu.cn; 'Zhang Zhoucai' <joe.zhang@unihan.com.cn> Cc: ChineseGP@icann.org Subject: ��: [ChineseGP]RE: question about �� and �� Dear all U+3007 doesn��t not exist in CDNC IDN Table I raised this question because JGP imported U+3007 �� in their latest repertoire. That��s why we need to review the relationship between �� and �� again. ������: Zhang Joe [mailto:JoeZhang43@hotmail.com] ����ʱ��: 2017��6��8�� 7:53 �ռ���: Yao HEALTH <healthyao@hotmail.com>; csluqin@comp.polyu.edu.hk; Wang Wei <wangwei@cnic.cn>; bjlgy@bnu.edu.cn; 'Zhang Zhoucai' <joe.zhang@unihan.com.cn> ����: ChineseGP@icann.org ����: ��: [ChineseGP]RE: question about �� and �� ��Ҫ��: �� Dear all, Sorry that I was confused by the word ��new character�� in Wang Wei��s email. Now it is clear that this is an issue purely with CGP-JGP, not IRG. The character �� U+3007 has been encoded since the first version of Unicode/CJK. Concerning the question about �� and ��, I have some points: 1. They are the corresponding member in the two common frequently-used subsets ����һ�����������߰˾�ʮ �� and �� ��Ҽ��������½��ƾ�ʰ��. 2. The traditional form of �� ��Ҽ��������½��ƾ�ʰ�� are �� ��Ҽ�E���������ƾ�ʰ�� used in Taiwan and Hongkong, Some characters may have more forms, say �� or��. 3. �� and ��may regarded as somehow low case-uppercase relation, or simplified -unsimplified ones. Whatever you consider they are, they have the same meaning as ideographic number zero. By LGR definition ,they are VARIANT each other. Even if they are simplified �C traditional relationship which still belong to VARIANT concept . 4. I remember that TLD-LGR does not require variants must have the ALL meaning(s) are the same. In summary�� It is OK to treat �� as a variant of �� in TLD-LGR scope. Thanks, Zhang ������: chinesegp-bounces@icann.org [mailto:chinesegp-bounces@icann.org] ���� Yao HEALTH ����ʱ��: 2017��6��7�� 15:09 �ռ���: csluqin@comp.polyu.edu.hk; Wang Wei <wangwei@cnic.cn>; bjlgy@bnu.edu.cn; 'Zhang Zhoucai' <joe.zhang@unihan.com.cn> ����: ChineseGP@icann.org ����: [ChineseGP] Re: [ChineseGP]RE: question about �� and �� Hello, I agree with Professor LuQin's point. Pls kindly consider the following words in section 2.6 of RFC5892(for IDNA ) 2.6. Exceptions (F) F: cp is in {00B7, 00DF, 0375, 03C2, 05F3, 05F4, 0640, 0660, 0661, 0662, 0663, 0664, 0665, 0666, 0667, 0668, 0669, 06F0, 06F1, 06F2, 06F3, 06F4, 06F5, 06F6, 06F7, 06F8, 06F9, 06FD, 06FE, 07FA, 0F0B, 3007, 302E, 302F, 3031, 3032, 3033, 3034, 3035, 303B, 30FB} This category explicitly lists code points for which the category cannot be assigned using only the core property values that exist in the Unicode standard. The values are according to the table below: PVALID -- Would otherwise have been DISALLOWED 00DF; PVALID # LATIN SMALL LETTER SHARP S 03C2; PVALID # GREEK SMALL LETTER FINAL SIGMA 06FD; PVALID # ARABIC SIGN SINDHI AMPERSAND 06FE; PVALID # ARABIC SIGN SINDHI POSTPOSITION MEN 0F0B; PVALID # TIBETAN MARK INTERSYLLABIC TSHEG 3007; PVALID # IDEOGRAPHIC NUMBER ZERO It means that U+3007 is only allowed as the number zero, otherwise it will not allowed in use in IDNA protocols. �� can be used to express many meanings besides the number zero, but �� can be only used as the number zero In IDNA. �� is a special code point in IDNA while �� is a normal code point. so it may be not proper that �� is regarded as a variant to �� based on IDNA protocols. Best Regards Jiankang Yao �����ˣ� csluqin@comp.polyu.edu.hk ����ʱ�䣺 2017-06-07 13:50 �ռ��ˣ� '��ΰ'; bjlgy@bnu.edu.cn; 'Zhang Zhoucai' ���ͣ� ChineseGP@icann.org ���⣺ [ChineseGP]RE: question about �� and �� Dear Wang Wei, What is the U code of this Japanese? I think U+3007 being considered the simplified form of �� is a good one, simply consider it as variant is not as good. But, U+3007 is considered a CJK symbol, not a CJK ideograph. Best regards, LuQin From: ��ΰ [mailto:wangwei@cnic.cn] Sent: Monday, 5 June 2017 1:48 PM To: 'Qin Lu' <csluqin@comp.polyu.edu.hk>; bjlgy@bnu.edu.cn; 'Zhang Zhoucai' <joe.zhang@unihan.com.cn> Cc: ChineseGP@icann.org Subject: question about �� and �� Dear Prof. Lu, Prof. Li and Prof. Zhang ��½��ʦ������ʦ������ʦ JGP provided its latest repertoire in which a new character �� was added. do you think �� should be an independent character, or, a variant to �� �� �����µ��ձ��ּ���������ַ������� ������Ϊ���Ƿ�Ӧ��Ϊ������壬���Ƕ������֣� Looking forward to your suggestion Regards WANG Wei
participants (5)
-
csluqin@comp.polyu.edu.hk -
YAN Zhiwei -
Yao HEALTH -
Zhang Joe -
王伟