Stackoverflow and ChatGPT

newer
UA EAI WG Meetings - 7 February...

Mark Svancarek (CELA)

Feb. 1, 2023

11:58 p.m.

You may recall that one of our recent EAI projects was to update the obsolete and incorrect guidance on sites like Stackoverflow and Serverfault regarding Universal Acceptance. You may also recall that the policies and processes of such sites made it almost impossible to update old conversations like the ones discussing parsing and validating email addresses. I wonder if we will have a new opportunity to address this problem. StackOverflow traffic dips as developers turn to ChatGPT for solutions | TechCabal<https://techcabal.com/2023/01/31/stack-overflow-chat-gpt/> Mark Svancarek Principal Program Manager, Corporate Standards Group Privacy & Regulatory Affairs Microsoft marksv@microsoft.com<mailto:marksv@microsoft.com> www.linkedin.com/in/marksv<http://www.linkedin.com/in/marksv> T: [+1] 425-706-2773 [cid:image001.gif@01D93655.F8D8AD10]

Attachments:

attachment.html (text/html — 5.3 KB)
image001.gif (image/gif — 344.6 KB)

Show replies by date

Marc Blanchet

February 2023

12:15 a.m.

...

Le 1 févr. 2023 à 18:58, Mark Svancarek (CELA) via UA-EAI <ua-eai@icann.org> a écrit :

You may recall that one of our recent EAI projects was to update the obsolete and incorrect guidance on sites like Stackoverflow and Serverfault regarding Universal Acceptance. You may also recall that the policies and processes of such sites made it almost impossible to update old conversations like the ones discussing parsing and validating email addresses.

I wonder if we will have a new opportunity to address this problem.

I’m sure you had in mind when sending your email that it will start a debate… ;-) So I’ll start. As you may know, chatGPT is based on a trained model with a gigantic source of information. So if chatGPT does not give proper information, it won’t be more simple to update it, as it depends on OpenAI algorithm and procedure to update its training model. Here is an example of the “limitations” of chatGTP as it currently stands… As you can see, not very useful! Marc.

...

StackOverflow traffic dips as developers turn to ChatGPT for solutions | TechCabal <https://techcabal.com/2023/01/31/stack-overflow-chat-gpt/>

Mark Svancarek Principal Program Manager, Corporate Standards Group Privacy & Regulatory Affairs Microsoft marksv@microsoft.com <mailto:marksv@microsoft.com> www.linkedin.com/in/marksv <http://www.linkedin.com/in/marksv> T: [+1] 425-706-2773 <image001.gif>

_______________________________________________ UA-EAI mailing list UA-EAI@icann.org <mailto:UA-EAI@icann.org> https://mm.icann.org/mailman/listinfo/ua-eai _______________________________________________ By submitting your personal data, you consent to the processing of your personal data for purposes of subscribing to this mailing list accordance with the ICANN Privacy Policy (https://www.icann.org/privacy/policy) and the website Terms of Service (https://www.icann.org/privacy/tos). You can visit the Mailman link above to change your membership status or configuration, including unsubscribing, setting digest-style delivery or disabling delivery altogether (e.g., for a vacation), and so on.

Marc Blanchet

12:35 a.m.

Garbage in, garbage out , as they say… :(. Marc.

...

Le 1 févr. 2023 à 19:15, Marc Blanchet <marc.blanchet@viagenie.ca> a écrit :

...
Le 1 févr. 2023 à 18:58, Mark Svancarek (CELA) via UA-EAI <ua-eai@icann.org> a écrit :

You may recall that one of our recent EAI projects was to update the obsolete and incorrect guidance on sites like Stackoverflow and Serverfault regarding Universal Acceptance. You may also recall that the policies and processes of such sites made it almost impossible to update old conversations like the ones discussing parsing and validating email addresses.

I wonder if we will have a new opportunity to address this problem.

I’m sure you had in mind when sending your email that it will start a debate… ;-) So I’ll start.

As you may know, chatGPT is based on a trained model with a gigantic source of information. So if chatGPT does not give proper information, it won’t be more simple to update it, as it depends on OpenAI algorithm and procedure to update its training model.

Here is an example of the “limitations” of chatGTP as it currently stands… As you can see, not very useful!

Marc.

<PastedGraphic-1.png>

...
StackOverflow traffic dips as developers turn to ChatGPT for solutions | TechCabal <https://techcabal.com/2023/01/31/stack-overflow-chat-gpt/>

Mark Svancarek Principal Program Manager, Corporate Standards Group Privacy & Regulatory Affairs Microsoft marksv@microsoft.com <mailto:marksv@microsoft.com> www.linkedin.com/in/marksv <http://www.linkedin.com/in/marksv> T: [+1] 425-706-2773 <image001.gif>

_______________________________________________ UA-EAI mailing list UA-EAI@icann.org <mailto:UA-EAI@icann.org> https://mm.icann.org/mailman/listinfo/ua-eai _______________________________________________ By submitting your personal data, you consent to the processing of your personal data for purposes of subscribing to this mailing list accordance with the ICANN Privacy Policy (https://www.icann.org/privacy/policy) and the website Terms of Service (https://www.icann.org/privacy/tos). You can visit the Mailman link above to change your membership status or configuration, including unsubscribing, setting digest-style delivery or disabling delivery altogether (e.g., for a vacation), and so on.

Marc Blanchet

12:37 a.m.

Final try… not bad, but as we know, not good also. But I’m stopping here. Full stop. Marc.

...

Le 1 févr. 2023 à 19:35, Marc Blanchet <marc.blanchet@viagenie.ca> a écrit :

Garbage in, garbage out , as they say… :(. Marc.

<PastedGraphic-1.png>

...
Le 1 févr. 2023 à 19:15, Marc Blanchet <marc.blanchet@viagenie.ca> a écrit :

...
Le 1 févr. 2023 à 18:58, Mark Svancarek (CELA) via UA-EAI <ua-eai@icann.org> a écrit :

You may recall that one of our recent EAI projects was to update the obsolete and incorrect guidance on sites like Stackoverflow and Serverfault regarding Universal Acceptance. You may also recall that the policies and processes of such sites made it almost impossible to update old conversations like the ones discussing parsing and validating email addresses.

I wonder if we will have a new opportunity to address this problem.

I’m sure you had in mind when sending your email that it will start a debate… ;-) So I’ll start.

As you may know, chatGPT is based on a trained model with a gigantic source of information. So if chatGPT does not give proper information, it won’t be more simple to update it, as it depends on OpenAI algorithm and procedure to update its training model.

Here is an example of the “limitations” of chatGTP as it currently stands… As you can see, not very useful!

Marc.

<PastedGraphic-1.png>

...
StackOverflow traffic dips as developers turn to ChatGPT for solutions | TechCabal <https://techcabal.com/2023/01/31/stack-overflow-chat-gpt/>

Mark Svancarek Principal Program Manager, Corporate Standards Group Privacy & Regulatory Affairs Microsoft marksv@microsoft.com <mailto:marksv@microsoft.com> www.linkedin.com/in/marksv <http://www.linkedin.com/in/marksv> T: [+1] 425-706-2773 <image001.gif>

_______________________________________________ UA-EAI mailing list UA-EAI@icann.org <mailto:UA-EAI@icann.org> https://mm.icann.org/mailman/listinfo/ua-eai _______________________________________________ By submitting your personal data, you consent to the processing of your personal data for purposes of subscribing to this mailing list accordance with the ICANN Privacy Policy (https://www.icann.org/privacy/policy) and the website Terms of Service (https://www.icann.org/privacy/tos). You can visit the Mailman link above to change your membership status or configuration, including unsubscribing, setting digest-style delivery or disabling delivery altogether (e.g., for a vacation), and so on.

Mark Svancarek (CELA)

1:03 a.m.

New subject: [EXTERNAL] Re: Stackoverflow and ChatGPT

I was indeed being provocative, but it’s really not so bad so long as you remember to have a conversation with ChatGPT (it maintains context between questions during the same session), rather than expecting it to emit perfect responses on its first attempt. Over time this might prove useful. Or not 😊 what is the best way to validate an international email address? To validate an international email address, you can follow these steps: 1. Check the format: Ensure that the email address has a local part (username), an @ symbol, and a domain name (e.g., example@domain.com<mailto:example@domain.com>). 2. Verify the domain name: Check that the domain name has a valid format (e.g., .com, .org, .net, etc.) and is correctly registered. 3. Test the email address: Send a test email to the address and check if it bounces or if you receive a response. 4. Use a library: There are several libraries available that can help you validate an email address programmatically, such as Python's validate_email library or PHP's validator library. It's important to keep in mind that email validation is not foolproof and even a valid email address can still bounce or be invalid. [cid:image002.png@01D9365F.1AD6D460][mark.svancarek@live.com] what is the difference between a legacy email address and an internationalized email address? A legacy email address is a type of email address that adheres to the traditional ASCII character set, which includes the English alphabet (upper and lower case), numbers, and special characters like @, ., _, etc. An example of a legacy email address is example@domain.com<mailto:example@domain.com>. An internationalized email address, on the other hand, allows for the use of non-ASCII characters such as characters from other alphabets, diacritics, and special characters specific to different languages. An example of an internationalized email address is pépé@domaine.com<mailto:pépé@domaine.com>, where the letter é is a non-ASCII character. Internationalized email addresses are also known as "Internationalized Domain Names" (IDN) email addresses. The use of non-ASCII characters in email addresses allows for a wider range of email addresses that are more inclusive of different languages and alphabets. [cid:image002.png@01D9365F.1AD6D460][mark.svancarek@live.com] what are the challenges when coding an email system to support internationalized email addresses? There are several challenges when coding an email system to support internationalized email addresses, including: 1. Encoding: Different characters may have different encodings, which can cause issues when sending, receiving, and storing emails. 2. Validation: Validating internationalized email addresses can be more complex than validating legacy email addresses, as there may be different rules and restrictions for different languages and scripts. 3. Display: Displaying internationalized email addresses in a user-friendly manner can be challenging, as different scripts may require different font styles and sizes, and may not be supported by all email clients. 4. Spoofing: Internationalized email addresses may be more susceptible to spoofing attacks, as it can be difficult to distinguish between visually similar characters from different scripts. 5. Interoperability: Ensuring that internationalized email addresses are compatible and can be sent and received by all email systems can be a challenge, as different email systems may have different levels of support for different scripts. To overcome these challenges, it's important to use a standardized encoding for internationalized email addresses, such as UTF-8, and to follow the guidelines set by the Internet Assigned Numbers Authority (IANA) and the Internet Engineering Task Force (IETF). [cid:image002.png@01D9365F.1AD6D460][mark.svancarek@live.com] why shouldn't you use a regex to evaluate that an internationalized email address is well-formed? Using a regular expression (regex) to evaluate the well-formedness of an internationalized email address is not recommended because: 1. Complexity: Regular expressions can become very complex when trying to validate internationalized email addresses, as they need to consider the various different scripts and characters that may be used. 2. Incomplete validation: Regular expressions can only perform a basic syntax check and cannot guarantee that an internationalized email address is actually valid or will be deliverable. 3. Lack of standards: There are no widely accepted standards for the validation of internationalized email addresses, making it difficult to create a regex that will work for all cases. 4. False negatives: A regex that is designed to validate internationalized email addresses may still produce false negatives, meaning that valid email addresses will be rejected. 5. False positives: A regex that is designed to validate internationalized email addresses may produce false positives, meaning that invalid email addresses will be accepted. Instead of using a regex, it's recommended to use a library or code that is specifically designed to validate internationalized email addresses and follows the guidelines set by the Internet Assigned Numbers Authority (IANA) and the Internet Engineering Task Force (IETF). [cid:image002.png@01D9365F.1AD6D460][mark.svancarek@live.com] what are some examples of libraries to avoid when processing internationalized email addresses? There is no one-size-fits-all answer to this question, as the choice of library will depend on the specific requirements of your application. However, some libraries that are known to have limited or no support for internationalized email addresses include: 1. Java's java.util.regex: This library only supports ASCII characters and is not suitable for validating internationalized email addresses. 2. Python's re module: This library only supports ASCII characters and is not suitable for validating internationalized email addresses. 3. .NET's System.Text.RegularExpressions: This library only supports ASCII characters and is not suitable for validating internationalized email addresses. It's important to do thorough research and testing when selecting a library for processing internationalized email addresses, and to choose a library that is specifically designed for this purpose and follows the guidelines set by the Internet Assigned Numbers Authority (IANA) and the Internet Engineering Task Force (IETF). [cid:image002.png@01D9365F.1AD6D460][mark.svancarek@live.com] What is the universal Acceptance Steering Group and what is its role relative to internationalized email addresses? The Universal Acceptance Steering Group (UASG) is a non-profit organization that aims to promote and advance universal acceptance of all valid domain names and email addresses. The UASG was created in response to the growing use of internationalized domain names (IDNs) and internationalized email addresses (IDN email addresses), which can be written in different scripts and contain characters from different alphabets. The UASG's role relative to internationalized email addresses is to promote and encourage the widespread adoption and support of these email addresses by email service providers, email clients, and other related technology companies. This includes advocating for the implementation of relevant technical standards, such as the Internet Assigned Numbers Authority (IANA) and the Internet Engineering Task Force (IETF), and providing resources and guidance to help organizations understand the importance of universal acceptance. By promoting universal acceptance of internationalized email addresses, the UASG aims to make the internet a more inclusive and accessible place, where people can communicate using their preferred script and language. [cid:image002.png@01D9365F.1AD6D460][mark.svancarek@live.com] Where can one learn about internationalized email addresses from UASG? The Universal Acceptance Steering Group (UASG) provides a variety of resources and information on internationalized email addresses (IDN email addresses) on its website, uasg.tech. Some of the resources that you can access to learn about IDN email addresses from the UASG include: 1. Technical Guidelines: The UASG provides technical guidelines and recommendations for implementing support for IDN email addresses, based on the standards set by the Internet Assigned Numbers Authority (IANA) and the Internet Engineering Task Force (IETF). 2. Awareness Materials: The UASG provides awareness materials, such as articles, presentations, and videos, to help educate stakeholders about the benefits and importance of supporting IDN email addresses. 3. FAQs: The UASG provides answers to frequently asked questions about IDN email addresses, covering topics such as validation, display, interoperability, and security. 4. Discussion Forums: The UASG provides discussion forums and mailing lists, where stakeholders can connect with each other and discuss topics related to IDN email addresses. 5. Case Studies: The UASG provides case studies of organizations that have successfully implemented support for IDN email addresses, highlighting best practices and challenges encountered. By visiting the UASG website and utilizing these resources, you can gain a comprehensive understanding of internationalized email addresses and the steps that you can take to support them in your own organization. [cid:image002.png@01D9365F.1AD6D460][mark.svancarek@live.com] what is the difference between EAI addreses and IDN addresses? Internationalized Email Addresses (EAI addresses) and Internationalized Domain Names (IDN addresses) are related but distinct concepts. Internationalized Email Addresses (EAI addresses) are email addresses that can contain non-ASCII characters, such as characters from different scripts and alphabets. This allows people to use email addresses that are written in their preferred script and language, rather than being limited to ASCII characters. Internationalized Domain Names (IDN addresses) are domain names that can contain non-ASCII characters. This allows organizations and individuals to use domain names that are written in their preferred script and language, rather than being limited to ASCII characters. The main difference between EAI addresses and IDN addresses is that EAI addresses refer specifically to email addresses, while IDN addresses refer to domain names more broadly. However, both EAI addresses and IDN addresses share the goal of making the internet a more inclusive and accessible place, where people can communicate using their preferred script and language. [cid:image002.png@01D9365F.1AD6D460][mark.svancarek@live.com] what is the difference between EAI email addresses and IDN email addresses? The terms "Internationalized Email Addresses (EAI)" and "Internationalized Domain Name email addresses (IDN email addresses)" are sometimes used interchangeably to refer to email addresses that contain non-ASCII characters. However, the difference between the two terms is not well established, and their use may vary depending on the context. In general, both terms refer to email addresses that can contain non-ASCII characters, such as characters from different scripts and alphabets. This allows people to use email addresses that are written in their preferred script and language, rather than being limited to ASCII characters. In conclusion, both terms refer to the same concept and can be used interchangeably to describe email addresses that support non-ASCII characters. [cid:image002.png@01D9365F.1AD6D460][mark.svancarek@live.com] That's not correct. An EAI address (RFC 6530 et al.) may have non-ASCII characters in its local part, regardless of its domain name. "IDN email address" implies that the domain part is internationalized, but indicates noting about the local part. Since and IDN can be expressed as either A-Label or U-Label, an IDN may be fully represented in ASCII. That is not true for local parts. I apologize for the confusion. You are correct. An EAI address, as defined in RFC 6530 and other related documents, may have non-ASCII characters in its local part, regardless of its domain name. The term "IDN email address" specifically refers to email addresses where the domain part is internationalized, which may or may not be fully represented in ASCII, depending on the encoding used (either A-Label or U-Label). The local part, on the other hand, may contain non-ASCII characters, regardless of the encoding used for the domain part. Thank you for bringing this to my attention. From: Marc Blanchet <marc.blanchet@viagenie.ca> Sent: Wednesday, February 1, 2023 4:38 PM To: Mark Svancarek (CELA) <marksv@microsoft.com> Cc: ua-eai@icann.org Subject: [EXTERNAL] Re: [UA-EAI] Stackoverflow and ChatGPT Final try… not bad, but as we know, not good also. But I’m stopping here. Full stop. Marc. [cid:image001.png@01D9365F.0E5A6B20] Le 1 févr. 2023 à 19:35, Marc Blanchet <marc.blanchet@viagenie.ca<mailto:marc.blanchet@viagenie.ca>> a écrit : Garbage in, garbage out , as they say… :(. Marc. <PastedGraphic-1.png> Le 1 févr. 2023 à 19:15, Marc Blanchet <marc.blanchet@viagenie.ca<mailto:marc.blanchet@viagenie.ca>> a écrit : Le 1 févr. 2023 à 18:58, Mark Svancarek (CELA) via UA-EAI <ua-eai@icann.org<mailto:ua-eai@icann.org>> a écrit : You may recall that one of our recent EAI projects was to update the obsolete and incorrect guidance on sites like Stackoverflow and Serverfault regarding Universal Acceptance. You may also recall that the policies and processes of such sites made it almost impossible to update old conversations like the ones discussing parsing and validating email addresses. I wonder if we will have a new opportunity to address this problem. I’m sure you had in mind when sending your email that it will start a debate… ;-) So I’ll start. As you may know, chatGPT is based on a trained model with a gigantic source of information. So if chatGPT does not give proper information, it won’t be more simple to update it, as it depends on OpenAI algorithm and procedure to update its training model. Here is an example of the “limitations” of chatGTP as it currently stands… As you can see, not very useful! Marc. <PastedGraphic-1.png> StackOverflow traffic dips as developers turn to ChatGPT for solutions | TechCabal<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftechcabal....> Mark Svancarek Principal Program Manager, Corporate Standards Group Privacy & Regulatory Affairs Microsoft marksv@microsoft.com<mailto:marksv@microsoft.com> www.linkedin.com/in/marksv<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.linkedi...> T: [+1] 425-706-2773 <image001.gif> _______________________________________________ UA-EAI mailing list UA-EAI@icann.org<mailto:UA-EAI@icann.org> https://mm.icann.org/mailman/listinfo/ua-eai<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmm.icann.org%2Fmailman%2Flistinfo%2Fua-eai&data=05%7C01%7Cmarksv%40microsoft.com%7C626886b63ab2412a2e2408db04b5b626%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638108950730761561%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=P2FiBqB7q83lFC3r09wxuhajx9n0oi8iIb5oafjqoo8%3D&reserved=0> _______________________________________________ By submitting your personal data, you consent to the processing of your personal data for purposes of subscribing to this mailing list accordance with the ICANN Privacy Policy (https://www.icann.org/privacy/policy<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.icann.org%2Fprivacy%2Fpolicy&data=05%7C01%7Cmarksv%40microsoft.com%7C626886b63ab2412a2e2408db04b5b626%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638108950730761561%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=467t%2FHWNiX273eE1%2FA1SWWNNSMyP9wAKQ4V7kNXYoMk%3D&reserved=0>) and the website Terms of Service (https://www.icann.org/privacy/tos<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.icann.org%2Fprivacy%2Ftos&data=05%7C01%7Cmarksv%40microsoft.com%7C626886b63ab2412a2e2408db04b5b626%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638108950730761561%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=QsizOuBsXNyXgsLjTDFi6nHWqz%2FxOnNVwEFC46xFcVQ%3D&reserved=0>). You can visit the Mailman link above to change your membership status or configuration, including unsubscribing, setting digest-style delivery or disabling delivery altogether (e.g., for a vacation), and so on.

Asmus Freytag

1:05 a.m.

Marc, you've identified two of the main problems with AI based answers. One, the impossibility to fix misperceptions embedded in the training data. The other, the difficulty for a human user to "rate" the level of expertise behind an AI generated answer. You can't use plausibility, because all AI generated answers are plausible. You can't use other "tells" of inexperience, because the AI generated texts don't have them. And you can't "rate" the AI like you can human experts; nor can you rate the true experts that it plagiarizes. If, as can be expected, significant levels of AI generated output are then being published on various sites in ways that even an AI can't be sure it's AI generated, then you not only get garbage in, garbage out, but fully recycled garbage. Misinformation of the human generated kind is harmless by comparison. A./

Arnt Gulbrandsen

10:24 a.m.

Hi, This is precisely why ChaptGTP is banned on Stack Overflow. The ban announcement says “its answers are frequently bad but look plausible” or words to that effect. Arnt From: UA-EAI <ua-eai-bounces@icann.org> on behalf of Asmus Freytag via UA-EAI <ua-eai@icann.org> Reply to: Asmus Freytag <asmusf@ix.netcom.com> Date: Thursday, 2 February 2023 at 02:05 To: "ua-eai@icann.org" <ua-eai@icann.org> Subject: Re: [UA-EAI] Stackoverflow and ChatGPT Marc, you've identified two of the main problems with AI based answers. One, the impossibility to fix misperceptions embedded in the training data. The other, the difficulty for a human user to "rate" the level of expertise behind an AI generated answer. You can't use plausibility, because all AI generated answers are plausible. You can't use other "tells" of inexperience, because the AI generated texts don't have them. And you can't "rate" the AI like you can human experts; nor can you rate the true experts that it plagiarizes. If, as can be expected, significant levels of AI generated output are then being published on various sites in ways that even an AI can't be sure it's AI generated, then you not only get garbage in, garbage out, but fully recycled garbage. Misinformation of the human generated kind is harmless by comparison. A./

Nicolas Fiumarelli

12:43 p.m.

Offtopic but I found cool stuff in chat GPT regarding IDNA versions. I didn't test it and it also confuses me because there are many cases depending on the context. Q: why idna2003 and idna2008 are not compatible? Give an example in java A: IDNA2003 and IDNA2008 are not compatible because they use different algorithms for encoding internationalized domain names (IDNs) into the ASCII-compatible format required by the domain name system (DNS). IDNA2003 uses a simpler, table-based algorithm, while IDNA2008 uses a more complex, context-sensitive algorithm that provides better handling of certain characters and character combinations. Java: import java.net.IDN; public class Main { public static void main(String[] args) { String domainName = "xn--bcher-kva.example"; // IDNA2003 String idna2003 = IDN.toASCII(domainName, IDN.ALLOW_UNASSIGNED); System.out.println("IDNA2003: " + idna2003); // IDNA2008 String idna2008 = IDN.toASCII(domainName, IDN.USE_STD3_ASCII_RULES | IDN.ALLOW_UNASSIGNED); System.out.println("IDNA2008: " + idna2008); } } Output: IDNA2003: bücher.example IDNA2008: xn--bcher-kva.example In this example, the domain name "bücher.example" is encoded using both IDNA2003 and IDNA2008, and the resulting ASCII-compatible strings are printed to the console. The IDNA2003 encoding results in "bücher.example", which may not be acceptable for use as a domain name in the DNS, while the IDNA2008 encoding results in "xn--bcher-kva.example", which is a valid ASCII-compatible string. ...... Regards, Nicolas. El jue, 2 feb 2023 a las 7:25, Arnt Gulbrandsen via UA-EAI (< ua-eai@icann.org>) escribió:

...

Hi,

This is precisely why ChaptGTP is banned on Stack Overflow. The ban announcement says “its answers are frequently bad but look plausible” or words to that effect.

Arnt

*From: *UA-EAI <ua-eai-bounces@icann.org> on behalf of Asmus Freytag via UA-EAI <ua-eai@icann.org> *Reply to: *Asmus Freytag <asmusf@ix.netcom.com> *Date: *Thursday, 2 February 2023 at 02:05 *To: *"ua-eai@icann.org" <ua-eai@icann.org> *Subject: *Re: [UA-EAI] Stackoverflow and ChatGPT

Marc,

you've identified two of the main problems with AI based answers. One, the impossibility to fix misperceptions embedded in the training data. The other, the difficulty for a human user to "rate" the level of expertise behind an AI generated answer. You can't use plausibility, because all AI generated answers are plausible. You can't use other "tells" of inexperience, because the AI generated texts don't have them. And you can't "rate" the AI like you can human experts; nor can you rate the true experts that it plagiarizes.

If, as can be expected, significant levels of AI generated output are then being published on various sites in ways that even an AI can't be sure it's AI generated, then you not only get garbage in, garbage out, but fully recycled garbage.

Misinformation of the human generated kind is harmless by comparison.

A./ _______________________________________________ UA-EAI mailing list UA-EAI@icann.org https://mm.icann.org/mailman/listinfo/ua-eai _______________________________________________ By submitting your personal data, you consent to the processing of your personal data for purposes of subscribing to this mailing list accordance with the ICANN Privacy Policy (https://www.icann.org/privacy/policy) and the website Terms of Service (https://www.icann.org/privacy/tos). You can visit the Mailman link above to change your membership status or configuration, including unsubscribing, setting digest-style delivery or disabling delivery altogether (e.g., for a vacation), and so on.

John Levine

5:49 p.m.

It appears that Nicolas Fiumarelli via UA-EAI <nicocamarao@gmail.com> said:

...

Q: why idna2003 and idna2008 are not compatible? Give an example in java

A: IDNA2003 and IDNA2008 are not compatible because they use different algorithms for encoding internationalized domain names (IDNs) into the ASCII-compatible format required by the domain name system (DNS).

So far, so good.

...

IDNA2003 uses a simpler, table-based algorithm, while IDNA2008 uses a more complex, context-sensitive algorithm that provides better handling of certain characters and character combinations.

Completely wrong. IDNA2008 uses tables and is not context-sensitive. You see the problem. R's, John

Arnt Gulbrandsen

6:21 p.m.

Hi, RFC 5895 discusses locale-dependent things clients might want/need to do in conjunction with IDNA2008. It defines these things as not part of the protocol. Some of them are required, though. Is locale context? I'd say yes. Is IDNA2008 just the protocol, just the things you have to do, or all the things described in RFC5890-5? I don't have an opinion on that, but GPT's source apparently thinks the latter and I'd say it's a reasonable reading. Arnt

Asmus Freytag

7:01 p.m.

On 2/2/2023 9:49 AM, John Levine via UA-EAI wrote:

...

It appears that Nicolas Fiumarelli via UA-EAI<nicocamarao@gmail.com> said:

...
Q: why idna2003 and idna2008 are not compatible? Give an example in java

A: IDNA2003 and IDNA2008 are not compatible because they use different algorithms for encoding internationalized domain names (IDNs) into the ASCII-compatible format required by the domain name system (DNS). So far, so good.

...
IDNA2003 uses a simpler, table-based algorithm, while IDNA2008 uses a more complex, context-sensitive algorithm that provides better handling of certain characters and character combinations. Completely wrong. IDNA2008 uses tables and is not context-sensitive.

Hmm. What about CONTEXTO and CONTEXTJ ?? My take is that these introduce (in-label) context into the base algorithm, but I rate the improvements as marginal; witness the extensive amount of added context restrictions required for the Root Zone LGRs. A./

...

You see the problem.

R's, John

_______________________________________________ UA-EAI mailing list UA-EAI@icann.org https://mm.icann.org/mailman/listinfo/ua-eai _______________________________________________ By submitting your personal data, you consent to the processing of your personal data for purposes of subscribing to this mailing list accordance with the ICANN Privacy Policy (https://www.icann.org/privacy/policy) and the website Terms of Service (https://www.icann.org/privacy/tos). You can visit the Mailman link above to change your membership status or configuration, including unsubscribing, setting digest-style delivery or disabling delivery altogether (e.g., for a vacation), and so on.

Martin J. Dürst

6:43 a.m.

Hello Nicolas, others, As already said, the answer below is not too helpful. A few more details below: On 2023-02-02 21:43, Nicolas Fiumarelli via UA-EAI wrote:

...

Offtopic but I found cool stuff in chat GPT regarding IDNA versions. I didn't test it and it also confuses me because there are many cases depending on the context.

Q: why idna2003 and idna2008 are not compatible? Give an example in java

A: IDNA2003 and IDNA2008 are not compatible because they use different algorithms for encoding internationalized domain names (IDNs) into the ASCII-compatible format required by the domain name system (DNS). IDNA2003 uses a simpler, table-based algorithm, while IDNA2008 uses a more complex, context-sensitive algorithm that provides better handling of certain characters and character combinations.

Actually, IDNA2003 and IDNA2008 are mostly compatible, but there are exceptions. Depending on the language(s), scripts, characters, and computer system involved, the difference may be completely ignorable or may be relevant.

...

Java: import java.net.IDN; public class Main { public static void main(String[] args) { String domainName = "xn--bcher-kva.example";

// IDNA2003 String idna2003 = IDN.toASCII(domainName, IDN.ALLOW_UNASSIGNED); System.out.println("IDNA2003: " + idna2003);

// IDNA2008 String idna2008 = IDN.toASCII(domainName, IDN.USE_STD3_ASCII_RULES | IDN.ALLOW_UNASSIGNED); System.out.println("IDNA2008: " + idna2008); } } Output: IDNA2003: bücher.example IDNA2008: xn--bcher-kva.example

In this example, the domain name "bücher.example" is encoded using both IDNA2003 and IDNA2008, and the resulting ASCII-compatible strings are printed to the console. The IDNA2003 encoding results in "bücher.example", which may not be acceptable for use as a domain name in the DNS, while the IDNA2008 encoding results in "xn--bcher-kva.example", which is a valid ASCII-compatible string.

According to the latest official documentation at https://docs.oracle.com/en/java/javase/19/docs/api/java.base/java/net/IDN.ht..., JAVA still seems to be on IDNA2003, without support for IDNA2008. That can be deduced from the fact that it references nothing newer than RFC 3490. But wherever the above example program was run, there was either a faulty implementation involved, or somehow the output came from somewhere else (we don't know what parts of the answer came from where). There's *no way* the string idna2008 should end up as bücher.example after applying the toASCII method. Regards, Martin.

...

......

Regards, Nicolas.

Asmus Freytag

6:44 p.m.

On 2/2/2023 2:24 AM, Arnt Gulbrandsen wrote:

...

Hi,

This is precisely why ChaptGTP is banned on Stack Overflow. The ban announcement says “its answers are frequently bad but look plausible” or words to that effect.

Arnt, yes, I was aware of that. But plausibility is not the only metric we use when we make decisions whether to rely on advice by human experts. While none are perfect, they are still pretty decent heuristics. (see comment added below). A./

...

Arnt

*From: *UA-EAI <ua-eai-bounces@icann.org> on behalf of Asmus Freytag via UA-EAI <ua-eai@icann.org> *Reply to: *Asmus Freytag <asmusf@ix.netcom.com> *Date: *Thursday, 2 February 2023 at 02:05 *To: *"ua-eai@icann.org" <ua-eai@icann.org> *Subject: *Re: [UA-EAI] Stackoverflow and ChatGPT

Marc,

you've identified two of the main problems with AI based answers. One, the impossibility to fix misperceptions embedded in the training data. The other, the difficulty for a human user to "rate" the level of expertise behind an AI generated answer. You can't use plausibility, because all AI generated answers are plausible. You can't use other "tells" of inexperience, because the AI generated texts don't have them. And you can't "rate" the AI like you can human experts; nor can you rate the true experts that it plagiarizes.

What is meant is that with human experts you can look at their track record and form some qualified opinion about what level and type of expertise to expect. Part of that depends on associating individuals with a specialization. The AI is not specialized in a knowledge domain and one result doesn't predict the next.

...

If, as can be expected, significant levels of AI generated output are then being published on various sites in ways that even an AI can't be sure it's AI generated, then you not only get garbage in, garbage out, but fully recycled garbage.

Misinformation of the human generated kind is harmless by comparison.

A./

Mark W. Datysgeld

1:22 a.m.

I do like the idea behind this second answer it gave to Marc... I do wonder if they will ever open up to some sort of "request for amendments". I imagine it would be a logistical and maybe legal nightmare, but it could be very helpful. --- Mark W. Datysgeld Director at Governance Primer ICANN GNSO Councilor ________________________________ From: Marc Blanchet via UA-EAI <ua-eai@icann.org> Sent: Wednesday, February 1, 2023 21:37 To: Mark Svancarek (CELA) Cc: ua-eai@icann.org Subject: Re: [UA-EAI] Stackoverflow and ChatGPT

...

Final try… not bad, but as we know, not good also. But I’m stopping here. Full stop. Marc.

...
Le 1 févr. 2023 à 19:35, Marc Blanchet <marc.blanchet@viagenie.ca> a écrit :

Garbage in, garbage out , as they say… :(. Marc.

<PastedGraphic-1.png>

...
Le 1 févr. 2023 à 19:15, Marc Blanchet <marc.blanchet@viagenie.ca> a écrit :

...
Le 1 févr. 2023 à 18:58, Mark Svancarek (CELA) via UA-EAI <ua-eai@icann.org> a écrit :

You may recall that one of our recent EAI projects was to update the obsolete and incorrect guidance on sites like Stackoverflow and Serverfault regarding Universal Acceptance. You may also recall that the policies and processes of such sites made it almost impossible to update old conversations like the ones discussing parsing and validating email addresses. I wonder if we will have a new opportunity to address this problem.

I’m sure you had in mind when sending your email that it will start a debate… ;-) So I’ll start.

As you may know, chatGPT is based on a trained model with a gigantic source of information. So if chatGPT does not give proper information, it won’t be more simple to update it, as it depends on OpenAI algorithm and procedure to update its training model.

Here is an example of the “limitations” of chatGTP as it currently stands… As you can see, not very useful!

Marc.

<PastedGraphic-1.png>

...
StackOverflow traffic dips as developers turn to ChatGPT for solutions | TechCabal Mark Svancarek Principal Program Manager, Corporate Standards Group Privacy & Regulatory Affairs Microsoft marksv@microsoft.com www.linkedin.com/in/marksv T: [+1] 425-706-2773 <image001.gif> _______________________________________________ UA-EAI mailing list UA-EAI@icann.org https://mm.icann.org/mailman/listinfo/ua-eai _______________________________________________ By submitting your personal data, you consent to the processing of your personal data for purposes of subscribing to this mailing list accordance with the ICANN Privacy Policy (https://www.icann.org/privacy/policy) and the website Terms of Service (https://www.icann.org/privacy/tos). You can visit the Mailman link above to change your membership status or configuration, including unsubscribing, setting digest-style delivery or disabling delivery altogether (e.g., for a vacation), and so on.

John Levine

3:55 a.m.

It appears that Mark Svancarek (CELA) via UA-EAI <marksv@microsoft.com> said:

...

-=-=-=-=-=- -=-=-=-=-=- -=-=-=-=-=-

You may recall that one of our recent EAI projects was to update the obsolete and incorrect guidance on sites like Stackoverflow and Serverfault regarding Universal Acceptance. You may also recall that the policies and processes of such sites made it almost impossible to update old conversations like the ones discussing parsing and validating email addresses.

I wonder if we will have a new opportunity to address this problem.

I think there are much easier ways to get kicked off Stackoverflow than sending int GPT blather. I see why they lock old converations, even though as we all know it sometimes preseves wrongness in amber, but I doubt it'll change. Perhaps ask a new similar question, provide correct answers (written by non-hallucinating humans) and get all your friends to upvote it. R's, John

1236

Age (days ago)

1238

Last active (days ago)

List overview

Download

14 comments

8 participants

participants (8)

Arnt Gulbrandsen
Asmus Freytag
John Levine
Marc Blanchet
Mark Svancarek (CELA)
Mark W. Datysgeld
Martin J. Dürst
Nicolas Fiumarelli

Stackoverflow and ChatGPT

tags

participants (8)