Quoting Andrew:
.., many people do not realise that there is no strict technical restriction at all of what eight bit octet you can put in your zone. DNS labels are made of octets. If you want, in your zone, to put any series of bits you like in there, you can do so. This means that you could just plop UTF-8 directly into the zone; and some people have done this. But, the RFCs (STD 13) also say that it would be better to stick to the hostname rules ("letter, digit, hyphen"). So, in the TLD space, we have mostly stuck to this, for maximum interoperability on the Internet. The closer you are to the root, the more conservative you need to be, since things will break otherwise.
The root is not just constrained by the hostname rule. To ensure that a domain name cannot be confused with a numerical IP address, RFC 1123 requires that "at least the highest-level component label will be alphabetic". Strictly speaking, this bars the ASCII-encoded form of an IDN TLD label (the "A-label") from the root (since every A-label contains hyphens and many also include digits). The necessary clarification of what is meant by "alphabetic", explicitly to permit the inclusion of A-labels in the root, is provided in <http://tools.ietf.org/html/draft-liman-tld-names-04>. (Since that draft recently expired, I assume subsequent action has been taken to address the underlying issue. ??) /Cary