This JEP specifies a mechanism that enables the display of Jabber Identifiers (JIDs) with characters disallowed by the Nodeprep profile of stringprep.
NOTICE: This JEP is currently within Last Call or under consideration by the Jabber Council for advancement to the next stage in the JSF standards process. For further details, visit <http://www.jabber.org/council/queue.shtml>.
Type: Standards Track
Last Updated: 2005-04-21
JIG: Standards JIG
Approving Body: Jabber Council
Dependencies: XMPP Core, JEP-0030
Superseded By: None
Short Name: jid#20;escaping
This Jabber Enhancement Proposal is copyright 1999 - 2005 by the Jabber Software Foundation (JSF) and is in full conformance with the JSF's Intellectual Property Rights Policy <http://www.jabber.org/jsf/ipr-policy.shtml>. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0 or later (the latest version is presently available at <http://www.opencontent.org/openpub/>).
The preferred venue for discussion of this document is the Standards-JIG discussion list: <http://mail.jabber.org/mailman/listinfo/standards-jig>.
The Extensible Messaging and Presence Protocol (XMPP) is defined in the XMPP Core (RFC 3920) and XMPP IM (RFC 3921) specifications contributed by the Jabber Software Foundation to the Internet Standards Process, which is managed by the Internet Engineering Task Force in accordance with RFC 2026. Any protocols defined in this JEP have been developed outside the Internet Standards Process and are to be understood as extensions to XMPP rather than as an evolution, development, or modification of XMPP itself.
The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
XMPP Core  defines the Nodeprep profile of stringprep (RFC 3454 ), which specifies that the following Unicode code points are disallowed in the node identifier portion of a JID (hereafter we refer to these as "the disallowed characters"):
This restriction is an inconvenience for users who have one or more of the foregoing nine disallowed characters in their desired usernames, particularly in the case of the ' character, which is common in names like O'Hara and D'Artagnan. The restriction is a positive hardship if existing email addresses are mapped to JIDs, since some of the disallowed characters are allowed in the username portion of an email address (e.g., the characters & ' / as described in sections 3.2.4 and 3.2.5 of RFC 2822 ).
If the & character had not been in the list of disallowed characters, then normal XML escaping conventions (as specified in XML 1.0 ) could have been used, with the result that D'Artagnan (for example) could have been rendered as D'artagnan [sic]. Since there are good reasons for each of the disallowed characters, another escaping mechanism is needed.
It might have been desirable to use percent-encoding (e.g., %27 for the ' character) as specified in Section 2.1 of RFC 3986 . However, that approach was rejected since the % character is an often-used character in JIDs (e.g., to replace the @ character in gateway addresses) and the resulting ambiguity would have caused confusion and, most likely, misdelivered or undeliverable XML stanzas. Therefore, a new mechanism is described herein to escape only the disallowed characters and only in the node identifier portion of Jabber IDs.
This JEP addresses the following requirements:
Rather than encoding a disallowed character as %hexhex as in URI syntax, this JEP specifies encoding such a character as #hexhex; where "hexhex" is the hexadecimal value of the Unicode code point in question, ignoring the leading "00" in the code point (e.g., 27 for the ' character, resulting in an encoding of #27;). Full encoding and decoding transformations for all nine disallowed characters are provided in the following sections.
Note: All transformations are exactly as specified below. CASE IS SIGNIFICANT. Lowercase was selected since Nodeprep will case fold to lowercase for US-ASCII characters such as A, C, E, and F.
The encoding transformations are defined in the following table. Typically, encoding is performed only by a client that is processing information provided by a human user in unescaped form, or by a gateway to some external system (e.g., email or LDAP) that needs to generate a JID.
|Unescaped Character||Encoded Character|
<message email@example.com/gate' to='d#27;firstname.lastname@example.org' type='chat'> <body>And do you always forget your eyes when you run?</body> </message>
The decoding transformations are defined in the following table. Typically, decoding is performed only by a client that wants to display JIDs containing encoded characters to a human user, or by a gateways to some external system (e.g., email or LDAP) that needs to generate identifiers for foreign systems.
|Encoded Character||Decoded Character|
<message from='d#27;email@example.com/elder' to='trévillefirstname.lastname@example.org'> <body>I recommend my son to you.</body> </message>
If an entity needs to discover whether another entity supports JID escaping, it MUST send a disco#info request to the other entity as specified in Service Discovery .
<iq type='get' email@example.com/gate' to='irc.shakespeare.lit' id='info1'> <query xmlns='http://jabber.org/protocol/disco#info'/> </iq>
If the queried entity supports JID escaping, it MUST return a jid#20;escaping [sic] feature in its reply.
<iq type='get' firstname.lastname@example.org/gate' from='irc.shakespeare.lit' id='info1'> <query xmlns='http://jabber.org/protocol/disco#info'> ... <feature var='jid#20;escaping'/> </query> </iq>
In order to maintain as much backward compatibility as possible, partial escape sequences and escape sequences corresponding to characters not on the list of disallowed characters MUST be ignored.
the#1solution is not modified by encoding or decoding transformations.
#1;#2;#3; is not modified by encoding or decoding transformations.
foo#ba;r is not modified (to fooºr) by encoding or decoding transformations.
foob#41;r is not modified (to foobAr) by encoding or decoding transformations.
The following processing rules apply:
When a client attempts to communicate with another entity through a gateway, it needs to know which encoding mechanism to use. A client MUST assume that the gateway does not support the JID escaping mechanism unless it explicitly discovers support via Service Discovery as shown above. If there any errors in the service discovery exchange or if support for the jid#20;escaping [sic] feature is not discovered, the client SHOULD proceed as follows:
Entities that enforce JID escaping MUST compare unescaped/decoded versions, otherwise stanzas could be directed to an incorrect JID.
This JEP requires no interaction with the Internet Assigned Numbers Authority (IANA) .
The Jabber Registrar  shall include the jid#20;escaping [sic] feature in its registry of service discovery features.
1. RFC 3920: Extensible Messaging and Presence Protocol (XMPP): Core <http://www.ietf.org/rfc/rfc3920.txt>.
2. RFC 3454: Preparation of Internationalized Strings (stringprep) < http://www.ietf.org/rfc/rfc3454.txt >.
3. RFC 2822: Internet Message Format <http://www.ietf.org/rfc/rfc2822.txt>.
4. Extensible Markup Language (XML) 1.0 (Third Edition) <http://www.w3.org/TR/REC-xml/>.
5. RFC 3986: Uniform Resource Identifiers (URI): Generic Syntax <http://www.ietf.org/rfc/rfc3986.txt>.
6. JEP-0030: Service Discovery <http://www.jabber.org/jeps/jep-0030.html>.
7. JEP-0100: Gateway Interaction <http://www.jabber.org/jeps/jep-0100.html>.
8. The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique parameter values for Internet protocols, such as port numbers and URI schemes. For further information, see <http://www.iana.org/>.
9. The Jabber Registrar maintains a list of reserved Jabber protocol namespaces as well as registries of parameters used in the context of protocols approved by the Jabber Software Foundation. For further information, see <http://www.jabber.org/registrar/>.