JEP-0106: JID Escaping

This JEP specifies a mechanism that enables the display of Jabber Identifiers (JIDs) with characters prohibited by the Nodeprep profile of stringprep.


NOTICE: This JEP is currently within Last Call or under consideration by the Jabber Council for advancement to the next stage in the JSF standards process. For further details, visit <http://www.jabber.org/council/queue.shtml>.


JEP Information

Status: Proposed
Type: Standards Track
Number: 0106
Version: 0.4
Last Updated: 2005-04-04
JIG: Standards JIG
Approving Body: Jabber Council
Dependencies: XMPP Core, JEP-0030
Supersedes: None
Superseded By: None
Short Name: jid#20;escaping

Author Information

Joe Hildebrand

Email: jhildebrand@jabber.com
JID: hildjj@jabber.org

Peter Saint-Andre

Email: stpeter@jabber.org
JID: stpeter@jabber.org

Legal Notice

This Jabber Enhancement Proposal is copyright 1999 - 2005 by the Jabber Software Foundation (JSF) and is in full conformance with the JSF's Intellectual Property Rights Policy <http://www.jabber.org/jsf/ipr-policy.shtml>. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0 or later (the latest version is presently available at <http://www.opencontent.org/openpub/>).

Discussion Venue

The preferred venue for discussion of this document is the Standards-JIG discussion list: <http://mail.jabber.org/mailman/listinfo/standards-jig>.

Relation to XMPP

The Extensible Messaging and Presence Protocol (XMPP) is defined in the XMPP Core (RFC 3920) and XMPP IM (RFC 3921) specifications contributed by the Jabber Software Foundation to the Internet Standards Process, which is managed by the Internet Engineering Task Force in accordance with RFC 2026. Any protocols defined in this JEP have been developed outside the Internet Standards Process and are to be understood as extensions to XMPP rather than as an evolution, development, or modification of XMPP itself.

Conformance Terms

The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.


Table of Contents

1. Introduction
2. Requirements
3. Use Cases
3.1. Encoding Transformation
3.2. Decoding Transformation
3.3. Discovery
4. Business Rules
4.1. Exceptions
4.2. Processing
5. Security Considerations
6. IANA Considerations
7. Jabber Registrar Considerations
Notes
Revision History


1. Introduction

XMPP Core [1] defines the Nodeprep profile of stringprep (RFC 3454 [2]), which specifies that the following characters are invalid in the node identifier portion of a JID:

This restriction is a hardship for users who have these characters in their chosen usernames, particularly in the case of the ' character, which is common in names like O'Hara and D'Artagnan. The restriction is especially onerous if existing email addresses are mapped to JIDs, since some of the foregoing characters are allowed in the username portion of an email address (e.g., the characters & ' / as described in sections 3.2.4 and 3.2.5 of RFC 2822 [3]).

If the & character had not been in the foregoing list, then normal XML escaping conventions could have been used, with the result that D'Artagnan (for example) could have been rendered as D&apos;artagnan [sic]. However, since there are good reasons for each of the prohibited characters shown above, another escaping mechanism is needed. Although it might have been desirable to use URL encoding (e.g., %27 for the ' character), that approach was rejected since the % character is such an often-used character in JIDs (e.g, to replace the @ character in gateway addresses). Therefore, a new mechanism is described herein to escape only the foregoing characters and only in the node identifier portion of Jabber IDs.

2. Requirements

This JEP addresses the following requirements only:

  1. The escaping mechanism shall apply to the node identitier portion of a JID only, and MUST NOT be applied to domain identifiers or resource identifiers.
  2. Escaped JIDs MUST conform to the definition of a Jabber ID as specified in RFC 3920, including the Nodeprep profile of stringprep. In particular this means that even after passing through Nodeprep, the JID MUST be valid, with the result that Unicode look-alikes like U+02BC (Modifier Letter Apostrophe) cannot be used.
  3. It MUST NOT be possible for clients to use this escaping mechanism to avoid the goal of stringprep; namely, that JIDs that look alike should have same character representation after being processed by stringprep. Therefore, this mechanism SHOULD NOT be applied to any characters that are not in the list of foregoing list of characters forbidden in node identifiers.
  4. Existing JIDs that include portions of the escaping mechanism MUST continue to be valid.
  5. The escaping mechanism SHOULD NOT place undue strain upon server implementations; implementations or deployments that do not need to unescape SHOULD be able to ignore the escaping mechanism.

3. Use Cases

All transformations are exactly as specified. CASE IS SIGNIFICANT. Lowercase was selected since Nodeprep will case fold to lowercase.

3.1 Encoding Transformation

The following escaping transformations MAY be used by a conforming entity. Typically, this will be completed only by a client that is retrieving information from a user in unescaped form, or by a gateway to some external system (e.g., email or LDAP) that needs to generate a JID.

Table 1: Mapping from Unescaped to Encoded Characters

Unescaped Character Encoded Character
<space> #20;
" #22;
# #23;
& #26;
' #27;
/ #2f;
: #3a;
< #3c;
> #3e;
@ #40;

Example 1. JID Encoding: Porthos starts a chat, typing into his client the JID d'artagnan@musketeers.bourbon.gov:

<message 
    from='porthos@musketeers.bourbon.gov/gate'
    to='d#27;artagnan@musketeers.bourbon.gov'
    type='chat'>
  <body>And do you always forget your eyes when you run?</body>
</message>

3.2 Decoding Transformation

The opposite unescaping transformations MAY be used by a conforming entity. Typically, this is only done by a client that wants to display JIDs, or by a gateways to some external system (e.g., email or LDAP) that needs to generate identifiers for foreign systems.

Table 2: Mapping from Encoded to Decoded Characters

Encoded Character Decoded Character
#20; <space>
#22; "
#23; #
#26; &
#27; '
#2f; /
#3a; :
#3c; <
#3e; >
#40; @

Example 2. JID Encoding: D'Artagnan the elder sends SMTP mail through a gateway:

<message 
    from='d#27;artagnan@gascon.fr/elder'
    to='tréville%musketeers.bourbon.gov@smtp.jabber.org'>
  <body>I recommend my son to you.</body>
</message>

3.3 Discovery

If an entity wants to discover whether another entity supports JID escaping, it MUST send a disco#info request to the other entity as specified in Service Discovery [4].

Example 3. Client requests features

<iq
    type='get'
    from='porthos@musketeers.bourbon.gov/gate'
    to='irc.shakespeare.lit'
    id='info1'>
  <query xmlns='http://jabber.org/protocol/disco#info'/>
</iq>

If the queried entity supports JID escaping, it MUST return a jid#20;escaping [sic] feature in its reply.

Example 4. Service responds with features

<iq
    type='get'
    to='porthos@musketeers.bourbon.gov/gate'
    from='irc.shakespeare.lit'
    id='info1'>
  <query xmlns='http://jabber.org/protocol/disco#info'>
...
    <feature var='jid#20;escaping'/>
  </query>
</iq>

When a client attempts to communicate with another entity through a gateway, it needs to know which encoding mechanism to use. A client MUST assume that the gateway does not support the JID escaping mechanism unless it explicitly discovers that fact via Service Discovery as shown above. If there any errors in the service discovery exchange or if support for the jid#20;escaping feature is not discovered, the client SHOULD proceed as follows:

  1. If the gateway supports the 'jabber:iq:gateway' protocol (as specified in Gateway Interaction [5]), use that protocol.
  2. If the gateway does not support the 'jabber:iq:gateway' protocol, use traditional escaping mechanisms (such as the transformation of the @ character to the % character).

4. Business Rules

4.1 Exceptions

In order to maintain as much backward compatibility as possible, JIDs that contain partial escape sequences, or escape sequences that are not on the list, MUST be ignored.

Example 5. Partial escape sequence

foo#bar is not modified by escaping or unescaping transformations

Example 6. Invalid escape sequence

foob#41;r is not modified by escaping or unescaping transformations

4.2 Processing

As far as the bulk of the system is concerned, an escaped JID has no special processing associated with it. Clients SHOULD render them unescaped. Servers MAY unescape them for communication with external systems (e.g. LDAP), but only AFTER stringprep has been applied. The unescape transformation MUST be NFKC-safe -- i.e., it must conform to Unicode normalization form KC (see Appendix B.3 of RFC 3454). An entity MUST NOT use the unescaped version in any protocol sent to another entity, and MUST NOT use the unescaped version to compare with another JID.

5. Security Considerations

Entities that enforce JID escaping MUST compare unescaped versions, otherwise a JID conflict could occur.

6. IANA Considerations

This JEP requires no interaction with the Internet Assigned Numbers Authority (IANA) [6].

7. Jabber Registrar Considerations

The Jabber Registrar [7] shall include the jid#20;escaping [sic] feature in its registry of service discovery features.


Notes

1. RFC 3920: Extensible Messaging and Presence Protocol (XMPP): Core <http://www.ietf.org/rfc/rfc3920.txt>.

2. RFC 3454: Preparation of Internationalized Strings (stringprep) < http://www.ietf.org/rfc/rfc3454.txt >.

3. RFC 2822: Internet Message Format <http://www.ietf.org/rfc/rfc2822.txt>.

4. JEP-0030: Service Discovery <http://www.jabber.org/jeps/jep-0030.html>.

5. JEP-0100: Gateway Interaction <http://www.jabber.org/jeps/jep-0100.html>.

6. The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique parameter values for Internet protocols, such as port numbers and URI schemes. For further information, see <http://www.iana.org/>.

7. The Jabber Registrar maintains a list of reserved Jabber protocol namespaces as well as registries of parameters used in the context of protocols approved by the Jabber Software Foundation. For further information, see <http://www.jabber.org/registrar/>.


Revision History

Version 0.4 (2005-04-04)

Corrected several small textual errors and ambiguities; slightly reorganized textual flow. (psa)

Version 0.3 (2005-03-16)

Clarified relationship between JID escaping and traditional client proxy gateway behavior; fixed several small errors. (psa)

Version 0.2 (2003-10-21)

Editorial cleanup; added security considerations. (psa)

Version 0.1 (2003-07-21)

Initial version. (jjh)


END