RFC 5122 |
TOC |
|
This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the “Internet Official Protocol Standards” (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited.
This document defines the use of Internationalized Resource Identifiers (IRIs) and Uniform Resource Identifiers (URIs) in identifying or interacting with entities that can communicate via the Extensible Messaging and Presence Protocol (XMPP).
This document obsoletes RFC 4622.
RFC 5122 |
TOC |
1.
Introduction
1.1.
Terminology
2.
Use of XMPP IRIs and URIs
2.1.
Rationale
2.2.
Form
2.3.
Authority Component
2.4.
Path Component
2.5.
Query Component
2.6.
Fragment Identifier Component
2.7.
Generation of XMPP IRIs/URIs
2.8.
Processing of XMPP IRIs/URIs
2.9.
Internationalization
3.
IANA Registration of xmpp URI Scheme
3.1.
URI Scheme Name
3.2.
Status
3.3.
URI Scheme Syntax
3.4.
URI Scheme Semantics
3.5.
Encoding Considerations
3.6.
Applications/Protocols That Use This URI Scheme Name
3.7.
Interoperability Considerations
3.8.
Security Considerations
3.9.
Contact
3.10.
Author/Change Controller
3.11.
References
4.
IANA Considerations
5.
Security Considerations
5.1.
Reliability and Consistency
5.2.
Malicious Construction
5.3.
Back-End Transcoding
5.4.
Sensitive Information
5.5.
Semantic Attacks
5.6.
Spoofing
6.
Acknowledgements
7.
References
7.1.
Normative References
7.2.
Informative References
Appendix A.
Differences From RFC 4622
Appendix B.
Copying Conditions
TOC |
The Extensible Messaging and Presence Protocol (XMPP) is a streaming XML technology that enables any two entities on a network to exchange well-defined but extensible XML elements (called "XML stanzas") at a rate close to real time.
As specified in [XMPP‑CORE] (Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Core,” October 2004.), entity addresses as used in communications over an XMPP network must not be prepended with a Uniform Resource Identifier (URI) scheme (as specified in [URI] (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.)). However, applications external to an XMPP network may need to identify XMPP entities either as URIs or, in a more modern fashion, as Internationalized Resource Identifiers (IRIs; see [IRI] (Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs),” January 2005.)). Examples of such external applications include databases that need to store XMPP addresses and non-native user agents such as web browsers and calendaring applications that provide interfaces to XMPP services.
The format for an XMPP address is defined in [XMPP‑CORE] (Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Core,” October 2004.). Such an address may contain nearly any Unicode character [UNICODE] (The Unicode Consortium, “The Unicode Standard, Version 3.2.0,” 2000.) and must adhere to various profiles of stringprep [STRINGPREP] (Hoffman, P. and M. Blanchet, “Preparation of Internationalized Strings ("STRINGPREP"),” December 2002.). The result is that an XMPP address is fully internationalizable and is very close to being an IRI without a scheme. However, given that there is no freestanding registry of IRI schemes, it is necessary to define XMPP identifiers primarily as URIs rather than as IRIs, and to register an XMPP URI scheme instead of an IRI scheme. Therefore, this document does the following:
TOC |
This document inherits terminology from [IRI] (Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs),” January 2005.), [URI] (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.), and [XMPP‑CORE] (Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Core,” October 2004.).
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 (Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” March 1997.) [TERMS].
TOC |
TOC |
As described in [XMPP‑IM] (Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Instant Messaging and Presence,” October 2004.), instant messaging and presence applications of XMPP must handle im: and pres: URIs (as specified by [CPIM] (Peterson, J., “Common Profile for Instant Messaging (CPIM),” August 2004.) and [CPP] (Peterson, J., “Common Profile for Presence (CPP),” August 2004.)). However, there are many other applications of XMPP (including network management, workflow systems, generic publish-subscribe, remote procedure calls, content syndication, gaming, and middleware), and these applications do not implement instant messaging and presence semantics. Neither does a generic XMPP entity implement the semantics of any existing URI scheme, such as the http:, ftp:, or mailto: scheme. Therefore, it is appropriate to define a new URI scheme that makes it possible to identify or interact with any XMPP entity (not just instant messaging and presence entities) as an IRI or URI.
XMPP IRIs and URIs are defined for use by non-native interfaces and applications. In order to ensure interoperability on XMPP networks, when data is routed to an XMPP entity (e.g., when an XMPP address is contained in the 'to' or 'from' attribute of an XML stanza) or an XMPP entity is otherwise identified in standard XMPP protocol elements, the entity MUST be addressed as <[node@]domain[/resource]> (i.e., without a prepended scheme), where the "node identifier", "domain identifier", and "resource identifier" portions of an XMPP address conform to the definitions provided in Section 3 of [XMPP‑CORE] (Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Core,” October 2004.).
Note: For historical reasons, the term "resource identifier" is used in XMPP to refer to the optional portion of an XMPP address that follows the domain identifier and the "/" separator character (for details, refer to Section 3.4 of [XMPP‑CORE] (Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Core,” October 2004.)); this use of the term "resource identifier" is not to be confused with the meanings of "resource" and "identifier" provided in Section 1.1 of [URI] (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.).
XMPP IRIs and URIs are defined primarily for the purpose of identification rather than of interaction (regarding this distinction, see Section 1.2.2 of [URI] (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.)). The "Internet resource" identified by an XMPP IRI or URI is an entity that can communicate via XMPP over a network. An XMPP IRI or URI can contain additional information above and beyond the identified resource; in particular, as described under Section 2.5 (Query Component) a query component can be included to specify suggested semantics for an interaction with the identified resource. It is envisioned that when an XMPP application resolves an XMPP IRI or URI containing suggested interaction semantics, the application will generate an XMPP stanza and send it to the identified resource, where the generated stanza may include user or application inputs that are consistent with the suggested interaction semantics (for details, see Section 2.8.1 (Processing Method)).
TOC |
As described in [XMPP‑CORE] (Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Core,” October 2004.), an XMPP address used natively on an XMPP network is a string of Unicode characters that (1) conforms to a certain set of stringprep [STRINGPREP] (Hoffman, P. and M. Blanchet, “Preparation of Internationalized Strings ("STRINGPREP"),” December 2002.) profiles and IDNA restrictions [IDNA] (Faltstrom, P., Hoffman, P., and A. Costello, “Internationalizing Domain Names in Applications (IDNA),” March 2003.), (2) follows a certain set of syntax rules, and (3) is encoded as UTF-8 [UTF‑8] (Yergeau, F., “UTF-8, a transformation format of ISO 10646,” November 2003.). The form of such an address can be represented using Augmented Backus-Naur Form [ABNF] (Crocker, D. and P. Overell, “Augmented BNF for Syntax Specifications: ABNF,” October 2005.) as:
[ node "@" ] domain [ "/" resource ]
In this context, the "node" and "resource" rules rely on distinct profiles of stringprep [STRINGPREP] (Hoffman, P. and M. Blanchet, “Preparation of Internationalized Strings ("STRINGPREP"),” December 2002.), and the "domain" rule relies on the concept of an internationalized domain name as described in [IDNA] (Faltstrom, P., Hoffman, P., and A. Costello, “Internationalizing Domain Names in Applications (IDNA),” March 2003.). (Note: There is no need to refer to punycode in the IRI syntax itself, since any punycode representation would occur only inside an XMPP application in order to represent internationalized domain names. However, it is the responsibility of the processing application to convert IRI syntax [IRI] (Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs),” January 2005.) into IDNA syntax [IDNA] (Faltstrom, P., Hoffman, P., and A. Costello, “Internationalizing Domain Names in Applications (IDNA),” March 2003.) before addressing XML stanzas to the specified entity on an XMPP network.)
Certain characters are allowed in XMPP node identifiers and XMPP resource identifiers but not in the relevant portion of an IRI or URI. The characters are as follows:
- In node identifiers:
- [ \ ] ^ ` { | }
- In resource identifiers:
- " < > [ \ ] ^ ` { | }
The node identifier characters are not allowed in userinfo by the sub-delims rule and the resource identifier characters are not allowed in segment by the pchar rule. These characters MUST be percent-encoded when transforming an XMPP address into an XMPP IRI or URI.
Naturally, in order to be converted into an IRI or URI, an XMPP address must be prepended with a scheme (specifically, the xmpp scheme) and may also need to undergo transformations that adhere to the rules defined in [IRI] (Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs),” January 2005.) and [URI] (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.). Furthermore, in order to enable more advanced interaction with an XMPP entity rather than simple identification, it is desirable to take advantage of additional aspects of URI syntax and semantics, such as authority components, query components, and fragment identifier components.
Therefore, the ABNF syntax for an XMPP IRI is defined as shown below using Augmented Backus-Naur Form specified by [ABNF] (Crocker, D. and P. Overell, “Augmented BNF for Syntax Specifications: ABNF,” October 2005.), where the "ifragment", "ihost", and "iunreserved" rules are defined in [IRI] (Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs),” January 2005.), the "pct-encoded" rule is defined in [URI] (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.):
xmppiri = "xmpp" ":" ihierxmpp [ "?" iquerycomp ] [ "#" ifragment ] ihierxmpp = iauthpath / ipathxmpp iauthpath = "//" iauthxmpp [ "/" ipathxmpp ] iauthxmpp = inodeid "@" ihost ipathxmpp = [ inodeid "@" ] ihost [ "/" iresid ] inodeid = *( iunreserved / pct-encoded / nodeallow ) nodeallow = "!" / "$" / "(" / ")" / "*" / "+" / "," / ";" / "=" iresid = *( iunreserved / pct-encoded / resallow ) resallow = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ":" / ";" / "=" iquerycomp = iquerytype [ *ipair ] iquerytype = *iunreserved ipair = ";" ikey "=" ivalue ikey = *iunreserved ivalue = *( iunreserved / pct-encoded )
However, the foregoing syntax is not appropriate for inclusion in the registration of the xmpp URI scheme, since the IANA recognizes only URI schemes and not IRI schemes. Therefore, the ABNF syntax for an XMPP URI rather than for IRI is defined as shown in Section 3.3 (URI Scheme Syntax) of this document (see below under "IANA Registration"). If it is necessary to convert the IRI syntax into URI syntax, an application MUST adhere to the mapping procedure specified in Section 3.1 of [IRI] (Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs),” January 2005.).
The following is an example of a basic XMPP IRI/URI used for purposes of identifying a node associated with an XMPP server:
xmpp:node@example.com
Descriptions of the various components of an XMPP IRI/URI are provided in the following sections.
TOC |
As explained in Section 2.8 (Processing of XMPP IRIs/URIs) of this document, in the absence of an authority component, the processing application would authenticate as a configured user at a configured XMPP server. That is, the authority component section is unnecessary and should be ignored if the processing application has been configured with a set of default credentials.
In accordance with Section 3.2 of RFC 3986, the authority component is preceded by a double slash ("//") and is terminated by the next slash ("/"), question mark ("?"), or number sign ("#") character, or by the end of the IRI/URI. As explained more fully in Section 2.8.1 (Processing Method) of this document, the presence of an authority component signals the processing application to authenticate as the node@domain specified in the authority component rather than as a configured node@domain (see the Security Considerations section of this document regarding authentication). (While it is unlikely that the authority component will be included in most XMPP IRIs or URIs, the scheme allows for its inclusion, if appropriate.) Thus, the following XMPP IRI/URI indicates to authenticate as "guest@example.com":
xmpp://guest@example.com
Note well that this is quite different from the following XMPP IRI/URI, which identifies a node "guest@example.com" but does not signal the processing application to authenticate as that node:
xmpp:guest@example.com
Similarly, using a possible query component of "?message" to trigger an interface for sending a message, the following XMPP IRI/URI signals the processing application to authenticate as "guest@example.com" and to send a message to "support@example.com":
xmpp://guest@example.com/support@example.com?message
By contrast, the following XMPP IRI/URI signals the processing application to authenticate as its configured default account and to send a message to "support@example.com":
xmpp:support@example.com?message
TOC |
The path component of an XMPP IRI/URI identifies an XMPP address or specifies the XMPP address to which an XML stanza shall be directed at the end of IRI/URI processing.
For example, the following XMPP IRI/URI identifies a node associated with an XMPP server:
xmpp:example-node@example.com
The following XMPP IRI/URI identifies a node associated with an XMPP server along with a particular XMPP resource identifier associated with that node:
xmpp:example-node@example.com/some-resource
Inclusion of a node is optional in XMPP addresses, so the following XMPP IRI/URI simply identifies an XMPP server:
xmpp:example.com
TOC |
There are many potential use cases for encapsulating information in the query component of an XMPP IRI/URI for the purpose of specifying suggested interaction semantics (see Section 2.1 (Rationale)); examples include, but are not limited to:
Many of these potential use cases are application specific, and the full range of such applications cannot be foreseen in advance given the continued expansion in XMPP development; however, there is agreement within the Jabber/XMPP developer community that all the uses envisioned to date can be encapsulated via a "query type", optionally supplemented by one or more "key-value" pairs (this is similar to the "application/x-www-form-urlencoded" MIME type described in [HTML] (Raggett, D., “HTML 4.0 Specification,” April 1998.)).
As an example, an XMPP IRI/URI intended to launch an interface for sending a message to the XMPP entity "example-node@example.com" might be represented as follows:
xmpp:example-node@example.com?message
Similarly, an XMPP IRI/URI intended to launch an interface for sending a message to the XMPP entity "example-node@example.com" with a particular subject might be represented as follows:
xmpp:example-node@example.com?message;subject=Hello%20World
If the processing application does not understand query components or the specified query type, it MUST ignore the query component and treat the IRI/URI as consisting of, for example, <xmpp:example-node@example.com> rather than <xmpp:example-node@example.com?query>. If the processing application does not understand a particular key within the query component, it MUST ignore that key and its associated value.
As noted, there exist many kinds of XMPP applications (both actual and potential), and such applications may define query types and keys for use in the query component portion of XMPP URIs. The XMPP Registrar function (see [XEP‑0053] (Saint-Andre, P., “XMPP Registrar Function,” December 2006.)) of the XMPP Standards Foundation maintains a registry of such query types and keys at <http://www.xmpp.org/registrar/querytypes.html>. To help ensure interoperability, any application using the formats defined in this document SHOULD submit any associated query types and keys to that registry in accordance with the procedures specified in [XEP‑0147] (Saint-Andre, P., “XMPP URI Scheme Query Components,” September 2006.).
Note: The delimiter between key-value pairs is the ";" character instead of the "&" character used in many other URI schemes. This delimiter was chosen in order to avoid problems with escaping of the & character in HTML and XML applications.
TOC |
As stated in Section 3.5 of [URI] (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.), "The fragment identifier component of a URI allows indirect identification of a secondary resource by reference to a primary resource and additional identifying information." Because the resource identified by an XMPP IRI/URI does not make available any media type (see [MIME] (Freed, N. and N. Borenstein, “Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types,” November 1996.)) and therefore (in the terminology of [URI] (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.)) no representation exists at an XMPP resource, the semantics of the fragment identifier component in XMPP IRIs/URIs are to be "considered unknown and, effectively, unconstrained" (ibid.). Particular XMPP applications MAY make use of the fragment identifier component for their own purposes. However, if a processing application does not understand fragment identifier components or the syntax of a particular fragment identifier component included in an XMPP IRI/URI, it MUST ignore the fragment identifier component.
TOC |
TOC |
In order to form an XMPP IRI from an XMPP node identifier, domain identifier, and resource identifier, the generating application MUST first ensure that the XMPP address conforms to the rules specified in [XMPP‑CORE] (Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Core,” October 2004.), including encoding as a UTF-8 [UTF‑8] (Yergeau, F., “UTF-8, a transformation format of ISO 10646,” November 2003.) string and application of the relevant stringprep profiles [STRINGPREP] (Hoffman, P. and M. Blanchet, “Preparation of Internationalized Strings ("STRINGPREP"),” December 2002.). Because IRI syntax [IRI] (Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs),” January 2005.) specifies that the characters in an IRI are the original Unicode characters themselves [UNICODE] (The Unicode Consortium, “The Unicode Standard, Version 3.2.0,” 2000.), when generating an XMPP IRI the generating application MUST then decode the UTF-8 [UTF‑8] (Yergeau, F., “UTF-8, a transformation format of ISO 10646,” November 2003.) characters of a native XMPP address to their original Unicode form. The generating application then MUST concatenate the following:
In order to form an XMPP URI from the resulting IRI, an application MUST adhere to the mapping procedure specified in Section 3.1 of [IRI] (Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs),” January 2005.).
TOC |
Certain characters are allowed in the node identifier, domain identifier, and resource identifier portions of a native XMPP address but prohibited by the "inodeid", "ihost", and "iresid" rules of an XMPP IRI. Specifically, the "#" and "?" characters are allowed in node identifiers, and the "/", "?", "#", and "@" characters are allowed in resource identifiers, but these characters are used as delimiters in XMPP IRIs. In addition, the " " ([US‑ASCII] (American National Standards Institute, “Coded Character Set - 7-bit American Standard Code for Information Interchange,” 1986.) space) character is allowed in resource identifiers but prohibited in IRIs. Therefore, all the foregoing characters MUST be percent-encoded when transforming an XMPP address into an XMPP IRI.
Consider the following nasty node in an XMPP address:
nasty!#$%()*+,-.;=?[\]^_`{|}~node@example.com
That address would be transformed into the following XMPP IRI (split into two lines for layout purposes):
xmpp:nasty!%23$%25()*+,-.;=%3F%5B%5C%5D%5E_%60%7B%7C%7D~node @example.com
Consider the following repulsive resource in an XMPP address (split into two lines for layout purposes):
node@example.com /repulsive !#"$%&'()*+,-./:;<=>?@[\]^_`{|}~resource
That address would be transformed into the following XMPP IRI (split into three lines for layout purposes):
xmpp:node@example.com /repulsive%20!%23%22$%25&'()*+,-.%2F:;%3C= %3E%3F%40%5B%5C%5D%5E_%60%7B%7C%7D~resource
Furthermore, virtually any character outside the US-ASCII range [US‑ASCII] (American National Standards Institute, “Coded Character Set - 7-bit American Standard Code for Information Interchange,” 1986.) is allowed in an XMPP address and therefore also in an XMPP IRI, but URI syntax forbids such characters directly and specifies that such characters MUST be percent-encoded. In order to determine the URI associated with an XMPP IRI, an application MUST adhere to the mapping procedure specified in Section 3.1 of [IRI] (Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs),” January 2005.).
The following table may assist implementors in understanding the respective encodings and "carrier units" of the identifiers discussed in this documnent, namely: (1) native XMPP addresses, (2) IRIs, and (3) URIs. For details, refer to Section 3.5 (Encoding Considerations) of this document as well as Section 3 of [XMPP‑CORE] (Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Core,” October 2004.), Section 6.4 of [IRI] (Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs),” January 2005.), and Section 2 of [URI] (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.).
+--------------+-----------+-----------+ | Identifier | Encoding | Units | +--------------+-----------+-----------+ | XMPP address | UTF-8 | Octets | +--------------+-----------+-----------+ | IRI | Unicode | 16/32-bit | | | | values | +--------------+-----------+-----------+ | URI | Percent- | US-ASCII | | | encoded | | | | UTF-8 | | +--------------+-----------+-----------+
TOC |
Consider the following XMPP address:
<jiři@čechy.example/v Praze>
Note: The string "ř" stands for the Unicode character LATIN SMALL LETTER R WITH CARON, and the string "č" stands for the Unicode character LATIN SMALL LETTER C WITH CARON. The "&#x..." form is used in this document as a notational device to represent Unicode characters, following the "XML Notation" used in [IRI] (Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs),” January 2005.) to represent characters that cannot be rendered in ASCII-only documents. An XMPP IRI MUST contain the Unicode characters themselves, not the represetnation in XML Notation (in particular, note that the "#" character is forbidden in IRI syntax). An XMPP URI MUST properly escape such characters, as described below. The '<' and '>' characters are not part of the address itself but are provided to set off the address for legibility. (For those who do not understand the Czech language, this example could be Anglicized as "george@czech-lands.example/In Prague".)
In accordance with the process specified above, the generating application would do the following to generate a valid XMPP IRI from this address:
The result is the following XMPP IRI (note again that, in accordance with the "XML Notation" used in [IRI] (Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs),” January 2005.), the string "ř" stands for the Unicode character LATIN SMALL LETTER R WITH CARON and the string "č" stands for the Unicode character LATIN SMALL LETTER C WITH CARON; an XMPP IRI would contain the Unicode characters themselves).
<xmpp:jiři@čechy.example/v%20Praze>
In order to generate a valid XMPP URI from the foregoing IRI, the application MUST adhere to the procedure specified in Section 3.1 of [IRI] (Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs),” January 2005.), resulting in the following URI:
<xmpp:ji%C5%99i@%C4%8Dechy.example/v%20Praze>
TOC |
TOC |
If a processing application is presented with an XMPP URI and not with an XMPP IRI, it MUST first convert the URI into an IRI by following the procedure specified in Section 3.2 of [IRI] (Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs),” January 2005.).
In order to decompose an XMPP IRI for interaction with the entity it identifies, a processing application MUST separate:
At this point, the processing application MUST ensure that the resulting XMPP address conforms to the rules specified in [XMPP‑CORE] (Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Core,” October 2004.), including application of the relevant stringprep profiles [STRINGPREP] (Hoffman, P. and M. Blanchet, “Preparation of Internationalized Strings ("STRINGPREP"),” December 2002.). The processing application then would either (1) complete further XMPP handling itself or (2) invoke a helper application to complete XMPP handling; such XMPP handling would most likely consist of the following steps:
TOC |
It may help implementors to note that the first two steps of "further XMPP handling", as described at the end of Section 2.8.1 (Processing Method), are similar to HTTP authentication [HTTP‑AUTH] (Franks, J., Hallam-Baker, P., Hostetler, J., Lawrence, S., Leach, P., Luotonen, A., and L. Stewart, “HTTP Authentication: Basic and Digest Access Authentication,” June 1999.), while the next three steps are similar to the handling of mailto: URIs [MAILTO] (Hoffman, P., Masinter, L., and J. Zawinski, “The mailto URL scheme,” July 1998.).
As noted in Section 2.7.2 (Generation Notes) of this document, certain characters are allowed in the node identifier, domain identifier, and resource identifier portions of a native XMPP address but prohibited by the "inodeid", "ihost", and "iresid" rules of an XMPP IRI. The percent-encoded octets corresponding to these characters in XMPP IRIs MUST be transformed into the characters allowed in XMPP addresses when processing an XMPP IRI for interaction with the represented XMPP entity.
Consider the following nasty node in an XMPP IRI (split into two lines for layout purposes):
xmpp:nasty!%23$%25()*+,-.;=%3F%5B%5C%5D%5E_%60%7B%7C%7D~node @example.com
That IRI would be transformed into the following XMPP address:
nasty!#$%()*+,-.;=?[\]^_`{|}~node@example.com
Consider the following repulsive resource in an XMPP IRI (split into three lines for layout purposes):
xmpp:node@example.com /repulsive%20!%23%22$%25&'()*+,-.%2F:;%3C =%3E%3F%40%5B%5C%5D%5E_%60%7B%7C%7D~resource
That IRI would be transformed into the following XMPP address (split into two lines for layout purposes):
node@example.com /repulsive !#"$%&'()*+,-./:;<=>?@[\]^_`{|}~resource
TOC |
Consider the XMPP URI that resulted from the previous example:
<xmpp:ji%C5%99i@%C4%8Dechy.example/v%20Praze>
In order to generate a valid XMPP IRI from that URI, the application MUST adhere to the procedure specified in Section 3.2 of [IRI] (Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs),” January 2005.), resulting in the following IRI:
<xmpp:jiři@čechy.example/v%20Praze>
In accordance with the process specified above, the processing application would remove the "xmpp" scheme and ":" character to extract the XMPP address from this XMPP IRI, converting any percent-encoded octets from the "inodeid", "ihost", and "iresid" rules into their character equivalents (e.g., "%20" into the space character).
The result is this XMPP address:
<jiři@čechy.example/v Praze>
TOC |
Because XMPP addresses are UTF-8 strings [UTF‑8] (Yergeau, F., “UTF-8, a transformation format of ISO 10646,” November 2003.) and because octets outside the US-ASCII range [US‑ASCII] (American National Standards Institute, “Coded Character Set - 7-bit American Standard Code for Information Interchange,” 1986.) within XMPP addresses can be easily converted to percent-encoded octets, XMPP addresses are designed to work well with Internationalized Resource Identifiers [IRI] (Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs),” January 2005.). In particular, with the exceptions of stringprep verification, the conversion of syntax-relevant US-ASCII characters (e.g., "?"), and the conversion of percent-encoded octets from the "inodeid", "ihost", and "iresid" rules into their character equivalents (e.g., "%20" into the US-ASCII space character), an XMPP IRI can be constructed directly by prepending the "xmpp" scheme and ":" character to an XMPP address. Furthermore, an XMPP IRI can be converted into URI syntax by adhering to the procedure specified in Section 3.1 of [IRI] (Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs),” January 2005.), and an XMPP URI can be converted into IRI syntax by adhering to the procedure specified in Section 3.2 of [IRI] (Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs),” January 2005.), thus ensuring interoperability with applications that are able to process URIs but unable to process IRIs.
TOC |
In accordance with [URI‑SCHEMES] (Hansen, T., Hardie, T., and L. Masinter, “Guidelines and Registration Procedures for New URI Schemes,” February 2006.), this section provides the information required to register the xmpp URI scheme.
TOC |
xmpp
TOC |
permanent
TOC |
The syntax for an xmpp URI is defined below using Augmented Backus-Naur Form as specified by [ABNF] (Crocker, D. and P. Overell, “Augmented BNF for Syntax Specifications: ABNF,” October 2005.), where the "fragment", "host", "pct-encoded", and "unreserved" rules are defined in [URI] (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.):
xmppuri = "xmpp" ":" hierxmpp [ "?" querycomp ] [ "#" fragment ] hierxmpp = authpath / pathxmpp authpath = "//" authxmpp [ "/" pathxmpp ] authxmpp = nodeid "@" host pathxmpp = [ nodeid "@" ] host [ "/" resid ] nodeid = *( unreserved / pct-encoded / nodeallow ) nodeallow = "!" / "$" / "(" / ")" / "*" / "+" / "," / ";" / "=" resid = *( unreserved / pct-encoded / resallow ) resallow = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ":" / ";" / "=" querycomp = querytype [ *pair ] querytype = *( unreserved / pct-encoded ) pair = ";" key "=" value key = *( unreserved / pct-encoded ) value = *( unreserved / pct-encoded )
TOC |
The xmpp URI scheme identifies entities that natively communicate using the Extensible Messaging and Presence Protocol (XMPP), and is mainly used for identification rather than for resource location. However, if an application that processes an xmpp URI enables interaction with the XMPP address identified by the URI, it MUST follow the methodology defined in Section 2 of this document, Use of XMPP IRIs and URIs, to reconstruct the encapsulated XMPP address, connect to an appropriate XMPP server, and send an appropriate XMPP "stanza" (XML fragment) to the XMPP address. (Note: There is no MIME type associated with the xmpp URI scheme.)
TOC |
In addition to XMPP URIs, there will also be XMPP Internationalized Resource Identifiers (IRIs). Prior to converting an Extensible Messaging and Presence Protocol (XMPP) address into an IRI (and in accordance with [XMPP‑CORE] (Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Core,” October 2004.)), the XMPP address must be represented as a string of UTF-8 characters [UTF‑8] (Yergeau, F., “UTF-8, a transformation format of ISO 10646,” November 2003.) by the generating application (e.g., by transforming an application's internal representation of the address as a UTF-16 string into a UTF-8 string). Because IRI syntax [IRI] (Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs),” January 2005.) specifies that the characters in an IRI are the original Unicode characters themselves [UNICODE] (The Unicode Consortium, “The Unicode Standard, Version 3.2.0,” 2000.), when generating an XMPP IRI the generating application MUST decode the UTF-8 characters of a native XMPP address to their original Unicode form. Because URI syntax [URI] (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.) specifices that the characters in a URI are US-ASCII characters [US‑ASCII] (American National Standards Institute, “Coded Character Set - 7-bit American Standard Code for Information Interchange,” 1986.) only, when generating an XMPP URI the generating application MUST escape the Unicode characters of an XMPP IRI to US-ASCII characters by adhering to the procedure specified in RFC 3987.
TOC |
The xmpp URI scheme is intended to be used by interfaces to an XMPP network from non-native user agents, such as web browsers, as well as by non-native applications that need to identify XMPP entities as full URIs or IRIs.
TOC |
There are no known interoperability concerns related to use of the xmpp URI scheme. In order to help ensure interoperability, the XMPP Registrar function of the XMPP Standards Foundation maintains a registry of query types and keys that can be used in the query components of XMPP URIs and IRIs, located at <http://www.xmpp.org/registrar/querytypes.html>.
TOC |
See Section 5 of this document, Security Considerations.
TOC |
Peter Saint-Andre [mailto:stpeter@jabber.org, xmpp:stpeter@jabber.org]
TOC |
This scheme is registered under the IETF tree. As such, the IETF maintains change control.
TOC |
TOC |
This document updates the URI scheme registration created by RFC 4622. The registration template can be found in Section 3 (IANA Registration of xmpp URI Scheme) of this document. In order to help ensure interoperability, the XMPP Registrar function of the XMPP Standards Foundation maintains a registry of query types and keys that can be used in the query components of XMPP URIs and IRIs, located at <http://www.xmpp.org/registrar/querytypes.html>.
TOC |
Providing an interface to XMPP services from non-native applications introduces new security concerns. The security considerations discussed in [IRI] (Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs),” January 2005.), [URI] (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.), and [XMPP‑CORE] (Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Core,” October 2004.) apply to XMPP IRIs, and the security considerations discussed in [URI] (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.) and [XMPP‑CORE] (Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Core,” October 2004.) apply to XMPP URIs. In accordance with Section 2.7 of [URI‑SCHEMES] (Hansen, T., Hardie, T., and L. Masinter, “Guidelines and Registration Procedures for New URI Schemes,” February 2006.) and Section 7 of [URI] (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.), particular security considerations are specified in the following sections.
TOC |
Given that XMPP addresses of the form node@domain.tld are typically created via registration at an XMPP server or provisioned by an administrator of such a server, it is possible that such addresses may also be unregistered or deprovisioned. Therefore, the XMPP IRI/URI that identifies such an XMPP address may not reliably and consistently be associated with the same principal, account owner, application, or device.
XMPP addresses of the form node@domain.tld/resource are typically even more ephemeral (since a given XMPP resource identifier is typically associated with a particular, temporary session of an XMPP client at an XMPP server). Therefore, the XMPP IRI/URI that identifies such an XMPP address probably will not reliably and consistently be associated with the same session. However, the procedures specified in Section 10 of [XMPP‑CORE] (Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Core,” October 2004.) effectively eliminate any potential confusion that might be introduced by the lack of reliability and consistency for the XMPP IRI/URI that identifies such an XMPP address.
XMPP addresses of the form domain.tld are typically long-lived XMPP servers or associated services; although naturally it is possible for server or service administrators to de-commission the server or service at any time, typically the IRIs/URIs that identify such servers or services are the most reliable and consistent of XMPP IRIs/URIs.
XMPP addresses of the form domain.tld/resource are not yet common on XMPP networks; however, the reliability and consistency of XMPP IRIs/URIs that identify such XMPP addresses would likely fall somewhere between those that identify XMPP addresses of the form domain.tld and those that identify XMPP addresses of the form node@domain.tld.
TOC |
Malicious construction of XMPP IRIs/URIs is made less likely by the prohibition on port numbers in XMPP IRIs/URIs (since port numbers are to be discovered using DNS SRV records [DNS‑SRV] (Gulbrandsen, A., Vixie, P., and L. Esibov, “A DNS RR for specifying the location of services (DNS SRV),” February 2000.), as specified in [XMPP‑CORE] (Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Core,” October 2004.)).
TOC |
Because the base XMPP protocol is designed to implement the exchange of messages and presence information and not the retrieval of files or invocation of similar system functions, it is deemed unlikely that the use of XMPP IRIs/URIs would result in harmful dereferencing. However, if an XMPP protocol extension defines methods for information retrieval, it MUST define appropriate controls over access to that information. In addition, XMPP servers SHOULD NOT natively parse XMPP IRIs/URIs but instead SHOULD accept only the XML wire protocol specified in [XMPP‑CORE] (Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Core,” October 2004.) and any desired extensions thereto.
TOC |
The ability to interact with XMPP entities via a web browser or other non-native application may expose sensitive information (such as support for particular XMPP application protocol extensions) and thereby make it possible to launch attacks that are not possible or that are unlikely on a native XMPP network. Due care must be taken in deciding what information is appropriate for representation in XMPP IRIs or URIs.
In particular, advertising XMPP IRIs/URIs in publicly accessible locations (e.g., on websites) may make it easier for malicious users to harvest XMPP addresses from the authority and path components of XMPP IRIs/URIs and therefore to send unsolicited bulk communications to the users or applications represented by those addresses. Due care should be taken in balancing the benefits of open information exchange against the potential costs of unwanted communications.
To help prevent leaking of sensitive information, passwords and other user credentials are forbidden in the authority component of XMPP IRIs/URIs; in fact they are not needed, since the fact that authentication in XMPP occurs via the Simple Authentication and Security Layer [SASL] (Melnikov, A. and K. Zeilenga, “Simple Authentication and Security Layer (SASL),” June 2006.) makes it possible to use the SASL ANONYMOUS mechanism, if desired.
TOC |
Despite the existence of non-hierarchical URI schemes such as [MAILTO] (Hoffman, P., Masinter, L., and J. Zawinski, “The mailto URL scheme,” July 1998.), by association human users may expect all URIs to include the "//" characters after the scheme name and ":" character. However, in XMPP IRIs/URIs, the "//" characters precede the authority component rather than the path component. Thus, xmpp://guest@example.com indicates to authenticate as "guest@example.com", whereas xmpp:guest@example.com identifies the node "guest@example.com". Processing applications MUST clearly differentiate between these forms, and user agents SHOULD discourage human users from including the "//" characters in XMPP IRIs/URIs since use of the authority component is envisioned to be helpful only in specialized scenarios, not more generally.
TOC |
The ability to include effectively the full range of Unicode characters in an XMPP IRI may make it easier to execute certain forms of address mimicking (also called "spoofing"). However, XMPP IRIs are no different from other IRIs in this regard, and applications that will present XMPP IRIs to human users must adhere to best practices regarding address mimicking in order to help prevent attacks that result from spoofed addresses (e.g., the phenomenon known as "phishing"). For details, refer to the Security Considerations of [IRI] (Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs),” January 2005.).
TOC |
Thanks to Martin Duerst, Lisa Dusseault, Frank Ellerman, Roy Fielding, Joe Hildebrand, and Ralph Meijer for their comments.
TOC |
TOC |
[ABNF] | Crocker, D. and P. Overell, “Augmented BNF for Syntax Specifications: ABNF,” RFC 4234, October 2005 (TXT). |
[IRI] | Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs),” RFC 3987, January 2005 (TXT). |
[TERMS] | Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” BCP 14, RFC 2119, March 1997. |
[URI] | Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” STD 66, RFC 3986, January 2005 (TXT). |
[XMPP-CORE] | Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Core,” RFC 3920, October 2004 (TXT). |
TOC |
[CPIM] | Peterson, J., “Common Profile for Instant Messaging (CPIM),” RFC 3860, August 2004 (TXT). |
[CPP] | Peterson, J., “Common Profile for Presence (CPP),” RFC 3859, August 2004 (TXT). |
[DNS-SRV] | Gulbrandsen, A., Vixie, P., and L. Esibov, “A DNS RR for specifying the location of services (DNS SRV),” RFC 2782, February 2000 (TXT). |
[HTML] | Raggett, D., “HTML 4.0 Specification,” W3C REC REC-html40-19980424, April 1998. |
[HTTP-AUTH] | Franks, J., Hallam-Baker, P., Hostetler, J., Lawrence, S., Leach, P., Luotonen, A., and L. Stewart, “HTTP Authentication: Basic and Digest Access Authentication,” RFC 2617, June 1999 (TXT, HTML, XML). |
[IDNA] | Faltstrom, P., Hoffman, P., and A. Costello, “Internationalizing Domain Names in Applications (IDNA),” RFC 3490, March 2003 (TXT). |
[MAILTO] | Hoffman, P., Masinter, L., and J. Zawinski, “The mailto URL scheme,” RFC 2368, July 1998 (TXT, HTML, XML). |
[MIME] | Freed, N. and N. Borenstein, “Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types,” RFC 2046, November 1996 (TXT). |
[SASL] | Melnikov, A. and K. Zeilenga, “Simple Authentication and Security Layer (SASL),” RFC 4422, June 2006 (TXT). |
[STRINGPREP] | Hoffman, P. and M. Blanchet, “Preparation of Internationalized Strings ("STRINGPREP"),” RFC 3454, December 2002 (TXT). |
[UNICODE] | The Unicode Consortium, “The Unicode Standard, Version 3.2.0,” 2000. The Unicode Standard, Version 3.2.0 is defined by The Unicode Standard, Version 3.0 (Reading, MA, Addison-Wesley, 2000. ISBN 0-201-61633-5), as amended by the Unicode Standard Annex #27: Unicode 3.1 (http://www.unicode.org/reports/tr27/) and by the Unicode Standard Annex #28: Unicode 3.2 (http://www.unicode.org/reports/tr28/). |
[URI-SCHEMES] | Hansen, T., Hardie, T., and L. Masinter, “Guidelines and Registration Procedures for New URI Schemes,” RFC 4395, February 2006 (TXT). |
[US-ASCII] | American National Standards Institute, “Coded Character Set - 7-bit American Standard Code for Information Interchange,” ANSI X3.4, 1986. |
[UTF-8] | Yergeau, F., “UTF-8, a transformation format of ISO 10646,” STD 63, RFC 3629, November 2003 (TXT). |
[XEP-0009] | Adams, D., “Jabber-RPC,” XSF XEP 0009, February 2006. |
[XEP-0030] | Hildebrand, J., Millard, P., Eatmon, R., and P. Saint-Andre, “Service Discovery,” XSF XEP 0030, February 2007. |
[XEP-0045] | Saint-Andre, P., “Multi-User Chat,” XSF XEP 0045, April 2007. |
[XEP-0053] | Saint-Andre, P., “XMPP Registrar Function,” XSF XEP 0053, December 2006. |
[XEP-0060] | Millard, P., Saint-Andre, P., and R. Meijer, “Publish-Subscribe,” XSF XEP 0060, September 2006. |
[XEP-0072] | Forno, F. and P. Saint-Andre, “SOAP Over XMPP,” XSF XEP 0072, December 2005. |
[XEP-0077] | Saint-Andre, P., “In-Band Registration,” XSF XEP 0077, January 2006. |
[XEP-0147] | Saint-Andre, P., “XMPP URI Scheme Query Components,” XSF XEP 0147, September 2006. |
[XMPP-IM] | Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Instant Messaging and Presence,” RFC 3921, October 2004 (TXT). |
TOC |
Several errors were found in RFC 4622. This document corrects those errors. The resulting differences from RFC 4622 are as follows:
TOC |
Regarding this entire document or any portion of it, the author makes no guarantees and is not responsible for any damage resulting from its use. The author grants irrevocable permission to anyone to use, modify, and distribute it in any way that does not diminish the rights of anyone else to use, modify, and distribute it, provided that redistributed derivative works do not contain misleading author or version information. Derivative works need not be licensed under similar terms.
TOC |
Peter Saint-Andre | |
XMPP Standards Foundation | |
EMail: | stpeter@jabber.org |
URI: | xmpp:stpeter@jabber.org |
TOC |
Copyright © The IETF Trust (2008).
This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights.
This document and the information contained herein are provided on an “AS IS” basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org.