TOC 
Network Working GroupJ. Miller
Internet-DraftP. Saint-Andre
Expires: April 23, 2003Jabber Software Foundation
 October 23, 2002

XMPP Core
draft-miller-xmpp-core-01

Status of this Memo

This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.

The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.

This Internet-Draft will expire on April 23, 2003.

Copyright Notice

Copyright (C) The Internet Society (2002). All Rights Reserved.

Abstract

This document describes the core features of the eXtensible Messaging and Presence Protocol (XMPP), which is used by numerous applications that are compatible with the open-source Jabber instant messaging system.



 TOC 

Table of Contents




 TOC 

1. Introduction

1.1 Overview

The eXtensible Messaging and Presence Protocol (XMPP) is an open, XML-based protocol for near-real-time messaging and presence. Currently, there exist multiple implementations of the protocol, mostly offered under the name of Jabber. In addition, there are countless deployments of these implementations, which provide instant messaging and presence services at thousands of domains to millions of end users. The current document defines the core constituents of XMPP; the specific protocols necessary to provide basic instant messaging and presence functionality are defined in XMPP IM[2].

1.2 Conventions Used in this Document

The capitalized key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119[1].

1.3 Discussion Venue

The authors welcome discussion and comments related to the topics presented in this document, preferably on the "xmppwg@jabber.org" mailing list (archives and subscription information are available at http://www.jabber.org/cgi-bin/mailman/listinfo/xmppwg/).

1.4 Intellectual Property Notice

This document is in full compliance with all provisions of Section 10 of RFC 2026. Parts of this specification use the term "jabber" for identifying URI schemes, namespaces, and other protocol syntax. Jabber[tm] is a registered trademark of Jabber, Inc. Jabber, Inc. grants permission to the IETF for use of Jabber trademark in association with this specification and its successors, if any.



 TOC 

2. Generalized Architecture

2.1 Overview

Although XMPP is not wedded to any specific network architecture, to this point it has usually been implemented via a typical client-server architecture, wherein a client utilizing XMPP accesses a server over a TCP[4] socket. While it can be helpful to keep that specific architecture in mind when seeking to understand XMPP, we have herein abstracted from any specific architecture and have described the architecture in a more generalized fashion.

The following diagram provides a high-level overview of this generalized architecture (where "-" represents communications that use XMPP and "=" represents communications that use any other protocol).

Connection Map

    S1       S2
     \      /
N1 -  H1 - H2 - N3
     /  \
N2 -     G1 = F1 = C1
        

The symbols are as follows:

2.2 Host

A host acts as an intelligent abstraction layer for XMPP communications. Its primary responsibility is to manage connections from or sessions for other entities (authorized nodes, services, and other hosts) and to route appropriately-addressed XML data among such entities. Most XMPP-compliant (Jabber) hosts also assume responsibility for the storage of data that is used by nodes or services (e.g., the contact list for each IM user, called a "roster"); in this case, the XML data is processed directly by the host itself on behalf of the node or service and is not routed to another entity.

2.3 Node

Most nodes connect directly to a host over a TCP socket and use XMPP to take full advantage of the functionality provided by a host and its associated services. (Clients on foreign messaging networks are also part of the architecture, made accessable via a gateway to that network.) Multiple resources (e.g., devices or locations) MAY connect simultaneously to a host on behalf of each authorized node, with each resource connecting over a discrete TCP socket and differentiated by the resource identifier of a JID (e.g., node@host/home vs. node@host/work). The port assigned by the IANA[5] for connections between a node and a host is 5222. For further details about node-to-host communications for the purpose of instant messaging and presence, refer to XMPP IM[2].

2.4 Service

In addition to the basic functionality provided by a host, additional functionality is made possible by connecting trusted services to a host. Examples include multi-user chat (a.k.a. conferencing), real-time alert systems, custom authentication modules, database connectivity, and translation to foreign messaging protocols. There is no set port on which services communicate with hosts; this is left up to the administrator of the service or host.

2.4.1 Gateway

A gateway is a special-purpose service whose primary function is to translate XMPP into the protocol(s) of another messaging system, as well as translate the return data back into XMPP. Examples are gateways to Internet Relay Chat (IRC), Short Message Service (SMS), SMTP, and foreign instant messaging networks such as Yahoo!, MSN, ICQ, and AIM.

2.5 Network

Because each host is identified by a network address (typically a DNS hostname) and because host-to-host communications are a simple extension of the node-to-host protocol, in practice the system consists of a network of hosts that inter-communicate. Thus node-a@host1 is able to exchange messages, presence, and other information with node-b@host2. This pattern is familiar from messaging protocols (such as SMTP) that make use of network addressing standards. The usual method for providing a connection between two hosts is to open a TCP socket on the IANA-assigned port 5269 and negotiate a connection using the Dialback Protocol.



 TOC 

3. Addressing Scheme

3.1 Overview

Any entity that can be considered a network endpoint (i.e., an ID on the network) and that can communicate using XMPP is considered a Jabber Entity. All such entities are uniquely addressable in a form that is consistent with the URI specification[11]. In particular, a valid Jabber Identifier (JID) contains a set of ordered elements formed of a domain identifier, node identifier, and resource identifier in the following format: [node@]domain[/resource].

All JIDs are based on the foregoing structure. The most common use of this structure is to identify an IM user, the host to which the user connects, and the user's active session or connection in the form of user@host/resource. However, other nodes are possible; for example, room-name@conference-service is a specific conference room that is offered by a multi-user chat service.

3.2 Domain Identifier

The domain identifier is the primary identifier and is the only required element of a JID (a simple domain identifier is a valid JID). It usually represents the network gateway or "primary" host to which other entities connect for XML routing and data management capabilities. However, the entity referenced by a domain identifier is not always a host, and may be a service that is addressed as a subdomain of a host and that provides functionality above and beyond the capabilities of a host (e.g., a multi-user chat service or a gateway to a foreign messaging system).

The domain identifier for every host or service that will communicate over a network SHOULD resolve to a Fully Qualified Domain Name, and a domain identifier SHOULD conform to RRC 952[6] and REF 1123[7]. Specifically, it is case-insensitive 7-bit ASCII and is limited to 255 bytes.

3.3 Node Identifier

The node identifier is an optional secondary identifier. It usually represents the entity requesting and using network access provided by the host (e.g., a client), although it can also represent other kinds of entities (e.g., a multi-user chat room associated with a conference service). The entity represented by a node identifier is addressed within the context of a specific domain. Node identifiers are restricted to 256 bytes. A node identifier may contain any Unicode character higher than #x20 with the exception of the following:

Case is preserved, but comparisons are made in case-normalized canonical form.

3.4 Resource Identifier

The resource identifer is an optional third identifier. It represents a specific session, connection (e.g., a device or location), or object (e.g., a participant in a multi-user chat room) belonging to a node. A node may maintain multiple resources simultaneously. A resource identifier is restricted to 256 bytes in length. A resource identifier MAY include any Unicode character greater than #x20, except #xFFFE and #xFFFF; if the Unicode character is a valid XML character as defined in Section 2.2 of the XML 1.0 specification[8], it MUST be suitably escaped for inclusion within an XML stream. Resource identifiers are case sensitive.

3.5 URIs

Full conformance with RFC 2396[11] would be valuable. This would most likely be effected through use of the 'im:' and 'pres:' URI schemes, resulting in URIs of the form "im:node@host" for exchanging instant messages and "pres:node@host" for exchanging presence. However, the use of such URIs has not yet been standardized.



 TOC 

4. XML Streams

4.1 Overview

Two fundamental concepts make possible the rapid, asynchronous exchange of relatively small payloads of structured information between presence-aware entities: XML streams and, as a result, discrete units of structured information that are referred to as "XML chunks". (Note: in this overview we use the example of communications between a node and host, however XML streams are more generalized and are used for communications between a wide range of entities [see Scope].)

On connecting to a host, a node initiates an XML stream by sending a properly namespaced <stream:stream> tag, and the host replies with a second XML stream back to the node. Within the context of an XML stream, a sender is able to route a discrete semantic unit of structured information to any recipient. This unit of structured information is a well-balanced XML chunk, such as a message, presence, or IQ chunk (a chunk of an XML document is said to be well-balanced if it matches production [43] content of XML 1.0 specification[8]). These chunks exist at the direct child level (depth=1) of the root stream element. The start of any XML chunk is unambiguously denoted by the element start tag at depth=1 (e.g., <presence>) and the end of any XML chunk is unambiguously denoted by the corresponding close tag at depth=1 (e.g., </presence>). Each XML chunk may contain child elements or CDATA sections as necessary in order to convey the desired information from the sender to the recipient. The session is closed at the node's request by sending a closing </stream:stream> tag to the host.

Thus a node's session with a host can be seen as two open-ended XML documents that are built up through the accumulation of the XML chunks that are sent over the course of the session (one from the node to the host and one from the host to the node). In essence, an XML stream acts as an envelope for all the XML chunks sent during a session. We can represent this graphically as follows:

|-------------------|
| open stream       |
|-------------------|
| <message to=''>   |
|   <body/>         |
| </message>        |
|-------------------|
| <presence to=''>  |
|   <show/>         |
| </presence>       |
|-------------------|
| <iq to=''>        |
|   <query/>        |
| </iq>             |
|-------------------|
| close stream      |
|-------------------|
        

4.2 Scope

XML streams function as containers for any XML chunks sent asynchronously between network endpoints. (We now generalize those endpoints by using the terms "initiating entity" and "receiving entity".) XML streams are used for the following types of communication:

These usages are differentiated through the inclusion of a namespace declaration in the stream from the initiating entity, which is mirrored in the reply from the receiving entity:

4.3 Restrictions

XML streams are used to transport a subset of XML. Specifically, XML streams SHOULD NOT contain processing instructions, non-predefined entities (as defined in Section 4.6 of the XML 1.0 specification[8]), comments, or DTDs. Any such XML data SHOULD be ignored.

4.4 Elements and Attributes

The attributes of the stream element are as follows:

We can summarize these values as follows:

      |  initiating to receiving  |  receiving to initiating
------------------------------------------------------------
to    |  JID of receiver          |  ignored
from  |  ignored                  |  JID of receiver
id    |  ignored                  |  session key
        

The stream element also contains the following namespace declarations:

In addition to the common data elements, the stream element MAY also contain <stream:error/> as a child element signifying that a stream-level error has occurred.

4.5 Stream Errors

Errors may occur at the level of the stream. Examples include the sending of invalid XML, the shutdown of a host, an internal server error such as the shutdown of a session manager, and an attempt by a node to authenticate as the same resource that is currently connected. If an error occurs at the level of the stream, the entity (initiating entity or receiving entity) that detects the error should send a stream error to the other entity specifying why the streams are being closed and then send a closing </stream:stream> tag. XML of the following form is sent within the context of an existing stream:

<stream:stream ...>
...
<stream:error>
  Error message (e.g., "Invalid XML")
</stream:error>
</stream:stream>
        

4.6 Example

The following is a simple stream-based session of a node on a host (where the NODE lines are sent from the node to the host, and the HOST lines are sent from the host to the node):

A simple session:

NODE: <stream:stream 
          to='host' 
          xmlns='jabber:client' 
          xmlns:stream='http://etherx.jabber.org/streams'>
HOST: <stream:stream 
          from='host' 
          id='id_123456789' 
          xmlns='jabber:client' 
          xmlns:stream='http://etherx.jabber.org/streams'>
NODE:   <message from='node@host' to='receiving-ID'> 
NODE:     <body>Watson come here, I need you!</body> 
NODE:   </message> 
HOST:   <message from='receiving-ID' to='node@host'> 
HOST:     <body>I'm on my way!</body> 
HOST:   </message> 
NODE: </stream:stream> 
HOST: </stream:stream>
        

These are in actuality a sending stream and a receiving stream, which can be viewed a-chronologically as two XML documents:

NODE: <stream:stream 
          to='host' 
          xmlns='jabber:client' 
          xmlns:stream='http://etherx.jabber.org/streams'>
NODE:   <message from='node@host' to='receiving-ID'> 
NODE:     <body>Watson come here, I need you!</body> 
NODE:   </message> 
NODE: </stream:stream> 

HOST: <stream:stream 
          from='host' 
          id='id_123456789' 
          xmlns='jabber:client' 
          xmlns:stream='http://etherx.jabber.org/streams'>
HOST:   <message from='receiving-ID' to='node@host'> 
HOST:     <body>I'm on my way!</body> 
HOST:   </message> 
HOST: </stream:stream>
        

A session gone bad:

NODE: <stream:stream 
          to='host' 
          xmlns='jabber:client' 
          xmlns:stream='http://etherx.jabber.org/streams'>
HOST: <stream:stream 
          from='host' 
          id='id_123456789' 
          xmlns='jabber:client' 
          xmlns:stream='http://etherx.jabber.org/streams'>
NODE: <message><body>Bad XML, no closing body tag!</message> 
HOST: <stream:error>Invalid XML</stream:error>
HOST: </stream:stream>
        

4.7 DTD

<!ELEMENT stream (#PCDATA | error?)*>
<!ATTLIST stream
  to            CDATA  #REQUIRED
  from          CDATA  #IMPLIED
  id            CDATA  #IMPLIED
  xml:lang      CDATA  #IMPLIED>
<!ELEMENT error (#PCDATA)>
        

4.8 Schema

<?xml version='1.0' encoding='UTF-8'?>
<xsd:schema
    xmlns:xsd='http://www.w3.org/2001/XMLSchema'
    targetNamespace='http://etherx.jabber.org/streams'
    xmlns='http://etherx.jabber.org/streams'
    elementFormDefault='qualified'>

  <xsd:element name='stream'>
    <xsd:complexType mixed='true'>
      <xsd:element ref='error' minOccurs='0' maxOccurs='1'/>
      <xsd:choice>
        <xsd:any 
             namespace='jabber:client' 
             maxOccurs='1'/>
        <xsd:any 
             namespace='jabber:component:accept' 
             maxOccurs='1'/>
        <xsd:any 
             namespace='jabber:component:connect' 
             maxOccurs='1'/>
        <xsd:any 
             namespace='jabber:server' 
             maxOccurs='1'/>
        <xsd:any 
             namespace='http://www.iana.org/assignments/sasl-mechanisms' 
             maxOccurs='1'/>
      </xsd:choice>
      <xsd:attribute name='to' type='xsd:string' use='optional'/>
      <xsd:attribute name='from' type='xsd:string' use='optional'/>
      <xsd:attribute name='id' type='xsd:string' use='optional'/>
      <xsd:attribute name='xml:lang' type='xsd:string' use='optional'/>
    </xsd:complexType>
  </xsd:element>

  <xsd:element name='error' type='xsd:string'/>

</xsd:schema>
        


 TOC 

5. Stream Authentication

XMPP includes two methods for enforcing authentication at the level of XML streams. When one entity is already known to another (i.e., there is an existing trust relationship between the entities such as that established when a node registers with a host or an administrator configures a host to trust a service), the preferred method for authenticating streams between the two entities uses an XMPP adaptation of the Simple Authentication and Security Layer (SASL)[9]. When there is no existing trust relationship between the two entities, such trust MAY be established based on existing trust in DNS; the authentication method used when two such entities are hosts is the server dialback protocol that is native to XMPP. Both of these methods are described in this section.

5.1 SASL Authentication

5.1.1 Overview

The Simple Authentication and Security Layer (SASL) provides a generalized method for adding authentication support to connection-based protocols. XMPP uses a generic XML namespace profile for SASL that conforms to section 4 ("Profiling Requirements") of RFC 2222[9] (the namespace identifier for this protocol is http://www.iana.org/assignments/sasl-mechanisms). If an entity (node, host, or service) is capable of authenticating by means of SASL, it MUST include the agreed-upon SASL namespace within the <stream:stream> tag it uses to initiate communications.

The following example shows the use of SASL in node authentication with a host, for which the steps involved are as follows:

This series of challenge/response pairs continues until one of three things happens:

Any character data contained within these elements MUST be encoded using base64.

5.1.2 Example

The following example shows the data flow for a node authenticating with a host using SASL.

Step 1: Node initiates stream to host:

<stream:stream 
    xmlns='jabber:client'
    xmlns:stream='http://etherx.jabber.org/streams'
    xmlns:sasl='http://www.iana.org/assignments/sasl-mechanisms'
    to='capulet.com'
    version='1.0'>
          

Step 2: Host responds with a stream tag sent to the node:

<stream:stream 
    xmlns='jabber:client'
    xmlns:stream='http://etherx.jabber.org/streams'
    xmlns:sasl='http://www.iana.org/assignments/sasl-mechanisms'
    id='12345678'
    version='1.0'>
          

Step 3: Host informs node of available authentication mechanisms as well as support for TLS:

<sasl:features>
  <mechanisms xmlns='http://www.iana.org/assignments/sasl-mechanisms'>
    <mechanism>DIGEST-MD5</mechanism>
    <mechanism>PLAIN</mechanism>
  </mechanisms>
  <starttls xmlns='http://www.ietf.org/rfc/rfc2246.txt'/>
</sasl:features>
          

Step 4: Node selects an authentication mechanism:

<sasl:auth>DIGEST-MD5</sasl:auth>
          

Step 5: Host sends a challenge to the node:

<sasl:challenge>
    cmVhbG09ImNhdGFjbHlzbS5jeCIsbm9uY2U9Ik9BNk1HOXRFUUdtMmhoIi
    xxb3A9ImF1dGgiLGNoYXJzZXQ9dXRmLTgsYWxnb3JpdGhtPW1kNS1zZXNz
</sasl:challenge>
          

Step 6: Node responds to the challenge:

<sasl:response>
    dXNlcm5hbWU9InJvYiIscmVhbG09ImNhdGFjbHlzbS5jeCIsbm9uY2U9Ik
    9BNk1HOXRFUUdtMmhoIixjbm9uY2U9Ik9BNk1IWGg2VnFUclJrIixuYz0w
    MDAwMDAwMSxxb3A9YXV0aCxkaWdlc3QtdXJpPSJqYWJiZXIvY2F0YWNseX
    NtLmN4IixyZXNwb25zZT1kMzg4ZGFkOTBkNGJiZDc2MGExNTIzMjFmMjE0
    M2FmNyxjaGFyc2V0PXV0Zi04
</sasl:response>
          

Step 7: Host sends another challenge to the node:

<sasl:challenge>
    cnNwYXV0aD1lYTQwZjYwMzM1YzQyN2I1NTI3Yjg0ZGJhYmNkZmZmZA==
</sasl:challenge>
          

Step 8: Node responds to the challenge:

<sasl:response/>
          

Step 9: Host informs node of successful authentication:

<sasl:success/>
          

Step 9 (alt): Host informs node of failed authentication:

<sasl:failure/>
          

5.1.3 DTD

The DTD for the sasl: namespace is as follows:

<!ELEMENT features ((mechanisms | starttls)*)>
<!ELEMENT mechanisms (mechanism)*>
<!ELEMENT mechanism (#PCDATA)>
<!ELEMENT starttls (#PCDATA)>
<!ELEMENT auth (#PCDATA)>
<!ELEMENT challenge (#PCDATA)>
<!ELEMENT response (#PCDATA)>
<!ELEMENT abort (#PCDATA)>
<!ELEMENT success (#PCDATA)>
<!ELEMENT failure (#PCDATA)>
        

5.1.4 Schema

<?xml version='1.0' encoding='UTF-8'?>
<xsd:schema
    xmlns:xsd='http://www.w3.org/2001/XMLSchema'
    targetNamespace='http://www.iana.org/assignments/sasl-mechanisms'
    xmlns='http://www.iana.org/assignments/sasl-mechanisms'
    elementFormDefault='qualified'>

  <xsd:element name='features'>
    <xsd:complexType>
      <xsd:choice> 
        <xsd:element ref='mechanisms' minOccurs='0' maxOccurs='1'/>
        <xsd:element ref='starttls' minOccurs='0' maxOccurs='1'/>
      </xsd:choice>
    </xsd:complexType>
  </xsd:element>

  <xsd:element name='mechanisms'>
    <xsd:complexType>
      <xsd:sequence minOccurs='0' maxOccurs='unbounded'>
        <xsd:element ref='mechanism'/>
      </xsd:sequence>
    </xsd:complexType>
  </xsd:element>

  <xsd:element name='mechanism' type='xsd:string'/>
  <xsd:element name='starttls' type='xsd:string'/>
  <xsd:element name='auth' type='xsd:string'/>
  <xsd:element name='challenge' type='xsd:string'/>
  <xsd:element name='response' type='xsd:string'/>
  <xsd:element name='abort' type='xsd:string'/>
  <xsd:element name='success' type='xsd:string'/>
  <xsd:element name='failure' type='xsd:string'/>

</xsd:schema>
        

5.2 Dialback Authentication

XMPP includes a protocol-level method for verifying that a connection between two hosts may be trusted. The method is called dialback and is used only within XML streams that are declared under the "jabber:server" namespace.

The purpose of the dialback protocol is to make server spoofing more difficult, and thus to make it more difficult to forge XML chunks. Dialback is not intended as a mechanism for securing or encrypting the streams between servers, only for helping to prevent the spoofing of a hostname and the sending of false data from it. Dialback is made possible by the existence of DNS, since one host can verify that another host which is connecting to it is authorized to represent a given host on the Jabber network. All DNS host resolutions must first resolve the host using an SRV[10] record of _jabber._tcp.host. If the SRV lookup fails, the fallback is a normal A lookup to determine the IP address, using the jabber-server port of 5269 assigned by the Internet Assigned Numbers Authority[5].

Note that the method used to generate and verify the keys used in the dialback protocol must take into account the hostnames being used, along with a secret known only by the receiving host and the random id per stream. Generating unique but verifiable keys is important to prevent common man-in-the-middle attacks and host spoofing.

In the description that follows we use the following terminology:

The following is a brief summary of the order of events in dialback:

  1. Originating Host establishes a connection to Receiving Host.
  2. Originating Host sends a 'key' value over the connection to Receiving Host.
  3. Receiving Host establishes a connection to Authoritative Host.
  4. Receiving Host sends the same 'key' value to Authoritative Host.
  5. Authoritative Host replies that key is valid or invalid.
  6. Receiving Host tells Originating Host whether it is authenticated or not.

We can represent this flow of events graphically as follows:

Originating               Receiving
   Host                     Host
-----------               ---------
    |                         |
    |  establish connection   |
    | ----------------------> |
    |                         |
    |   send stream header    |
    | ----------------------> |
    |                         |
    |  establish connection   |
    | <---------------------- |
    |                         |
    |   send stream header    |
    | <---------------------- |
    |                         |                   Authoritative
    |   send dialback key     |                       Host
    | ----------------------> |                   -------------
    |                         |                         |
                              |  establish connection   |
                              | ----------------------> |
                              |                         |
                              |   send stream header    |
                              | ----------------------> |
                              |                         |
                              |   send stream header    |
                              | <---------------------- |
                              |                         |
                              |   send dialback key     |
                              | ----------------------> |
                              |                         |
                              |  validate dialback key  |
                              | <---------------------- |
                              |
    |  report dialback result |
    | <---------------------- |
    |                         |
        

5.2.1 Dialback Protocol

The traffic sent between the hosts is as follows:

  1. Originating Host establishes connection to Receiving Host
  2. Originating Host sends a stream header to Receiving Host (the 'to' and 'from' attributes are not required):
    <stream:stream 
        xmlns:stream='http://etherx.jabber.org/streams'
        xmlns='jabber:server' 
        xmlns:db='jabber:server:dialback'>
                  

    Note: the value of the xmlns:db namespace declaration indicates to Receiving Host that the Originating Host supports dialback.

  3. Receiving Host sends a stream header back to Originating Host (the 'to' and 'from' attributes are not required):
    <stream:stream 
        xmlns:stream='http://etherx.jabber.org/streams'
        xmlns='jabber:server' 
        xmlns:db='jabber:server:dialback'
        id='457F9224A0...'>
                  
  4. Originating Host sends a dialback key to Receiving Host:
    <db:result 
        to='Receiving Host' 
        from='Originating Host'>
      98AF014EDC0...
    </db:result>
                  

    Note: this key is not examined by Receiving Host, since the Receiving Host does not keep information about Originating Host between sessions.

  5. Receiving Host now establishes a connection back to Originating Host, getting the Authoritative Host.
  6. Receiving Host sends Authoritative Host a stream header (the 'to' and 'from' attributes are not required):
    <stream:stream 
        xmlns:stream='http://etherx.jabber.org/streams'
        xmlns='jabber:server' 
        xmlns:db='jabber:server:dialback'>
                  
  7. Authoritative Host sends Receiving Host a stream header:
    <stream:stream 
        xmlns:stream='http://etherx.jabber.org/streams'
        xmlns='jabber:server' 
        xmlns:db='jabber:server:dialback' 
        id='1251A342B...'>
                    
  8. Receiving Host sends Authoritative Host a chunk indicating it wants Authoritative Host to verify a key:
    <db:verify 
        from='Receiving Host' 
        to='Originating Host' 
        id='457F9224A0...'>
      98AF014EDC0...
    </db:verify>
                    

    Note: passed here are the hostnames, the original identifier from Receiving Host's stream header to Originating Host in step 2, and the key Originating Host gave Receiving Host in step 3. Based on this information and shared secret information within the 'Originating Host' network, the key is verified. Any verifiable method can be used to generate the key.

  9. Authoritative Host sends a chunk back to Receiving Host indicating whether the key was valid or invalid:
    <db:result 
        from='Originating Host' 
        to='Receiving Host' 
        type='valid'
        id='457F9224A0...'/>
                    
    or
    <db:result 
        from='Originating Host' 
        to='Receiving Host' 
        type='invalid'
        id='457F9224A0...'/>
                    
  10. Receiving Host informs Originating Host of the result:
    <db:result 
        from='Receiving Host' 
        to='Originating Host' 
        type='valid'/>
                    

    Note: At this point the connection has either been validated via a type='valid', or reported as invalid. Once the connection is validated, data can be sent by the Originating Host and read by the Receiving Host; before that, all data chunks sent to Receiving Host SHOULD be dropped. As a final guard against domain spoofing, the Receiving Host MUST verify that all XML chunks received from the Originating Host include a 'from' attribute and that from address of each chunk includes the validated domain. In addition, all XML chunks of type message, presence, and IQ MUST include a 'to' attribute.



 TOC 

6. Common Data Elements

6.1 Overview

The common data elements for XMPP communications are <message/>, <presence/>, and <iq/>. These data elements are sent as direct children of the root <stream:stream/> element.

6.2 The Message Element

This section describes the valid attributes and child elements of the message element.

6.2.1 Attributes

A message chunk may possess the following attributes:

6.2.2 Children

A message chunk MAY contain zero or one of each of the following child elements (which may not contain mixed content):

As previously described under extended namespaces, a message chunk MAY also contain any properly-namespaced child element (other than the common data elements, stream elements, or defined children thereof).

6.2.3 DTD

<!ELEMENT message (( body? | subject? | thread? | 
                     error? | (#PCDATA) )*)>

<!ATTLIST message
  to CDATA #IMPLIED
  from CDATA #IMPLIED
  id CDATA #IMPLIED
  type ( normal | chat | groupchat | headline | error ) #IMPLIED
  xml:lang CDATA #IMPLIED
>

<!ELEMENT body (#PCDATA)>
<!ELEMENT subject (#PCDATA)>
<!ELEMENT thread (#PCDATA)>
<!ELEMENT error (#PCDATA)>
<!ATTLIST error code CDATA #REQUIRED>
          

6.2.4 Schema

<?xml version='1.0' encoding='UTF-8'?>
<xsd:schema
    xmlns:xsd='http://www.w3.org/2001/XMLSchema'
    targetNamespace='http://www.jabber.org/protocol'
    xmlns='http://www.jabber.org/protocol'
    elementFormDefault='qualified'>

  <xsd:element name='message'>
     <xsd:complexType mixed='true'>
        <xsd:choice> 
           <xsd:element ref='body' minOccurs='0' maxOccurs='1'/>
           <xsd:element ref='subject' minOccurs='0' maxOccurs='1'/>
           <xsd:element ref='thread' minOccurs='0' maxOccurs='1'/>
           <xsd:element ref='error' minOccurs='0' maxOccurs='1'/>
           <xsd:any 
               namespace='##other' 
               minOccurs='0' 
               maxOccurs='unbounded'/>
        </xsd:choice>
        <xsd:attribute name='to' type='xsd:string' use='optional'/>
        <xsd:attribute name='from' type='xsd:string' use='optional'/>
        <xsd:attribute name='id' type='xsd:string' use='optional'/>
        <xsd:attribute name='type' use='optional' default='normal'>
          <xsd:simpleType>
            <xsd:restriction base='xsd:NCName'>
              <xsd:enumeration value='normal'/>
              <xsd:enumeration value='chat'/>
              <xsd:enumeration value='groupchat'/>
              <xsd:enumeration value='headline'/>
              <xsd:enumeration value='error'/>
            </xsd:restriction>
          </xsd:simpleType>
        </xsd:attribute>
        <xsd:attribute name='xml:lang' type='xsd:string' use='optional'/>
     </xsd:complexType>
  </xsd:element>

  <xsd:element name='body' type='xsd:string'/>
  <xsd:element name='subject' type='xsd:string'/>
  <xsd:element name='thread' type='xsd:string'/>
  <xsd:element name='error'>
    <xsd:complexType>
      <xsd:attribute 
          name='code' 
          type='xsd:nonNegativeInteger' 
          use='required'/>
    </xsd:complexType>
  </xsd:element>

</xsd:schema>
          

6.3 The Presence Element

The <presence/> is used to express an entity's current availability status (offline or online, along with various sub-states of the latter) and communicate that status to other entities. It is also used to negotiate and manage subscriptions to the presence of other entities.

6.3.1 Attributes

A presence chunk MAY possess the following attributes:

6.3.2 Children

A presence chunk may contain zero or one of each of the following child elements:

As previously described under extended namespaces, a presence chunk MAY also contain any properly-namespaced child element (other than the common data elements, stream elements, or defined children thereof).

6.3.3 DTD

<!ELEMENT presence (( show? | status? | priority? | error? )*)>

<!ATTLIST presence
  to CDATA #IMPLIED
  from CDATA #IMPLIED
  id CDATA #IMPLIED
  type ( subscribe | subscribed | unsubscribe | 
         unsubscribed | unavailable | error ) #IMPLIED
  xml:lang CDATA #IMPLIED
>

<!ELEMENT show (#PCDATA)>
<!ELEMENT status (#PCDATA)>
<!ELEMENT priority (#PCDATA)>
<!ELEMENT error (#PCDATA)>
<!ATTLIST error code CDATA #REQUIRED>
          

6.3.4 Schema

<?xml version='1.0' encoding='UTF-8'?>
<xsd:schema
    xmlns:xsd='http://www.w3.org/2001/XMLSchema'
    targetNamespace='http://www.jabber.org/protocol'
    xmlns='http://www.jabber.org/protocol'
    elementFormDefault='qualified'>

  <xsd:element name='presence'>
    <xsd:complexType>
      <xsd:choice>
        <xsd:element ref='show' minOccurs='0' maxOccurs='1'/>
        <xsd:element ref='status' minOccurs='0' maxOccurs='1'/>
        <xsd:element ref='priority' minOccurs='0' maxOccurs='1'/>
        <xsd:element ref='error' minOccurs='0' maxOccurs='1'/>
        <xsd:any 
            namespace='##other' 
            minOccurs='0' 
            maxOccurs='unbounded'/>
      </xsd:choice>
      <xsd:attribute name='to' type='xsd:string' use='optional'/>
      <xsd:attribute name='from' type='xsd:string' use='optional'/>
      <xsd:attribute name='id' type='xsd:string' use='optional'/>
      <xsd:attribute name='type' use='optional'>
        <xsd:simpleType>
          <xsd:restriction base='xsd:string'>
            <xsd:enumeration value='unavailable'/>
            <xsd:enumeration value='subscribe'/>
            <xsd:enumeration value='subscribed'/>
            <xsd:enumeration value='unsubscribe'/>
            <xsd:enumeration value='unsubscribed'/>
            <xsd:enumeration value='error'/>
          </xsd:restriction>
        </xsd:simpleType>
      </xsd:attribute>
      <xsd:attribute name='xml:lang' type='xsd:string' use='optional'/>
    </xsd:complexType>
  </xsd:element>

  <xsd:element name='show'>
    <xsd:simpleType>
      <xsd:restriction base='xsd:string'>
        <xsd:enumeration value='away'/>
        <xsd:enumeration value='chat'/>
        <xsd:enumeration value='xa'/>
        <xsd:enumeration value='dnd'/>
      </xsd:restriction>
    </xsd:simpleType>
  </xsd:element>
  <xsd:element name='status' type='xsd:string'/>
  <xsd:element name='priority' type='xsd:nonNegativeInteger'/>
  <xsd:element name='error'>
    <xsd:complexType>
      <xsd:attribute 
          name='code' 
          type='xsd:nonNegativeInteger' 
          use='required'/>
    </xsd:complexType>
  </xsd:element>

</xsd:schema>
          

6.4 The IQ Element

6.4.1 Overview

Info/Query, or IQ, is a simple request-response mechanism. Just as HTTP is a request-response medium, the iq element enables an entity to make a request of, and receive a response from, another entity. The data content of the request and response is defined by the namespace declaration of a direct child element of the iq element.

Most IQ interactions follow a common pattern of structured data exchange such as get/result or set/result:

Requesting               Responding
  Entity                   Entity
----------               ----------
    |                        |
    |    <iq type="get">     |
    | ---------------------> |
    |                        |
    |   <iq type="result">   |
    | <--------------------- |
    |                        |
    |    <iq type="set">     |
    | ---------------------> |
    |                        |
    |   <iq type="result">   |
    | <--------------------- |
    |                        |
          

6.4.2 Attributes

An IQ chunk MAY possess the following attributes:

6.4.3 Children

In the strictest terms, the iq element contains no children since it is a vessel for XML in another namespace. As previously described under extended namespaces, an IQ chunk MAY contain any properly-namespaced child element (other than the common data elements, stream elements, or defined children thereof).

If the IQ is of type="error", the <iq/> chunk MUST include an <error/> child, which in turn MUST have a 'code' attribute corresponding to one of the standard error codes and MAY also contain PCDATA corresponding to a natural-language description of the error.

6.4.4 DTD

<!ELEMENT iq ( error | (#PCDATA) )*>

<!ATTLIST iq
  to CDATA #IMPLIED
  from CDATA #IMPLIED
  id CDATA #IMPLIED
  type ( get | set | result | error ) #REQUIRED
  xml:lang CDATA #IMPLIED
>

<!ELEMENT error (#PCDATA)>
<!ATTLIST error code CDATA #REQUIRED>
          

6.4.5 Schema

<?xml version='1.0' encoding='UTF-8'?>
<xsd:schema
    xmlns:xsd='http://www.w3.org/2001/XMLSchema'
    targetNamespace='http://www.jabber.org/protocol'
    xmlns='http://www.jabber.org/protocol'
    elementFormDefault='qualified'>

  <xsd:element name='iq'>
    <xsd:complexType mixed='true'>
      <xsd:choice> 
        <xsd:element ref='error' minOccurs='0' maxOccurs='1'/>
        <xsd:any 
            namespace='##other' 
            minOccurs='0' 
            maxOccurs='unbounded'/>
      </xsd:choice>
      <xsd:attribute name='to' type='xsd:string' use='optional'/>
      <xsd:attribute name='from' type='xsd:string' use='optional'/>
      <xsd:attribute name='id' type='xsd:string' use='optional'/>
      <xsd:attribute name='type' use='required'>
        <xsd:simpleType>
          <xsd:restriction base='xsd:string'>
            <xsd:enumeration value='get'/>
            <xsd:enumeration value='set'/>
            <xsd:enumeration value='result'/>
            <xsd:enumeration value='error'/>
          </xsd:restriction>
        </xsd:simpleType>
      </xsd:attribute>
      <xsd:attribute name='xml:lang' type='xsd:string' use='optional'/>
    </xsd:complexType>
  </xsd:element>

  <xsd:element name='error'>
    <xsd:complexType>
      <xsd:attribute 
          name='code' 
          type='xsd:nonNegativeInteger' 
          use='required'/>
    </xsd:complexType>
  </xsd:element>

</xsd:schema>
          


 TOC 

7. XML Usage within XMPP

7.1 Overview

In essence, XMPP core consists of three interrelated parts:

  1. XML streams, which provide a stateful means for transporting data in an asynchronous manner from one entity to another
  2. stream authentication using SASL authentication or the dialback protocol
  3. common data elements (message, presence, and iq), which provide a framework for communications between entities

XML[8] is used to define each of these protocols, as described in detail in the following sections.

In addition, XMPP contains protocol extensions (such as extended namespaces) that address the specific functionality required to create a basic instant messaging and presence application; these non-core protocol extensions are defined in XMPP IM[2].

7.2 Namespaces

XML Namespaces[12] are used within all XMPP-compliant XML to create strict boundaries of data ownership. The basic function of namespaces is to separate different vocabularies of XML elements that are structurally mixed together. Ensuring that XMPP-compliant XML is namespace-aware enables any XML to be structurally mixed with any data element within XMPP. This feature is relied upon frequently within XMPP to separate the XML that is processed by different services.

Additionally, XMPP is more strict about namespace prefixes than the XML namespace specification requires.

7.3 Validation

A host is not responsible for validating the XML elements forwarded to a node; an implementation MAY choose to provide only validated data elements but is not REQUIRED to do so. Nodes and services SHOULD NOT rely on the ability to send data which does not conform to the schemas, and SHOULD ignore any non-conformant elements or attributes on the incoming XML stream.

7.4 Extended Namespaces

While the common data elements defined in this document provide a basic level of functionality for messaging and presence, XMPP uses XML namespaces to extend the common data elements for the purpose of providing additional functionality. Thus a message, presence, or iq element may house an optional element containing content that extends the meaning of the message (e.g., an encrypted form of the message body). In XMPP usage this child element is often the <x/> element (in message and presence chunks) or the <query/> element (in IQ chunks), but it MAY be any element (other than the common data elements, stream elements, or defined children thereof). The child element MUST possess an 'xmlns' namespace declaration (other than those defined for XML streams) that defines all data contained within the child element. Note that the extended namespaces accepted by the Jabber Software Foundation[13] all begin with the string 'jabber:x' or 'jabber:iq' (e.g., jabber:iq:register).

7.5 Handling of Extended Namespaces

Support for extended namespaces is OPTIONAL on the part of any implementation. If an entity does not understand such a child element or its namespace, it must ignore the associated XML data. If an entity receives an IQ chunk in a namespace it does not understand, the entity SHOULD return an IQ chunk of type "error" with an error element of code 400 (bad request). If an entity receives a message or presence chunk that contains XML data in an extended namespace it does not understand, the portion of the chunk that is in the unknown namespace SHOULD be ignored. If an entity receives a message chunk without a <body/> element but with a child element bound by a namespace it does not understand, it MUST ignore that chunk.

7.6 Inclusion of Prolog

The prolog to an XML document is not a processing instruction. Applications MAY send a prolog. Applications MUST follow the rules in [12] concerning the circumstances in which a prolog is included.



 TOC 

8. Internationalization Considerations

8.1 Character Encodings

Software implementing XML streams MUST support the UTF-8 and UTF-16 encodings for received data. Software MUST NOT attempt to use any other encoding for transmitted data. The encodings of the transmit and receive streams are independent. Software may select either UTF-8 or UTF-16 for the transmitted stream, and should deduce the encoding of the received stream as described in [12].

8.2 Language Declarations

Message, presence, and IQ chunks sent over XML streams may contain text in many different languages. Therefore it is important to explicitly identify the language in use. This is done using the xml:lang attribute described in [8]. The following rules apply to use of the xml:attribute on XML streams and XML chunks.



 TOC 

9. IANA Considerations

The IANA registers "jabber-client" and "jabber-server" as GSS-API[16] service names, as specified in Section 6.1.1.



 TOC 

10. Security Considerations

10.1 Node-to-Host Communications

The SASL protocol for authenticating XML streams negotiated between a node and a host (defined under SASL Authentication above) provides a reliable mechanism for validating that a node connecting to a host is who it claims to be.

10.2 Host-to-Host Communications

It is OPTIONAL for any given host to communicate with other hosts, and host-to-host communications MAY be disabled by the administrator of any given deployment.

If two hosts would like to enable communications between themselves, they MUST form a relationship of trust at some level, either based on trust in DNS or based on a pre-existing trust relationship (e.g., through exchange of certificates). If two hosts have a pre-existing trust relationship, they MAY use SASL Authentication for the purpose of authenticating each other. If they do not have a pre-existing relationship, they MUST use the Dialback Protocol, which provides a reliable mechanism for preventing the spoofing of hosts.

10.3 Use of SASL

Although service provisioning is a policy matter, at a minimum, all implementations must provide:

for authentication:
the SASL DIGEST-MD5 mechanism

Further, node implementations may choose to offer MIME-based security services providing message integrity and confidentiality, such as OpenPGP[14] or S/MIME[15].



 TOC 

References

[1] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[2] Miller, J. and P. Saint-Andre, "XMPP Instant Messaging (draft-miller-jabber-xmpp-im-00, work in progress)", June 2002.
[3] Postel, J., "Simple Mail Transfer Protocol", STD 10, RFC 821, August 1982.
[4] University of Southern California, "Transmission Control Protocol", RFC 793, September 1981.
[5] Internet Assigned Numbers Authority, "Internet Assigned Numbers Authority", January 1998.
[6] Harrenstien, K., Stahl, M. and E. Feinler, "DoD Internet host table specification", RFC 952, October 1985.
[7] Braden, R., "Requirements for Internet Hosts - Application and Support", STD 3, RFC 1123, October 1989.
[8] World Wide Web Consortium, "Extensible Markup Language (XML) 1.0 (Second Edition)", W3C xml, October 2000.
[9] Myers, J., "Simple Authentication and Security Layer (SASL)", RFC 2222, October 1997.
[10] Gulbrandsen, A. and P. Vixie, "A DNS RR for specifying the location of services (DNS SRV)", RFC 2052, October 1996.
[11] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource Identifiers (URI): Generic Syntax", RFC 2396, August 1998.
[12] World Wide Web Consortium, "Namespaces in XML", W3C xml-names, January 1999.
[13] Jabber Software Foundation, "Jabber Software Foundation", August 2001.
[14] Elkins, M., Del Torto, D., Levien, R. and T. Roessler, "MIME Security with OpenPGP", RFC 3156, August 2001.
[15] Ramsdell, B., "S/MIME Version 3 Message Specification", RFC 2633, June 1999.
[16] Linn, J., "Generic Security Service Application Program Interface, Version 2", RFC 2078, January 1997.
[17] Day, M., Rosenberg, J. and H. Sugano, "A Model for Presence and Instant Messaging", RFC 2778, February 2000.
[18] Day, M., Aggarwal, S., Mohr, G. and J. Vincent, "A Model for Presence and Instant Messaging", RFC 2779, February 2000.


 TOC 

Authors' Addresses

  Jeremie Miller
  Jabber Software Foundation
  1899 Wynkoop Street, Suite 600
  Denver, CO 80202
  US
EMail:  jeremie@jabber.org
URI:  http://www.jabber.org/
  
  Peter Saint-Andre
  Jabber Software Foundation
  1899 Wynkoop Street, Suite 600
  Denver, CO 80202
  US
EMail:  stpeter@jabber.org
URI:  http://www.jabber.org/


 TOC 

Appendix A. Standard Error Codes

A standard error element is used for failed processing of XML chunks. This element is a child of the failed element and MUST include a 'code' attribute corresponding to one of the following error codes.



 TOC 

Full Copyright Statement

Acknowledgement