JEP-0166: Jingle Signalling

This document defines signalling methods for initiating and managing peer-to-peer sessions (e.g., voice and video exchanges) between XMPP clients in a way that is interoperable with existing Internet standards.


WARNING: This Standards-Track JEP is Experimental. Publication as a Jabber Enhancement Proposal does not imply approval of this proposal by the Jabber Software Foundation. Implementation of the protocol described herein is encouraged in exploratory implementations, but production systems should not deploy implementations of this protocol until it advances to a status of Draft.


JEP Information

Status: Experimental
Type: Standards Track
Number: 0166
Version: 0.1
Last Updated: 2005-12-15
JIG: Standards JIG
Approving Body: Jabber Council
Dependencies: XMPP Core
Supersedes: None
Superseded By: None
Short Name: jingle
Wiki Page: <http://wiki.jabber.org/index.php/Jingle Signalling (JEP-0166)>

Author Information

Scott Ludwig

Email: scottlu@google.com
JID: scottlu@google.com

Peter Saint-Andre

Email: stpeter@jabber.org
JID: stpeter@jabber.org

Joe Beda

Email: jbeda@google.com
JID: jbeda@google.com

Joe Hildebrand

Email: jhildebrand@jabber.com
JID: hildjj@jabber.org

Legal Notice

This Jabber Enhancement Proposal is copyright 1999 - 2005 by the Jabber Software Foundation (JSF) and is in full conformance with the JSF's Intellectual Property Rights Policy <http://www.jabber.org/jsf/ipr-policy.shtml>. This material may be distributed only subject to the terms and conditions set forth in the Creative Commons Attribution License (<http://creativecommons.org/licenses/by/2.5/>).

Discussion Venue

The preferred venue for discussion of this document is the Standards-JIG discussion list: <http://mail.jabber.org/mailman/listinfo/standards-jig>.

Relation to XMPP

The Extensible Messaging and Presence Protocol (XMPP) is defined in the XMPP Core (RFC 3920) and XMPP IM (RFC 3921) specifications contributed by the Jabber Software Foundation to the Internet Standards Process, which is managed by the Internet Engineering Task Force in accordance with RFC 2026. Any protocol defined in this JEP has been developed outside the Internet Standards Process and is to be understood as an extension to XMPP rather than as an evolution, development, or modification of XMPP itself.

Conformance Terms

The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.


Table of Contents

1. Introduction
2. Requirements
3. Glossary
4. Concepts and Approach
4.1. Session Management State Machine
4.2. Session Descriptions
5. Explanatory Example: Basic One-to-One Signalling
5.1. Resource Determination (Example)
5.2. Initiation (Example)
5.3. Negotiation (Example)
5.4. Acceptance (Example)
5.5. Termination (Example)
6. Protocol Description
6.1. Resource Determination
6.2. Required and Optional Actions
6.3. Initiation
6.4. Redirection
6.5. Negotiation
6.5.1. Candidate Format
6.5.2. Negotiation Modes
6.5.2.1. Dribble Mode
6.5.2.2. Burst Mode
6.5.3. Checking Connectivity
6.6. Acceptance
6.7. Termination
6.8. Informational Messages
7. Error Handling
7.1. Initiation-Related Error Conditions
7.2. Other Error Conditions
8. Security Considerations
8.1. Denial of Service
8.2. Communication Through Gateways
9. IANA Considerations
10. Jabber Registrar Considerations
10.1. Protocol Namespaces
10.2. Service Discovery Features
10.3. Jingle Session Types Registry
10.3.1. Registration Process
11. XML Schemas
11.1. Signalling
11.2. Errors
12. Open Issues
13. Acknowledgements
Notes
Revision History


1. Introduction

There exists no widely-adopted standard for initiating and managing peer-to-peer (p2p) multimedia interactions (such as voice and video exchanges) from within Jabber/XMPP clients. Although several large service providers and Jabber/XMPP clients have written and implemented their own proprietary XMPP extensions for p2p signalling (usually only for voice), those technologies are not open and do not always take into account requirements to interoperate with the Public Switched Telephone Network (PSTN) or emerging SIP-based Internet voice networks. By contrast, the only existing open protocol has been A Transport for Initiating and Negotiating Sessions (TINS) [1], which made it possible to initiate and manage p2p sessions, but which did not provide enough of the key signalling semantics to be easily implemented in Jabber/XMPP clients. [2]

The result has been an unfortunate fragmentation within the XMPP community regarding signalling protocols. There are, essentially, two approaches to solving the problem:

  1. Recommend that all client developers implement a dual-stack (XMPP + SIP) solution.
  2. Define a full-featured protocol for XMPP signalling.

Implementation experience indicates that a dual-stack approach may not be feasible on all the computing platforms for which Jabber clients have been written, or even desirable on platforms where it is feasible. [3] Therefore, it seems reasonable to define an XMPP signalling protocol that can provide the necessary signalling semantics while also making it possible to interoperate with existing Internet standards.

As a result of feedback received on JEP-0111, the second and fourth authors of this document began to define such a signalling protocol, code-named Jingle. Upon communication with members of the Google Talk team, it was discovered that the emerging Jingle approach was conceptually (and even syntactically) quite similar to the signalling protocol used in the Google Talk application. Therefore, in the interest of interoperability and adoption, we decided to harmonize the two approaches. The signalling protocol specified therein is, therefore, substantially equivalent to the existing Google Talk protocol, with several adjustments based on feedback received from implementors as well as for publication within the Jabber Software Foundation's standards process.

2. Requirements

The protocol defined herein is designed to meet the following requirements:

  1. Make it possible to manage a wide variety of peer-to-peer sessions (not limited to voice and video) within XMPP. [4]
  2. Make it relatively easy to implement support for the protocol in standard Jabber/XMPP clients.
  3. Where communication with non-XMPP entities is needed, push as much complexity as possible onto server-side gateways between the XMPP network and the non-XMPP network.

This document defines the signalling protocol only. Additional documents will specify the following:

3. Glossary

The following terminology is used in this document.

Table 1: Terminology

Term Definition
Session A set of 1+ negotiated channels between endpoints for the purpose of exchanging data related to 1+ session types, delimited in time by a session initiation request and session ending event.
Session Type A formal description of the purpose of the session. Common session types are voice, voice+video, and file sharing. A session consists of one and only one session type, for which 1+ channels will be negotiated and used.
Channel A direct communication channel in the context of a session. The channel is ideally a peer-to-peer connection. A channel ends when the session ends.

4. Concepts and Approach

4.1 Session Management State Machine

A simplified state machine for basic session management is shown below:

         START
           o  
           |   
           | initiate
           |   
           |  _____________________
           | /                     \
[PENDING]  o________                |
           |  |     | info,         |
           |  |_____| negotiate     | 
           |                        |
           | accept                 | decline,
           |                        | redirect,
 [ACTIVE]  o________                | terminate
           |  |     | info,         |
           |  |     | modify,       |
           |  |     | join,         |
           |  |     | replace,      |
           |  |_____| transfer      |
           |                        |
            \_______________________o [ENDED]
                  decline, 
                  redirect, 
                  terminate
    

There are three basic states:

  1. PENDING
  2. ACTIVE
  3. ENDED

There are ten basic "verbs" or actions:

  1. accept
  2. info
  3. initiate
  4. join
  5. modify
  6. negotiate
  7. redirect
  8. replace
  9. terminate
  10. transfer

Many of these states and actions correspond to (or can be mapped to) the states and actions defined in the Session Initiation Protocol (SIP), but such a mapping is out of scope here and will be provided in a separate specification.

4.2 Session Descriptions

Parallel to the signalling flows outlined above, the entities involved in the session need to exchange descriptions of the desired session. While it is possible to send raw Session Description Protocol (SDP) data for the session descriptions (the approach taken in TINS), this is not necessarily helpful, since in practice (1) not all SDP data is needed or used in the most common use cases and (2) SDP has been heavily extended in several useful directions, especially for NAT traversal (see RFC 3489 [8] and Interactive Connectivity Establishment (ICE) [9]). The approach taken herein is to specify pure session-description information in separate documents, one for each session type (audio, video, etc.). However, we include the NAT traversal semantics as a native part of Jingle signalling, since they are necessary for any kind of peer-to-peer session (no matter what the session description is).

5. Explanatory Example: Basic One-to-One Signalling

5.1 Resource Determination (Example)

To illustrate the basic concepts, we use our standard characters Romeo and Juliet and assume that Romeo wants to initiate a one-to-one audio session with Juliet. (See the Protocol Description section of this document for a more formal description.)

First Romeo must discover which of Juliet's XMPP resources is best for audio interaction. If a Juliet has only one XMPP resource, this task is best completed using Service Discovery [10] or the presence-based profile of service discovery specified in Entity Capabilities [11]:

Example 1. Romeo Requests Service Discovery Information

<iq from='romeo@montague.net/orchard' 
    id='disco1'
    to='juliet@capulet.com/balcony' 
    type='get'>
  <query xmlns='http://jabber.org/protocol/disco#info'/>
</iq>
    

Example 2. Juliet Provides Service Discovery Information

<iq from='juliet@capulet.com/balcony' 
    id='disco1'
    to='romeo@montague.net/orchard' 
    type='result'>
  <query xmlns='http://jabber.org/protocol/disco#info'>
    ...
    <feature var='http://jabber.org/protocol/jingle'/>
    <feature var='http://jabber.org/protocol/jingle/sessions/audio'/>
    <feature var='http://jabber.org/protocol/jingle?mode=dribble'/>
    <feature var='http://jabber.org/protocol/jingle?mode=burst'/>
    <feature var='http://jabber.org/protocol/jingle?stun=inline'/>
    <feature var='http://jabber.org/protocol/jingle?action=initiate'/>
    <feature var='http://jabber.org/protocol/jingle?action=redirect'/>
    <feature var='http://jabber.org/protocol/jingle?action=accept'/>
    <feature var='http://jabber.org/protocol/jingle?action=negotiate'/>
    <feature var='http://jabber.org/protocol/jingle?action=terminate'/>
    ...
  </query>
</iq>
    

If Juliet has more than one XMPP resource, it may be that only one of the resources supports Jingle and the audio session type, in which case Romeo would initiate Jingle signalling with that resource.

If Juliet has more than one XMPP resource that supports Jingle and the audio session type, Romeo's client should use Resource Application Priority [12] in order to determine the best resource with which to initiate a Jingle audio session.

5.2 Initiation (Example)

Once Romeo has discovered which of Juliet's XMPP resources is ideal for audio interaction, he sends a session initiation request to Juliet and specifies an audio session (see Jingle Audio [13]):

Example 3. Romeo Initiates Call

<iq to='juliet@capulet.com/balcony' from='romeo@montague.net/orchard' id='jingle1' type='set'>
  <jingle xmlns='http://jabber.org/protocol/jingle' 
          action='initiate' 
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <description xmlns='http://jabber.org/protocol/jingle/sessions/audio'>
      <payload-type id='18' name='G729'/>
      <payload-type id='97' name='IPCMWB'/>
      <payload-type id='98' name='L16'/>
      <payload-type id='103' name='ISAC'/>
      <payload-type id='102' name='iLBC'/>
      <payload-type id='4' name='G723'/>
      <payload-type id='100' name='EG711U'/>
      <payload-type id='101' name='EG711A'/>
      <payload-type id='0' name='PCMU'/>
      <payload-type id='8' name='PCMA'/>
      <payload-type id='13' name='CN'/>
    </description>
  </jingle>
</iq>
    

At this point, Juliet can do one of three things:

Here we assume that Juliet provisionally accepts the session:

Example 4. Juliet Provisionally Accepts the Session Request

<iq type='result' from='juliet@capulet.com/balcony' to='romeo@montague.net/orchard' id='jingle1'/>
    

5.3 Negotiation (Example)

As soon as Juliet provisionally accepts the session initiation request, the next phase of the session flow begins: negotiation of connectivity. Here we assume that both Romeo and Juliet support dribble mode for connectivity negotation (for burst mode, see the Burst Mode section of this document) and that therefore their clients immediately begin sending candidate transport mechanisms to each other.

Example 5. Romeo Sends a Candidate Transport

<iq to='juliet@capulet.com/balcony' from='romeo@montague.net/orchard' id='candidate1' type='set'>
  <jingle xmlns='http://jabber.org/protocol/jingle' 
          action='negotiate'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <candidate name='rtp'
               protocol='udp'
               preference='1.0'
               username='/38UHtocC941jdS4' 
               password='pcd+Z/WmsthSFIcz'
               type='local'
               network='0'
               generation='0' 
               ip='10.1.1.104' 
               port='13540'/>
  </jingle>
</iq>
    

If Juliet successfully receives the candidate, she returns an IQ-result (if not, for example because the candidate data is improperly formatted, she returns an error):

Example 6. Juliet Indicates Receipt of the First Candidate

<iq type='result' from='juliet@capulet.com/balcony' to='romeo@montague.net/orchard' id='candidate1'/>
    

Note well that Juliet is only indicating receipt of the candidate, not telling Romeo that the candidate will be used.

Romeo keeps sending candidates, one after the other (without stopping to receive an acknowledgement of receipt from the target entity for each candidate) until he has exhausted his supply of possible or desirable candidate transports:

Example 7. Romeo Sends a Second Candidate Transport

<iq to='juliet@capulet.com/balcony' from='romeo@montague.net/orchard' id='candidate2' type='set'>
  <jingle xmlns='http://jabber.org/protocol/jingle' 
          action='negotiate'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <candidate name='rtp'
               protocol='udp'
               preference='0.8'
               type='stun'
               username='ld6Hi+PfVtnmU8cf'
               password='gzoufy3aMXBRtiWs'
               network='1'
               generation='0' 
               ip='1.2.3.4' 
               port='6459'/>
  </jingle>
</iq>
    

Example 8. Romeo Sends a Third Candidate Transport

<iq to='juliet@capulet.com/balcony' from='romeo@montague.net/orchard' id='candidate2' type='set'>
  <jingle xmlns='http://jabber.org/protocol/jingle' 
          action='negotiate'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <candidate name='rtp'
               protocol='udp'
               preference='0.1'
               type='relay'
               username='XKqUmqiftjPUYAbF'
               password='G4116MkgTzb8+1N/'
               network='2'
               generation='0' 
               ip='5.6.7.8' 
               port='9823'/>
  </jingle>
</iq>
    

As above, Juliet keeps acknowledging receipt of the candidates:

Example 9. Juliet Indicates Receipt of the Second and Third Candidates

<iq type='result' from='juliet@capulet.com/balcony' to='romeo@montague.net/orchard' id='candidate2'/>

<iq type='result' from='juliet@capulet.com/balcony' to='romeo@montague.net/orchard' id='candidate3'/>
    

At the same time (i.e., immediately after provisionally accepting the session, not waiting for Romeo to begin or finish sending candidates), Juliet also begins sending candidates that may work for her:

Example 10. Juliet Sends a First Candidate

<iq from='juliet@capulet.com/balcony' to='romeo@montague.net/orchard' id='can1' type='set'>
  <jingle xmlns='http://jabber.org/protocol/jingle' 
          action='negotiate'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <candidate name='rtp'
               protocol='udp'
               preference='0.7'
               type='stun'
               username='5ilRe0u+EF17aUQU'
               password='VXrUejbQILvnEMIJ'
               network='0'
               generation='0' 
               ip='3.4.5.6' 
               port='7676'/>
  </jingle>
</iq>
    

Example 11. Juliet Sends a Second Candidate

<iq from='juliet@capulet.com/balcony' to='romeo@montague.net/orchard' id='can2' type='set'>
  <jingle xmlns='http://jabber.org/protocol/jingle' 
          action='negotiate'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <candidate name='rtp'
               protocol='udp'
               preference='0.3'
               type='relay'
               username='ph+H8epib3+I8aB8'
               password='o+bUsKt+SzkBPlOF'
               network='0'
               generation='0' 
               ip='4.5.6.7' 
               port='8135'/>
  </jingle>
</iq>
    

As above, Romeo acknowledges receipt of the candidates:

Example 12. Romeo Indicates Receipt of Candidates from Juliet

<iq type='result' from='romeo@montague.net/orchard' to='juliet@capulet.com/balcony' id='can1'/>

<iq type='result' from='romeo@montague.net/orchard' to='juliet@capulet.com/balcony' id='can2'/>
    

As Romeo and Juliet receive candidates, they probe the various candidate transports for connectivity. (This process is described in the Checking Connectivity section of this document.)

5.4 Acceptance (Example)

If, based on the connectivity checks, Juliet determines that she will be able to establish a connection, she sends a definitive acceptance to Romeo:

Example 13. Juliet Definitively Accepts the Call

<iq type='set' from='juliet@capulet.com/balcony' to='romeo@montague.net/orchard' id='accept1'>
  <jingle xmlns='http://jabber.org/protocol/jingle'
          action='accept' 
          initiator='romeo@montague.net/orchard'
          responder='juliet@capulet.com/balcony'
          sid='a73sjjvkla37jfea'/>
</iq>
    

Romeo then acknowledges Juliet's definitive acceptance:

Example 14. Romeo Acknowledges Definitive Acceptance

<iq type='result' to='juliet@capulet.com/balcony' from='romeo@montague.net/orchard' id='accept1'/>
    

Now Romeo and Juliet can begin sending media over the negotiated connection.

5.5 Termination (Example)

We assume that after a pleasant voice chat, Juliet decides to gracefully end the session, so she sends a "terminate" action to Romeo:

Example 15. Juliet Terminates the Session

<iq from='juliet@capulet.com/balcony' 
    id='term1' 
    to='romeo@montague.net/orchard' 
    type='set'>
  <jingle xmlns='http://jabber.org/protocol/jingle'
          action='terminate' 
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'/>
</iq>
    

Romeo then acknowledges termination of the session:

Example 16. Romeo Acknowledges Termination

<iq type='result' to='juliet@capulet.com/balcony' from='romeo@montague.net/orchard' id='term1'/>
    

6. Protocol Description

6.1 Resource Determination

In order to initiate a Jingle session, the initiating entity must determine which of the target entity's XMPP resources is best for the desired session type. If a contact has only one XMPP resource, this task MUST be completed using Service Discovery or the presence-based profile of service discovery specified in Entity Capabilities (see example above).

Naturally, instead of sending service discovery requests to every contact in a user's roster, it is more efficient to use Entity Capabilities, whereby support for the Jingle protocol (including negotiation mode and STUN style as described below) and various Jingle session types is determined for a client version in general (rather than on a per-JID basis) and then cached. Refer to JEP-0115 for details.

If a contact has more than one XMPP resource, it may be that only one of the resources supports Jingle and the desired session type, in which case the user MUST initiate the Jingle signalling with that resource.

If a contact has more than one XMPP resource that supports Jingle and the desired session type, it is RECOMMENDED for a client to use Resource Application Priority (RAP) in order to determine which is the best resource with which to initiate the desired Jingle session.

6.2 Required and Optional Actions

Support for the 'accept', 'info', 'initiate', 'negotiate', 'redirect', and 'terminate' actions is REQUIRED.

Support for the 'join', 'modify', 'replace', and 'transfer' actions is OPTIONAL.

An entity MUST disclose which actions it supports in its response to service discovery information requests via the following features:

6.3 Initiation

Once the initiating entity has discovered which of the target entity's XMPP resources is ideal for audio interaction, it sends a session initiation request to the target entity. This request is an IQ-set containing a <jingle/> element qualified by the 'http://jabber.org/protocol/jingle' namespace; the <jingle/> element MUST possess the 'action', 'initiator', and 'sid' attributes as described below; for initiation the 'action' attribute MUST have a value of "initiate", the 'mode' attribute SHOULD be included (if not, its value defaults to "dribble"), the 'stun' attribute SHOULD be included (if not, its value defaults to "inline"), and the <jingle/> element MUST contain one and only one child element that describes desired session (different elements will be defined for different session types). Here is an example:

Example 17. Initiation Example

<iq to='juliet@capulet.com/balcony' from='romeo@montague.net/orchard' id='jingle1' type='set'>
  <jingle xmlns='http://jabber.org/protocol/jingle' 
          action='initiate' 
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <description xmlns='http://jabber.org/protocol/jingle/sessions/audio'>
      ...
    </description>
  </jingle>
</iq>
    

The attributes of the <jingle/> element are as follows:

At this point, the target entity can do one of three things:

To decline the session initiation request, the target entity MUST return an XMPP <not-acceptable/> error. There are five defined error cases:

Example 18. Juliet Declines the Session Request

<iq from='juliet@capulet.com/balcony' to='romeo@montague.net/orchard' id='jingle1' type='error'>
  <error code='406' type='modify'>
    <not-acceptable xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/>
    <busy xmlns='http://jabber.org/protocol/jingle#errors'/>
  </error>
</iq>
    

To immediately redirect the session initiation request to another address (e.g., because of a change in resource application priority generated right before the session initiation request was received), the target entity returns an XMPP <redirect/> error:

Example 19. Juliet Redirects the Session Request

<iq from='juliet@capulet.com/balcony' to='romeo@montague.net/orchard' id='jingle1' type='error'>
  <error code='302' type='modify'>
    <redirect xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'>xmpp:juliet@capulet.com/chamber</redirect>
  </error>
</iq>
    

(Note: RFC 3920 specifies that the optional XML character data of the XMPP <redirect/> error shall be an XMPP address (JID) rather than a full URI/IRI; however, since it is possible for an XMPP entity to desire redirection to a non-XMPP address (e.g., a SIP address or telephone number), in this specification we recommend inclusion of a full URI/IRI such as a sip: or sips: URI, a tel: URI or an XMPP URI/IRI that is consistent with XMPP URI Scheme [15].)

If the session initiation request is declined or redirected, the original session MUST be considered in the ENDED state and the new session initiation request (if any) sent to the target entity or redirected entity MUST possess a newly-generated session ID.

To provisionally accept the session initiation request, the target entity returns an IQ-result:

Example 20. Juliet Provisionally Accepts the Session Request

<iq type='result' from='juliet@capulet.com/balcony' to='romeo@montague.net/orchard' id='jingle1'/>
      

6.4 Redirection

After provisionally accepting the session, the target entity MAY redirect the session to another address (e.g., because the principal is not answering at the original resource). This is done by sending a Jingle redirect action to the initiating entity:

Example 21. Juliet Redirects the Session

<iq from='juliet@capulet.com/balcony' 
    id='jingle2' 
    to='romeo@montague.net/orchard' 
    type='set'>
  <jingle xmlns='http://jabber.org/protocol/jingle' 
          action='redirect' 
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <redirect>xmpp:voicemail@capulet.com</redirect>
  </jingle>
</iq>
    

The recipient then acknowledges the redirection:

Example 22. Romeo Acknowledges Redirection

<iq from='romeo@montague.net/orchard' 
    id='jingle2' 
    to='juliet@capulet.com/balcony' 
    type='result'/>
    

Both entities MUST now consider the original session to be in the ENDED state, and if the initiating entity wishes to initiate a session with the redirected address it MUST do so by sending a session initiation request to that address with a new session ID.

6.5 Negotiation

As soon as the target entity provisionally accepts the session initiation request, the next phase of the session flow begins: negotiation of connectivity by exchanging XML-formatted candidate transports for the channel. The process for this negotiation is largely the same in Jingle as it is in Interactive Connectivity Establishment (ICE). The main exception is that, when operating in dribble mode, Jingle takes advantage of the request-response semantics of the XMPP <iq/> stanza type by sending each candidate transport in a separate IQ exchange. Jingle burst negotiation is more similar to ICE in this respect. The candidate format and negotiation modes are described below.

6.5.1 Candidate Format

In contrast to ICE, in Jingle candidates are encoded into XML rather than into SDP. In addition, in Jingle a candidate is a single XML element (rather than the candidate pairs recommended in ICE) to save bandwidth.

The following is an example of the candidate format:

Example 23. Romeo Sends a Candidate Transport

<iq to='juliet@capulet.com/balcony' from='romeo@montague.net/orchard' id='candidate1' type='set'>
  <jingle xmlns='http://jabber.org/protocol/jingle' 
          action='negotiate'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <candidate name='rtp'
               protocol='udp'
               preference='1.0'
               username='/38UHtocC941jdS4' 
               password='pcd+Z/WmsthSFIcz'
               type='local'
               network='0'
               generation='0' 
               ip='10.1.1.104' 
               port='13540'/>
  </jingle>
</iq>
      

The attributes of the <candidate/> element are as follows:

6.5.2 Negotiation Modes

An implementation SHOULD support both modes: dribble and burst. An entity MUST disclose which mode(s) it supports in its response to service discovery information requests via the "http://jabber.org/protocol/jingle?mode=dribble" and "http://jabber.org/protocol/jingle?mode=burst" features. If the target entity supports both modes, the initiating entity MUST specify its desired mode in the session initiation request, either explicitly by including the 'mode' attribute with a value of "burst" or "dribble", or implicitly specifying dribble mode by not including the 'mode' attribute (since the default value of the 'mode' attribute is "dribble"). If the target entity supports only one mode, then the initiating entity MUST specify that mode in the session initiation request; however, if the only mode supported by the target entity is not supported by the initiating entity, then the initiating MUST NOT send a session initiation request. An intermediary such as a SIP-to-XMPP gateway MUST insert its supported negotiation mode or modes (both in service discovery responses and session initiation requests) on behalf of entities it represents on the XMPP network.

6.5.2.1 Dribble Mode

In dribble mode, each candidate is sent in a separate IQ stanza and acknowledged by the other party.

If dribble mode is used, the first step in negotiating connectivity is for each client to immediately begin sending candidate transport mechanisms to the other client. These candidates SHOULD be gathered by following the procedure specified in Section 7.1 of ICE and prioritied by following the procedure specified in Section 7.2 of ICE.

If the target entity successfully receives the candidate, it returns an IQ-result (if not, for example because the candidate data is improperly formatted, it returns an error).

Note well that the target entity is only indicating receipt of the candidate, not telling the initiating entity that the candidate will be used.

The initiating entity keeps sending candidates, one after the other (without stopping to receive an acknowledgement of receipt from the target entity for each candidate) until it has exhausted its supply of possible or desirable candidate transports: [17] For each candidate, the target entity acknowledges receipt.

At the same time (i.e., immediately after provisionally accepting the session, not waiting for the initiating entity to begin or finish sending candidates), the target entity also begins sending candidates that may work for it. As above, the initiating entity acknowledges receipt of the candidates.

As the initiating entity and target entity receive candidates, they probe the various candidate transports for connectivity. In performing these connectivity checks, client SHOULD follow the procedure specified in Section 7.6 of ICE.

6.5.2.2 Burst Mode

Burst mode closely follows the offer-answer model specified in RFC 3264 [18]. In particular, the initiating entity MUST send a complete set of candidate transports in the session initiation request, as in the following example:

Example 24. Romeo Initiates Call (Burst Mode)

<iq to='juliet@capulet.com/balcony' from='romeo@montague.net/orchard' id='jingle1' type='set'>
  <jingle xmlns='http://jabber.org/protocol/jingle' 
          action='initiate' 
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <candidate name='rtp'
               protocol='udp'
               preference='1.0'
               username='/38UHtocC941jdS4' 
               password='pcd+Z/WmsthSFIcz'
               type='local'
               network='0'
               generation='0' 
               ip='10.1.1.104' 
               port='13540'/>
    <candidate name='rtp'
               protocol='udp'
               preference='0.8'
               type='stun'
               username='ld6Hi+PfVtnmU8cf'
               password='gzoufy3aMXBRtiWs'
               network='1'
               generation='0' 
               ip='1.2.3.4' 
               port='6459'/>
    <candidate name='rtp'
               protocol='udp'
               preference='0.1'
               type='relay'
               username='XKqUmqiftjPUYAbF'
               password='G4116MkgTzb8+1N/'
               network='2'
               generation='0' 
               ip='5.6.7.8' 
               port='9823'/>
    <description xmlns='http://jabber.org/protocol/jingle/sessions/audio'>
      <payload-type id='18' name='G729'/>
      <payload-type id='97' name='IPCMWB'/>
      <payload-type id='98' name='L16'/>
      <payload-type id='103' name='ISAC'/>
      <payload-type id='102' name='iLBC'/>
      <payload-type id='4' name='G723'/>
      <payload-type id='100' name='EG711U'/>
      <payload-type id='101' name='EG711A'/>
      <payload-type id='0' name='PCMU'/>
      <payload-type id='8' name='PCMA'/>
      <payload-type id='13' name='CN'/>
    </description>
  </jingle>
</iq>
        

6.5.3 Checking Connectivity

It is possible to check connectivity via STUN only "upfront" before the session is negotiated or also "inline" during the life of the session (e.g., to determine if a connectivity method other than that chosen initially is now more efficient); it is also possible to perform no STUN connectivity checking at all (e.g., this may be true of older SIP implementations that do no support STUN). An implementation SHOULD support inline STUN checking but MAY support only upfront STUN checking or no STUN checking at all. An entity MUST disclose which method(s) of connectivity checking it supports in its response to service discovery information requests via the "http://jabber.org/protocol/jingle?stun=inline", "http://jabber.org/protocol/jingle?stun=none", and "http://jabber.org/protocol/jingle?stun=upfront" features. If the target entity supports both upfront and inline STUN checking, the initiating entity MUST specify its desired mode in the session initiation request, either explicitly by including the 'stun' attribute with a value of "inline" or "upfront", or implicitly specifying inline checking by not including the 'stun' attribute (since the default value of the 'stun' attribute is "inline"). If the target entity supports only one of upfront or inline, then the initiating entity MUST specify that checking method in the session initiation request; however, if the only method supported by the target entity is not supported by the initiating entity, then the initiating MUST fall back to no STUN checking at all. An intermediary such as a SIP-to-XMPP gateway MUST insert its supported STUN checking method or methods (both in service discovery responses and session initiation requests) on behalf of entities it represents on the XMPP network.

6.6 Acceptance

If, based on the connectivity checks, the target entity determines that it will be able to establish a connection, it sends a definitive acceptance to the initiating entity:

Example 25. Juliet Definitively Accepts the Call

<iq type='set' from='juliet@capulet.com/balcony' to='romeo@montague.net/orchard' id='accept1'>
  <jingle xmlns='http://jabber.org/protocol/jingle'
          action='accept' 
          initiator='romeo@montague.net/orchard'
          responder='juliet@capulet.com/balcony'
          sid='a73sjjvkla37jfea'/>
</iq>
    

The <jingle/> element in the accept stanza MAY contain a <description/> element that specifies the supported codecs (and other session description details) and SHOULD possess a 'responder' attribute that explicitly specifies the full JID of the responding entity. If provided, all future commmunications SHOULD be sent to the JID provided in the 'responder' attribute.

The initiating entity then acknowledges the target entity's definitive acceptance:

Example 26. Romeo Acknowledges Definitive Acceptance

<iq type='result' to='juliet@capulet.com/balcony' from='romeo@montague.net/orchard' id='accept1'/>
    

Now the initiating entity and target entity can begin sending media over the negotiated connection.

In the unlikely event that the target entity cannot find a suitable candidate transport, it SHOULD terminate the session as described below.

6.7 Termination

In order to gracefully end the session, either the target entity or the initiating entity MUST a send a "terminate" action to the other party:

Example 27. Juliet Terminates the Session

<iq from='juliet@capulet.com/balcony' 
    id='term1' 
    to='romeo@montague.net/orchard' 
    type='set'>
  <jingle xmlns='http://jabber.org/protocol/jingle'
          action='terminate' 
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'/>
</iq>
    

The initiating entity then acknowledges termination of the session:

Example 28. Romeo Acknowledges Termination

<iq type='result' to='juliet@capulet.com/balcony' from='romeo@montague.net/orchard' id='term1'/>
    

Unfortunately, not all sessions end gracefully. The following events MUST be considered session-ending events, and any further communication for the session type MUST be completed through negotiation of a new session:

In particular, one party MUST consider the session to be in the ENDED state if it receives presence of type "unavailable" from the other party:

Example 29. Juliet Goes Offline

<presence from='juliet@capulet.com/balcony' to='romeo@montague.net/orchard' type='unavailable'/>
    

Naturally, in this case there is nothing for the initiating entity to acknowledge.

6.8 Informational Messages

At any point after initiation of a Jingle session, either entity MAY send an informational message to the other party, for example to inform the other party that a session initiation request is queued, that a device is ringing, or that a scheduled event has occurred or will occur. An information message takes the form of an IQ-set containing a <jingle/> element whose 'action' attribute is set to a value of "info"; the <jingle/> element MUST further contain a "payload" child element that specifies the information being communicated. The syntax of this payload element is undefined in this signalling specification and shall be defined by the appropriate session type specification.

7. Error Handling

The following sections describe error handling related to Jingle signalling. For general information regarding XMPP error handling, refer to RFC 3920 and Error Condition Mappings [19].

7.1 Initiation-Related Error Conditions

There are several possible reasons why the target entity might specify an error of <not-acceptable/> in response to the session initiation request. The target entity SHOULD specify the precise reason for not accepting the session by including an appropriate error condition element qualified by the 'http://jabber.org/protocol/jingle#errors' namespace along with the <not-acceptable/> error. [20] The defined initiation-related error conditions are:

Table 2: Initiation-Related Error Conditions

Jingle Condition XMPP Condition Description
<busy/> <not-acceptable/> The target entity or principal is busy with other sessions or tasks and therefore is unable or unwilling to accept the session; this maps to SIP codes 486 and 600. [21]
<unsupported-media/> <not-acceptable/> The target entity does not support any of the payload-types (e.g., codecs) offered by the initiating entity; this maps to SIP code 415.
<unsupported-negotiation-mode/> <not-acceptable/> The target entity does not support the session negotiation mode (dribble or burst) specified in the session initiation request.
<unsupported-session-type/> <not-acceptable/> The target entity does not support the session type specified in the session initiation request.
<unsupported-stun-method/> <not-acceptable/> The target entity does not support the STUN testing method (inline or none) specified in the session initiation request.

7.2 Other Error Conditions

At any point during a session, one of the entities may send a faulty signalling stanza to the other party. The error conditions defined for such cases are:

Table 3: Other Error Conditions

Jingle Condition XMPP Condition Description
<out-of-order/> <unexpected-request/> The request cannot be at this point in the state machine (e.g., initiate after accept).
<unknown-session/> <bad-request/> The 'sid' attribute specifies a session that is unknown to the recipient.
<unsupported-action/> <feature-not-implemented/> The 'action' attribute specifies an action that is not supported by the recipient.

8. Security Considerations

Note: This section is not yet complete.

8.1 Denial of Service

Media sessions are resource-intensive. Therefore, it is possible to launch a denial-of-service attack against an entity by burdening it with too many media sessions. Care must be taken to accept media sessions only from known entities.

8.2 Communication Through Gateways

Jingle communications may be enabled through gateways to non-XMPP networks, whose security characteristics may be quite different from those of XMPP networks. (For example, on some SIP networks authentication is optional and "from" addresses can be easily forged.) Care must be taken in communicating through such gateways.

9. IANA Considerations

This JEP requires no interaction with the Internet Assigned Numbers Authority (IANA) [22].

10. Jabber Registrar Considerations

10.1 Protocol Namespaces

The Jabber Registrar [23] shall include 'http://jabber.org/protocol/jingle' in its registry of protocol namespaces.

10.2 Service Discovery Features

The Jabber Registrar shall include the following values in its registry of service discovery features.

10.3 Jingle Session Types Registry

The Jabber Registrar shall maintain a registry of Jingle session types. All session type registrations shall be defined in separate documents (not in this JEP). Session types defined within the JEP series MUST be registered with the Jabber Registrar, resulting in protocol URIs of the form "http://jabber.org/protocol/jingle/session/name" (where "name" is the registered name of the session type)..

10.3.1 Registration Process

In order to submit new values to this registry, the registrant must define an XML fragment of the following form and either include it in the relevant Jabber Enhancement Proposal or send it to the email address <registrar@jabber.org>:

<session>
  <name>the name of the session type (e.g., "audio")</name>
  <desc>a natural-language description of the session type</desc>
  <doc>the document in which this session type is specified</doc>
</session>
      

11. XML Schemas

11.1 Signalling

<?xml version='1.0' encoding='UTF-8'?>

<xs:schema
    xmlns:xs='http://www.w3.org/2001/XMLSchema'
    targetNamespace='http://jabber.org/protocol/jingle'
    xmlns='http://jabber.org/protocol/jingle'
    elementFormDefault='qualified'>

  <xs:element name='jingle'>
    <xs:complexType>
      <xs:choice>
        <xs:sequence>
          <xs:element ref='candidate' minOccurs='0' maxOccur='unbounded'/>
          <xs:any namespace='##other' minOccurs='0' maxOccurs='1'/>
        </xs:sequence>
        <xs:element ref='redirect' type='xs:anyURI'/>
      </xs:choice>
      <xs:attribute name='action' use='required'>
        <xs:simpleType>
          <xs:restriction base='xs:NCName'>
            <xs:enumeration value='accept'/>
            <xs:enumeration value='info'/>
            <xs:enumeration value='initiate'/>
            <xs:enumeration value='join'/>
            <xs:enumeration value='modify'/>
            <xs:enumeration value='negotiate'/>
            <xs:enumeration value='redirect'/>
            <xs:enumeration value='replace'/>
            <xs:enumeration value='terminate'/>
            <xs:enumeration value='transfer'/>
          </xs:restriction>
        </xs:simpleType>
      </xs:attribute>
      <xs:attribute name='initiator' type='xs:string' use='required'/>
      <xs:attribute name='mode' use='optional' default='dribble'>
        <xs:simpleType>
          <xs:restriction base='xs:NCName'>
            <xs:enumeration value='burst'/>
            <xs:enumeration value='dribble'/>
          </xs:restriction>
        </xs:simpleType>
      </xs:attribute>
      <xs:attribute name='sid' type='xs:NMTOKEN' use='required'/>
    </xs:complexType>
  </xs:element>

  <xs:element name='candidate'>
    <xs:complexType>
      <xs:simpleContent>
        <xs:extension base='empty'>
          <xs:attribute name='generation' type='xs:unsignedByte' use='required'/>
          <xs:attribute name='ip' type='xs:string' use='required'/>
          <xs:attribute name='name' type='xs:string' use='optional'/>
          <xs:attribute name='network' type='xs:unsignedByte' use='required'/>
          <xs:attribute name='password' type='xs:string' use='required'/>
          <xs:attribute name='port' type='xs:short' use='required'/>
          <xs:attribute name='preference' type='xs:decimal' use='required'/>
          <xs:attribute name='protocol' use='required'>
            <xs:simpleType>
              <xs:restriction base='xs:NCName'>
                <xs:enumeration value='ssltcp'/>
                <xs:enumeration value='tcp'/>
                <xs:enumeration value='udp'/>
              </xs:restriction>
            </xs:simpleType>
          </xs:attribute>
          <xs:attribute name='type' use='required'>
            <xs:simpleType>
              <xs:restriction base='xs:NCName'>
                <xs:enumeration value='local'/>
                <xs:enumeration value='relay'/>
                <xs:enumeration value='stun'/>
              </xs:restriction>
            </xs:simpleType>
          </xs:attribute>
          <xs:attribute name='username' type='xs:string' use='required'/>
        </xs:extension>
      </xs:simpleContent>
    </xs:complexType>
  </xs:element>

  <xs:simpleType name='empty'>
    <xs:restriction base='xs:string'>
      <xs:enumeration value=''/>
    </xs:restriction>
  </xs:simpleType>

</xs:schema>
    

11.2 Errors

<?xml version='1.0' encoding='UTF-8'?>

<xs:schema
    xmlns:xs='http://www.w3.org/2001/XMLSchema'
    targetNamespace='http://jabber.org/protocol/jingle#errors'
    xmlns='http://jabber.org/protocol/jingle#errors'
    elementFormDefault='qualified'>

  <xs:element name='busy' type='empty'/>
  <xs:element name='out-of-order' type='empty'/>
  <xs:element name='unknown-session' type='empty'/>
  <xs:element name='unsupported-action' type='empty'/>
  <xs:element name='unsupported-media' type='empty'/>
  <xs:element name='unsupported-negotiation-mode' type='empty'/>
  <xs:element name='unsupported-session-type' type='empty'/>
  <xs:element name='unsupported-stun-method' type='empty'/>

  <xs:simpleType name='empty'>
    <xs:restriction base='xs:string'>
      <xs:enumeration value=''/>
    </xs:restriction>
  </xs:simpleType>

</xs:schema>
    

12. Open Issues

The open issues include:

13. Acknowledgements

The authors would like to thank Rohan Mahy for his helpful feedback on this document. Thanks also to those who have commented on the Standards JIG [24] and (earlier) Jingle [25] mailing lists.


Notes

1. JEP-0111: A Transport for Initiating and Negotiating Sessions (TINS <http://www.jabber.org/jeps/jep-0111.html>.

2. It is true that TINS made it relatively easy to implement an XMPP-to-SIP gateway; however, in line with the long-time Jabber philosophy of "simple clients, complex servers", it would be better to force complexity onto the server-side gateway and to keep the client as simple as possible.

3. For example, one large ISP recently decided to switch to a pure XMPP approach after having implemented and deployed a dual-stack client for several years.

4. Possible other session types include file sharing, application casting, application sharing, whiteboarding, torrent broadcasting, shared real-time editing, and distributed musical performance, to name but a few.

5. RFC 2327: SDP: Session Description Protocol <http://www.ietf.org/rfc/rfc2327.txt>.

6. RFC 3261: Session Initiation Protocol (SIP) <http://www.ietf.org/rfc/rfc3261.txt>.

7. ITU Recommendation H.323: Packet-based Multimedia Communications Systems (September 1999).

8. RFC 3489: STUN - Simple Traversal of User Datagram Protocol (UDP) Through Network Address Translators (NATs) <http://www.ietf.org/rfc/rfc3489.txt>.

9. Interactive Connectivity Establishment (ICE): A Methodology for Network Address Translator (NAT) Traversal for Offer/Answer Protocols <http://www.ietf.org/internet-drafts/draft-ietf-mmusic-ice-06.txt>. Work in progress.

10. JEP-0030: Service Discovery <http://www.jabber.org/jeps/jep-0030.html>.

11. JEP-0115: Entity Capabilities <http://www.jabber.org/jeps/jep-0115.html>.

12. JEP-0168: Resource Application Priority <http://www.jabber.org/jeps/jep-0168.html>.

13. JEP-0167: Jingle Audio <http://www.jabber.org/jeps/jep-0167.html>.

14. See <http://www.w3.org/TR/2000/WD-xml-2e-20000814#NT-Nmtoken>

15. Internationalized Resource Identifiers (IRIs) and Uniform Resource Identifiers (URIs) for the Extensible Messaging and Presence Protocol (XMPP) <http://www.ietf.org/internet-drafts/draft-saintandre-xmpp-iri-03.txt> (work in progress).

16. RFC 3489: STUN - Simple Traversal of User Datagram Protocol (UDP) Through Network Address Translators (NATs) <http://www.ietf.org/rfc/rfc3489.txt>.

17. Because certain candidates may be more "expensive" in terms of bandwidth or processing power, the initiator may not want to advertise their existence unless necessary.

18. RFC 3264: An Offer/Answer Model with the Session Description Protocol (SDP) <http://www.ietf.org/rfc/rfc3264.txt>.

19. JEP-0086: Error Condition Mappings <http://www.jabber.org/jeps/jep-0086.html>.

20. While it may seem that some of these Jingle-specific conditions would be more appropriate in conjunction with other XMPP conditions, such as <feature-not-implemented/>, the <not-acceptable/> XMPP condition is used for all cases of declining a session initiation request in order to simplify handling of session initiation.

21. We deliberately do not specify a separate error of "resource-constrained" in order to help prevent denial of service attacks.

22. The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique parameter values for Internet protocols, such as port numbers and URI schemes. For further information, see <http://www.iana.org/>.

23. The Jabber Registrar maintains a list of reserved Jabber protocol namespaces as well as registries of parameters used in the context of protocols approved by the Jabber Software Foundation. For further information, see <http://www.jabber.org/registrar/>.

24. The Standards JIG is a standing Jabber Interest Group devoted to discussion of Jabber Enhancement Proposals. The discussion list of the Standards JIG is the primary venue for discussion of Jabber protocol development, as well as for announcements by the JEP Editor and Jabber Registrar. To subscribe to the list or view the list archives, visit <http://mail.jabber.org/mailman/listinfo/standards-jig/>.

25. Before this specification was accepted as a Jabber Enhancement Proposal, it was discussed on the semi-private <jingle@jabber.org> mailing list; although that list is no longer used (the Standards-JIG list is the preferred discussion venue), for historical purposes it is publicly archived at <http://mail.jabber.org/pipermail/jingle/>.


Revision History

Version 0.1 (2005-12-15)

Initial JEP version. (psa)

Version 0.0.10 (2005-12-11)

More fully documented burst mode, connectivity checks, error cases, etc. (psa)

Version 0.0.9 (2005-12-08)

Restructured document flow; provided example of burst mode. (psa)

Version 0.0.8 (2005-12-05)

Distinguished between dribble mode and burst mode, including mode attribute, service discovery features, and implementation notes; provided detailed resource discovery examples; corrected state chart; specified session termination; specified error conditions; specified semantics of informational messages; began to define security considerations; added Joe Beda as co-author. (psa/sl/jb)

Version 0.0.7 (2005-11-08)

Added more detail to basic session flow; harmonized candidate negotiation process with ICE. (psa)

Version 0.0.6 (2005-10-27)

Added Jabber Registrar considerations; defined schema; completed slight syntax cleanup. (psa)

Version 0.0.5 (2005-10-21)

Separated description formats from signalling protocol. (psa/sl)

Version 0.0.4 (2005-10-19)

Harmonized basic session flow with Google Talk protocol; added Scott Ludwig as co-author. (psa/sl)

Version 0.0.3 (2005-10-10)

Added more detail to basic session flow. (psa)

Version 0.0.2 (2005-10-07)

Protocol cleanup. (psa/jjh)

Version 0.0.1 (2005-10-06)

First draft. (psa/jjh)


END