XEP-0166: Jingle

This document defines a framework for initiating and managing peer-to-peer multimedia sessions (e.g., voice and video chat) between two Jabber/XMPP endpoints in a way that is interoperable with existing Internet standards.


WARNING: This Standards-Track document is Experimental. Publication as an XMPP Extension Protocol does not imply approval of this proposal by the XMPP Standards Foundation. Implementation of the protocol described herein is encouraged in exploratory implementations, but production systems should not deploy implementations of this protocol until it advances to a status of Draft.


Document Information

Series: XEP
Number: 0166
Publisher: XMPP Standards Foundation
Status: Experimental
Type: Standards Track
Version: 0.18
Last Updated: 2007-11-08
Approving Body: XMPP Council
Dependencies: XMPP Core
Supersedes: None
Superseded By: None
Short Name: TO BE ASSIGNED
Wiki Page: <http://wiki.jabber.org/index.php/Jingle (XEP-0166)>

Author Information

Scott Ludwig

Email: scottlu@google.com
JabberID: scottlu@google.com

Joe Beda

Email: jbeda@google.com
JabberID: jbeda@google.com

Peter Saint-Andre

JabberID: stpeter@jabber.org
URI: https://stpeter.im/

Robert McQueen

Email: robert.mcqueen@collabora.co.uk
JabberID: robert.mcqueen@collabora.co.uk

Sean Egan

Email: seanegan@google.com
JabberID: seanegan@google.com

Joe Hildebrand

Email: jhildebrand@jabber.com
JabberID: hildjj@jabber.org

Legal Notices

IPR Conformance

This XMPP Extension Protocol has been contributed in full conformance with the XSF's Intellectual Property Rights Policy (a copy of which may be found at <http://www.xmpp.org/extensions/ipr-policy.shtml> or obtained by writing to XSF, P.O. Box 1641, Denver, CO 80201 USA).

Copyright

This XMPP Extension Protocol is copyright (c) 1999 - 2007 by the XMPP Standards Foundation (XSF).

Permissions

This material may be distributed only subject to the terms and conditions set forth in the Creative Commons Attribution License (<http://creativecommons.org/licenses/by/2.5/>).

Discussion Venue

The preferred venue for discussion of this document is the Standards discussion list: <http://mail.jabber.org/mailman/listinfo/standards>.

Relation to XMPP

The Extensible Messaging and Presence Protocol (XMPP) is defined in the XMPP Core (RFC 3920) and XMPP IM (RFC 3921) specifications contributed by the XMPP Standards Foundation to the Internet Standards Process, which is managed by the Internet Engineering Task Force in accordance with RFC 2026. Any protocol defined in this document has been developed outside the Internet Standards Process and is to be understood as an extension to XMPP rather than as an evolution, development, or modification of XMPP itself.

Conformance Terms

The following keywords as used in this document are to be interpreted as described in RFC 2119: "MUST", "SHALL", "REQUIRED"; "MUST NOT", "SHALL NOT"; "SHOULD", "RECOMMENDED"; "SHOULD NOT", "NOT RECOMMENDED"; "MAY", "OPTIONAL".


Table of Contents


1. Introduction
2. How It Works
3. Requirements
4. Glossary
5. Concepts and Approach
    5.1. Overall Session Management
6. Session Flow
    6.1. Resource Determination
    6.2. Initiation
    6.3. Receiver Response
    6.4. Decline
    6.5. Negotiation
    6.6. Acceptance
    6.7. Modifying an Active Session
    6.8. Termination
    6.9. Informational Messages
7. Scenarios
    7.1. Jingle Audio via RTP/AVP, Negotiated with ICE-UDP
    7.2. Jingle Audio and Video via RTP/AVP, Negotiated with ICE-UDP
    7.3. Secure Jingle Audio via UDP/TLS/RTP/AVP, Negotiated with ICE-UDP
8. Formal Definition
    8.1. Jingle Element
    8.2. Content Element
9. Error Handling
10. Determining Support
11. Conformance by Using Protocols
    11.1. Application Types
    11.2. Transport Methods
12. Security Considerations
    12.1. Denial of Service
    12.2. Communication Through Gateways
13. IANA Considerations
14. XMPP Registrar Considerations
    14.1. Protocol Namespaces
    14.2. Jingle Content Description Formats Registry
    14.3. Jingle Content Transport Methods Registry
    14.4. Jingle Reason Codes Registry
       14.4.1. Process
       14.4.2. Initial Registration
15. XML Schemas
    15.1. Jingle
    15.2. Jingle Errors
16. History
17. Acknowledgements
Notes
Revision History


1. Introduction

The purpose of Jingle is to enable one-to-one, peer-to-peer media sessions between XMPP entities, with the negotiation occurring over XMPP and the media being exchanged outside the XMPP band using technologies such as the Real-time Transport Protocol (RTP; RFC 3550 [1]), the User Datagram Protocol (UDP; RFC 768 [2]), and Interactive Connectivity Establishment (ICE) [3].

One target application for Jingle is simple voice chat (see Jingle Audio via RTP [4]). We stress the word "simple". The purpose of Jingle is not to build a full-fledged telephony application that supports call waiting, call forwarding, call transfer, hold music, IVR systems, find-me-follow-me functionality, conference calls, and the like. These features are of interest to some user populations, but building in support for these features would introduce unnecessary complexity into a technology that is designed for basic multimedia interaction.

The purpose of Jingle is not to supplant or replace technologies based on Session Initiation Protocol (SIP; RFC 3261 [5]). Because dual-stack XMPP+SIP clients are difficult to build, Jingle was designed as a pure XMPP signalling protocol. However, Jingle is at the same time designed to interwork with SIP so that the millions of deployed XMPP clients can be added onto existing Voice over Internet Protocol (VoIP) networks, rather than limiting XMPP users to a separate and distinct network.

Jingle is designed in a modular way so that developers can easily add support for multimedia session types other than voice chat, such as video chat (see Jingle Video via RTP [6]), application sharing, file sharing, collaborative editing, whiteboarding, and torrent broadcasting. The transport methods are also modular, so that Jingle implementations can use any appropriate media transport (including proprietary methods not standardized through the XMPP Standards Foundation).

2. How It Works

This section provides a friendly introduction to Jingle.

In essence, Jingle enables two XMPP entities (e.g., romeo@montague.lit and juliet@capulet.lit) to set up, manage, and tear down a multimedia session. The negotiation takes place over XMPP, and the media transfer takes place outside of XMPP. The simplest session flow is as follows:

Romeo                         Juliet
  |                             |
  |   session-initiate          |
  |---------------------------->|
  |   ack                       |
  |<----------------------------|
  |   session-accept            |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |   MEDIA SESSION (RTP)       |
  |<===========================>|
  |   session-terminate         |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |                             |
  

Naturally, more complex scenarios are possible (indeed, likely).

The simplest flow might happens as follows. The example is that of a voice chat (see XEP-0167) initiated by Romeo, where the transport is Jingle Raw UDP Transport Method [7].

Example 1. Initiator sends session-initiate

<iq from='romeo@montague.lit/orchard' to='juliet@capulet.lit/balcony' id='jingle1' type='set'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns'
          action='session-initiate'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='this-is-the-audio-content' profile='RTP/AVP'>
      <description xmlns='http://www.xmpp.org/extensions/xep-0167.html#ns'>
        <payload-type id='96' name='speex' clockrate='16000'/>
        <payload-type id='97' name='speex' clockrate='8000'/>
        <payload-type id='18' name='G729'/>
        <payload-type id='103' name='L16' clockrate='16000' channels='2'/>
        <payload-type id='98' name='x-ISAC' clockrate='8000'/>
      </description>
      <transport xmlns='http://www.xmpp.org/extensions/xep-0177.html#ns'>
        <candidate ip='10.1.1.104' port='13540' generation='0'/>
      </transport>
    </content>
  </jingle>
</iq>
  

Example 2. Receiver sends provisional acceptance

<iq from='juliet@capulet.lit/balcony'
    id='accept1'
    to='romeo@montague.lit/orchard'
    type='result'/>
  

Example 3. Receiver sends session-accept

<iq type='set' from='juliet@capulet.lit/balcony' to='romeo@montague.lit/orchard' id='accept1'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns'
          action='session-accept'
          initiator='romeo@montague.lit/orchard'
          responder='juliet@capulet.lit/balcony'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='this-is-the-audio-content' profile='RTP/AVP'>
      <description xmlns='http://www.xmpp.org/extensions/xep-0167.html#ns'>
        <payload-type id='97' name='speex' clockrate='8000'/>
        <payload-type id='18' name='G729'/>
        <payload-type id='0' name='PCMU' />
        <payload-type id='102' name='iLBC'/>
        <payload-type id='4' name='G723'/>
        <payload-type id='8' name='PCMA'/>
        <payload-type id='13' name='CN'/>
      </description>
      <transport xmlns='http://www.xmpp.org/extensions/xep-0177.html#ns'>
        <candidate ip='208.245.212.67' port='9876' generation='0'/>
      </transport>
    </content>
  </jingle>
</iq>
  

If the foregoing payload types and transport candidates can be successfully used, then the parties would begin to exchange media. In this case they would exchange audio using the Speex codec at a clockrate of 8000 since that is the highest-priority codec for the responder (as determined by the XML order of the <payload-type/> children).

The parties may continue the session as long as desired.

Eventually, one of the parties terminates the session.

Example 4. Receiver terminates the session

<iq from='juliet@capulet.lit/balcony'
    id='term1'
    to='romeo@montague.lit/orchard'
    type='set'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns'
          action='session-terminate'
          initiator='romeo@montague.lit/orchard'
          reason='Sorry, gotta go!'
          sid='a73sjjvkla37jfea'/>
</iq>
  

The other party MUST then acknowledge termination of the session:

Example 5. Initiator Acknowledges Termination

<iq from='romeo@montague.lit/orchard' 
    id='term1'
    to='juliet@capulet.lit/balcony' 
    type='result'/>
  

3. Requirements

The protocol defined herein is designed to meet the following requirements:

  1. Make it possible to manage a wide variety of peer-to-peer sessions (not limited to voice and video) within XMPP.
  2. Clearly separate the signalling channel (XMPP) from the data channel.
  3. Clearly separate the content description formats (e.g., for voice chat) from the content transport methods.
  4. Make it possible to add, modify, and remove content types from an existing session.
  5. Make it relatively easy to implement support for the protocol in standard Jabber/XMPP clients.
  6. Where communication with non-XMPP entities is needed, push as much complexity as possible onto server-side gateways between the XMPP network and the non-XMPP network.

This document defines the signalling protocol only. Additional documents specify the following:

4. Glossary

Table 1: Glossary

Term Definition
Session A number of pairs of negotiated content transport methods and content description formats connecting two entities. It is delimited in time by an initiate request and session ending events. During the lifetime of a session, pairs of content descriptions and content transport methods can be added or removed. A session consists of at least one active negotiated content type at a time.
Content Type The combination of one content description and one content transport method.
Content Description The format of the content type being established, which formally declares one purpose of the session (e.g., "voice" or "video"). This is the 'what' of the session (i.e., the bits to be transferred), such as the acceptable codecs when establishing a voice conversation. In Jingle XML syntax the content type is the namespace of the <description/> element.
Transport Method The method for establishing data stream(s) between entities. Possible transports might include ICE-TCP, Raw UDP, inband data, etc. This is the 'how' of the session. In Jingle XML syntax this is the namespace of the <transport/> element. The content transport method defines how to transfer bits from one host to another. Each transport method must specify whether it is lossy (thus suitable for applications where some packet loss is tolerable) or reliable (thus suitable for applications where packet loss is not tolerable).
Component A component is a numbered stream of data which needs to be transmitted between the endpoints for a given content type in the context of a given session. It is up to the transport to negotiate the details of each component. Depending on the content type and the content description, one content description may require multiple components to be communicated (e.g., the audio content type might use two components: one to transmit an RTP stream and another to transmit RTCP timing information).

5. Concepts and Approach

Jingle consists of three parts, each with its own syntax, semantics, and state machine:

  1. Overall session management
  2. Content description formats (the "what")
  3. Content transport methods (the "how")

This document defines the semantics and syntax for overall session management. It also provides pluggable "slots" for content description formats and content transport methods, which are specified in separate documents; however, for the sake of completeness, this document also includes examples for all of the actions related to description formats and transport methods.

At the most basic level, the process for negotiating a Jingle session is as follows:

  1. One user (the "initator") sends to another user (the "receiver") a session request with one content type, which includes at least one content type.
  2. If the receiver wants to proceed, it provisionally accepts the request by sending an IQ result.
  3. Both the initiator and receive start exchanging possible transport candidates as quickly as possible (these are sent in quick succession before further negotiation in order to minimize the time required before media data can flow).
  4. These candidates are checked for connectivity.
  5. As soon as the receiver finds a candidate over which media can flow, the receiver sends to the initiator a "session-accept" action.
  6. The parties start sending media over the negotiated candidate.

If the parties later discover a better candidate, they perform a "content-modify" negotiation and then switch to the better candidate. Naturally they can also modify various other parameters related to the session (e.g., adding video to a voice chat).

5.1 Overall Session Management

The state machine for overall session management (i.e., the state per Session ID) is as follows:

         o
         |
         | session-initiate
         |
         | +-----------------------+
         |/                        |
PENDING  o---------------------+   |
         |  | content-accept,  |   |
         |  | content-modify,  |   |
         |  | content-remove,  |   |
         |  | session-info,    |   |
         |  | transport-info   |   |
         |  +------------------+   |
         |                         |
         | session-accept          |
         |                         |
 ACTIVE  o---------------------+   |
         |  | content-accept,  |   |
         |  | content-add,     |   |
         |  | content-modify,  |   |
         |  | content-remove,  |   |
         |  | session-info,    |   |
         |  | transport-info   |   |
         |  +------------------+   |
         |                         |
         +-------------------------+
                                   |
                                   | session-terminate
                                   |
                                   o ENDED
    

There are three overall session states:

  1. PENDING
  2. ACTIVE
  3. ENDED

The actions related to management of the overall Jingle session are as follows:

Table 2: Jingle Actions

Action Description
content-accept Accept a content-add or content-remove action received from another party.
content-add Add one or more new content types to the session. The sender MUST specify only the added content-type(s), not the added content-type(s) plus the existing content-type(s). Therefore it is the responsibility of the recipient to maintain a local copy of the content definition. This action MUST NOT be sent while the session is in the PENDING state. When a party sends a content-add, it MUST ignore any actions received from the other party until it receives acknowledgement of the content-add. [11]
content-modify Change an existing content type. The sender SHOULD specify only the aspects for which a modification is desired (e.g., if the sender wishes to change only the profile then it would send an empty <content/> element with a modified value for the 'profile' attribute; if the wishes to change only the transport, then it would send a <content/> element that contains only a <transport/> child; etc.). Therefore it is the responsibility of the recipient to maintain a local copy of the content definition. The recipient MUST NOT reply to a content-modify action with another content-modify action. [12]
content-remove Remove one or more content types from the session. The sender MUST specify only the removed content-type(s), not the removed content-type(s) plus the remaining content-type(s). Therefore it is the responsibility of the recipient to maintain a local copy of the content definition. [13] [14]
session-accept Definitively accept a session negotiation (implicitly this action also serves as a content-accept).
session-info Send session-level information / messages, such as (for Jingle audio) a ringing message.
session-initiate Request negotiation of a new Jingle session.
session-terminate End an existing session.
transport-info Exchange transport candidates; it is mainly used in XEP-0176 but may be used in other transport specifications.

6. Session Flow

This section defines the high-level flow of a Jingle session. More detailed descriptions are provided in the Scenarios section of this document.

6.1 Resource Determination

In order to initiate a Jingle session, the initiator must determine which of the receiver's XMPP resources is best for the desired content description format. There are several possible scenarios:

  1. If the intended responder shares presence with the initiator (see XMPP IM [15]) and has only one available resource, this task SHOULD be completed using Service Discovery [16] or the presence-based profile of service discovery specified in Entity Capabilities [17]. [18]

  2. If the intended responder shares presence with the initiator and has more than one available resource but only one of the resources supports Jingle and the desired content description format, the initiator SHOULD initiate the Jingle signalling with that resource.

  3. If the intended responder shares presence with the initiator and has more than one available resource but more than one of the resources supports Jingle and the desired content description format, the initiator SHOULD use Resource Application Priority [19] in order to determine which is the best resource with which to initiate the desired Jingle session.

  4. If the intended responder does not share presence with the initiator, the initiator SHOULD first send a Stanza Session Negotiation [20] request to the responder in order to initiate the exchange of XMPP stanzas. The request SHOULD include a RAP routing hint as specified in XEP-0168 and the <message/> stanza containing the request SHOULD be of type "headline" so that (typically) it is not stored offline for later delivery.

6.2 Initiation

Once the initiator has discovered which of the receiver's XMPP resources is ideal for the desired content description format, it sends a session initiation request to the receiver. This request is an IQ-set containing a <jingle/> element qualified by the 'http://www.xmpp.org/extensions/xep-0166.html#ns' namespace (see Protocol Namespaces regarding issuance of one or more permanent namespaces), where the value of the 'action' attribute is "session-initiate" and where the <jingle/> element contains one or more <content/> elements. Each <content/> element defines a content type to be transferred during the session, and each <content/> element in turn contains one <description/> child element that specifies a desired content description format and one <transport/> child element that specifies a potential content transport method. If either party wishes to propose the use of multiple transport methods for the same content description, it must send multiple <content/> elements.

Note: The syntax and semantics of the <description/> and <transport/> elements are out of scope for this specification, since they are defined in related specifications. The syntax and semantics of the <jingle/> and <content/> elements are specified in this document under Formal Definition.

Note: In order to expedite session establishment, the initiator MAY send transport candidates (e.g., for negotiation of the ICE transport) immediately after sending the "session-initiate" message and before receiving acknowledgement from the receiver (i.e., the initiator MUST consider the session to be live even before receiving acknowledgement). Given in-order delivery, the receiver should receive such "transport-info" messages after receiving the "session-initiate" message (if not, it is appropriate for the receiver to return <unknown-session/> errors since it according to its state machine the session does not exist).

6.3 Receiver Response

Unless an error occurs, the receiver MUST acknowledge receipt of the initiation request.

If the receiver acknowledges receipt of the initation request, both parties must consider the session to be in the PENDING state.

There are several reasons why the receiver might return an error instead of acknowledging receipt of the initiation request:

If the initiator is unknown to the receiver (e.g., via presence subscription) and the receiver has a policy of not communicating via Jingle with unknown entities, it SHOULD return a <service-unavailable/> error.

Example 6. Initiator Unknown to Receiver

<iq type='error' from='juliet@capulet.lit/balcony' to='romeo@montague.lit/orchard' id='jingle1'>
  <error type='cancel'>
    <service-unavailable xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/>
  </error>
</iq>
    

If the receiver wishes to redirect to another address, it SHOULD return a <redirect/> error.

Example 7. Receiver Redirection

<iq type='error' from='juliet@capulet.lit/balcony' to='romeo@montague.lit/orchard' id='jingle1'>
  <error type='cancel'>
    <redirect xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'>xmpp:voicemail@capulet.lit</redirect>
  </error>
</iq>
    

If the receiver does not support Jingle, it MUST return a <service-unavailable/> error.

Example 8. Receiver Does Not Support Jingle

<iq type='error' from='juliet@capulet.lit/balcony' to='romeo@montague.lit/orchard' id='jingle1'>
  <error type='cancel'>
    <service-unavailable xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/>
  </error>
</iq>
    

If the receiver does not support any of the specified content description formats, it MUST return a <feature-not-implemented/> error with a Jingle-specific error condition of <unsupported-content/>.

Example 9. Receiver Does Not Support Any Content Description Formats

<iq type='error' from='juliet@capulet.lit/balcony' to='romeo@montague.lit/orchard' id='jingle1'>
  <error type='cancel'>
    <feature-not-implemented xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/>
    <unsupported-content xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns-errors'/>
  </error>
</iq>
    

If the receiver does not support any of the specified content transport methods, it MUST return a <feature-not-implemented/> error with a Jingle-specific error condition of <unsupported-transports/>.

Example 10. Receiver Does Not Support Any Transport Methods

<iq type='error' from='juliet@capulet.lit/balcony' to='romeo@montague.lit/orchard' id='jingle1'>
  <error type='cancel'>
    <feature-not-implemented xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/>
    <unsupported-transports xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns-errors'/>
  </error>
</iq>
    

If the initiation request was malformed, the receiver MUST return a <bad-request/> error.

Example 11. Initiation Request Malformed

<iq type='error' from='juliet@capulet.lit/balcony' to='romeo@montague.lit/orchard' id='jingle1'>
  <error type='cancel'>
    <bad-request xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/>
  </error>
</iq>
    

6.4 Decline

In order to decline the session initiation request, the receiver MUST acknowledge receipt of the session initiation request, then terminate the session as described under Termination.

6.5 Negotiation

In general, negotiation will be necessary before the parties can agree on an acceptable set of content types, content description formats, and content transport methods. The potential combinations of parameters to be negotiated are many, and not all are shown herein. Some are defined in the relevant specifications for various content description formats and content transport methods, and illustrated in the Scenarios section of this document.

The allowable negotiations (including content-level and transport-level negotiations) are as follows:

6.6 Acceptance

If (after negotiation of content transport methods and content description formats) the receiver determines that it will be able to establish a connection, it sends a definitive acceptance to the initiator.

Note: In the accept stanza, the <jingle/> element MUST contain one or more <content/> elements, each of which MUST contain one <description/> element and one <transport/> element. The <jingle/> element SHOULD possess a 'responder' attribute that explicitly specifies the full JID of the responding entity, and the initiator SHOULD send all future commmunications about this Jingle session to the JID provided in the 'responder' attribute.

The initiator then acknowledges the receiver's definitive acceptance, after which the parties can exchange content over the negotiated connection.

If one of the parties cannot find a suitable content transport method, it SHOULD terminate the session as described below.

6.7 Modifying an Active Session

Once a session is in the ACTIVE state, it may be modified. Potential modifications are shown in the Scenarios section of this document.

6.8 Termination

In order to gracefully end the session (which MAY be done at any point after acknowledging receipt of the initiation request, including immediately thereafter in order to decline the request), either the receiver or the initiator MUST a send a "terminate" action to the other party.

The other party (in this case the initiator) MUST then acknowledge termination of the session:

Note: As soon as an entity sends a "session-terminate" action, it MUST consider the session to be ended (even before receiving acknowledgement from the other party). If the terminating entity receives additional IQ-sets from the other party after sending the "session-terminate" action, it MUST reply with an <unknown-session/> error.

Unfortunately, not all sessions end gracefully. In applications of Jingle that also involve the exchange of presence information, receipt of <presence type='unavailable'/> from the other party MAY be a considered session-ending event. However, in this case there is nothing for the party to acknowledge.

6.9 Informational Messages

At any point after initiation of a Jingle session, either entity MAY send an informational message to the other party, for example to change a content transport method or content description format parameter, inform the other party that a session initiation request is queued, that a device is ringing, or that a scheduled event has occurred or will occur.

An informational message MUST be an IQ-set containing a <jingle/> element whose 'action' attribute is set to a value of "session-info" or "transport-info"; the <jingle/> element MUST further contain a payload child element (specific to the session or to a transport method) that specifies the information being communicated. If the party that receives an informational message does not understand the payload, it MUST return a <feature-not-implemented/> error with a Jingle-specific error condition of <unsupported-info/>.

If either party receives an empty "session-info" message for an active session, it MUST send an empty IQ result; this way, an empty "session-info" message may be used as a "ping" to determine session vitality.

Most informational messages are specific to a particular description format or transport method and therefore are described in specifications other than this one.

7. Scenarios

The very simple scenario described in the How It Works section of this document is just that: very simple. Typically, the session flow is more complex. The following sections show some more complex scenarios, in order of complexity.

7.1 Jingle Audio via RTP/AVP, Negotiated with ICE-UDP

In this scenario, Romeo initiates a voice chat with Juliet using a transport method of ICE-UDP.

The session flow is as follows:

Romeo                         Juliet
  |                             |
  |   session-initiate          |
  |---------------------------->|
  |   ack                       |
  |<----------------------------|
  |   transport-info (X times)  |
  |   (with acks)               |
  |<--------------------------->|
  |   session-accept            |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |   AUDIO (RTP)               |
  |<===========================>|
  |   session-terminate         |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |                             |
    

The protocol flow is as follows.

Example 12. Initiator sends session-initiate

<iq from='romeo@montague.lit/orchard' to='juliet@capulet.lit/balcony' id='jingle1' type='set'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns'
          action='session-initiate'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='this-is-the-audio-content' profile='RTP/AVP'>
      <description xmlns='http://www.xmpp.org/extensions/xep-0167.html#ns'>
        <payload-type id='96' name='speex' clockrate='16000'/>
        <payload-type id='97' name='speex' clockrate='8000'/>
        <payload-type id='18' name='G729'/>
        <payload-type id='103' name='L16' clockrate='16000' channels='2'/>
        <payload-type id='98' name='x-ISAC' clockrate='8000'/>
      </description>
      <transport xmlns='http://www.xmpp.org/extensions/xep-0176.html#ns-udp'/>
    </content>
  </jingle>
</iq>
    

Example 13. Receiver sends provisional acceptance

<iq from='juliet@capulet.lit/balcony'
    id='accept1'
    to='romeo@montague.lit/orchard'
    type='result'/>
    

Example 14. Initiator Sends a Candidate

<iq from='romeo@montague.lit/orchard' 
    id='info1' 
    to='juliet@capulet.lit/balcony' 
    type='set'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns' 
          action='transport-info'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='this-is-the-audio-content' profile='RTP/AVP'>
      <transport xmlns='http://www.xmpp.org/extensions/xep-0176.html#ns-udp'>
        <candidate component='1'
                   foundation='1'
                   generation='0' 
                   ip='10.0.1.1' 
                   network='0'
                   port='8998'
                   priority='2114978302'
                   protocol='udp'
                   pwd='asd88fgpdd777uzjYhagZg'
                   type='host'
                   ufrag='8hhy'/>
      </transport>
    </content>
  </jingle>
</iq>
    

Example 15. Initiator Sends a Second Candidate

<iq from='romeo@montague.lit/orchard' 
    id='info2' 
    to='juliet@capulet.lit/balcony' 
    type='set'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns' 
          action='transport-info' 
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='this-is-the-audio-content' profile='RTP/AVP'>
      <transport xmlns='http://www.xmpp.org/extensions/xep-0176.html#ns-udp'>
        <candidate component='1'
                   foundation='1'
                   generation='0' 
                   ip='192.0.2.3' 
                   network='1'
                   port='45664'
                   priority='1678246398'
                   protocol='udp'
                   pwd='asd88fgpdd777uzjYhagZg'
                   type='srflx'
                   ufrag='8hhy'/>
      </transport>
    </content>
  </jingle>
</iq>
    

Example 16. Initiator Sends a Third Candidate

<iq from='romeo@montague.lit/orchard' 
    id='info3' 
    to='juliet@capulet.lit/balcony' 
    type='set'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns' 
          action='transport-info' 
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='this-is-the-audio-content' profile='RTP/AVP'>
      <transport xmlns='http://www.xmpp.org/extensions/xep-0176.html#ns-udp'>
        <candidate component='1'
                   foundation='1'
                   generation='0' 
                   ip='208.245.212.67' 
                   network='2'
                   port='53267'
                   priority='1677984254'
                   protocol='udp'
                   pwd='asd88fgpdd777uzjYhagZg'
                   type='srflx'
                   ufrag='8hhy'/>
      </transport>
    </content>
  </jingle>
</iq>
    

For each candidate received, the other party MUST acknowledge receipt or return an error:

Example 17. Responder Acknowledges Receipt

<iq from='juliet@capulet.lit/balcony' to='romeo@montague.lit/orchard' id='info1' type='result'/>

<iq from='juliet@capulet.lit/balcony' to='romeo@montague.lit/orchard' id='info2' type='result'/>

<iq from='juliet@capulet.lit/balcony' to='romeo@montague.lit/orchard' id='info3' type='result'/>
    

At the same time (i.e., immediately after provisionally accepting the session, not waiting for the initiator to begin or finish sending candidates), the responder also begins sending candidates that may work for it. As above, the initiator acknowledges receipt of the candidates.

As the initiator and responder receive candidates, they probe the various candidate transports for connectivity. In performing these connectivity checks, the parties follow the procedure specified in Section 7 of draft-ietf-mmusic-ice.

If one of the candidate transports is found to work, the receiver accepts the session.

Example 18. Receiver sends session-accept

<iq type='set' from='juliet@capulet.lit/balcony' to='romeo@montague.lit/orchard' id='accept1'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns'
          action='session-accept'
          initiator='romeo@montague.lit/orchard'
          responder='juliet@capulet.lit/balcony'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='this-is-the-audio-content' profile='RTP/AVP'>
      <description xmlns='http://www.xmpp.org/extensions/xep-0167.html#ns'>
        <payload-type id='97' name='speex' clockrate='8000'/>
        <payload-type id='18' name='G729'/>
        <payload-type id='0' name='PCMU' />
        <payload-type id='102' name='iLBC'/>
        <payload-type id='4' name='G723'/>
        <payload-type id='8' name='PCMA'/>
        <payload-type id='13' name='CN'/>
      </description>
      <transport xmlns='http://www.xmpp.org/extensions/xep-0176.html#ns'>
        <candidate ip='208.245.212.67' port='9876' generation='0'/>
      </transport>
    </content>
  </jingle>
</iq>
    

If the payload types and transport candidate can be successfully used by both parties, then the initiator acknowledges the session-accept.

Example 19. Initiator acknowledges session-accept

<iq type='result' to='juliet@capulet.lit/balcony' from='romeo@montague.lit/orchard' id='accept1'/>
    

The parties now begin to exchange media. In this case they would exchange audio using the Speex codec at a clockrate of 8000 since that is the highest-priority codec for the responder (as determined by the XML order of the <payload-type/> children).

The parties may continue the session as long as desired.

Eventually, one of the parties terminates the session.

Example 20. Receiver terminates the session

<iq from='juliet@capulet.lit/balcony'
    id='term1'
    to='romeo@montague.lit/orchard'
    type='set'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns'
          action='session-terminate'
          initiator='romeo@montague.lit/orchard'
          reason='Sorry, gotta go!'
          sid='a73sjjvkla37jfea'/>
</iq>
    

The other party MUST then acknowledge termination of the session:

Example 21. Initiator Acknowledges Termination

<iq from='romeo@montague.lit/orchard' 
    id='term1'
    to='juliet@capulet.lit/balcony' 
    type='result'/>
    

7.2 Jingle Audio and Video via RTP/AVP, Negotiated with ICE-UDP

In this scenario, Romeo initiates a combined audio and video chat with Juliet using a transport method of ICE. Juliet at first refuses the video portion, then later offers to add video, which Romeo accepts.

The session flow is as follows:

Romeo                         Juliet
  |                             |
  |   session-initiate          |
  |---------------------------->|
  |   ack                       |
  |<----------------------------|
  |   content-remove            |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |   content-accept            |
  |---------------------------->|
  |   ack                       |
  |<----------------------------|
  |   transport-info (X times)  |
  |   (with acks)               |
  |<--------------------------->|
  |   session-accept            |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |   AUDIO (RTP)               |
  |<===========================>|
  |   content-add               |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |   content-accept            |
  |---------------------------->|
  |   ack                       |
  |<----------------------------|
  |   AUDIO + VIDEO (RTP)       |
  |<===========================>|
  |   session-terminate         |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |                             |
    

The protocol flow is as follows.

Example 22. Initiation

<iq from='romeo@montague.lit/orchard' to='juliet@capulet.lit/balcony' id='jingle1' type='set'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns'
          action='session-initiate'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='this-is-the-audio-content' profile='RTP/AVP'>
      <description xmlns='http://www.xmpp.org/extensions/xep-0167.html#ns'>
        <payload-type id='96' name='speex' clockrate='16000'/>
        <payload-type id='97' name='speex' clockrate='8000'/>
        <payload-type id='18' name='G729'/>
        <payload-type id='103' name='L16' clockrate='16000' channels='2'/>
        <payload-type id='98' name='x-ISAC' clockrate='8000'/>
      </description>
      <transport xmlns='http://www.xmpp.org/extensions/xep-0176.html#ns'/>
    </content>
    <content creator='initiator' name='this-is-the-video-content' profile='RTP/AVP'>
      <description xmlns='http://www.xmpp.org/extensions/xep-0180.html#ns'>
        <payload-type id='96' name='theora' clockrate='90000' height='720' width='1280'>
          <parameter name='delivery-method' value='inline'/>
          <parameter name='configuration' value='somebase16string'/>
          <parameter name='sampling' value='YCbCr-4:2:2'/>
        </payload-type>
        <payload-type id='28' name='nv' clockrate='90000'/>
        <payload-type id='25' name='CelB' clockrate='90000'/>
        <payload-type id='32' name='MPV' clockrate='90000'/>
      </description>
      <transport xmlns='http://www.xmpp.org/extensions/xep-0176.html#ns'/>
    </content>
  </jingle>
</iq>
    

Example 23. Receiver Acknowledges Receipt of Initiation Request

<iq type='result' from='juliet@capulet.lit/balcony' to='romeo@montague.lit/orchard' id='jingle1'/>
    

However, Juliet doesn't want to do video because she is having a bad hair day, so she sends a "content-remove" request to Romeo.

Example 24. Receiver requests content-remove

<iq from='juliet@capulet.lit/balcony' to='romeo@montague.lit/orchard' id='remove1' type='set'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns'
          action='content-remove'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='this-is-the-video-content' profile='RTP/AVP'/>
  </jingle>
</iq>
    

Romeo then acknowledges the content-remove request and, if it is acceptable, returns a content-accept:

Example 25. Initiator acknowledges content-remove

<iq from='romeo@montague.lit/orchard' to='juliet@capulet.lit/balcony' id='remove1' type='result'/>
    

Example 26. Initiator accepts content definition

<iq from='romeo@montague.lit/orchard' to='juliet@capulet.lit/balcony' id='remove2' type='set'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns'
          action='content-accept'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='responder' name='this-is-the-video-content' profile='RTP/AVP'>
      <description xmlns='http://www.xmpp.org/extensions/xep-0180.html#ns'/>
      <transport xmlns='http://www.xmpp.org/extensions/xep-0176.html#ns'/>
    </content>
  </jingle>
</iq>
    

The other party then acknowledges the acceptance.

Example 27. Receiver acknowledges content-accept

<iq from='juliet@capulet.lit/balcony' to='romeo@montague.lit/orchard' id='remove2' type='result'/>
    

As in Scenario #1, the parties exchange ICE candidates (see above for examples).

Once the parties find candidate transports that work, the receiver accepts the session.

Example 28. Receiver sends session-accept

<iq type='set' from='juliet@capulet.lit/balcony' to='romeo@montague.lit/orchard' id='accept1'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns'
          action='session-accept'
          initiator='romeo@montague.lit/orchard'
          responder='juliet@capulet.lit/balcony'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='this-is-the-audio-content' profile='RTP/AVP'>
      <description xmlns='http://www.xmpp.org/extensions/xep-0167.html#ns'>
        <payload-type id='97' name='speex' clockrate='8000'/>
        <payload-type id='18' name='G729'/>
        <payload-type id='0' name='PCMU' />
        <payload-type id='102' name='iLBC'/>
        <payload-type id='4' name='G723'/>
        <payload-type id='8' name='PCMA'/>
        <payload-type id='13' name='CN'/>
      </description>
      <transport xmlns='http://www.xmpp.org/extensions/xep-0176.html#ns'>
        <candidate ip='208.245.212.67' port='9876' generation='0'/>
      </transport>
    </content>
  </jingle>
</iq>
    

As above, if the payload types and transport candidate can be successfully used by both parties, then the initiator acknowledges the session-accept.

Example 29. Initiator acknowledges session-accept

<iq type='result' to='juliet@capulet.lit/balcony' from='romeo@montague.lit/orchard' id='accept1'/>
    

The parties now begin to exchange media. In this case they would exchange audio using the Speex codec at a clockrate of 8000 since that is the highest-priority codec for the responder (as determined by the XML order of the <payload-type/> children).

Once Juliet gets her hair in order, she decides that she is presentable for a video chat so she sends a content-add request to Romeo.

Example 30. Receiver sends a content-add

<iq from='juliet@capulet.lit/balcony' to='romeo@montague.lit/orchard' id='add1' type='set'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns'
          action='content-add'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='responder' name='video-is-back' profile='RTP/AVP'>
      <description xmlns='http://www.xmpp.org/extensions/xep-0180.html#ns'>
        <payload-type id='96' name='theora' height='720' width='1280'>
          <parameter name='delivery-method' value='inline'/>
          <parameter name='configuration' value='somebase16string'/>
          <parameter name='sampling' value='YCbCr-4:2:2'/>
        </payload-type>
        <payload-type id='32' name='MPV' clockrate='90000'/>
        <payload-type id='33' name='MP2T' clockrate='90000'/>
      </description>
      <transport xmlns='http://www.xmpp.org/extensions/xep-0176.html#ns'>
    </content>
  </jingle>
</iq>
    

The entity receiving the content-add request then acknowledges the request and, if it is acceptable, returns a content-accept:

Example 31. Initiator acknowledges content-add

<iq from='romeo@montague.lit/orchard' to='juliet@capulet.lit/balcony' id='add1' type='result'/>
    

Example 32. Initiator accepts content definition

<iq from='romeo@montague.lit/orchard' to='juliet@capulet.lit/balcony' id='add2' type='set'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns'
          action='content-accept'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='responder' name='video-is-back' profile='RTP/AVP'>
      <description xmlns='http://www.xmpp.org/extensions/xep-0180.html#ns'>
        <payload-type id='96' name='theora' height='720' width='1280'>
          <parameter name='delivery-method' value='inline'/>
          <parameter name='configuration' value='somebase16string'/>
          <parameter name='sampling' value='YCbCr-4:2:2'/>
        </payload-type>
        <payload-type id='32' name='MPV' clockrate='90000'/>
        <payload-type id='33' name='MP2T' clockrate='90000'/>
      </description>
      <transport xmlns='http://www.xmpp.org/extensions/xep-0176.html#ns'/>
    </content>
  </jingle>
</iq>
    

The other party then acknowledges the acceptance.

Example 33. Receiver acknowledges content-accept

<iq from='juliet@capulet.lit/balcony' to='romeo@montague.lit/orchard' id='add2' type='result'/>
    

The media session proceeds. Now they would exchange both audio and video, where the audio is exchanged the Speex codec at a clockrate of 8000 and the video is exchanged using the Theora codec with a height of 720 pixels, a width of 1280 pixels, and so on.

The parties may continue the session as long as desired.

Eventually, one of the parties terminates the session.

Example 34. Initiator sends session-terminate

<iq from='romeo@montague.lit/orchard'
    id='term1'
    to='juliet@capulet.lit/balcony'
    type='set'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns'
          action='session-terminate'
          initiator='romeo@montague.lit/orchard'
          reason='I&apos;m outta here!'
          sid='a73sjjvkla37jfea'/>
</iq>
    

Example 35. Receiver acknowledges session-terminate

<iq from='juliet@capulet.lit/balcony' 
    id='term1'
    to='romeo@montague.lit/orchard' 
    type='result'/>
    

7.3 Secure Jingle Audio via UDP/TLS/RTP/AVP, Negotiated with ICE-UDP

In this scenario, Romeo initiates a voice chat with Juliet using a transport method of ICE-UDP and an unencrypted profile of "RTP/AVP", but Juliet wants to chat securely so she requests the use of a secure transport as specified in RTP Over DTLS [21] (via a profile of "UDP/TLS/RTP/AVP").

The session flow is as follows:

Romeo                         Juliet
  |                             |
  |   session-initiate          |
  |---------------------------->|
  |   ack                       |
  |<----------------------------|
  |   content-modify            |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |   content-accept            |
  |---------------------------->|
  |   ack                       |
  |<----------------------------|
  |   transport-info (X times)  |
  |   (with acks)               |
  |<--------------------------->|
  |   session-accept            |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |   AUDIO (RTP)               |
  |<===========================>|
  |   session-terminate         |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |                             |
    

The protocol flow is as follows.

Example 36. Initiator sends session-initiate

<iq from='romeo@montague.lit/orchard' to='juliet@capulet.lit/balcony' id='jingle1' type='set'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns'
          action='session-initiate'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='this-is-the-audio-content' profile='RTP/AVP'>
      <description xmlns='http://www.xmpp.org/extensions/xep-0167.html#ns'>
        <payload-type id='96' name='speex' clockrate='16000'/>
        <payload-type id='97' name='speex' clockrate='8000'/>
        <payload-type id='18' name='G729'/>
        <payload-type id='103' name='L16' clockrate='16000' channels='2'/>
        <payload-type id='98' name='x-ISAC' clockrate='8000'/>
      </description>
      <transport xmlns='http://www.xmpp.org/extensions/xep-0176.html#ns-udp'/>
    </content>
  </jingle>
</iq>
    

Example 37. Receiver sends provisional acceptance

<iq from='juliet@capulet.lit/balcony'
    id='accept1'
    to='romeo@montague.lit/orchard'
    type='result'/>
    

However, Juliet wants to make sure that the communications are encrypted, so she sends a "content-modify" request to Romeo.

Example 38. Receiver requests content-modify

<iq from='juliet@capulet.lit/balcony' to='romeo@montague.lit/orchard' id='mod1' type='set'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns'
          action='content-modify'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='this-is-the-audio-content' profile='UDP/TLS/RTP/AVP'/>
  </jingle>
</iq>
    

Romeo then acknowledges the content-modify request and, if it is acceptable, returns a content-accept:

Example 39. Initiator acknowledges content-modify

<iq from='romeo@montague.lit/orchard' to='juliet@capulet.lit/balcony' id='mod1' type='result'/>
    

Example 40. Initiator accepts content definition

<iq from='romeo@montague.lit/orchard' to='juliet@capulet.lit/balcony' id='mod2' type='set'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns'
          action='content-accept'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='responder' name='this-is-the-audio-content' profile='UDP/TLS/RTP/AVP'/>
  </jingle>
</iq>
    

The other party then acknowledges the acceptance.

Example 41. Receiver acknowledges content-accept

<iq from='juliet@capulet.lit/balcony' to='romeo@montague.lit/orchard' id='mod2' type='result'/>
    

As in Scenario #1, the parties exchange ICE candidates (see above for examples).

If one of the candidate transports is found to work, the receiver accepts the session.

Example 42. Receiver sends session-accept

<iq type='set' from='juliet@capulet.lit/balcony' to='romeo@montague.lit/orchard' id='accept1'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns'
          action='session-accept'
          initiator='romeo@montague.lit/orchard'
          responder='juliet@capulet.lit/balcony'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='this-is-the-audio-content' profile='UDP/TLS/RTP/AVP'>
      <description xmlns='http://www.xmpp.org/extensions/xep-0167.html#ns'>
        <payload-type id='97' name='speex' clockrate='8000'/>
        <payload-type id='18' name='G729'/>
        <payload-type id='0' name='PCMU' />
        <payload-type id='102' name='iLBC'/>
        <payload-type id='4' name='G723'/>
        <payload-type id='8' name='PCMA'/>
        <payload-type id='13' name='CN'/>
      </description>
      <transport xmlns='http://www.xmpp.org/extensions/xep-0176.html#ns'>
        <candidate ip='208.245.212.67' port='9876' generation='0'/>
      </transport>
    </content>
  </jingle>
</iq>
    

If the payload types and transport candidate can be successfully used by both parties, then the initiator acknowledges the session-accept.

Example 43. Initiator acknowledges session-accept

<iq type='result' to='juliet@capulet.lit/balcony' from='romeo@montague.lit/orchard' id='accept1'/>
    

The parties now begin to exchange media. In this case they would exchange audio using the Speex codec at a clockrate of 8000 since that is the highest-priority codec for the responder (as determined by the XML order of the <payload-type/> children).

The parties may continue the session as long as desired.

Eventually, one of the parties terminates the session.

Example 44. Receiver terminates the session

<iq from='juliet@capulet.lit/balcony'
    id='term1'
    to='romeo@montague.lit/orchard'
    type='set'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns'
          action='session-terminate'
          initiator='romeo@montague.lit/orchard'
          reason='Sorry, gotta go!'
          sid='a73sjjvkla37jfea'/>
</iq>
    

The other party MUST then acknowledge termination of the session:

Example 45. Initiator Acknowledges Termination

<iq from='romeo@montague.lit/orchard' 
    id='term1'
    to='juliet@capulet.lit/balcony' 
    type='result'/>
    

8. Formal Definition

8.1 Jingle Element

The <jingle/> element MAY be empty or contain one or more <content/> elements (for which see Content Element).

The attributes of the <jingle/> element are as follows.

Table 3: Attributes of Jingle Element

Attribute Definition Inclusion
action A Jingle action as listed in this document (e.g., "session-terminate"). REQUIRED
initiator The full JID of the entity that has initiated the session flow (which may be different from the 'from' address on the IQ-set). REQUIRED
reasoncode A machine-readable purpose for the action being sent (e.g., "connectivity-error" for a session-terminate action). OPTIONAL
reasontext A human-readable purpose for the action being sent (e.g., "Sorry, gotta go!" for a session-terminate action). OPTIONAL
responder The full JID of the entity that has replied to the initiation, which may be different from the 'to' address on the IQ-set. RECOMMENDED
sid A random session identifier generated by the initiator, which effectively maps to the SIP "Call-ID" parameter; this SHOULD match the XML Nmtoken production [22] so that XML character escaping is not needed for characters such as &. REQUIRED

8.2 Content Element

The attributes of the <content/> element are as follows:

Table 4: Attributes of Content Element

Attribute Definition Inclusion
creator Which party originally generated the content description (used to prevent race conditions regarding modifications). REQUIRED
name A unique name or identifier for the content type (this identifier is opaque and does not have semantic meaning). REQUIRED
profile The profile in use (e.g., "RTP/AVP" in the context of the Real-time Transport Protocol). RECOMMENDED
senders which entities in the session will be generating content; the allowable values are "initiator", "recipient", or "both" (where "both" is the default). RECOMMENDED

9. Error Handling

The Jingle-specific error conditions are as follows.

Table 5: Other Error Conditions

Jingle Condition XMPP Condition Description
<out-of-order/> <unexpected-request/> The request cannot occur at this point in the state machine (e.g., initiate after accept).
<unknown-session/> <bad-request/> The 'sid' attribute specifies a session that is unknown to the recipient (e.g., no longer live according to the recipient's state machine because the recipient previously terminated the session).
<unsupported-content/> <not-acceptable/> The recipient does not support any of the desired content description formats.
<unsupported-info/> <feature-not-implemented/> The recipient does not support the informational payload of a session-info message.
<unsupported-transports/> <not-acceptable/> The recipient does not support any of the desired content transport methods.

10. Determining Support

If an entity supports Jingle, it MUST advertise that fact by returning a feature of "http://www.xmpp.org/extensions/xep-0166.html#ns" (see Protocol Namespaces regarding issuance of one or more permanent namespaces) in response to Service Discovery [23] information requests.

Example 46. Service Discovery Information Request

<iq from='romeo@montague.lit/orchard'
    id='disco1'
    to='juliet@capulet.lit/balcony'
    type='get'>
  <query xmlns='http://jabber.org/protocol/disco#info'/>
</iq>
  

Example 47. Service Discovery Information Response

<iq from='juliet@capulet.lit/balcony'
    id='disco1'
    to='romeo@montague.lit/orchard'
    type='result'>
  <query xmlns='http://jabber.org/protocol/disco#info'>
    ...
    <feature var='http://www.xmpp.org/extensions/xep-0166.html#ns'/>
    ...
  </query>
</iq>
  

11. Conformance by Using Protocols

11.1 Application Types

A document that specifies a Jingle application type (e.g., audio via RTP) MUST define:

  1. How successful content negotiation occurs for encapsulation into Jingle.
  2. A <description/> element and associated semantics for representing the content.
  3. If and how the content description can be mapped to the Session Description Protocol.
  4. Whether the content should be sent over a reliable or lossy transport type (or both).
  5. Exactly how the content is to be sent and received over a reliable or lossy transport.

11.2 Transport Methods

A document that specifies a Jingle transport method (e.g., Raw UDP) MUST define:

  1. How successful transport negotiation occurs for encapsulation into Jingle.
  2. A <transport/> element and associated semantics for representing the transport type.
  3. Whether the transport is reliable or lossy.
  4. If and how the transport handles components as defined herein (e.g., for the Real Time Control Protocol).

12. Security Considerations

12.1 Denial of Service

Jingle sessions may be resource-intensive. Therefore, it is possible to launch a denial-of-service attack against an entity by burdening it with too many Jingle sessions. Care must be taken to accept content sessions only from known entities and only if the entity's device is able to process such sessions.

12.2 Communication Through Gateways

Jingle communications may be enabled through gateways to non-XMPP networks, whose security characteristics may be quite different from those of XMPP networks. (For example, on some SIP networks authentication is optional and "from" addresses can be easily forged.) Care must be taken in communicating through such gateways.

13. IANA Considerations

This document requires no interaction with the Internet Assigned Numbers Authority (IANA) [24].

14. XMPP Registrar Considerations

14.1 Protocol Namespaces

Until this specification advances to a status of Draft, its associated namespaces shall be "http://www.xmpp.org/extensions/xep-0166.html#ns" and "http://www.xmpp.org/extensions/xep-0166.html#ns-errors"; upon advancement of this specification, the XMPP Registrar [25] shall issue permanent namespaces in accordance with the process defined in Section 4 of XMPP Registrar Function [26].

14.2 Jingle Content Description Formats Registry

The XMPP Registrar shall maintain a registry of Jingle content description formats. All content description format registrations shall be defined in separate specifications (not in this document). Content description formats defined within the XEP series MUST be registered with the XMPP Registrar, resulting in protocol URNs of the form "urn:xmpp:jingle:description:name" (where "name" is the registered name of the content description format).

In order to submit new values to this registry, the registrant must define an XML fragment of the following form and either include it in the relevant XMPP Extension Protocol or send it to the email address <registrar@xmpp.org>:

<content>
  <name>the name of the content description format</name>
  <desc>a natural-language summary of the content description format</desc>
  <transport>whether the content should be sent over a "reliable" or "lossy" transport</transport>
  <doc>the document in which this content description format is specified</doc>
</content>
    

14.3 Jingle Content Transport Methods Registry

The XMPP Registrar shall maintain a registry of Jingle content transport methods. All content transport method registrations shall be defined in separate specifications (not in this document). Content transport methods defined within the XEP series MUST be registered with the XMPP Registrar, resulting in protocol URNs of the form "urn:xmpp:jingle:transport:name" (where "name" is the registered name of the content transport method).

In order to submit new values to this registry, the registrant must define an XML fragment of the following form and either include it in the relevant XMPP Extension Protocol or send it to the email address <registrar@xmpp.org>:

<transport>
  <name>the name of the content transport method</name>
  <desc>a natural-language summary of the content transport method</desc>
  <type>whether the transport method is "reliable" or "lossy"</type>
  <doc>the document in which this content transport method is specified</doc>
</transport>
    

14.4 Jingle Reason Codes Registry

14.4.1 Process

The XMPP Registrar shall maintain a registry of reasons for Jingle actions.

In order to submit new values to this registry, the registrant must define an XML fragment of the following form and either include it in the relevant XMPP Extension Protocol or send it to the email address <registrar@xmpp.org>:

<reason>
  <code>the value of the 'reasoncode' attribute</name>
  <desc>a natural-language summary of the reason code</desc>
  <doc>the document in which this reason code is specified</doc>
</reason>
      

14.4.2 Initial Registration

The following submission registers reasoncodes in use as of April 2007. Refer to the registry itself for a complete and current list of reasoncodes.

<reason>
  <code>connectivity-error</code>
  <desc>the action (e.g., session-terminate) is related to connectivity problems</desc>
  <doc>XEP-0166</doc>
</reason>

<reason>
  <code>general-error</code>
  <desc>the action (e.g., session-terminate) is related to a non-specific application error</desc>
  <doc>XEP-0166</doc>
</reason>

<reason>
  <code>media-error</code>
  <desc>the action (e.g., session-terminate) is related to media processing problems</desc>
  <doc>XEP-0166</doc>
</reason>

<reason>
  <code>no-error</code>
  <desc>the action is generated during the normal course of state management</desc>
  <doc>XEP-0166</doc>
</reason>
      

15. XML Schemas

15.1 Jingle

<?xml version='1.0' encoding='UTF-8'?>

<xs:schema
    xmlns:xs='http://www.w3.org/2001/XMLSchema'
    targetNamespace='http://www.xmpp.org/extensions/xep-0166.html#ns'
    xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns'
    elementFormDefault='qualified'>

  <xs:element name='jingle'>
    <xs:complexType>
      <xs:sequence minOccurs='1' maxOccurs='unlimited'>
        <xs:element ref='content'/>
      </xs:sequence>
      <xs:attribute name='action' use='required'>
        <xs:simpleType>
          <xs:restriction base='xs:NCName'>
            <xs:enumeration value='content-accept'/>
            <xs:enumeration value='content-add'/>
            <xs:enumeration value='content-modify'/>
            <xs:enumeration value='content-remove'/>
            <xs:enumeration value='session-accept'/>
            <xs:enumeration value='session-info'/>
            <xs:enumeration value='session-initiate'/>
            <xs:enumeration value='session-terminate'/>
            <xs:enumeration value='transport-info'/>
          </xs:restriction>
        </xs:simpleType>
      </xs:attribute>
      <xs:attribute name='initiator' type='xs:string' use='required'/>
      <xs:attribute name='reasoncode' type='xs:string' use='optional'/>
      <xs:attribute name='reasontext' type='xs:string' use='optional'/>
      <xs:attribute name='responder' type='xs:string' use='optional'/>
      <xs:attribute name='sid' type='xs:NMTOKEN' use='required'/>
    </xs:complexType>
  </xs:element>

  <xs:element name='content'>
    <xs:complexType>
      <xs:choice minOccurs='0'>
        <xs:sequence>
          <xs:any namespace='##other' minOccurs='0' maxOccurs='unbounded'/>
        </xs:sequence>
      </xs:choice>
      <xs:attribute name='creator' use='required'>
        <xs:simpleType>
          <xs:restriction base='xs:NCName'>
            <xs:enumeration value='initiator'>
            <xs:enumeration value='responder'/>
          </xs:restriction>
        </xs:simpleType>
      </xs:attribute>
      <xs:attribute name='name' use='required' type='xs:string'/>
      <xs:attribute name='profile' use='optional' type='xs:string'/>
      <xs:attribute name='senders' use='optional' default='both'>
        <xs:simpleType>
          <xs:restriction base='xs:NCName'>
            <xs:enumeration value='both'>
            <xs:enumeration value='initiator'>
            <xs:enumeration value='responder'/>
          </xs:restriction>
        </xs:simpleType>
      </xs:attribute>
    </xs:complexType>
  </xs:element>

</xs:schema>
    

15.2 Jingle Errors

<?xml version='1.0' encoding='UTF-8'?>

<xs:schema
    xmlns:xs='http://www.w3.org/2001/XMLSchema'
    targetNamespace='http://www.xmpp.org/extensions/xep-0166.html#ns-errors'
    xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns-errors'
    elementFormDefault='qualified'>

  <xs:element name='out-of-order' type='empty'/>
  <xs:element name='unknown-session' type='empty'/>
  <xs:element name='unsupported-content' type='empty'/>
  <xs:element name='unsupported-transports' type='empty'/>

  <xs:simpleType name='empty'>
    <xs:restriction base='xs:string'>
      <xs:enumeration value=''/>
    </xs:restriction>
  </xs:simpleType>

</xs:schema>
    

16. History

Until Jingle was developed, there existed no widely-adopted standard for initiating and managing peer-to-peer interactions between XMPP entities. Although several large service providers and Jabber client teams had written and implemented their own proprietary XMPP extensions for peer-to-peer signalling (usually only for voice), those technologies were not open and did not always take into account requirements to interoperate with SIP-based technologies. The only existing open protocol was A Transport for Initiating and Negotiating Sessions [27], which made it possible to initiate and manage peer-to-peer sessions, but which did not provide enough of the key signalling semantics to be easily implemented in Jabber/XMPP clients. [28]

The result was an unfortunate fragmentation within the XMPP community regarding signalling protocols. Essentially, there were two possible approaches to solving the problem:

  1. Recommend that all client developers implement a dual-stack (XMPP + SIP) solution.
  2. Define a full-featured protocol for XMPP signalling.

Implementation experience indicates that a dual-stack approach may not be feasible on all the computing platforms for which Jabber clients have been written, or even desirable on platforms where it is feasible. [29] Therefore, it seemed reasonable to define an XMPP signalling protocol that could provide the necessary session management semantics while also making it relatively straightforward to interoperate with existing Internet standards.

As a result of feedback received on XEP-0111, the original authors of this document (Joe Hildebrand and Peter Saint-Andre) began to define such a signalling protocol, code-named Jingle. Upon communication with members of the Google Talk team, [30] it was discovered that the emerging Jingle approach was conceptually (and even syntactically) quite similar to the signalling protocol used in the Google Talk application. Therefore, in the interest of interoperability and adoption, we decided to harmonize the two approaches. The signalling protocol specified herein is, therefore, substantially equivalent to the original Google Talk protocol, with several adjustments based on feedback received from implementors as well as for publication within the XMPP Standards Foundation's standards process.

17. Acknowledgements

The authors would like to thank Rohan Mahy for his valuable input on early versions of this document. Thiago Camargo, Dafydd Harries, Antti Ijäs, Lauri Kaila, Justin Karneges, Jussi Laako, Anthony Minessale, Matt O'Gorman, Rob Taylor, Matt Tucker, Saku Vainio, Brian West, and others have also provided helpful input. Thanks also to those who have commented on the Standards SIG [31] and (earlier) Jingle [32] mailing lists.


Notes

1. RFC 3550: RTP: A Transport Protocol for Real-Time Applications <http://tools.ietf.org/html/rfc3550>.

2. RFC 768: User Datagram Protocol <http://tools.ietf.org/html/rfc0768>.

3. Interactive Connectivity Establishment (ICE): A Methodology for Network Address Translator (NAT) Traversal for Offer/Answer Protocols <http://tools.ietf.org/html/draft-ietf-mmusic-ice>. Work in progress.

4. XEP-0167: Jingle Audio via RTP <http://www.xmpp.org/extensions/xep-0167.html>.

5. RFC 3261: Session Initiation Protocol (SIP) <http://tools.ietf.org/html/rfc3261>.

6. XEP-0180: Jingle Video via RTP <http://www.xmpp.org/extensions/xep-0180.html>.

7. XEP-0177: Jingle Raw UDP Transport Method <http://www.xmpp.org/extensions/xep-0177.html>.

8. RFC 4566: SDP: Session Description Protocol <http://tools.ietf.org/html/rfc4566>.

9. XEP-0176: Jingle ICE Transport Method <http://www.xmpp.org/extensions/xep-0176.html>.

10. ITU Recommendation H.323: Packet-based Multimedia Communications Systems (September 1999).

11. In the event that a session contains two unidirectional streams of the same type because a content-add was issued simultaneously by both parties, it is RECOMMENDED that participants close the duplicate stream in favour of that created by the session initiator, which should be made bidirectional with a 'content-modify' action by the responder.

12. If both parties send modify messages at the same time, the modify message from the session initiator MUST trump the modify message from the recipient and the initiator SHOULD return an <unexpected-request/> error to the other party.

13. A client MUST NOT return an error upon receipt of a 'content-remove' action for a content description that is received after a 'content-remove' action has been sent but before the action has been acknowledged by the peer.

14. If the content-remove results in no more content types for the session, the entity that receives the content-remove SHOULD send a session-terminate action to the other party (since a session with no content types is void).

15. RFC 3921: Extensible Messaging and Presence Protocol (XMPP): Instant Messaging and Presence <http://tools.ietf.org/html/rfc3921>.

16. XEP-0030: Service Discovery <http://www.xmpp.org/extensions/xep-0030.html>.

17. XEP-0115: Entity Capabilities <http://www.xmpp.org/extensions/xep-0115.html>.

18. Naturally, instead of sending service discovery requests to every contact in a user's roster, it is more efficient to use Entity Capabilities, whereby support for Jingle and various Jingle content description formats and content transport methods is determined for a client version in general (rather than on a per-JID basis) and then cached. Refer to XEP-0115 for details.

19. XEP-0168: Resource Application Priority <http://www.xmpp.org/extensions/xep-0168.html>.

20. XEP-0155: Stanza Session Negotiation <http://www.xmpp.org/extensions/xep-0155.html>.

21. Real-Time Transport Protocol (RTP) over Datagram Transport Layer Security (DTLS) <http://tools.ietf.org/html/draft-fischl-mmusic-sdp-dtls>. Work in progress.

22. See <http://www.w3.org/TR/2000/WD-xml-2e-20000814#NT-Nmtoken>

23. XEP-0030: Service Discovery <http://www.xmpp.org/extensions/xep-0030.html>.

24. The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique parameter values for Internet protocols, such as port numbers and URI schemes. For further information, see <http://www.iana.org/>.

25. The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used in the context of XMPP extension protocols approved by the XMPP Standards Foundation. For further information, see <http://www.xmpp.org/registrar/>.

26. XEP-0053: XMPP Registrar Function <http://www.xmpp.org/extensions/xep-0053.html>.

27. XEP-0111: A Transport for Initiating and Negotiating Sessions <http://www.xmpp.org/extensions/xep-0111.html>.

28. It is true that TINS made it relatively easy to implement an XMPP-to-SIP gateway; however, in line with the long-time Jabber philosophy of "simple clients, complex servers", it would be better to force complexity onto the server-side gateway and to keep the client as simple as possible.

29. For example, one large ISP decided to switch to a pure XMPP approach after having implemented and deployed a dual-stack client for several years.

30. Google Talk is a messaging and voice chat service and client provided by Google; see <http://www.google.com/talk/>.

31. The Standards SIG is a standing Special Interest Group devoted to development of XMPP Extension Protocols. The discussion list of the Standards SIG is the primary venue for discussion of XMPP protocol extensions, as well as for announcements by the XMPP Extensions Editor and XMPP Registrar. To subscribe to the list or view the list archives, visit <http://mail.jabber.org/mailman/listinfo/standards/>.

32. Before this specification was accepted as a XMPP Extension Protocol specification, it was discussed on the semi-private <jingle@jabber.org> mailing list; although that list is no longer used (the Standards list is the preferred discussion venue), for historical purposes it is publicly archived at <http://mail.jabber.org/pipermail/jingle/>.


Revision History

Version 0.18 (2007-11-08)

Added scenarios for various session flows; clarified handling of content-add, content-modify, and content-remove actions; clarified rules for codec priority.

(psa)

Version 0.17 (2007-06-20)

Added <unsupported-info/> error.

(psa)

Version 0.16 (2007-06-06)

Clarified resource determination process and updated text to reflect modifications to XEP-0168.

(psa)

Version 0.15 (2007-05-25)

Rewrote introduction and moved historical text to separate section.

(psa)

Version 0.14 (2007-04-17)

Clarified session lifetime; defined reason attribute and associated registry; further specified conformance requirements.

(psa)

Version 0.13 (2007-03-23)

Simplified signalling process and state chart; Removed session-redirect action (use redirect error instead); removed content-decline action; removed transport-* actions (except transport-info for ICE negotiation); removed description-* actions; simplified syntax to allow only one transport per content type; corrected syntax of creator attribute to be either initiator or responder (not JIDs); added profile attribute to content element in order to specify RTP profile in use.

(psa/ram)

Version 0.12 (2006-12-21)

Added creator attribute to content element for prevention of race condition; modified spec to use provisional namespace before advancement to Draft (per XEP-0053).

(psa/ram)

Version 0.11 (2006-10-31)

Completed clarifications and corrections throughout; added section on Jingle Actions.

(psa)

Version 0.10 (2006-09-29)

Made several corrections to the state machines and examples.

(ram/psa)

Version 0.9 (2006-09-08)

Further cleaned up state machines and state-related actions.

(ram/psa)

Version 0.8 (2006-08-23)

Changed channels to components in line with ICE; changed various action names for consistency; added session-extend and session-reduce actions to add and remove description/transport pairs; added description-modify action; added sender attribute to specify directionality.

(ram/psa)

Version 0.7 (2006-07-17)

Added implementation note about handling multiple content types.

(psa)

Version 0.6 (2006-07-12)

Changed media type to content type.

(se/psa)

Version 0.5 (2006-03-20)

Further clarified state machine diagrams; specified that session accept must include agreed-upon media format and transport description; moved deployment notes to appropriate transport spec.

(psa/jb)

Version 0.4 (2006-03-01)

Added glossary; clarified state machines; updated to reflect publication of XEP-0176 and XEP-0177.

(psa/jb)

Version 0.3 (2006-02-24)

Provided more detail about modify scenarios; defined media-specific and transport-specific actions and adjusted state machine accordingly.

(psa/jb)

Version 0.2 (2006-02-13)

Per agreement among the co-authors, moved transport definition to separate specification, simplified state machine, and made other associated changes to the text, examples, and schemas; also harmonized redirect, decline, and terminate processes.

(psa/jb)

Version 0.1 (2005-12-15)

Initial version.

(psa)

Version 0.0.10 (2005-12-11)

More fully documented burst mode, connectivity checks, error cases, etc.

(psa)

Version 0.0.9 (2005-12-08)

Restructured document flow; provided example of burst mode.

(psa)

Version 0.0.8 (2005-12-05)

Distinguished between dribble mode and burst mode, including mode attribute, service discovery features, and implementation notes; provided detailed resource discovery examples; corrected state chart; specified session termination; specified error conditions; specified semantics of informational messages; began to define security considerations; added Joe Beda as co-author.

(psa/sl/jb)

Version 0.0.7 (2005-11-08)

Added more detail to basic session flow; harmonized candidate negotiation process with ICE.

(psa)

Version 0.0.6 (2005-10-27)

Added XMPP Registrar considerations; defined schema; completed slight syntax cleanup.

(psa)

Version 0.0.5 (2005-10-21)

Separated methoddescription formats from signalling protocol.

(psa/sl)

Version 0.0.4 (2005-10-19)

Harmonized basic session flow with Google Talk protocol; added Scott Ludwig as co-author.

(psa/sl)

Version 0.0.3 (2005-10-10)

Added more detail to basic session flow.

(psa)

Version 0.0.2 (2005-10-07)

Protocol cleanup.

(psa/jjh)

Version 0.0.1 (2005-10-06)

First draft.

(psa/jjh)

END