XEP-0167: Jingle Audio Content Description Format

This document defines a content description format for Jingle audio sessions.


WARNING: This Standards-Track document is Experimental. Publication as an XMPP Extension Protocol does not imply approval of this proposal by the XMPP Standards Foundation. Implementation of the protocol described herein is encouraged in exploratory implementations, but production systems should not deploy implementations of this protocol until it advances to a status of Draft.


Document Information

Series: XEP
Number: 0167
Publisher: XMPP Standards Foundation
Status: Experimental
Type: Standards Track
Version: 0.7
Last Updated: 2006-12-21
Approving Body: XMPP Council
Dependencies: XMPP Core, XEP-0166
Supersedes: None
Superseded By: None
Short Name: TO BE ASSIGNED
Wiki Page: <http://wiki.jabber.org/index.php/Jingle Audio Content Description Format (XEP-0167)>

Author Information

Scott Ludwig

Email: scottlu@google.com
JabberID: scottlu@google.com

Peter Saint-Andre

Email: stpeter@jabber.org
JabberID: stpeter@jabber.org

Sean Egan

Email: seanegan@google.com
JabberID: seanegan@google.com

Legal Notice

This XMPP Extension Protocol is copyright 1999 - 2007 by the XMPP Standards Foundation (XSF) and is in full conformance with the XSF's Intellectual Property Rights Policy <http://www.xmpp.org/extensions/ipr-policy.shtml>. This material may be distributed only subject to the terms and conditions set forth in the Creative Commons Attribution License (<http://creativecommons.org/licenses/by/2.5/>).

Discussion Venue

The preferred venue for discussion of this document is the Standards discussion list: <http://mail.jabber.org/mailman/listinfo/standards>.

Relation to XMPP

The Extensible Messaging and Presence Protocol (XMPP) is defined in the XMPP Core (RFC 3920) and XMPP IM (RFC 3921) specifications contributed by the XMPP Standards Foundation to the Internet Standards Process, which is managed by the Internet Engineering Task Force in accordance with RFC 2026. Any protocol defined in this document has been developed outside the Internet Standards Process and is to be understood as an extension to XMPP rather than as an evolution, development, or modification of XMPP itself.

Conformance Terms

The following keywords as used in this document are to be interpreted as described in RFC 2119: "MUST", "SHALL", "REQUIRED"; "MUST NOT", "SHALL NOT"; "SHOULD", "RECOMMENDED"; "SHOULD NOT", "NOT RECOMMENDED"; "MAY", "OPTIONAL".


Table of Contents


1. Introduction
2. Requirements
3. Content Description Format
4. Negotiating a Jingle-Audio Session
5. Mapping to Session Description Protocol
6. Service Discovery
7. Informational Messages
    7.1. Format
    7.2. Examples
8. Error Handling
9. Implementation Notes
    9.1. Codecs
    9.2. DTMF
    9.3. When to Listen
10. Security Considerations
11. IANA Considerations
12. XMPP Registrar Considerations
    12.1. Protocol Namespaces
    12.2. Jingle Content Description Formats
13. XML Schemas
    13.1. Content Description Format
    13.2. Informational Messages
Notes
Revision History


1. Introduction

Jingle [1] can be used to initiate and negotiate a wide range of peer-to-peer sessions. One session type of interest is audio (voice) chat. This document specifies a format for describing Jingle audio sessions.

2. Requirements

The Jingle content description format defined herein is designed to meet the following requirements:

  1. Enable negotiation of parameters necessary for audio chat over Realtime Transport Protocol (RTP; see RFC 3550 [2]).
  2. Map these parameters to Session Description Protocol (SDP; see RFC 4566 [3]) to enable interoperability.
  3. Define informational messages related to audio chat (e.g., busy and ringing).

3. Content Description Format

A Jingle audio session is described by one or more encodings contained within a wrapper <description/> element. In the language of RFC 4566 these encodings are payload-types; therefore, each <payload-type/> element specifies an encoding that can be used for the audio stream. In Jingle Audio, these encodings are used in the context of RTP. The most common encodings for the Audio/Video Profile (AVP) of RTP are listed in RFC 3551 [4] (these "static" types are reserved from payload ID 0 through payload ID 95), although other encodings are allowed (these "dynamic" types use payload IDs 96 to 127) in accordance with the dynamic assignment rules described in Section 3 of RFC 3551.

The allowable attributes are as follows:

Table 1: Defined Attributes

Attribute Description Inclusion
channels The number of channels; if omitted, it MUST be assumed to contain one channel RECOMMENDED
clockrate The sampling frequency in Hertz RECOMMENDED
id The payload identifier REQUIRED
maxptime Maximum packet time as specified in RFC 4566 OPTIONAL
name The appropriate subtype of the audio MIME type RECOMMENDED for static payload types, REQUIRED for dynamic payload types
ptime Packet time as specified in RFC 4566 OPTIONAL

The encodings SHOULD be provided in order of preference.

Example 1. Audio Description Format

    <description xmlns='http://www.xmpp.org/extensions/xep-0167.html#ns'>
      <payload-type id='96' name='speex' clockrate='16000'/>
      <payload-type id='97' name='speex' clockrate='8000'/>
      <payload-type id='18' name='G729'/>
      <payload-type id='103' name='L16' clockrate='16000' channels='2'/>
      <payload-type id='98' name='x-ISAC' clockrate='8000'/>
      <payload-type id='102' name='iLBC'/>
      <payload-type id='4' name='G723'/>
      <payload-type id='0' name='PCMU' clockrate='16000'/>
      <payload-type id='8' name='PCMA'/>
      <payload-type id='13' name='CN'/>
    </description>
  

The <description/> element is intended to be a child of a <jingle/> element as specified in XEP-0166. (See Protocol Namespaces regarding issuance of a permanent namespace.)

Each <payload-type/> element MAY contain one or more child elements that specify particular parameters related to the payload. For example, as described in draft-ietf-avt-rtp-speex [5], the "ebw", "eng", "mode", "sr", and "vbr" parameters may be specified in relation to usage of the Speex [6] codec. Where such parameters are encoded via the "fmtp" SDP attribute, they shall be represented in Jingle via the following format:

<parameter name='foo' value='bar'/>
  

Note: The parameter names are effectively guaranteed to be unique, since the Internet Assigned Numbers Authority (IANA) [7] maintains a registry of SDP parameters (see <http://www.iana.org/assignments/sdp-parameters>).

4. Negotiating a Jingle-Audio Session

Upon receiving a Jingle initiate stanza containing a Jingle Audio content description as defined in this document, a target entity iterates through the list of offered payload types, composing an appropriate Jingle Audio response description according to the following rules:

If, after applying these rules, the target entity determines it does not support any of the offering encodings, the target entity MUST reject the session by sending a <unsupported-codecs/> error in response to the initiator's "initiate" message. Otherwise, it MUST provisionally accept the session by sending an empty IQ result. If the response content type differs from the one offered, the target entity MUST then propose the change in a "description-modify" message as defined in XEP-0166. If the description is identical, the target entity MUST send a "description-accept" message (either explictly, or implicitly as part of a "content-accept" message).

Following is an example of this negotiation:

Example 2. Initiation Example

  <iq to='juliet@capulet.com/balcony' from='romeo@montague.net/orchard' id='jingleaudio1' type='set'>
    <jingle xmlns='http://jabber.org/protocol/jingle'
            action='session-initiate'
            initiator='romeo@montague.net/orchard'
            sid='a73sjjvkla37jfea'>
      <content name='audio'>
        <description xmlns='http://jabber.org/protocol/jingle/description/audio'>
          <payload-type id='96' name='speex' clockrate='16000'/>
          <payload-type id='0' name='PCMU' />
        </description>
        <transport xmlns='http://jabber.org/protocol/jingle/transport/ice'>
          ...
        </transport>
      </content>
    </jingle>
  </iq>
    

The target entity now follows the rules provided in this section and determines it can only support PCMU. It provisionally accepts the session:

Example 3. Target Provisionally Accepts Session

  <iq to='romeo@montague.net/orchard' from='juliet@capulet.com/balcony' id='jingleaudio1' type='result' />
  

It then offers the new content description in a 'description-modify' message:

Example 4. Initiation Example

  <iq to='romeo@montague.net/orchard' from='juliet@capulet.com/balcony' id='jingleaudio2' type='set'>
    <jingle xmlns='http://jabber.org/protocol/jingle'
            action='description-modify'
            initiator='romeo@montague.net/orchard'
            sid='a73sjjvkla37jfea'>
      <content name='audio'>
        <description xmlns='http://jabber.org/protocol/jingle/description/audio'>
          <payload-type id='0' name='PCMU' />
        </description> 
      </content>
    </jingle>
  </iq>
    

The initiator acknowledges the 'description-modify' with an empty IQ result, and sends a 'description-accept' to accept the new Jingle Audio content description.

Example 5. Initiator Accepts New Content Description

  <iq to='juliet@capulet.com/balcony' from='romeo@montegue.net/orchard' id='jingleaudio2' type='result' />

  <iq to='juliet@capulet.com/balcony' from='romeo@montegue.net/orchard' id='jingleaudio3' type='set' />
    <jingle xmlns='http://jabber.org/protocol/jingle'
     action='description-accept' initiator='romeo@montague.net/orchard' sid='a73sjjvkla37jfea'>
       <content name='audio'>
         <description xmlns='http://jabber.org/protocol/jingle/description/audio'>
           <payload-type id='0' name='PCMU' />
         </description>
       </content>
    </jingle>
  </iq>
  

Finally, the target acknowledges the 'description-accept'.

Example 6. Target Provisionally Accepts Session

  <iq to='romeo@montague.net/orchard' from='juliet@capulet.com/balcony' id='jingleaudio3' type='result' />
  

5. Mapping to Session Description Protocol

If the payload type is static (payload-type IDs 0 through 95 inclusive), it MUST be mapped to a media field defined in RFC 4566: Session Description Protocol (SDP). The generic format for the media field is as follows:

m=<media> <port> <transport> <fmt list>
  

In the context of Jingle audio sessions, the <content> is "audio", the <port> is the preferred port for such communications (which may be determined dynamically), the <transport> is whatever transport method is negotiated via the Jingle negotiation (e.g., "RTP/AVT"), and the <fmt list> is the payload-type ID.

For example, consider the following static payload-type:

Example 7. Jingle Format for Static Payload-Type

<payload-type id="13" name="CN"/>
  

Example 8. SDP Mapping of Static Payload-Type

m=audio 9999 RTP/AVP 13
  

If the payload type is dynamic (payload-type IDs 96 through 127 inclusive), it SHOULD be mapped to an SDP media field plus an SDP attribute field named "rtpmap".

For example, consider a payload of 16-bit linear-encoded stereo audio sampled at 16KHz associated with dynamic payload-type 98:

Example 9. Jingle Format for Dynamic Payload-Type

<payload-type id='96' name='speex' clockrate='16000'/>
  

Example 10. SDP Mapping of Dynamic Payload-Type

m=audio 9999 RTP/AVP 96
a=rtpmap:96 speex/16000
  

As noted, if additional parameters are to be specified, they shall be represented as attributes of the <payload-type/> element or of the child <parameter/> element, as in the following example.

Example 11. Jingle Format for Dynamic Payload-Type With Parameters

<payload-type id='96' name='speex' clockrate='16000' ptime='40'>
  <parameter name='vbr' value='on'/>
  <parameter name='cng' value='on'/>
</payload-type>
  

Example 12. SDP Mapping of Dynamic Payload-Type With Parameters

m=audio 9999 RTP/AVP 96
a=rtpmap:96 speex/16000
a=ptime:40
a=fmtp:96 vbr=on;cng=on
  

6. Service Discovery

If an entity supports the Jingle audio content description format, it MUST advertise that fact by returning a feature of "http://www.xmpp.org/extensions/xep-0167.html#ns" (see Protocol Namespaces) in response to Service Discovery [8] information requests.

Example 13. Service Discovery Information Request

<iq from='romeo@montague.net/orchard'
    id='disco1'
    to='juliet@capulet.com/balcony'
    type='get'>
  <query xmlns='http://jabber.org/protocol/disco#info'/>
</iq>
  

Example 14. Service Discovery Information Response

<iq from='juliet@capulet.com/balcony'
    id='disco1'
    to='romeo@montague.net/orchard'
    type='result'>
  <query xmlns='http://jabber.org/protocol/disco#info'>
    ...
    <feature var='http://www.xmpp.org/extensions/xep-0166.html#ns'/>
    <feature var='http://www.xmpp.org/extensions/xep-0167.html#ns'/>
    ...
  </query>
</iq>
  

7. Informational Messages

7.1 Format

Informational messages may be sent by either party within the context of Jingle to communicate the status of a Jingle audio session, device, or principal. The informational message MUST be an IQ-set containing a <jingle/> element of type "description-info", where the informational message is a payload element qualified by the 'http://www.xmpp.org/extensions/xep-0167.html#ns-info' namespace; the following payload elements are defined: [9]

Table 2: Information Payload Elements

Element Meaning
<busy/> The principal or device is currently unavailable for a session because busy with another (audio or other) session.
<hold/> The principal is temporarily pausing the chat (i.e., putting the other party on hold).
<mute/> The principal is temporarily stopping audio input but continues to accept audio output.
<ringing/> The device is ringing but the principal has not yet interacted with it to answer (maps to the SIP 180 response code).

Note: Because the informational message is sent in an IQ-set, the receiving party MUST return either an IQ-result or an IQ-error (normally only an IQ-result to acknowledge receipt; no error flows are defined or envisioned at this time).

7.2 Examples

Example 15. Receiver Sends Busy Message

<iq from='juliet@capulet.com/balcony'
    to='romeo@montague.net/orchard'
    id='busy1'
    type='set'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns'
          action='description-info'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <busy xmlns='http://www.xmpp.org/extensions/xep-0167.html#ns-info'/>
  </jingle>
</iq>
    

Example 16. Receiver Sends Hold Message

<iq from='juliet@capulet.com/balcony'
    to='romeo@montague.net/orchard'
    id='hold1'
    type='set'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns'
          action='description-info'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <hold xmlns='http://www.xmpp.org/extensions/xep-0167.html#ns-info'/>
  </jingle>
</iq>
    

Example 17. Receiver Sends Mute Message

<iq from='juliet@capulet.com/balcony'
    to='romeo@montague.net/orchard'
    id='mute1'
    type='set'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns'
          action='description-info'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <mute xmlns='http://www.xmpp.org/extensions/xep-0167.html#ns-info'/>
  </jingle>
</iq>
    

Example 18. Receiver Sends Ringing Message

<iq from='juliet@capulet.com/balcony'
    to='romeo@montague.net/orchard'
    id='ringing1'
    type='set'>
  <jingle xmlns='http://www.xmpp.org/extensions/xep-0166.html#ns'
          action='description-info'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <ringing xmlns='http://www.xmpp.org/extensions/xep-0167.html#ns-info'/>
  </jingle>
</iq>
    

8. Error Handling

The Jingle Audio-specific error conditions are as follows:

Table 3: Other Error Conditions

Jingle Condition XMPP Condition Description
<unsupported-codecs/> <not-acceptable/> The recipient does not support any of the offered audio encodings.

9. Implementation Notes

9.1 Codecs

Support for the Speex codec is RECOMMENDED.

9.2 DTMF

If it is necessary to send Dual Tone Multi-Frequency (DTMF) tones, it is REQUIRED to use the XML format specified Jingle DTMF [10].

9.3 When to Listen

When the Jingle Audio content is accepted, either by a 'content-accept' action or a combination of 'description-accept' and 'transport-accept' actions, both receiving and sending entities SHOULD start listening for audio as defined by the negotiated transport method and audio description. For interoperability with telephony systems, each entity SHOULD both play any audio received and send a ringing tone, at this time, before the receiver sends a 'session-accept' action.

10. Security Considerations

The description of a format for audio sessions introduces no known security vulnerabilities.

11. IANA Considerations

This document requires no interaction with the Internet Assigned Numbers Authority (IANA) [11].

12. XMPP Registrar Considerations

12.1 Protocol Namespaces

Until this specification advances to a status of Draft, its associated namespaces shall be "http://www.xmpp.org/extensions/xep-0167.html#ns" and "http://www.xmpp.org/extensions/xep-0167.html#ns-info"; upon advancement of this specification, the XMPP Registrar [12] shall issue permanent namespaces in accordance with the process defined in Section 4 of XMPP Registrar Function [13].

12.2 Jingle Content Description Formats

The XMPP Registrar shall include "audio" in its registry of Jingle content description formats. The registry submission is as follows:

<content>
  <name>audio</name>
  <desc>Jingle sessions that support audio exchanges</desc>
  <doc>XEP-0167</doc>
</content>
    

13. XML Schemas

13.1 Content Description Format

<?xml version='1.0' encoding='UTF-8'?>

<xs:schema
    xmlns:xs='http://www.w3.org/2001/XMLSchema'
    targetNamespace='http://www.xmpp.org/extensions/xep-0167.html#ns'
    xmlns='http://www.xmpp.org/extensions/xep-0167.html#ns'
    elementFormDefault='qualified'>

  <xs:element name='description'>
    <xs:complexType>
      <xs:sequence minOccurs='0' maxOccurs='unbounded'/>
        <xs:element ref='payload-type'/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>

  <xs:element name='payload-type'>
    <xs:complexType>
      <xs:sequence minOccurs='0' maxOccurs='unbounded'>
        <xs:element ref='parameter'/>
      </xs:choice>
      <xs:attribute name='channels' type='xs:byte' use='optional' default='1'/>
      <xs:attribute name='clockrate' type='xs:short' use='optional'/>
      <xs:attribute name='id' type='xs:unsignedByte' use='required'/>
      <xs:attribute name='maxptime' type='xs:short' use='optional'/>
      <xs:attribute name='name' type='xs:string' use='optional'/>
      <xs:attribute name='ptime' type='xs:short' use='optional'/>
    </xs:complexType>
  </xs:element>

  <xs:element name='parameter'>
    <xs:complexType>
      <xs:simpleContent>
        <xs:extension base='empty'>
          <xs:attribute name='name' type='xs:string' use='required'/>
          <xs:attribute name='value' type='xs:string' use='required'/>
        </xs:extension>
      </xs:simpleContent>
    </xs:complexType>
  </xs:element>

  <xs:simpleType name='empty'>
    <xs:restriction base='xs:string'>
      <xs:enumeration value=''/>
    </xs:restriction>
  </xs:simpleType>

</xs:schema>
    

13.2 Informational Messages

<?xml version='1.0' encoding='UTF-8'?>

<xs:schema
    xmlns:xs='http://www.w3.org/2001/XMLSchema'
    targetNamespace='http://www.xmpp.org/extensions/xep-0167.html#ns-info'
    xmlns='http://www.xmpp.org/extensions/xep-0167.html#ns-info'
    elementFormDefault='qualified'>

  <xs:element name='busy' type='empty'/>
  <xs:element name='hold' type='empty'/>
  <xs:element name='mute' type='empty'/>
  <xs:element name='ringing' type='empty'/>

  <xs:simpleType name='empty'>
    <xs:restriction base='xs:string'>
      <xs:enumeration value=''/>
    </xs:restriction>
  </xs:simpleType>

</xs:schema>
    

Notes

1. XEP-0166: Jingle <http://www.xmpp.org/extensions/xep-0166.html>.

2. RFC 3550: RTP: A Transport Protocol for Real-Time Applications <http://tools.ietf.org/html/rfc3550>.

3. RFC 4566: SDP: Session Description Protocol <http://tools.ietf.org/html/rfc4566>.

4. RFC 3551: RTP Profile for Audio and Video Conferences with Minimal Control <http://tools.ietf.org/html/rfc3551>.

5. This Internet-Draft has expired; see <http://www.watersprings.org/pub/id/draft-ietf-avt-rtp-speex-00.txt> for an archived version.

6. See <http://www.speex.org/>.

7. The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique parameter values for Internet protocols, such as port numbers and URI schemes. For further information, see <http://www.iana.org/>.

8. XEP-0030: Service Discovery <http://www.xmpp.org/extensions/xep-0030.html>.

9. A <trying/> element (equivalent to the SIP 100 Trying response code) is not necessary, since each session-level action is acknowledged via XMPP IQ semantics.

10. XEP-0181: Jingle DTMF <http://www.xmpp.org/extensions/xep-0181.html>.

11. The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique parameter values for Internet protocols, such as port numbers and URI schemes. For further information, see <http://www.iana.org/>.

12. The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used in the context of XMPP extension protocols approved by the XMPP Standards Foundation. For further information, see <http://www.xmpp.org/registrar/>.

13. XEP-0053: XMPP Registrar Function <http://www.xmpp.org/extensions/xep-0053.html>.


Revision History

Version 0.7 (2006-12-21)

Modified spec to use provisional namespace before advancement to Draft (per XEP-0053).

(psa)

Version 0.6 (2006-10-31)

Specified how to include SDP parameters and codec-specific parameters; clarified negotiation process; added Speex examples; removed queued info message.

(psa/se)

Version 0.5 (2006-08-23)

Modified namespace to track XEP-0166.

(psa)

Version 0.4 (2006-07-12)

Specified when to play received audio (early media); specified that DTMF must use in-band signalling (XEP-0181).

(se/psa)

Version 0.3 (2006-03-20)

Defined info messages for hold and mute.

(psa)

Version 0.2 (2006-02-13)

Defined info message for busy; added info message examples; recommended use of Speex; updated schema and XMPP Registrar considerations.

(psa)

Version 0.1 (2005-12-15)

Initial version.

(psa)

Version 0.0.3 (2005-12-05)

Described service discovery usage; defined initial informational messages.

(psa)

Version 0.0.2 (2005-10-27)

Added SDP mapping, security considerations, IANA considerations, XMPP Registrar considerations, and XML schema.

(psa)

Version 0.0.1 (2005-10-21)

First draft.

(psa/sl)

END