JEP-0167: Jingle Audio Content Description Format

This document defines a content description format for Jingle audio sessions.


WARNING: This Standards-Track JEP is Experimental. Publication as a Jabber Enhancement Proposal does not imply approval of this proposal by the Jabber Software Foundation. Implementation of the protocol described herein is encouraged in exploratory implementations, but production systems should not deploy implementations of this protocol until it advances to a status of Draft.


JEP Information

Status: Experimental
Type: Standards Track
Number: 0167
Version: 0.4
Last Updated: 2006-07-12
JIG: Standards JIG
Approving Body: Jabber Council
Dependencies: XMPP Core, JEP-0166
Supersedes: None
Superseded By: None
Short Name: jingle-audio
Wiki Page: <http://wiki.jabber.org/index.php/Jingle Audio Content Description Format (JEP-0167)>

Author Information

Scott Ludwig

Email: scottlu@google.com
JID: scottlu@google.com

Peter Saint-Andre

Email: stpeter@jabber.org
JID: stpeter@jabber.org

Sean Egan

Email: seanegan@google.com
JID: seanegan@google.com

Legal Notice

This Jabber Enhancement Proposal is copyright 1999 - 2006 by the Jabber Software Foundation (JSF) and is in full conformance with the JSF's Intellectual Property Rights Policy <http://www.jabber.org/jsf/ipr-policy.shtml>. This material may be distributed only subject to the terms and conditions set forth in the Creative Commons Attribution License (<http://creativecommons.org/licenses/by/2.5/>).

Discussion Venue

The preferred venue for discussion of this document is the Standards-JIG discussion list: <http://mail.jabber.org/mailman/listinfo/standards-jig>.

Relation to XMPP

The Extensible Messaging and Presence Protocol (XMPP) is defined in the XMPP Core (RFC 3920) and XMPP IM (RFC 3921) specifications contributed by the Jabber Software Foundation to the Internet Standards Process, which is managed by the Internet Engineering Task Force in accordance with RFC 2026. Any protocol defined in this JEP has been developed outside the Internet Standards Process and is to be understood as an extension to XMPP rather than as an evolution, development, or modification of XMPP itself.

Conformance Terms

The following keywords as used in this document are to be interpreted as described in RFC 2119: "MUST", "SHALL", "REQUIRED"; "MUST NOT", "SHALL NOT"; "SHOULD", "RECOMMENDED"; "SHOULD NOT", "NOT RECOMMENDED"; "MAY", "OPTIONAL".


Table of Contents

1. Introduction
2. Requirements
3. Content Description Format
4. Mapping to Session Description Protocol
5. Service Discovery
6. Informational Messages
6.1. Format
6.2. Examples
7. Implementation Notes
7.1. Codecs
7.2. DTMF
8. Security Considerations
9. IANA Considerations
10. Jabber Registrar Considerations
10.1. Protocol Namespaces
10.2. Jingle Content Description Formats
11. XML Schemas
11.1. Content Description Format
11.2. Informational Messages
Notes
Revision History


1. Introduction

Jingle [1] can be used to initiate and negotiate a wide range of peer-to-peer sessions. The first session type of interest is audio chat. This document specifies a format for describing Jingle audio sessions.

2. Requirements

The Jingle content description format defined herein is designed to meet the following requirements:

  1. Enable negotiation of parameters necessary for audio chat over Realtime Transport Protocol (RTP; see RFC 3550 [2]).
  2. Map these parameters to Session Description Protocol (SDP; see RFC 2327 [3]) to enable interoperability.
  3. Define informational messages related to audio chat (e.g., busy and ringing).

3. Content Description Format

A Jingle audio session is described by one or more encodings contained within a wrapper <description/> element. In the language of RFC 2327 these encodings are payload-types; therefore, each <payload-type/> element specifies an encoding that can be used for the audio stream. In Jingle Audio, these encodings are used in the context of RTP. The most common encodings for the Audio/Video Profile (AVP) of RTP are listed in RFC 3551 [4] (these "static" types are reserved from payload ID 0 through payload ID 96), although other encodings are allowed (these "dynamic" types use payload IDs 97 to 127) in accordance with the dynamic assignment rules described in Section 3 of RFC 3551. The 'id' attribute is REQUIRED. The 'name' attribute is RECOMMENDED for static payload types, and REQUIRED for dynamic payload types. The 'clockrate' attribute is RECOMMENDED and should specify the sampling frequency in hertz. The 'channels' attribute is RECOMMENDED and should specify the number of channels. If omitted, it SHOULD be assumed to contain one channel.

The encodings SHOULD be provided in order of preference.

Example 1. Audio Description Format

    <description xmlns='http://jabber.org/protocol/jingle/content/audio'>
      <payload-type id='18' name='G729'/>
      <payload-type id="97" name="IPCMWB"/>
      <payload-type id='98' name='L16' clockrate='16000' channels='2'/>
      <payload-type id="96" name="ISAC" clockrate="8000"/>
      <payload-type id="102" name="iLBC"/>
      <payload-type id="4" name="G723"/>
      <payload-type id="100" name="EG711U"/>
      <payload-type id="101" name="EG711A"/>
      <payload-type id="0" name="PCMU" clockrate="16000"/>
      <payload-type id="8" name="PCMA"/>
      <payload-type id="13" name="CN"/>
    </description>
  

The <description/> element is intended to be a child of a <jingle/> element as specified in JEP-0166.

When the session is provisionally accepted, as indicated by the target entity sending an empty IQ result in response to an 'initiate' message, both receiving and sending entities SHOULD start listening for audio as defined by the negotiated transport method. For interoperability with telephony systems, each entity SHOULD play any audio received at this time, before the target sends an 'accept' message.

4. Mapping to Session Description Protocol

If the payload type is static (payload-type IDs 0 through 96 inclusive), it MUST be mapped to a media field defined in RFC 2327: Session Description Protocol (SDP). The generic format for the media field is as follows:

m=<media> <port> <transport> <fmt list>
  

In the context of Jingle audio sessions, the <content> is "audio", the <port> is the preferred port for such communications (which may be determined dynamically), the <transport> is whatever transport method is negotiated via the Jingle negotiation (e.g., "RTP/AVT"), and the <fmt list> is the payload-type ID.

For example, consider the following static payload-type:

Example 2. Jingle Format for Static Payload-Type

<payload-type id="13" name="CN"/>
  

Example 3. SDP Mapping of Static Payload-Type

m=audio 9999 RTP/AVP 13
  

If the payload type is dynamic (payload-type IDs 97 through 127 inclusive), it SHOULD be mapped to an SDP media field plus an SDP attribute field named "rtpmap".

For example, consider a payload of 16-bit linear-encoded stereo audio sampled at 16KHz associated with dynamic payload-type 98:

Example 4. Jingle Format for Dynamic Payload-Type

<payload-type id='98' name='L16' clockrate='16000' channels='2'/>
  

Example 5. SDP Mapping of Dynamic Payload-Type

m=audio 9999 RTP/AVP 98
a=rtpmap:98 L16/16000/2
  

5. Service Discovery

If an entity supports the Jingle audio content description format, it MUST advertise that fact by returning a feature of "http://jabber.org/protocol/jingle/content/audio" in response to Service Discovery [5] information requests.

Example 6. Service Discovery Information Request

<iq from='romeo@montague.net/orchard'
    id='disco1'
    to='juliet@capulet.com/balcony'
    type='get'>
  <query xmlns='http://jabber.org/protocol/disco#info'/>
</iq>
  

Example 7. Service Discovery Information Response

<iq from='juliet@capulet.com/balcony'
    id='disco1'
    to='romeo@montague.net/orchard'
    type='result'>
  <query xmlns='http://jabber.org/protocol/disco#info'>
    ...
    <feature var='http://jabber.org/protocol/jingle'/>
    <feature var='http://jabber.org/protocol/jingle/content/audio'/>
    ...
  </query>
</iq>
  

6. Informational Messages

6.1 Format

Informational messages may be sent by either party within the context of Jingle to communicate the status of a Jingle audio session, device, or principal. The informational message MUST be an IQ-set containing a <jingle/> element of type "content-info", where the informational message is a payload element qualified by the 'http://jabber.org/protocol/jingle/info/audio' namespace; the following payload elements are defined:

Table 1: Information Payload Elements

Element Meaning
<busy/> The principal or device is currently unavailable for a session because busy with another (audio or other) session.
<hold/> The principal is temporarily pausing the chat (i.e., putting the other party on hold).
<mute/> The principal is temporarily stopping audio input but continues to accept audio output.
<queued/> The audio session request is queued for pickup by the principal.
<ringing/> The device is ringing but the principal has not yet interacted with it to answer.

Note: Because the informational message is sent in an IQ-set, the receiving party MUST return either an IQ-result or an IQ-error (normally only an IQ-result to acknowledge receipt; no error flows are defined or envisioned at this time).

6.2 Examples

Example 8. Target Entity Sends Busy Message

<iq from='juliet@capulet.com/balcony'
    to='romeo@montague.net/orchard'
    id='busy1'
    type='set'>
  <jingle xmlns='http://jabber.org/protocol/jingle'
          action='content-info'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <busy xmlns='http://jabber.org/protocol/jingle/info/audio'/>
  </jingle>
</iq>
    

Example 9. Target Entity Sends Hold Message

<iq from='juliet@capulet.com/balcony'
    to='romeo@montague.net/orchard'
    id='hold1'
    type='set'>
  <jingle xmlns='http://jabber.org/protocol/jingle'
          action='content-info'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <hold xmlns='http://jabber.org/protocol/jingle/info/audio'/>
  </jingle>
</iq>
    

Example 10. Target Entity Sends Mute Message

<iq from='juliet@capulet.com/balcony'
    to='romeo@montague.net/orchard'
    id='mute1'
    type='set'>
  <jingle xmlns='http://jabber.org/protocol/jingle'
          action='content-info'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <mute xmlns='http://jabber.org/protocol/jingle/info/audio'/>
  </jingle>
</iq>
    

Example 11. Target Entity Sends Queued Message

<iq from='juliet@capulet.com/balcony'
    to='romeo@montague.net/orchard'
    id='queued1'
    type='set'>
  <jingle xmlns='http://jabber.org/protocol/jingle'
          action='content-info'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <queued xmlns='http://jabber.org/protocol/jingle/info/audio'/>
  </jingle>
</iq>
    

Example 12. Target Entity Sends Ringing Message

<iq from='juliet@capulet.com/balcony'
    to='romeo@montague.net/orchard'
    id='ringing1'
    type='set'>
  <jingle xmlns='http://jabber.org/protocol/jingle'
          action='content-info'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <ringing xmlns='http://jabber.org/protocol/jingle/info/audio'/>
  </jingle>
</iq>
    

7. Implementation Notes

7.1 Codecs

Support for the Speex codec [6] is RECOMMENDED.

7.2 DTMF

Support for Dual Tone Multi-Frequency (DTMF) MUST use the protocol described in Jingle DTMF [7].

8. Security Considerations

The description of a format for audio sessions introduces no known security vulnerabilities.

9. IANA Considerations

This JEP requires no interaction with the Internet Assigned Numbers Authority (IANA) [8].

10. Jabber Registrar Considerations

10.1 Protocol Namespaces

The Jabber Registrar [9] shall include 'http://jabber.org/protocol/jingle/content/audio' and 'http://jabber.org/protocol/jingle/info/audio' in its registry of protocol namespaces.

10.2 Jingle Content Description Formats

The Jabber Registrar shall include the name "audio" in its registry of Jingle content description formats. The registration is as follows:

<content>
  <name>audio</name>
  <desc>Jingle sessions that support audio exchanges</desc>
  <doc>JEP-0167</doc>
</content>
    

11. XML Schemas

11.1 Content Description Format

<?xml version='1.0' encoding='UTF-8'?>

<xs:schema
    xmlns:xs='http://www.w3.org/2001/XMLSchema'
    targetNamespace='http://jabber.org/protocol/jingle/content/audio'
    xmlns='http://jabber.org/protocol/jingle/content/audio'
    elementFormDefault='qualified'>

  <xs:element name='description'>
    <xs:complexType>
      <xs:sequence>
        <xs:element ref='payload-type' minOccurs='0' maxOccurs='unbounded'/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>

  <xs:element name='payload-type'>
    <xs:complexType>
      <xs:simpleContent>
        <xs:extension base='empty'>
          <xs:attribute name='channels' type='xs:byte' use='optional'/>
          <xs:attribute name='id' type='xs:unsignedByte' use='required'/>
          <xs:attribute name='name' type='xs:string' use='optional'/>
          <xs:attribute name='rate' type='xs:short' use='optional'/>
        </xs:extension>
      </xs:simpleContent>
    </xs:complexType>
  </xs:element>

  <xs:simpleType name='empty'>
    <xs:restriction base='xs:string'>
      <xs:enumeration value=''/>
    </xs:restriction>
  </xs:simpleType>

</xs:schema>
    

11.2 Informational Messages

<?xml version='1.0' encoding='UTF-8'?>

<xs:schema
    xmlns:xs='http://www.w3.org/2001/XMLSchema'
    targetNamespace='http://jabber.org/protocol/jingle/info/audio'
    xmlns='http://jabber.org/protocol/jingle/info/audio'
    elementFormDefault='qualified'>

  <xs:element name='busy' type='empty'/>
  <xs:element name='hold' type='empty'/>
  <xs:element name='mute' type='empty'/>
  <xs:element name='queued' type='empty'/>
  <xs:element name='ringing' type='empty'/>

  <xs:simpleType name='empty'>
    <xs:restriction base='xs:string'>
      <xs:enumeration value=''/>
    </xs:restriction>
  </xs:simpleType>

</xs:schema>
    


Notes

1. JEP-0166: Jingle <http://www.jabber.org/jeps/jep-0166.html>.

2. RFC 3550: RTP: A Transport Protocol for Real-Time Applications <http://www.ietf.org/rfc/rfc3550.txt>.

3. RFC 2327: SDP: Session Description Protocol <http://www.ietf.org/rfc/rfc2327.txt>.

4. RFC 3551: RTP Profile for Audio and Video Conferences with Minimal Control <http://www.ietf.org/rfc/rfc3551.txt>.

5. JEP-0030: Service Discovery <http://www.jabber.org/jeps/jep-0030.html>.

6. See <http://www.speex.org/>.

7. JEP-0181: Jingle DTMF <http://www.jabber.org/jeps/jep-0181.html>.

8. The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique parameter values for Internet protocols, such as port numbers and URI schemes. For further information, see <http://www.iana.org/>.

9. The Jabber Registrar maintains a list of reserved Jabber protocol namespaces as well as registries of parameters used in the context of protocols approved by the Jabber Software Foundation. For further information, see <http://www.jabber.org/registrar/>.


Revision History

Version 0.4 (2006-07-12)

Specified when to play received audio (early media); specified that DTMF must use in-band signalling (JEP-0181).

(se/psa)

Version 0.3 (2006-03-20)

Defined info messages for hold and mute.

(psa)

Version 0.2 (2006-02-13)

Defined info message for busy; added info message examples; recommended use of Speex; updated schema and Jabber Registrar considerations.

(psa)

Version 0.1 (2005-12-15)

Initial JEP version.

(psa)

Version 0.0.3 (2005-12-05)

Described service discovery usage; defined initial informational messages.

(psa)

Version 0.0.2 (2005-10-27)

Added SDP mapping, security considerations, IANA considerations, Jabber Registrar considerations, and XML schema.

(psa)

Version 0.0.1 (2005-10-21)

First draft.

(psa/sl)


END