XEP-0167: Jingle RTP Sessions

This specification defines a Jingle application type for negotiating a session that uses the Real-time Transport Protocol (RTP) to exchange media such as voice or video. The application type includes a straightforward mapping to Session Description Protocol (SDP) for interworking with SIP media endpoints.


NOTICE: This document is currently within Last Call or under consideration by the XMPP Council for advancement to the next stage in the XSF standards process.


Document Information

Series: XEP
Number: 0167
Publisher: XMPP Standards Foundation
Status: Proposed
Type: Standards Track
Version: 0.21
Last Updated: 2008-06-09
Approving Body: XMPP Council
Dependencies: XMPP Core, XEP-0166
Supersedes: None
Superseded By: None
Short Name: NOT_YET_ASSIGNED
Wiki Page: <http://wiki.jabber.org/index.php/Jingle RTP Sessions (XEP-0167)>


Author Information

Scott Ludwig

Email: scottlu@google.com
JabberID: scottlu@google.com

Peter Saint-Andre

JabberID: stpeter@jabber.org
URI: https://stpeter.im/

Sean Egan

Email: seanegan@google.com
JabberID: seanegan@google.com

Robert McQueen

Email: robert.mcqueen@collabora.co.uk
JabberID: robert.mcqueen@collabora.co.uk


Legal Notices

Copyright

This XMPP Extension Protocol is copyright (c) 1999 - 2008 by the XMPP Standards Foundation (XSF).

Permissions

Permission is hereby granted, free of charge, to any person obtaining a copy of this specification (the "Specification"), to make use of the Specification without restriction, including without limitation the rights to implement the Specification in a software program, deploy the Specification in a network service, and copy, modify, merge, publish, translate, distribute, sublicense, or sell copies of the Specification, and to permit persons to whom the Specification is furnished to do so, subject to the condition that the foregoing copyright notice and this permission notice shall be included in all copies or substantial portions of the Specification. Unless separate permission is granted, modified works that are redistributed shall not contain misleading information regarding the authors, title, number, or publisher of the Specification, and shall not claim endorsement of the modified works by the authors, any organization or project to which the authors belong, or the XMPP Standards Foundation.

Disclaimer of Warranty

## NOTE WELL: This Specification is provided on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. In no event shall the XMPP Standards Foundation or the authors of this Specification be liable for any claim, damages, or other liability, whether in an action of contract, tort, or otherwise, arising from, out of, or in connection with the Specification or the implementation, deployment, or other use of the Specification. ##

Limitation of Liability

In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall the XMPP Standards Foundation or any author of this Specification be liable for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising out of the use or inability to use the Specification (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if the XMPP Standards Foundation or such author has been advised of the possibility of such damages.

IPR Conformance

This XMPP Extension Protocol has been contributed in full conformance with the XSF's Intellectual Property Rights Policy (a copy of which may be found at <http://www.xmpp.org/extensions/ipr-policy.shtml> or obtained by writing to XSF, P.O. Box 1641, Denver, CO 80201 USA).

Discussion Venue

The preferred venue for discussion of this document is the Standards discussion list: <http://mail.jabber.org/mailman/listinfo/standards>.

Errata may be sent to <editor@xmpp.org>.

Relation to XMPP

The Extensible Messaging and Presence Protocol (XMPP) is defined in the XMPP Core (RFC 3920) and XMPP IM (RFC 3921) specifications contributed by the XMPP Standards Foundation to the Internet Standards Process, which is managed by the Internet Engineering Task Force in accordance with RFC 2026. Any protocol defined in this document has been developed outside the Internet Standards Process and is to be understood as an extension to XMPP rather than as an evolution, development, or modification of XMPP itself.

Conformance Terms

The following keywords as used in this document are to be interpreted as described in RFC 2119: "MUST", "SHALL", "REQUIRED"; "MUST NOT", "SHALL NOT"; "SHOULD", "RECOMMENDED"; "SHOULD NOT", "NOT RECOMMENDED"; "MAY", "OPTIONAL".


Table of Contents


1. Introduction
2. Requirements
3. Jingle Conformance
4. Application Format
5. Negotiating a Jingle RTP Session
6. Mapping to Session Description Protocol
7. Informational Messages
    7.1. Format
    7.2. Examples
8. Determining Support
9. Scenarios
    9.1. Responder is Busy
    9.2. Jingle Audio via RTP/AVP, Negotiated with ICE-UDP
    9.3. Jingle Audio and Video via RTP/AVP, Negotiated with ICE-UDP
    9.4. Secure Jingle Audio via UDP/TLS/RTP/SAVP, Negotiated with ICE-UDP
10. Implementation Notes
    10.1. Audio Sessions
       10.1.1. Codecs
         10.1.1.1. Speex
         10.1.1.2. G.711
       10.1.2. DTMF
       10.1.3. When to Listen for Audio
    10.2. Video Sessions
       10.2.1. Codecs
11. Security Considerations
12. IANA Considerations
13. XMPP Registrar Considerations
    13.1. Protocol Namespaces
    13.2. Service Discovery Features
    13.3. Jingle Application Formats
14. XML Schemas
    14.1. Application Format
    14.2. Informational Messages
15. Acknowledgements
Notes
Revision History


1. Introduction

Jingle [1] can be used to initiate and negotiate a wide range of peer-to-peer sessions. One session type of interest is media such as voice or video. This document specifies an application format for negotiating Jingle media sessions, where the media is exchanged over the Realtime Transport Protocol (RTP; see RFC 3550 [2]).

2. Requirements

The Jingle application format defined herein is designed to meet the following requirements:

  1. Enable negotiation of parameters necessary for media sessions using the Realtime Transport Protocol (RTP).
  2. Map these parameters to Session Description Protocol (SDP; see RFC 4566 [3]) to enable interoperability.
  3. Define informational messages related to typical RTP uses such as audio chat and video chat (e.g., ringing, on hold, on mute).

3. Jingle Conformance

In accordance with Section 8 of XEP-0166, this document specifies the following information related to the Jingle RTP application type:

  1. The application format negotiation process is defined in the Negotiating a Jingle RTP Session section of this document.

  2. The semantics of the <description/> element are defined in the Application Format section of this document.

  3. A mapping of Jingle semantics to the Session Description Protocol is provided in the Mapping to Session Description Protocol section of this document.

  4. A Jingle RTP session SHOULD use a lossy transport method such as Jingle Raw UDP Transport Method [4] or the "ice-udp" method specified in Jingle ICE-UDP Transport Method [5], but MAY use a reliable transport such as "ice-tcp" if a low-bandwidth codec is employed.

  5. Content is to be sent and received as follows:

4. Application Format

A Jingle RTP session is described by a content type that contains one application format and one transport method. The application format consists of one or more encodings contained within a wrapper <description/> element qualified by the 'urn:xmpp:tmp:jingle:apps:rtp' namespace (see Protocol Namespaces regarding issuance of one or more permanent namespaces). In the language of RFC 4566 each encoding is a payload-type; therefore, each <payload-type/> element specifies an encoding that can be used for the RTP stream, as illustrated in the following example.

Example 1. RTP description format

    <description xmlns='urn:xmpp:tmp:jingle:apps:rtp'>
      <payload-type id='96' name='speex' clockrate='16000'/>
      <payload-type id='97' name='speex' clockrate='8000'/>
      <payload-type id='18' name='G729'/>
      <payload-type id='103' name='L16' clockrate='16000' channels='2'/>
      <payload-type id='98' name='x-ISAC' clockrate='8000'/>
      <payload-type id='102' name='iLBC'/>
      <payload-type id='4' name='G723'/>
      <payload-type id='0' name='PCMU' clockrate='16000'/>
      <payload-type id='8' name='PCMA'/>
      <payload-type id='13' name='CN'/>
    </description>
  

The <description/> element is intended to be a child of a <content/> element as specified in XEP-0166.

The <description/> element MUST possess a 'media' attribute that specifies the media type, such as "audio" or "video".

The <description/> element SHOULD possess a 'profile' attribute that specifies the profile of RTP in use as would be encapsulated in SDP (e.g., "RTP/AVP" or "UDP/TLS/RTP/SAVP"). If not included, the default value of "RTP/AVP" MUST be assumed.

The encodings SHOULD be provided in order of preference by placing the most-preferred <payload-type/> element as the first child of the <description/> element (etc.).

The allowable attributes of the <payload-type/> element are as follows:

Table 1: Payload-Type Attributes

Attribute Description Datatype Inclusion
channels The number of channels; if omitted, it MUST be assumed to contain one channel positiveInteger (defaults to 1) RECOMMENDED
clockrate The sampling frequency in Hertz positiveInteger RECOMMENDED
id The payload identifier positiveInteger REQUIRED
maxptime Maximum packet time as specified in RFC 4566 positiveInteger OPTIONAL
name The appropriate subtype of the MIME type string RECOMMENDED for static payload types, REQUIRED for dynamic payload types
ptime Packet time as specified in RFC 4566 positiveInteger OPTIONAL

In Jingle RTP, the encodings are used in the context of RTP. The most common encodings for the Audio/Video Profile (AVP) of RTP are listed in RFC 3551 [6] (these "static" types are reserved from payload ID 0 through payload ID 95), although other encodings are allowed (these "dynamic" types use payload IDs 96 to 127) in accordance with the dynamic assignment rules described in Section 3 of RFC 3551. The payload IDs are represented in the 'id' attribute.

Each <payload-type/> element MAY contain one or more child elements that specify particular parameters related to the payload. For example, as described in RTP Payload Format for the Speex Codec [7], the "cng", "mode", and "vbr" parameters may be specified in relation to usage of the Speex [8] codec. Where such parameters are encoded via the "fmtp" SDP attribute, they shall be represented in Jingle via the following format:

<parameter name='foo' value='bar'/>
  

Note: The parameter names are effectively guaranteed to be unique, since the Internet Assigned Numbers Authority (IANA) [9] maintains a registry of SDP parameters (see <http://www.iana.org/assignments/sdp-parameters>).

5. Negotiating a Jingle RTP Session

In general, the process for negotiating a Jingle RTP session is as follows:

Initiator                    Responder
  |                             |
  |   session-initiate          |
  |---------------------------->|
  |   ack                       |
  |<----------------------------|
  |   [transport negotiation]   |
  |<--------------------------->|
  |   session-accept            |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |   AUDIO (RTP)               |
  |<===========================>|
  |                             |
  

When the initiator sends a session-initiate stanza to the responder, the <description/> element includes all of the payload types that the initiator can send and/or receive for Jingle RTP, each one encapsulated in a separate <payload-type/> element (the rules specified in RFC 3264 [10] SHOULD be followed regarding inclusion of payload types).

Example 2. Initiation

<iq from='romeo@montague.net/orchard'
    id='jingle1'
    to='juliet@capulet.com/balcony'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'>
          action='session-initiate'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='voice'>
      <description xmlns='urn:xmpp:tmp:jingle:apps:rtp' media='audio' profile='RTP/AVP'>
        <payload-type id='96' name='speex' clockrate='16000'/>
        <payload-type id='97' name='speex' clockrate='8000'/>
        <payload-type id='18' name='G729'/>
        <payload-type id='0' name='PCMU' />
        <payload-type id='103' name='L16' clockrate='16000' channels='2'/>
        <payload-type id='98' name='x-ISAC' clockrate='8000'/>
      </description>
      <transport xmlns='urn:xmpp:tmp:jingle:transports:ice-udp'/>
    </content>
  </jingle>
</iq>
  

Upon receiving the session-initiate stanza, the responder determines whether it can proceed with the negotiation. The general Jingle error cases are specified in XEP-0166 and illustrated in the Scenarios section of this document.

If there is no error, the responder acknowledges the session initiation request.

Example 3. Responder acknowledges session-initiate

  <iq from='juliet@capulet.com/balcony'
      id='jingle1'
      to='romeo@montague.net/orchard'
      type='result'/>
  

After successful transport negotiation (not shown here), the responder accepts the session by sending a session-accept action to the initiator. The session-accept SHOULD include a subset of the payload types sent by the initiator, i.e., a list of the offered payload types that the responder can send and/or receive. The list that the responder sends SHOULD retain the ID numbers specified by the initiator. The order of the <payload-type/> elements indicates the responder's preferences, with the most-preferred types first.

In the following example, we imagine that the responder supports Speex at clockrate of 8000 but not 16000, G729, and PCMU but not PMCA. Therefore the responder returns only two payload types.

Example 4. Responder definitively accepts the session

<iq from='juliet@capulet.com/balcony'
    id='accept1'
    to='romeo@montague.net/orchard'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='session-accept'
          initiator='romeo@montague.net/orchard'
          responder='juliet@capulet.com/balcony'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='voice'>
      <description xmlns='urn:xmpp:tmp:jingle:apps:rtp' media='audio' profile='RTP/AVP'>
        <payload-type id='97' name='speex' clockrate='8000'/>
        <payload-type id='18' name='G729'/>
      </description>
      <transport xmlns='urn:xmpp:tmp:jingle:transports:ice-udp'>
        <candidate component='1'
                   foundation='1'
                   generation='0'
                   ip='192.0.2.3'
                   network='1'
                   port='45664'
                   priority='1678246398'
                   protocol='udp'
                   pwd='asd88fgpdd777uzjYhagZg'
                   type='srflx'
                   ufrag='8hhy'/>
      </transport>
    </content>
  </jingle>
</iq>
  

And the initiator acknowledges session acceptance:

Example 5. Initiator acknowledges session acceptance

  <iq from='romeo@montague.net/orchard'
      id='accept1'
      to='juliet@capulet.com/balcony'
      type='result'/>
  

The initiator and responder would then exchange media using any of the codecs that meet the following criteria:

6. Mapping to Session Description Protocol

The SDP media type for Jingle RTP is "audio" (see Section 8.2.1 of RFC 4566) for audio media, "video" (see Section 8.2.1 of RFC 4566) for video media, etc.

If the payload type is static (payload-type IDs 0 through 95 inclusive), it MUST be mapped to a media field defined in RFC 4566. The generic format for the media field is as follows:

m=<media> <port> <transport> <fmt list>
  

In the context of Jingle audio sessions, the <media> is "audio" or "video" or some other media type, the <port> is the preferred port for such communications (which may be determined dynamically), the <transport> is whatever profile is negotiated via the 'profile' attribute of the <content/> element in the Jingle negotiation (e.g., "RTP/AVP"), and the <fmt list> is the payload-type ID.

For example, consider the following static payload-type:

Example 6. Jingle format for static payload-type

<payload-type id="13" name="CN"/>
  

That Jingle-formatted information would be mapped to SDP as follows:

Example 7. SDP mapping of static payload-type

m=audio 9999 RTP/AVP 13
  

If the payload type is dynamic (payload-type IDs 96 through 127 inclusive), it SHOULD be mapped to an SDP media field plus an SDP attribute field named "rtpmap".

For example, consider a payload of 16-bit linear-encoded stereo audio sampled at 16KHz associated with dynamic payload-type 96:

Example 8. Jingle format for dynamic payload-type

<payload-type id='96' name='speex' clockrate='16000'/>
  

That Jingle-formatted information would be mapped to SDP as follows:

Example 9. SDP mapping of dynamic payload-type

m=audio 9999 RTP/AVP 96
a=rtpmap:96 speex/16000
  

As noted, if additional parameters are to be specified, they shall be represented as attributes of the <parameter/> child of the <payload-type/> element, as in the following example.

Example 10. Dynamic audio payload-type with parameters

<payload-type id='96' name='speex' clockrate='16000' ptime='40'>
  <parameter name='vbr' value='on'/>
  <parameter name='cng' value='on'/>
</payload-type>
  

That Jingle-formatted information would be mapped to SDP as follows:

Example 11. SDP mapping of dynamic audio payload-type with parameters

m=audio 9999 RTP/AVP 96
a=rtpmap:96 speex/16000
a=ptime:40
a=fmtp:96 vbr=on;cng=on
  

The formatting is similar for video parameters, as shown in the following example.

Example 12. Dynamic video payload-type with parameters

<payload-type id='96' name='theora' clockrate='90000'>
  <parameter name='height' value='720'/>
  <parameter name='width' value='1280'/>
  <parameter name='delivery-method' value='inline'/>
  <parameter name='configuration' value='somebase16string'/>
  <parameter name='sampling' value='YCbCr-4:2:2'/>
</payload-type>
  

That Jingle-formatted information would be mapped to SDP as follows:

Example 13. SDP mapping of dynamic video payload-type with parameters

m=video 49170 RTP/AVP 98
a=rtpmap:96 theora/90000
a=fmtp:96 sampling=YCbCr-4:2:2; width=1280; height=720;
delivery-method=inline; configuration=somebase16string;
  

7. Informational Messages

7.1 Format

Informational messages may be sent by either party within the context of Jingle to communicate the status of a Jingle RTP session, device, or principal. The informational message MUST be an IQ-set containing a <jingle/> element of type "session-info", where the informational message is a payload element qualified by the 'urn:xmpp:tmp:jingle:apps:rtp:info' namespace; the following payload elements are defined: [11]

Table 2: Information Payload Elements

Element Meaning
<active/> The principal or device is again actively participating in the session after having been on hold or on mute.
<hold/> The principal is temporarily pausing the chat (i.e., putting the other party on hold).
<mute/> The principal is temporarily stopping media output but continues to accept media input. The <mute/< element MAY possess a 'name' attribute whose value specifies a particular content-type to be muted (e.g., muting the video aspect but not the voice aspect of a voice+video chat). If no 'name' attribute is included, the recipient MUST assume that all content-types are to be muted.
<ringing/> The device is ringing but the principal has not yet interacted with it to answer (this maps to the SIP 180 response code).

Note: Because the informational message is sent in an IQ-set, the receiving party MUST return either an IQ-result or an IQ-error (normally only an IQ-result to acknowledge receipt; no error flows are defined or envisioned at this time).

7.2 Examples

Example 14. Responder sends active message

<iq from='juliet@capulet.com/balcony'
    id='active1'
    to='romeo@montague.net/orchard'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'>
          action='session-info'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <active xmlns='urn:xmpp:tmp:jingle:apps:rtp:info'/>
  </jingle>
</iq>
    

Example 15. Responder sends hold message

<iq from='juliet@capulet.com/balcony'
    id='hold1'
    to='romeo@montague.net/orchard'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='session-info'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <hold xmlns='urn:xmpp:tmp:jingle:apps:rtp:info'/>
  </jingle>
</iq>
    

Example 16. Responder sends mute message

<iq from='juliet@capulet.com/balcony'
    id='mute1'
    to='romeo@montague.net/orchard'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='session-info'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <mute xmlns='urn:xmpp:tmp:jingle:apps:rtp:info' name='webcam'/>
  </jingle>
</iq>
    

Example 17. Responder sends ringing message

<iq from='juliet@capulet.com/balcony'
    id='ringing1'
    to='romeo@montague.net/orchard'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='session-info'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <ringing xmlns='urn:xmpp:tmp:jingle:apps:rtp:info'/>
  </jingle>
</iq>
    

8. Determining Support

If an entity supports Jingle RTP session, it MUST advertise that fact by returning a feature of "urn:xmpp:tmp:jingle:apps:rtp" (see Protocol Namespaces regarding issuance of one or more permanent namespaces) in response to Service Discovery [12] information requests.

Example 18. Service discovery information request

<iq from='romeo@montague.net/orchard'
    id='disco1'
    to='juliet@capulet.com/balcony'
    type='get'>
  <query xmlns='http://jabber.org/protocol/disco#info'/>
</iq>
  

Example 19. Service discovery information response

<iq from='juliet@capulet.com/balcony'
    id='disco1'
    to='romeo@montague.net/orchard'
    type='result'>
  <query xmlns='http://jabber.org/protocol/disco#info'>
    ...
    <feature var='urn:xmpp:tmp:jingle'/>
    <feature var='urn:xmpp:tmp:jingle:apps:rtp'/>
    ...
  </query>
</iq>
  

Naturally, support MAY also be determined via the dynamic, presence-based profile of Service Discovery defined in Entity Capabilities [13].

9. Scenarios

The following sections show a number of Jingle RTP scenarios, in relative order of complexity.

9.1 Responder is Busy

In this scenario, Romeo initiates a voice chat with Juliet but she is otherwise engaged.

The session flow is as follows:

Romeo                         Juliet
  |                             |
  |   session-initiate          |
  |---------------------------->|
  |   ack                       |
  |<----------------------------|
  |   session-info (ringing)    |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |   terminate                 |
  |   (reason = busy)           |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |                             |
    

The protocol flow is as follows.

Example 20. Initiator sends session-initiate

<iq from='romeo@montague.lit/orchard'
    id='jingle1'
    to='juliet@capulet.lit/balcony'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='session-initiate'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='voice'>
      <description xmlns='urn:xmpp:tmp:jingle:apps:rtp' media='audio' profile='RTP/AVP'>
        <payload-type id='96' name='speex' clockrate='16000'/>
        <payload-type id='97' name='speex' clockrate='8000'/>
        <payload-type id='18' name='G729'/>
        <payload-type id='103' name='L16' clockrate='16000' channels='2'/>
        <payload-type id='98' name='x-ISAC' clockrate='8000'/>
      </description>
      <transport xmlns='urn:xmpp:tmp:jingle:transports:ice-udp'/>
    </content>
  </jingle>
</iq>
    

Example 21. Responder sends provisional acceptance

<iq from='juliet@capulet.lit/balcony'
    id='accept1'
    to='romeo@montague.lit/orchard'
    type='result'/>
    

Example 22. Responder sends ringing message

<iq from='juliet@capulet.com/balcony'
    id='ringing1'
    to='romeo@montague.net/orchard'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='session-info'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <ringing xmlns='urn:xmpp:tmp:jingle:apps:rtp:info'/>
  </jingle>
</iq>
    

Example 23. Initiator acknowledges ringing message

<iq from='romeo@montague.lit/orchard'
    id='ringing1'
    to='juliet@capulet.lit/balcony'
    type='result'/>
    

Example 24. Responder terminates the session

<iq from='juliet@capulet.lit/balcony'
    id='term1'
    to='romeo@montague.lit/orchard'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='session-terminate'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <reason>
      <condition><busy/></condition>
      <text>No time to chat right now!</text>
    </reason>
  </jingle>
</iq>
    

The other party then acknowledges termination of the session:

Example 25. Initiator acknowledges termination

<iq from='romeo@montague.lit/orchard'
    id='term1'
    to='juliet@capulet.lit/balcony'
    type='result'/>
    

9.2 Jingle Audio via RTP/AVP, Negotiated with ICE-UDP

In this scenario, Romeo initiates a voice chat with Juliet using a transport method of ICE-UDP. The parties also exchange informational messages.

The session flow is as follows:

Romeo                         Juliet
  |                             |
  |   session-initiate          |
  |---------------------------->|
  |   ack                       |
  |<----------------------------|
  |   session-info (ringing)    |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |   transport-info (X times)  |
  |   (with acks)               |
  |<--------------------------->|
  |   STUN connectivity checks  |
  |<--------------------------->|
  |   session-accept            |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |   AUDIO (RTP)               |
  |<===========================>|
  |   session-terminate         |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |                             |
    

The protocol flow is as follows.

Example 26. Initiator sends session-initiate

<iq from='romeo@montague.lit/orchard'
    id='jingle1'
    to='juliet@capulet.lit/balcony'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='session-initiate'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='voice'>
      <description xmlns='urn:xmpp:tmp:jingle:apps:rtp' media='audio' profile='RTP/AVP'>
        <payload-type id='96' name='speex' clockrate='16000'/>
        <payload-type id='97' name='speex' clockrate='8000'/>
        <payload-type id='18' name='G729'/>
        <payload-type id='103' name='L16' clockrate='16000' channels='2'/>
        <payload-type id='98' name='x-ISAC' clockrate='8000'/>
      </description>
      <transport xmlns='urn:xmpp:tmp:jingle:transports:ice-udp'/>
    </content>
  </jingle>
</iq>
    

Example 27. Responder sends provisional acceptance

<iq from='juliet@capulet.lit/balcony'
    id='accept1'
    to='romeo@montague.lit/orchard'
    type='result'/>
    

Example 28. Responder sends ringing message

<iq from='juliet@capulet.com/balcony'
    id='ringing1'
    to='romeo@montague.net/orchard'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='session-info'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <ringing xmlns='urn:xmpp:tmp:jingle:apps:rtp:info'/>
  </jingle>
</iq>
    

Example 29. Initiator acknowledges ringing message

<iq from='romeo@montague.lit/orchard'
    id='ringing1'
    to='juliet@capulet.lit/balcony'
    type='result'/>
    

Because the parties have chosen the Jingle ICE-UDP Transport Method, the initiator and responder exchange an open-ended number of possible candidate transports, perform connectivity checks, and agree upon a candidate transport as explained in XEP-0176. Once ICE negotiation is completed, the responder sends a session-accept action to the initiator.

Example 30. Responder sends session-accept

<iq from='juliet@capulet.lit/balcony'
    id='accept1'
    to='romeo@montague.lit/orchard'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='session-accept'
          initiator='romeo@montague.lit/orchard'
          responder='juliet@capulet.lit/balcony'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='voice'>
      <description xmlns='urn:xmpp:tmp:jingle:apps:rtp' media='audio' profile='RTP/AVP'>
        <payload-type id='97' name='speex' clockrate='8000'/>
        <payload-type id='18' name='G729'/>
      </description>
      <transport xmlns='urn:xmpp:tmp:jingle:transports:ice-udp'>
        <candidate component='1'
                   foundation='1'
                   generation='0'
                   ip='192.0.2.3'
                   network='1'
                   port='45664'
                   priority='1694498815'
                   protocol='udp'
                   pwd='asd88fgpdd777uzjYhagZg'
                   rel-addr='10.0.1.1'
                   rel-port='8998'
                   rem-addr='192.0.2.1'
                   rem-port='3478'
                   type='srflx'
                   ufrag='8hhy'/>
      </transport>
    </content>
  </jingle>
</iq>
    

If the payload types and transport candidate can be successfully used by both parties, then the initiator acknowledges the session-accept action.

Example 31. Initiator acknowledges session-accept

<iq from='romeo@montague.lit/orchard'
    id='accept1'
    to='juliet@capulet.lit/balcony'
    type='result'/>
    

The parties now begin to exchange media. In this case they would exchange audio using the Speex codec at a clockrate of 8000 since that is the highest-priority codec for the responder (as determined by the XML order of the <payload-type/> children).

The parties may continue the session as long as desired.

Eventually, one of the parties terminates the session.

Example 32. Responder terminates the session

<iq from='juliet@capulet.lit/balcony'
    id='term1'
    to='romeo@montague.lit/orchard'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='session-terminate'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <reason>
      <condition><no-error/></condition>
      <text>Sorry, gotta go!</text>
    </reason>
  </jingle>
</iq>
    

The other party then acknowledges termination of the session:

Example 33. Initiator acknowledges termination

<iq from='romeo@montague.lit/orchard'
    id='term1'
    to='juliet@capulet.lit/balcony'
    type='result'/>
    

9.3 Jingle Audio and Video via RTP/AVP, Negotiated with ICE-UDP

In this scenario, Romeo initiates a combined audio and video chat with Juliet using a transport method of ICE-UDP. Juliet at first refuses the video portion, then later offers to add video, which Romeo accepts. The parties also exchange various informational messages

The session flow is as follows:

Romeo                         Juliet
  |                             |
  |   session-initiate          |
  |---------------------------->|
  |   ack                       |
  |<----------------------------|
  |   session-info (ringing)    |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |   content-remove            |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |   transport-info (X times)  |
  |   (with acks)               |
  |<--------------------------->|
  |   STUN connectivity checks  |
  |<--------------------------->|
  |   session-accept            |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |   AUDIO (RTP)               |
  |<===========================>|
  |   session-info (hold)       |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |   session-info (active)     |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |   content-add               |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |   content-accept            |
  |---------------------------->|
  |   ack                       |
  |<----------------------------|
  |   AUDIO + VIDEO (RTP)       |
  |<===========================>|
  |   session-terminate         |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |                             |
    

The protocol flow is as follows.

Example 34. Initiation

<iq from='romeo@montague.lit/orchard'
    id='jingle1'
    to='juliet@capulet.lit/balcony'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='session-initiate'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='voice'>
      <description xmlns='urn:xmpp:tmp:jingle:apps:rtp' media='audio' profile='RTP/AVP'>
        <payload-type id='96' name='speex' clockrate='16000'/>
        <payload-type id='97' name='speex' clockrate='8000'/>
        <payload-type id='18' name='G729'/>
        <payload-type id='103' name='L16' clockrate='16000' channels='2'/>
        <payload-type id='98' name='x-ISAC' clockrate='8000'/>
      </description>
      <transport xmlns='urn:xmpp:tmp:jingle:transports:ice-udp'/>
    </content>
    <content creator='initiator' name='webcam'>
      <description xmlns='urn:xmpp:tmp:jingle:apps:rtp' media='video' profile='RTP/AVP'>
        <payload-type id='96' name='theora' clockrate='90000' height='720' width='1280'>
          <parameter name='delivery-method' value='inline'/>
          <parameter name='configuration' value='somebase16string'/>
          <parameter name='sampling' value='YCbCr-4:2:2'/>
        </payload-type>
        <payload-type id='28' name='nv' clockrate='90000'/>
        <payload-type id='25' name='CelB' clockrate='90000'/>
        <payload-type id='32' name='MPV' clockrate='90000'/>
      </description>
      <transport xmlns='urn:xmpp:tmp:jingle:transports:ice-udp'/>
    </content>
  </jingle>
</iq>
    

Example 35. Responder acknowledges session-initiate request

<iq from='juliet@capulet.lit/balcony'
    id='jingle1'
    to='romeo@montague.lit/orchard'
    type='result'/>
    

Example 36. Responder sends ringing message

<iq from='juliet@capulet.com/balcony'
    id='ringing1'
    to='romeo@montague.net/orchard'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='session-info'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <ringing xmlns='urn:xmpp:tmp:jingle:apps:rtp:info'/>
  </jingle>
</iq>
    

Example 37. Initiator acknowledges ringing message

<iq from='romeo@montague.lit/orchard'
    id='ringing1'
    to='juliet@capulet.lit/balcony'
    type='result'/>
    

However, Juliet doesn't want to do video because she is having a bad hair day, so she sends a "content-remove" request to Romeo.

Example 38. Responder requests content-remove

<iq from='juliet@capulet.lit/balcony'
    id='remove1'
    to='romeo@montague.lit/orchard'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='content-remove'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='webcam'>
  </jingle>
</iq>
    

Romeo then acknowledges the content-remove request:

Example 39. Initiator acknowledges content-remove

<iq from='romeo@montague.lit/orchard'
    id='remove1'
    to='juliet@capulet.lit/balcony'
    type='result'/>
    

Because the parties have chosen the Jingle ICE-UDP Transport Method, the initiator and responder exchange an open-ended number of possible candidate transports, perform connectivity checks, and agree upon a candidate transport as explained in XEP-0176. Once ICE negotiation is completed, the responder sends a session-accept action to the initiator.

Example 40. Responder sends session-accept

<iq from='juliet@capulet.lit/balcony'
    id='accept1'
    to='romeo@montague.lit/orchard'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='session-accept'
          initiator='romeo@montague.lit/orchard'
          responder='juliet@capulet.lit/balcony'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='voice'>
      <description xmlns='urn:xmpp:tmp:jingle:apps:rtp' media='audio' profile='RTP/AVP'>
        <payload-type id='97' name='speex' clockrate='8000'/>
        <payload-type id='18' name='G729'/>
      </description>
      <transport xmlns='urn:xmpp:tmp:jingle:transports:ice-udp'>
        <candidate component='1'
                   foundation='1'
                   generation='0'
                   ip='192.0.2.3'
                   network='1'
                   port='45664'
                   priority='1694498815'
                   protocol='udp'
                   pwd='asd88fgpdd777uzjYhagZg'
                   rel-addr='10.0.1.1'
                   rel-port='8998'
                   rem-addr='192.0.2.1'
                   rem-port='3478'
                   type='srflx'
                   ufrag='8hhy'/>
      </transport>
    </content>
  </jingle>
</iq>
    

As above, if the payload types and transport candidate can be successfully used by both parties, then the initiator acknowledges the session-accept action.

Example 41. Initiator acknowledges session-accept

<iq from='romeo@montague.lit/orchard'
    id='accept1'
    to='juliet@capulet.lit/balcony'
    type='result'/>
    

The parties now begin to exchange media. In this case they would exchange audio using the Speex codec at a clockrate of 8000 since that is the highest-priority codec for the responder (as determined by the XML order of the <payload-type/> children).

The parties chat for a while. Eventually Juliet wants to get her hair in order so she puts Romeo on hold.

Example 42. Responder sends hold message

<iq from='juliet@capulet.com/balcony'
    id='hold1'
    to='romeo@montague.net/orchard'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='session-info'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <hold xmlns='urn:xmpp:tmp:jingle:apps:rtp:info'/>
  </jingle>
</iq>
    

Example 43. Initiator acknowledges hold message

<iq from='romeo@montague.lit/orchard'
    id='hold1'
    to='juliet@capulet.lit/balcony'
    type='result'/>
    

Juliet returns so she informs Romeo that she is actively engaged in the call again.

Example 44. Responder sends active message

<iq from='juliet@capulet.com/balcony'
    id='active1'
    to='romeo@montague.net/orchard'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='session-info'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <active xmlns='urn:xmpp:tmp:jingle:apps:rtp:info'/>
  </jingle>
</iq>
    

Example 45. Initiator acknowledges active message

<iq from='romeo@montague.lit/orchard'
    id='active1'
    to='juliet@capulet.lit/balcony'
    type='result'/>
    

The parties now continue the audio chat.

Finally Juliet decides that she is presentable for a video chat so she sends a content-add request to Romeo.

Example 46. Responder sends a content-add

<iq from='juliet@capulet.lit/balcony'
    id='add1'
    to='romeo@montague.lit/orchard'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='content-add'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='responder' name='webcam'>
      <description xmlns='urn:xmpp:tmp:jingle:apps:rtp' media='video' profile='RTP/AVP'>
        <payload-type id='96' name='theora' height='720' width='1280'>
          <parameter name='delivery-method' value='inline'/>
          <parameter name='configuration' value='somebase16string'/>
          <parameter name='sampling' value='YCbCr-4:2:2'/>
        </payload-type>
        <payload-type id='32' name='MPV' clockrate='90000'/>
        <payload-type id='33' name='MP2T' clockrate='90000'/>
      </description>
      <transport xmlns='urn:xmpp:tmp:jingle:transports:ice-udp'>
    </content>
  </jingle>
</iq>
    

The entity receiving the content-add request then acknowledges the request and, if it is acceptable, returns a content-accept action:

Example 47. Initiator acknowledges content-add

<iq from='romeo@montague.lit/orchard'
    id='add1'
    to='juliet@capulet.lit/balcony'
    type='result'/>
    

Example 48. Initiator accepts new content definition

<iq from='romeo@montague.lit/orchard'
    id='add2'
    to='juliet@capulet.lit/balcony'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='content-accept'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='responder' name='webcam'>
      <description xmlns='urn:xmpp:tmp:jingle:apps:rtp' media='video' profile='RTP/AVP'>
        <payload-type id='96' name='theora' height='720' width='1280'>
          <parameter name='delivery-method' value='inline'/>
          <parameter name='configuration' value='somebase16string'/>
          <parameter name='sampling' value='YCbCr-4:2:2'/>
        </payload-type>
        <payload-type id='32' name='MPV' clockrate='90000'/>
      </description>
      <transport xmlns='urn:xmpp:tmp:jingle:transports:ice-udp'/>
    </content>
  </jingle>
</iq>
    

The other party then acknowledges the acceptance.

Example 49. Responder acknowledges content-accept

<iq from='juliet@capulet.lit/balcony'
    id='add2'
    to='romeo@montague.lit/orchard'
    type='result'/>
    

The media session proceeds. Now they would exchange both audio and video, where the audio is exchanged via the Speex codec at a clockrate of 8000 and the video is exchanged using the Theora codec with a height of 720 pixels, a width of 1280 pixels, and so on.

The parties may continue the session as long as desired.

Eventually, one of the parties terminates the session.

Example 50. Initiator sends session-terminate

<iq from='romeo@montague.lit/orchard'
    id='term1'
    to='juliet@capulet.lit/balcony'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='session-terminate'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <reason>
      <condition><busy/></condition>
      <text>I&apos;m outta here!</text>
    </reason>
  </jingle>
</iq>
    

Example 51. Responder acknowledges session-terminate

<iq from='juliet@capulet.lit/balcony'
    id='term1'
    to='romeo@montague.lit/orchard'
    type='result'/>
    

9.4 Secure Jingle Audio via UDP/TLS/RTP/SAVP, Negotiated with ICE-UDP

In this scenario, Romeo initiates a voice chat with Juliet using a secure transport as specified in RTP Over DTLS [14] (via a profile of "UDP/TLS/RTP/SAVP").

The session flow is as follows:

Romeo                         Juliet
  |                             |
  |   session-initiate          |
  |---------------------------->|
  |   ack                       |
  |<----------------------------|
  |   session-info (ringing)    |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |   transport-info (X times)  |
  |   (with acks)               |
  |<--------------------------->|
  |   STUN connectivity checks  |
  |<--------------------------->|
  |   session-accept            |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |   AUDIO (RTP)               |
  |<===========================>|
  |   session-terminate         |
  |<----------------------------|
  |   ack                       |
  |---------------------------->|
  |                             |
    

The protocol flow is as follows.

Example 52. Initiator sends session-initiate

<iq from='romeo@montague.lit/orchard'
    id='jingle1'
    to='juliet@capulet.lit/balcony'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='session-initiate'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='voice'>
      <description xmlns='urn:xmpp:tmp:jingle:apps:rtp' media='audio' profile='UDP/TLS/RTP/SAVP'/>
        <payload-type id='96' name='speex' clockrate='16000'/>
        <payload-type id='97' name='speex' clockrate='8000'/>
        <payload-type id='18' name='G729'/>
        <payload-type id='103' name='L16' clockrate='16000' channels='2'/>
        <payload-type id='98' name='x-ISAC' clockrate='8000'/>
      </description>
      <transport xmlns='urn:xmpp:tmp:jingle:transports:ice-udp'/>
    </content>
  </jingle>
</iq>
    

Example 53. Responder acknowledges session-initiate

<iq from='juliet@capulet.lit/balcony'
    id='jingle1'
    to='romeo@montague.lit/orchard'
    type='result'/>
    

Example 54. Responder sends ringing message

<iq from='juliet@capulet.com/balcony'
    id='ringing1'
    to='romeo@montague.net/orchard'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='session-info'
          initiator='romeo@montague.net/orchard'
          sid='a73sjjvkla37jfea'>
    <ringing xmlns='urn:xmpp:tmp:jingle:apps:rtp:info'/>
  </jingle>
</iq>
    

Example 55. Initiator acknowledges ringing message

<iq from='romeo@montague.lit/orchard'
    id='ringing1'
    to='juliet@capulet.lit/balcony'
    type='result'/>
    

Because the parties have chosen the Jingle ICE-UDP Transport Method, the initiator and responder exchange an open-ended number of possible candidate transports, perform connectivity checks, and agree upon a candidate transport as explained in XEP-0176. Once ICE negotiation is completed, the responder sends a session-accept action to the initiator.

Example 56. Responder sends session-accept

<iq from='juliet@capulet.lit/balcony'
    id='accept1'
    to='romeo@montague.lit/orchard'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='session-accept'
          initiator='romeo@montague.lit/orchard'
          responder='juliet@capulet.lit/balcony'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='voice'>
      <description xmlns='urn:xmpp:tmp:jingle:apps:rtp' media='audio' profile='UDP/TLS/RTP/SAVP'>
        <payload-type id='97' name='speex' clockrate='8000'/>
        <payload-type id='18' name='G729'/>
      </description>
      <transport xmlns='urn:xmpp:tmp:jingle:transports:ice-udp'>
        <candidate component='1'
                   foundation='1'
                   generation='0'
                   ip='192.0.2.3'
                   network='1'
                   port='45664'
                   priority='1694498815'
                   protocol='udp'
                   pwd='asd88fgpdd777uzjYhagZg'
                   rel-addr='10.0.1.1'
                   rel-port='8998'
                   rem-addr='192.0.2.1'
                   rem-port='3478'
                   type='srflx'
                   ufrag='8hhy'/>
      </transport>
    </content>
  </jingle>
</iq>
    

If the payload types and transport candidate can be successfully used by both parties, then the initiator acknowledges the session-accept action.

Example 57. Initiator acknowledges session-accept

<iq from='romeo@montague.lit/orchard'
    id='accept1'
    to='juliet@capulet.lit/balcony'
    type='result'/>
    

The parties now begin to exchange media. In this case they would exchange audio using the Speex codec at a clockrate of 8000 since that is the highest-priority codec for the responder (as determined by the XML order of the <payload-type/> children).

The parties may continue the session as long as desired.

Eventually, one of the parties terminates the session.

Example 58. Responder terminates the session

<iq from='juliet@capulet.lit/balcony'
    id='term1'
    to='romeo@montague.lit/orchard'
    type='set'>
  <jingle xmlns='urn:xmpp:tmp:jingle'
          action='session-terminate'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <reason>
      <condition><no-error/></condition>
      <text>Sorry, gotta go!</text>
    </reason>
  </jingle>
</iq>
    

The other party then acknowledges termination of the session:

Example 59. Initiator acknowledges termination

<iq from='romeo@montague.lit/orchard'
    id='term1'
    to='juliet@capulet.lit/balcony'
    type='result'/>
    

10. Implementation Notes

10.1 Audio Sessions

10.1.1 Codecs

10.1.1.1 Speex

For the sake of interoperability with a wide variety of free and open-source voice systems as well as deployment of patent-free technologies, support for the Speex codec is RECOMMENDED.

10.1.1.2 G.711

For the sake of interoperability with the public switched telephone network (PSTN) and most VoIP providers, support for the Pulse Code Modulation (PCM) codec defined in International Telecommunication Union (ITU) [15] recommendation G.711 is RECOMMENDED, including both the μ-law ("U-law") and A-law versions widely deployed in North America and Japan and in the rest of the world respectively.

10.1.2 DTMF

If it is necessary to send Dual Tone Multi-Frequency (DTMF) tones in the content of audio exchanges, it is RECOMMENDED to use the XML format specified Jingle DTMF [16]. However, an implementation MAY also support native RTP methods, specifically the "audio/telephone-event" and "audio/tone" media types.

10.1.3 When to Listen for Audio

When the Jingle RTP content type is accepted via a session-accept action, both initiator and responder SHOULD start listening for audio as defined by the negotiated transport method and audio application format. For interoperability with telephony systems, after the responder acknowledges the session initiation request, the responder SHOULD send a "ringing" message and both parties SHOULD play any audio received.

10.2 Video Sessions

10.2.1 Codecs

Support for the Theora codec is RECOMMENDED.

11. Security Considerations

In order to secure the data stream, implementations SHOULD use encryption methods appropriate to the transport method and media being exchanged; for example, in the case of UDP, that would include Datagram Transport Layer Security (DTLS) as specified in RFC 4347 [17]. The work-in-progress draft-fishl-mmusic-sdp-dtls defines such methods for the Session Description Protocol; the relevant RTP profile (e.g., "UDP/TLS/RTP/SAVP" for transporting the RTP stream over DTLS with UDP) shall be specified as the value of the <content/> element's 'profile' attribute.

12. IANA Considerations

This document requires no interaction with the Internet Assigned Numbers Authority (IANA) [18].

13. XMPP Registrar Considerations

13.1 Protocol Namespaces

Until this specification advances to a status of Draft, its associated namespaces shall be:

Upon advancement of this specification, the XMPP Registrar [19] shall issue permanent namespaces in accordance with the process defined in Section 4 of XMPP Registrar Function [20].

The following namespaces are requested, and are thought to be unique per the XMPP Registrar's requirements:

13.2 Service Discovery Features

For each RTP media type that an entity supports, it MUST advertise support for the "urn:xmpp:tmp:jingle:apps:rtp#[media]" feature, where the string "[media]" is replaced by the appropriate media type such as "audio" or "video".

The initial registry submission is as follows.

Registry Submission

<var>
  <name>urn:xmpp:tmp:jingle:apps:rtp#audio</name>
  <desc>Signals support for audio sessions via RTP</desc>
  <doc>XEP-0167</doc>
</var>
<var>
  <name>urn:xmpp:tmp:jingle:apps:rtp#video</name>
  <desc>Signals support for video sessions via RTP</desc>
  <doc>XEP-0167</doc>
</var>
    

13.3 Jingle Application Formats

The XMPP Registrar shall include "rtp" in its registry of Jingle application formats. The registry submission is as follows:

<application>
  <name>rtp</name>
  <desc>Jingle sessions that support media exchange via the Real-time Transport Protocol</desc>
  <transport>lossy</transport>
  <doc>XEP-0167</doc>
</application>
    

14. XML Schemas

14.1 Application Format

<?xml version='1.0' encoding='UTF-8'?>

<xs:schema
    xmlns:xs='http://www.w3.org/2001/XMLSchema'
    targetNamespace='urn:xmpp:tmp:jingle:apps:rtp'
    xmlns='urn:xmpp:tmp:jingle:apps:rtp'
    elementFormDefault='qualified'>

  <xs:element name='description'>
    <xs:complexType>
      <xs:sequence minOccurs='0' maxOccurs='unbounded'/>
        <xs:element ref='payload-type'/>
      </xs:sequence>
      <xs:attribute name='profile' use='optional' type='xs:string' default='RTP/AVP'/>
    </xs:complexType>
  </xs:element>

  <xs:element name='payload-type'>
    <xs:complexType>
      <xs:sequence minOccurs='0' maxOccurs='unbounded'>
        <xs:element ref='parameter'/>
      </xs:sequence>
      <xs:attribute name='channels' type='xs:byte' use='optional' default='1'/>
      <xs:attribute name='clockrate' type='xs:short' use='optional'/>
      <xs:attribute name='id' type='xs:unsignedByte' use='required'/>
      <xs:attribute name='maxptime' type='xs:short' use='optional'/>
      <xs:attribute name='name' type='xs:string' use='optional'/>
      <xs:attribute name='ptime' type='xs:short' use='optional'/>
    </xs:complexType>
  </xs:element>

  <xs:element name='parameter'>
    <xs:complexType>
      <xs:simpleContent>
        <xs:extension base='empty'>
          <xs:attribute name='name' type='xs:string' use='required'/>
          <xs:attribute name='value' type='xs:string' use='required'/>
        </xs:extension>
      </xs:simpleContent>
    </xs:complexType>
  </xs:element>

  <xs:simpleType name='empty'>
    <xs:restriction base='xs:string'>
      <xs:enumeration value=''/>
    </xs:restriction>
  </xs:simpleType>

</xs:schema>
    

14.2 Informational Messages

<?xml version='1.0' encoding='UTF-8'?>

<xs:schema
    xmlns:xs='http://www.w3.org/2001/XMLSchema'
    targetNamespace='urn:xmpp:tmp:jingle:apps:rtp:info'
    xmlns='urn:xmpp:tmp:jingle:apps:rtp:info'
    elementFormDefault='qualified'>

  <xs:element name='active' type='empty'/>

  <xs:element name='hold' type='empty'/>

  <xs:element name='mute'>
    <xs:complexType>
      <xs:simpleContent>
        <xs:extension base='empty'>
          <xs:attribute name='name' type='xs:string' use='optional'/>
        </xs:extension>
      </xs:simpleContent>
    </xs:complexType>
  </xs:element>

  <xs:element name='ringing' type='empty'/>

  <xs:simpleType name='empty'>
    <xs:restriction base='xs:string'>
      <xs:enumeration value=''/>
    </xs:restriction>
  </xs:simpleType>

</xs:schema>
    

15. Acknowledgements

Thanks to Milton Chen, Diana Cionoiu, Olivier Crête, Tim Julien, Steffen Larsen, Robert McQueen, Jeff Muller, Mike Ruprecht, and Paul Witty for their feedback.


Notes

1. XEP-0166: Jingle <http://www.xmpp.org/extensions/xep-0166.html>.

2. RFC 3550: RTP: A Transport Protocol for Real-Time Applications <http://tools.ietf.org/html/rfc3550>.

3. RFC 4566: SDP: Session Description Protocol <http://tools.ietf.org/html/rfc4566>.

4. XEP-0177: Jingle Raw UDP Transport Method <http://www.xmpp.org/extensions/xep-0177.html>.

5. XEP-0176: Jingle ICE-UDP Transport Method <http://www.xmpp.org/extensions/xep-0176.html>.

6. RFC 3551: RTP Profile for Audio and Video Conferences with Minimal Control <http://tools.ietf.org/html/rfc3551>.

7. RTP Payload Format for the Speex Codec <http://tools.ietf.org/html/draft-ietf-avt-rtp-speex>. Work in progress.

8. See <http://www.speex.org/>.

9. The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique parameter values for Internet protocols, such as port numbers and URI schemes. For further information, see <http://www.iana.org/>.

10. RFC 3264: An Offer/Answer Model with the Session Description Protocol (SDP) <http://tools.ietf.org/html/rfc3264>.

11. A <trying/> element (equivalent to the SIP 100 Trying response code) is not necessary, since each session-level action is acknowledged via XMPP IQ semantics.

12. XEP-0030: Service Discovery <http://www.xmpp.org/extensions/xep-0030.html>.

13. XEP-0115: Entity Capabilities <http://www.xmpp.org/extensions/xep-0115.html>.

14. Real-Time Transport Protocol (RTP) over Datagram Transport Layer Security (DTLS) <http://tools.ietf.org/html/draft-fischl-mmusic-sdp-dtls>. Work in progress.

15. The International Telecommunication Union develops technical and operating standards (such as H.323) for international telecommunication services. For further information, see <http://www.itu.int/>.

16. XEP-0181: Jingle DTMF <http://www.xmpp.org/extensions/xep-0181.html>.

17. RFC 4347: Datagram Transport Layer Security <http://tools.ietf.org/html/rfc4347>.

18. The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique parameter values for Internet protocols, such as port numbers and URI schemes. For further information, see <http://www.iana.org/>.

19. The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used in the context of XMPP extension protocols approved by the XMPP Standards Foundation. For further information, see <http://www.xmpp.org/registrar/>.

20. XEP-0053: XMPP Registrar Function <http://www.xmpp.org/extensions/xep-0053.html>.


Revision History

Version 0.21 (2008-06-09)

Added name attribute to mute element for more precise handling of informational messages.

(psa)

Version 0.20 (2008-06-04)

In accordance with list consensus, generalized to cover all RTP media, not just audio; corrected text regarding payload types sent by responder in order to match SDP approach.

(psa)

Version 0.19 (2008-05-28)

Specified default value for profile attribute; clarified relationship to SDP offer-answer model.

(psa)

Version 0.18 (2008-05-28)

Removed content-replace from ICE-UDP examples per XEP-0176.

(psa)

Version 0.17 (2008-02-29)

Corrected use of content-replace action per XEP-0166.

(psa)

Version 0.16 (2008-02-28)

Moved profile attribute from XEP-0166 to this specification.

(psa)

Version 0.15 (2008-01-11)

Removed content-accept after content-remove per XEP-0166.

(psa)

Version 0.14 (2008-01-03)

Modified examples to track changes to XEP-0176.

(psa)

Version 0.13 (2007-12-06)

To track changes to XEP-0166, modified busy scenario and removed unsupported-codecs error.

(psa)

Version 0.12 (2007-11-27)

Further editorial review.

(psa)

Version 0.11 (2007-11-15)

Editorial review and consistency check; moved voice chat scenarios from XEP-0166 to this specification.

(psa)

Version 0.10 (2007-11-13)

Removed info message for busy since it is now a Jingle-specific error condition defined in XEP-0166; defined info message for active.

(psa)

Version 0.9 (2007-04-17)

Specified Jingle conformance, including the preference for lossy transports over reliable transports and the process of sending and receiving audio content over each transport type.

(psa)

Version 0.8 (2007-03-23)

Renamed to mention RTP as the associated transport; corrected negotiation flow to be consistent with SIP/SDP (each party specifies a list of the payload types it can receive); added profile attribute to content element in order to specify RTP profile in use.

(psa/ram)

Version 0.7 (2006-12-21)

Modified spec to use provisional namespace before advancement to Draft (per XEP-0053).

(psa)

Version 0.6 (2006-10-31)

Specified how to include SDP parameters and codec-specific parameters; clarified negotiation process; added Speex examples; removed queued info message.

(psa/se)

Version 0.5 (2006-08-23)

Modified namespace to track XEP-0166.

(psa)

Version 0.4 (2006-07-12)

Specified when to play received audio (early media); specified that DTMF must use in-band signalling (XEP-0181).

(se/psa)

Version 0.3 (2006-03-20)

Defined info messages for hold and mute.

(psa)

Version 0.2 (2006-02-13)

Defined info message for busy; added info message examples; recommended use of Speex; updated schema and XMPP Registrar considerations.

(psa)

Version 0.1 (2005-12-15)

Initial version.

(psa)

Version 0.0.3 (2005-12-05)

Described service discovery usage; defined initial informational messages.

(psa)

Version 0.0.2 (2005-10-27)

Added SDP mapping, security considerations, IANA considerations, XMPP Registrar considerations, and XML schema.

(psa)

Version 0.0.1 (2005-10-21)

First draft.

(psa/sl)

END