Abstract: | This specification defines an XMPP protocol extension for initiating and managing multiparty voice and video conferences within an XMPP MUC |
Authors: | Sjoerd Simons, Dafydd Harries |
Copyright: | © 1999 – 2018 XMPP Standards Foundation. SEE LEGAL NOTICES. |
Status: | Deferred |
Type: | Standards Track |
Version: | 0.1.1 |
Last Updated: | 2018-11-03 |
WARNING: This document has been automatically Deferred after 12 months of inactivity in its previous Experimental state. Implementation of the protocol described herein is not recommended for production systems. However, exploratory implementations are encouraged to resume the standards process.
1. Introduction
2. How it Works
3. Joining a Conference
4. Leaving a Conference
5. Adding a Content Type
6. Removing a Content Type
7. Relays and Mixers
Appendices
A: Document Information
B: Author Information
C: Legal Notices
D: Relation to XMPP
E: Discussion Venue
F: Requirements Conformance
G: Notes
H: Revision History
Jingle (XEP-0166) [1] is used to negotiate peer to peer media sessions. Muji (short for Multiparty Jingle) is a way to coordinate Jingle sessions between a group of people. Muji conferences are held in Multi-User Chat (XEP-0045) [2] rooms.
A Muji conference has a number of contents, each of which has unique name, content type, and an encoding. Each participant may provide a stream for each content, and communicates which contents they are willing to provide streams for, along with encoding information, in their MUC presence. This serves two purposes. Firstly, so that each participant knows which contents every other participant provides. Secondly, so that there is a global payload type (PT) mapping for the various contents, so that clients only need to encode and payload each content that they provide once.
Participants are not required to participate all the contents that are available. For example, a Muji client might choose to only request audio streams.
Joining a conference is done in two stages. The first step is to declare that preparations are being done to either join or start a muji session inside the MUC. This is indicated by the client sending a presence stanza to the MUC with a preparing element in muji section.
<presence from='wiccarocks@shakespeare.lit/laptop' to='darkcave@chat.shakespeare.lit/oldhag'> <c xmlns="http://jabber.org/protocol/caps" node="http://telepathy.freedesktop.org/wiki/Muji" ver="48QdBuXRCJFb8qIzgy1FOHSGO0U=" hash="sha-1" /> <muji xmlns='http://telepathy.freedesktop.org/muji'> <preparing /> </muji> </presence>
The client MUST then wait until the MUC rebroadcasts its presence message, after which it MUST wait for all other participants that had a preparing element in their presence to finish preparation. Afterwards it should finish its own preparation by updating its presence with the contents it wants to take part in.
<presence from='wiccarocks@shakespeare.lit/laptop' to='darkcave@chat.shakespeare.lit/oldhag'> <c xmlns="http://jabber.org/protocol/caps" node="http://telepathy.freedesktop.org/wiki/Muji" ver="48QdBuXRCJFb8qIzgy1FOHSGO0U=" hash="sha-1" /> <muji xmlns='http://telepathy.freedesktop.org/muji'> <content name='video'> <description xmlns='urn:xmpp:jingle:apps:rtp:0' media='video'> <payload-type id='97' name='theora' clockrate='90000'/> </description> </content> <content creator='initiator' name='voice'> <description xmlns='urn:xmpp:jingle:apps:rtp:0' media='audio'> <payload-type id='97' name='speex' clockrate='8000'/> <payload-type id='18' name='G729'/> </description> </content> </muji> </presence>
When a client adds a payload ID to a content description, it MUST have the same codec name and receiving parameters as the corresponding entries in other participants' payload maps for that content. For instance, if Alice defines a payload type with ID 98, codec Speex and a a clock rate of 8000 for a content called “voice0”, then Bob must define payload type 98 identically or not at all for that content.
Furthermore, each content description MUST include at least one payload type that every other participant supports. In other words, the intersection of payload type mappings in descriptions for a content must not be the empty set. This avoids clients having to encode the same stream multiple times, which can be very costly, and also allows sending the encoded data only once where the transport makes this possible (e.g. IP multicast).
Once a client has constructed content descriptions and advertised them in its MUC presence, it MUST initiate a Jingle session with every other participant. The requirement that it is the joining participant that initiates sessions avoids race conditions.
Jingle sessions are initiated between the MUC JIDs of participants. That is, the Jingle session-initiate stanza is sent from one MUC JID to another. This allows participants to easily identify sessions as belonging to a Muji conference. Content names inside Muji-related Jingle sessions always refer to the content with the same name inside the Muji conference.
To leave a conference the Muji information MUST first be removed from the participant's presence; subsequently it SHOULD terminate all Jingle sessions related to that conference. Updating the presence first reduces the likelihood of situations where new participants initiate sessions with participants who are leaving the conference.
Adding a stream follows a process similar to the joining a conference. As a first step an updated presence stanza MUST be send which contains a preparing element as part of the Muji section.
<presence from='wiccarocks@shakespeare.lit/laptop' to='darkcave@chat.shakespeare.lit/oldhag'> <c xmlns="http://jabber.org/protocol/caps" node="http://telepathy.freedesktop.org/wiki/Muji" ver="48QdBuXRCJFb8qIzgy1FOHSGO0U=" hash="sha-1" /> <muji xmlns='http://telepathy.freedesktop.org/muji'> <content creator='initiator' name='voice'> <description xmlns='urn:xmpp:jingle:apps:rtp:0' media='audio'> <payload-type id='97' name='speex' clockrate='8000'/> <payload-type id='18' name='G729'/> </description> </content> <preparing/> </muji> </presence>
The client MUST then wait until the MUC rebroadcasts its presence message, after which it MUST wait for all other participants that had a preparing element in their presence to finish their changes.
Afterwards the client should add the new content to the muji section of its presence and add the content to all the Jingle sessions it had with participants it shared the content with.
<presence from='wiccarocks@shakespeare.lit/laptop' to='darkcave@chat.shakespeare.lit/oldhag'> <c xmlns="http://jabber.org/protocol/caps" node="http://telepathy.freedesktop.org/wiki/Muji" ver="48QdBuXRCJFb8qIzgy1FOHSGO0U=" hash="sha-1" /> <muji xmlns='http://telepathy.freedesktop.org/muji'> <content name='video'> <description xmlns='urn:xmpp:jingle:apps:rtp:0' media='video'> <payload-type id='97' name='theora' clockrate='90000'/> </description> </content> <content creator='initiator' name='voice'> <description xmlns='urn:xmpp:jingle:apps:rtp:0' media='audio'> <payload-type id='97' name='speex' clockrate='8000'/> <payload-type id='18' name='G729'/> </description> </content> </muji> </presence>
To remove a content type the participant SHOULD first sent an updated presence without the content in its muji section. Afterwards it MUST the content from all the Jingle sessions it has open.
When scaling to conferences with a big number of participants it's no longer viable for all participants to have direct connections. On connections where upstream bandwidth is the limiting factor, an RTP relay which is able to relay the stream to multiple participants on the behalf of the clients and which forwards the streams of other participants back to the client can be used. If the limiting factor is either CPU or downstream bandwidth then a mixer can be used, which receives the media streams from other participants and mixes them on behalf of the client, so that the client only has to deal with receiving and decoding a single stream for each media type. On the sending side a mixer acts like a relay and relays the clients stream to all other participants. Both these services can either be provided by dedicated services or by other clients.
Series: XEP
Number: 0272
Publisher: XMPP Standards Foundation
Status:
Deferred
Type:
Standards Track
Version: 0.1.1
Last Updated: 2018-11-03
Approving Body: XMPP Council
Dependencies: XMPP Core, XEP-0045, XEP-0166
Supersedes: None
Superseded By: None
Short Name: muji
Source Control:
HTML
This document in other formats:
XML
PDF
Email:
sjoerd.simons@collabora.co.uk
JabberID:
sjoerd.simons@collabora.co.uk
Email:
dafydd.harries@collabora.co.uk
JabberID:
dafydd.harries@collabora.co.uk
The Extensible Messaging and Presence Protocol (XMPP) is defined in the XMPP Core (RFC 6120) and XMPP IM (RFC 6121) specifications contributed by the XMPP Standards Foundation to the Internet Standards Process, which is managed by the Internet Engineering Task Force in accordance with RFC 2026. Any protocol defined in this document has been developed outside the Internet Standards Process and is to be understood as an extension to XMPP rather than as an evolution, development, or modification of XMPP itself.
The primary venue for discussion of XMPP Extension Protocols is the <standards@xmpp.org> discussion list.
Discussion on other xmpp.org discussion lists might also be appropriate; see <http://xmpp.org/about/discuss.shtml> for a complete list.
Errata can be sent to <editor@xmpp.org>.
The following requirements keywords as used in this document are to be interpreted as described in RFC 2119: "MUST", "SHALL", "REQUIRED"; "MUST NOT", "SHALL NOT"; "SHOULD", "RECOMMENDED"; "SHOULD NOT", "NOT RECOMMENDED"; "MAY", "OPTIONAL".
1. XEP-0166: Jingle <https://xmpp.org/extensions/xep-0166.html>.
2. XEP-0045: Multi-User Chat <https://xmpp.org/extensions/xep-0045.html>.
Note: Older versions of this specification might be available at http://xmpp.org/extensions/attic/
Initial published version as accepted for publication by the XMPP Council.
(psa)Second rough draft.
(sjoerd)END