XEP-xxxx: Jingle Remote Control

Abstract
This specification defines a way to remotely control a device using local peripheral inputs.
Author
Jérôme Poisson
Copyright
© 2024 – 2024 XMPP Standards Foundation. SEE LEGAL NOTICES.
Status

ProtoXEP

WARNING: This document has not yet been accepted for consideration or approved in any official manner by the XMPP Standards Foundation, and this document is not yet an XMPP Extension Protocol (XEP). If this document is accepted as a XEP by the XMPP Council, it will be published at <https://xmpp.org/extensions/> and announced on the <standards@xmpp.org> mailing list.
Type
Standards Track
Version
0.0.1 (2024-05-11)
Document Lifecycle
  1. Experimental
  2. Proposed
  3. Stable
  4. Final

1. Introduction

Thanks to Jingle (XEP-0166) [1], Jingle RTP Sessions (XEP-0167) [2], and associated XEPs, it is now possible to have video calls between devices through XMPP. Several XMPP clients also support desktop sharing, i.e., using screen capture of the current desktop session or of a specific application window instead of the usual webcam stream.

With this in place, we are one step away from having a remote desktop feature: if one "controlling" peer sends data from input peripherals to another "controlled" device and this other device uses them as input, we have a remote control system. This specification proposes ways to transmit input data to achieve remote control, with or without attached video call sessions, so that it can be used to remotely control a whole desktop or specific applications, to use remotely or simulate input devices for another device that may lack them (e.g., single-board computers or IoT devices), and it establishes a framework that can be extended to share data (e.g., clipboard content).

2. Requirements

The design goals of this XEP are:

3. Glossary

4. Overview

Remote control works by defining a new Jingle application, with the namespace 'urn:xmpp:jingle:apps:remote-control:0'. Once permission is granted, a streaming transport is established between "controlling" and "controlled" device. Input peripheral data are sent using CBOR serialization format (RFC 8949 [3]). Controlled device receive those data, decode them, and use them to simulate inputs.

Audio or video streams can optionally be used in the same session to transmit the audio and/or video content of the controlled application. Additionally, Audio/Video streams can also be used for communication between the controlling entity and the controlled device user; this can be useful for explaining what is currently being done, for teaching, and so on.

5. Jingle Conformance

In accordance with Section 12 of XEP-0166, this document specifies the following information related to the Remote Control application type:

  1. The application format negotiation process is defined in the Negotiating a Remote Control Session section of this document.

  2. The semantics of the <description/> element are defined in the Application Format section of this document.

  3. There is no mapping of Remote Control semantics to the Session Description Protocol.

  4. A Remote Control session MUST use a streaming transport method, not a datagram transport method.

    Even if in theory any streaming transport could be used, Signaling WebRTC datachannels in Jingle (XEP-0343) [4] SHOULD be used as the preferred transport method, given that Remote Control is often utilized in conjunction with Jingle RTP Sessions (XEP-0167) [2] and should be made accessible for web clients.

  5. Transport components are not used in Remote Control.

  6. Description of content sending and receiving is describe in the Exchanging Input Data section of this document.

6. Negotiating a Remote Control Session

A Remote Control session is requested when one of the Jingle <content/> elements has a child <description/> element qualified by the 'urn:xmpp:jingle:apps:remote-control:0' namespace. This content MAY be the only content of the session, or it MAY be associated with Jingle RTP Sessions (XEP-0167) [2] media contents.

If a Remote Control session content is associated with Jingle RTP Sessions (XEP-0167) [2] and if WebRTC is used as transport (via Jingle ICE-UDP Transport Method (XEP-0176) [5] and Signaling WebRTC datachannels in Jingle (XEP-0343) [4]), the same WebRTC session MUST be used for all contents.

A Remote Control content MAY be added to an existing Jingle RTP Sessions (XEP-0167) [2] session by using a Jingle content-add as specified in XEP-0166's "content-add" section. If this new content appears and uses WebRTC for transport, and a WebRTC session is already used for Jingle RTP Sessions (XEP-0167) [2], the same WebRTC session MUST be used and a data-channel MUST be created inside, using the content's name as the Data Channel label. Permission SHOULD be requested from the user to allow remote control, before accepting the <content/>.

As a rule of thumb, Remote Control sessions, in particular in the remote desktop use case, are generally associated with a single unidirectional video stream (with the <content/> 'senders' attribute set to "responder"). This stream MUST diffuse the content of the controlled application. An additional audio unidirectional stream may also be used to transmit the sounds emitted by the controlled application. If those streams are bidirectional (i.e., with the 'senders' attribute not set, or set to "both"), the content emitted by the controlling device SHOULD be similar to a normal A/V call, i.e., webcam and/or desktop sharing with sound from the microphone. This is notably useful if the Remote Control feature is used for teaching, or explaining what is done on the controlled app. Additional video and/or audio streams may be used to have communication with the controlled device user, resulting in up to 2 video and 2 audio streams. In this case, the first video and audio streams MUST be used for controlled application content, and optionally controlling entity video, while the last 2 MUST have the 'senders' attribute set to "responder" and be used for controlled device user feedback.

The following table summarizes the video stream usage in Remote Control sessions.

Table 1: Video Stream Usage in Remote Control Sessions
Number of Streams Use Case Senders Attribute Remarks
0 Remote Control without A/V feedback - No A/V streams, inputs are simply sent to the controlled device.
1 video (optional 1 audio) Simple controlled application video feedback senders="responder" The first video stream is mandatory if Jingle RTP Sessions (XEP-0167) [2] is used.
1 video, 1 audio Controlled application feedback + controlling entity video and audio senders="both"
2 video, 2 audio Controlled application feedback + bidirectional communication senders="both" (first 2 streams), senders="responder" (last 2 streams) First 2 streams: controlled application content and controlling entity video. Last 2 streams: controlled device user feedback.

6.1 Negotiation example

This is a negotiation example: Juliets wants to control Romeo's device to help him with a new software. After using her client to initiate a Remote Control session, her clients send the Jingle initiation stanza:

Example 1. Juliet's Client Sends session-initiate
<iq id="IQ_1" type="set" from="juliet@capulet.lit/balcony" to="romeo@montague.lit/orchard">
  <jingle xmlns="urn:xmpp:jingle:1" sid="42d6beee-b51d-4a4b-8333-405051a33a10" action="session-initiate" initiator="juliet@capulet.lit/balcony">
    <content creator="initiator" name="1" senders="initiator">
      <description xmlns="urn:xmpp:jingle:apps:remote-control:0">
        <device type="keyboard"/>
        <device type="mouse"/>
        <device type="touch"/>
        <device type="wheel"/>
      </description>
      <transport xmlns="urn:xmpp:jingle:transports:webrtc-datachannel:1" sctp-port="5000" max-message-size="1073741823">
        <!-- XEP-0343 payload -->
      </transport>
    </content>
    <content creator="initiator" name="0" senders="responder">
      <description xmlns="urn:xmpp:jingle:apps:rtp:1" media="video">
        <!-- XEP-0167 payload -->
      </description>
      <transport xmlns="urn:xmpp:jingle:transports:ice-udp:1" ufrag="f0a1620c" pwd="6fb807d5f37ca6f7f248dae57fe3da02">
        <!-- XEP-0176 payload -->
      </transport>
    </content>
  </jingle>
</iq>
Example 2. Romeo's Client Sends Acknowledgement.
<iq to="juliet@capulet.lit/balcony" from="romeo@montague.lit/orchard" id="IQ_1" type="result"/>

Romeo accepts the session, but his device doesn't handle touch inputs, so his client accepts every decive but the "touch" one (<device type="touch"/> is missing in its response):

Example 3. Romeo clients Sends the session-accept Stanza
<iq id="IQ_2" type="set" from="romeo@montague.lit" to="juliet@capulet.lit/balcony">
  <jingle xmlns="urn:xmpp:jingle:1" sid="42d6beee-b51d-4a4b-8333-405051a33a10" action="session-accept" responder="romeo@montague.lit">
    <content creator="initiator" name="1">
      <description xmlns="urn:xmpp:jingle:apps:remote-control:0">
        <device type="keyboard"/>
        <device type="mouse"/>
        <device type="wheel"/>
      </description>
      <transport xmlns="urn:xmpp:jingle:transports:webrtc-datachannel:1" sctp-port="5000" max-message-size="1073741823">
        <!-- XEP-0343 payload -->
      </transport>
    </content>
    <content creator="initiator" name="0">
      <description xmlns="urn:xmpp:jingle:apps:rtp:1" media="video">
        <!-- XEP-0167 payload -->
      </description>
      <transport xmlns="urn:xmpp:jingle:transports:ice-udp:1" ufrag="UozLzExwo0c2lw2yfCm2CSX0RgPeocCT" pwd="RDMWbP4Rh/nFU54Q5UfuFR27oUQ1oALR">
        <!-- XEP-0176 payload -->
      </transport>
    </content>
  </jingle>
</iq>
Example 4. Juliet's Client Sends Acknowledgement.
<iq to="romeo@montague.lit/orchard" from="juliet@capulet.lit/balcony" id="IQ_2" type="result"/>

From this point the WebRTC session is established and a Data Channel is opened. Input events are then sent on the wire using CBOR as explained below.

7. Application Format

The <description/> element contains 0, 1, or more <device> child elements, indicating which devices the controlling entity wishes to access.

When the responder is sending back a Remote Control <description/>, it MUST add a <description/> element for each device that has been accepted (i.e., that the controlling device may control). The responder MUST NOT add a <device> element for a specific device if authorization has not been granted to control it. That means that the initiator's list of <device> elements MAY differ from the one of the responder (if the responder is not willing to let the controller control all devices, or if a specific device is not present on the controlled device).

If no <device> element is specified, the session is a simple screen-sharing session. Note that in this case, at least one Jingle RTP Sessions (XEP-0167) [2] <content/> element MUST be present; otherwise, the Jingle session MUST be terminated.

<device> elements may have child elements specific to this device type and how it should be managed. Those elements may be specified by future specifications extending this one.

Note: the Remote Control content may be unidirectional or bidirectional. All devices described here are unidirectional, meaning that the 'senders' attribute of the Remote Control <content/> SHOULD be set to "initiator". However, future specifications may add devices that communicate in both directions (e.g., for haptic feedback, clipboard transmission, event pulling, LED feedback, etc.). In this case, the 'senders' attribute will be unset or set to "both".

A <device> element has a mandatory 'type' attribute, whose value is the kind of device that is requested by the controlling entity.

The current specification defines 4 <device> elements, described below.

7.1 mouse device

<device type="mouse"/> is used when the controlling entity want to control the mouse pointer.

7.2 wheel device

<device type="wheel"/> is used when the controlling entity wants to send wheel events (e.g. for scrolling). Most of the time, this is a mouse wheel, but it can be an equivalent device not related to the mouse.

Note: when "mouse" device is accepted, responder SHOULD also accept wheel device. This has been made a separate device only to handle the case of independent wheel devices.

7.3 touch device

<device type="touch"/> is used when the controlling entity wants to send touch events (e.g. screen taps, gestures).

Note: touch device MUST NOT be requested or accepted if no Jingle RTP Sessions (XEP-0167) [2] video stream is present in the session.

7.4 keyboard device

<device type="keyboard"/> is used when the controlling entity wants to send keyboard events..

8. Exchanging Input Data

Events data are transmitted on the wire using CBOR. The reason is that input data must be sent as fast as possible in an efficient way, and CBOR is a standard, extensible, and efficient binary format well suited for this task. Furthermore, it is easily mapped to JSON, which makes it ideal to work with Web APIs and other APIs using JSON. Each event is encoded as a map, resulting as a stream of CBOR map objects on the wire.

To make understanding of the format easier and this document more readable, JSON is used in examples of this specification, but data needs to be ultimately serialized to/from CBOR to go on the wire.

The input event data format here is inspired by those found in Web APIs. This makes them straightforward to use in web-based applications, but also easy to adapt to underlying platforms as the Web APIs are abstracted and thought to work anywhere.

The general format of an input event looks like this:

    {
        "type": "keyboard",
        "device_id": "device_123",
        "timestamp": 1712678325.0,
        "key": "A",
    }
  

The first three keys ("type", "device_id", and "timestamp") are common to all device events. Other keys are device-specific.

The common keys are specified as follows:

type

Type of event (e.g., "mouse", "keydown").
Field type: string
This key is REQUIRED.

device_id

Unique identifier of this device. If not specified, the receiver MUST assume that there is a single device of this kind (e.g., single mouse, single keyboard, single touch screen).
Field type: string
This key is OPTIONAL.

timestamp

Unix timestamp (time since Unix Epoch) of when the event occurred.
Field type: double
This key is REQUIRED.

9. Device Events

Below is the description of the events for the 4 types of devices specified here. Future specifications may add new types of events.

Note that a controlling device doesn't need to have the device in question to send those events: for instance, a touch device doesn't have to send touch events, and can instead send mouse events to simulate a mouse, or an automation mechanism can simulate input devices.

9.1 Mouse Events

Several events are user to describe mouse movement, and button pressed or released. They are inspired from DOM Events and have similar fields.

9.1.1 common keys

The following keys are common to all mouse events.

Note: if a Jingle RTP Sessions (XEP-0167) [2] video stream is attached to the session, the "x" and "y" keys are related to the video stream (x=0 and y=0 means the upper left corner of the video stream). If not video stream is attached, "x" and "y" keys MUST NOT be sent. "movementX" and "movementY" MUST be used instead. Those values are mutually exclusive, it's either "x" and "y" or "movementX" and "movementY", the controlled device MUST handle both cases.

x

The X coordinate of the mouse pointed for the event, relative to video stream.
Field type: double
This key is REQUIRED if "movementX and "movementY" are not sent, and MUST NOT be present if there is no Jingle RTP Sessions (XEP-0167) [2] video stream.

y

The Y coordinate of the mouse pointed for the event, relative to video stream.
Field type: double
This key is REQUIRED if "movementX and "movementY" are not sent, and MUST NOT be present if there is no Jingle RTP Sessions (XEP-0167) [2] video stream.

movementX

Relative X coordinate (difference between the X coordinate between this event and the previous mouse event).
Field type: double
This key is REQUIRED if "x" and "y" are not sent, and MUST NOT be present otherwise.

movementY

Relative Y coordinate (difference between the Y coordinate between this event and the previous mouse event).
Field type: double
This key is REQUIRED if "x" and "y" are not sent, and MUST NOT be present otherwise.

9.1.2 mousedown

The "mousedown" event means that one or more mouse buttons have been pressed.

The only key used in addition to common ones is the "buttons" key:

buttons

Indicates which buttons have been pressed. This uses the same values as the equivalent DOM MouseEvent "buttons" [6], except for value "0", which is not used here. Possible values are:

  • 1: Primary button (usually the left button)
  • 2: Secondary button (usually the right button)
  • 4: Auxiliary button (usually the mouse wheel button or middle button)
  • 8: 4th button (typically the "Browser Back" button)
  • 16: 5th button (typically the "Browser Forward" button)

Field type: int
This key is REQUIRED.

9.1.3 mouseup

The "mousedown" event means that one or more mouse buttons have been released.

The only key used in addition to common ones is the "buttons" key, its definition is the same as for mousedown event.

9.1.4 mousemove

The "mousemove" event means that the mouse pointer has been move. It does not contain any custom keys in addition to common ones.

9.2 Wheel Events

One event is used to describe wheel move in all 3 axis (X, Y and Z).

9.2.1 wheel

The "wheel" event indicate that the wheel has been moved by one or more of the 3 axis.

Note that even if all keys are OPTIONAL, at least one "delta*" key MUST be set.

The only key used in addition to common ones is the "buttons" key:

deltaX

Horizontal scroll amount.
Field type: double
This key is OPTIONAL and default to 0.

deltaY

Vertical scroll amount.
Field type: double
This key is OPTIONAL and default to 0.

deltaZ

Scroll amount for the Z axis.
Field type: double
This key is OPTIONAL and default to 0.

deltaMode

Indicate the unit of delta* scroll values. This uses the same values as the equivalent DOM WheelEvent "deltaMode" [7]. Possible values are:

  • 0: The delta values are specified in pixels.
  • 1: The delta values are specified in lines.
  • 2: The delta values are specified in pages.

Field type: int
This key is OPTIONAL and default to 0.

9.3 Touch Events

Three events are used to describe touches. They all have the same single "touches" key with is a (potentially empty) array of "touch" objects, as described below.

9.3.1 common key

All touch events have a "touches" key which is an array of "touch" map. They represent the touch objects that are currently in contact with the surface. That means that even for the "touchend" event, it's the list of touch objects still in contact, not the removed ones.

The key usable in a touch map are specified below:

identifier

An unique identifier for this touch object. This uses the same values as the equivalent DOM Touch "identifier" [8]. A given touch point will have the same identifier for the duration of its movement around the surface.
Field type: string
This key is REQUIRED.

x

The X coordinate of the mouse pointed for the event, relative to video stream.
Field type: double
This key is REQUIRED.

y

The Y coordinate of the mouse pointed for the event, relative to video stream.
Field type: double
This key is REQUIRED.

radiusX

This uses the same values as the equivalent DOM Touch "radiusX" [8]. Returns the X radius of the ellipse that most closely circumscribes the area of contact with the screen. The value is in pixels of the same scale as screenX.
Field type: float
This key is OPTIONAL.

radiusY

This uses the same values as the equivalent DOM Touch "radiusY" [8]. Returns the Y radius of the ellipse that most closely circumscribes the area of contact with the screen. The value is in pixels of the same scale as screenY.
Field type: float
This key is OPTIONAL.

rotationAngle

This uses the same values as the equivalent DOM Touch "rotationAngle" [8]. Returns the angle (in degrees) that the ellipse described by radiusX and radiusY must be rotated, clockwise, to most accurately cover the area of contact between the user and the surface.
Field type: float
This key is OPTIONAL.

force

This uses the same values as the equivalent DOM Touch "force" [8]. Returns the amount of pressure being applied to the surface by the user, as a float between 0.0 (no pressure) and 1.0 (maximum pressure).
Field type: float
This key is OPTIONAL.

9.3.2 touchstart

One or more touch objects are placed on the surface.

9.3.3 touchend

One or more touch objects are removed from the surface.

9.3.4 touchmove

One or more touch objects are moved along the surface.

9.4 Keyboard Events

There are two keyboard events, they use the same keys.

9.4.1 common key

The following keys are common to all keyboard events:

key

A string representing the key value. This use DOM key values, and it the same as the equivalent DOM KeyboardEvent "key" [9]. You can check the possible values at https://developer.mozilla.org/en-US/docs/Web/API/UI_Events/Keyboard_event_key_values. The W3C Keyboard Event Viewer can also be useful to check keys value while working with this specification.
Field type: string
This key is REQUIRED.

location

A number representing the location of the key on the keyboard. This uses the same values as the equivalent DOM KeyboardEvent "location" [10], except for value "0", which is not used here. Possible values are (the values description is an abstract of Mozilla Developer Network [11]):

  • 1: The key was the left-hand version of the key; for example, the left-hand Control key was pressed on a standard 101 key US keyboard. This value is only used for keys that have more than one possible location on the keyboard.
  • 2: The key was the right-hand version of the key; for example, the right-hand Control key is pressed on a standard 101 key US keyboard. This value is only used for keys that have more than one possible location on the keyboard.
  • 3: The key was on the numeric keypad, or has a virtual key code that corresponds to the numeric keypad.
  • 4: The key was on a mobile device; this can be on either a physical keypad or a virtual keyboard.
  • 5: The key was a button on a game controller or a joystick on a mobile device.

Field type: int
This key is OPTIONAL.

9.4.2 keydown

A key is pressed.

9.4.3 keyup

A key is released.

9.5 Summary of Event Keys

This section summarize the possible event key values for the devices specified here.

Table 2: Summary of Events and Keys
Key Device Event Types Description Requirement Remarks Data Type
buttons Mouse mousedown, mouseup Indicates which buttons have been pressed or released Required - int
deltaMode Wheel wheel Indicate the unit of delta* scroll values Optional Default to 0 int
deltaX Wheel wheel Horizontal scroll amount Optional Default to 0. At least one "delta*" key must be set double
deltaY Wheel wheel Vertical scroll amount Optional Default to 0. At least one "delta*" key must be set double
deltaZ Wheel wheel Scroll amount for the Z axis Optional Default to 0. At least one "delta*" key must be set double
key Keyboard keydown, keyup A string representing the key value. Uses DOM key values Required - string
location Keyboard keydown, keyup A number representing the location of the key on the keyboard Optional - int
movementX Mouse mousemove Relative X coordinate (difference between the X coordinate of this event and the previous mouse event) Required* Only required if "x" and "y" are not sent double
movementY Mouse mousemove Relative Y coordinate (difference between the Y coordinate of this event and the previous mouse event) Required* Only required if "x" and "y" are not sent double
touches Touch touchstart, touchend, touchmove A list of touch objects. Required - List of touch objects
type All All events Type of event (e.g., "mouse", "keydown") Required - string
device_id All All events Unique identifier of this device Optional For single device type, it's not needed string
timestamp All All events Unix timestamp (time since Unix Epoch) of when the event occurred Required - double
x Mouse, Touch mousedown, mouseup, mousemove, touchstart, touchend, touchmove X coordinate of the mouse pointer or touch point, relative to video stream Required* Only required if "movementX" and "movementY" are not sent and if there is a Jingle RTP Sessions (XEP-0167) [2] video stream double
y Mouse, Touch mousedown, mouseup, mousemove, touchstart, touchend, touchmove Y coordinate of the mouse pointer or touch point, relative to video stream Required* Only required if "movementX" and "movementY" are not sent and if there is a Jingle RTP Sessions (XEP-0167) [2] video stream double
Table 3: Touch Object
Key Description Requirement Remarks Data Type
identifier An unique identifier for this touch object Required - string
force Returns the amount of pressure being applied to the surface by the user, as a float between 0.0 (no pressure) and 1.0 (maximum pressure) Optional - float
radiusX Returns the X radius of the ellipse that most closely circumscribes the area of contact with the screen Optional - float
radiusY Returns the Y radius of the ellipse that most closely circumscribes the area of contact with the screen Optional - float
rotationAngle Returns the angle (in degrees) that the ellipse must be rotated, clockwise, to most accurately cover the area of contact between the user and the surface Optional - float
x X coordinate of the touch object Required Related to the video stream (if present) double
y Y coordinate of the touch object Required Related to the video stream (if present) double

10. Extending this Specification

It is expected that future specifications will extend this one. To make an extension, we recommand to follows those rules:

11. Business Rules

A controlled device MAY send data back to the controlling device, for instance, to send force haptic feedback such as vibration or force feedback. All devices in this specification are unidirectional, but future specifications may add bidirectional ones. If any bidirectional device is requested, the 'senders' attribute of the Remote Control content MUST be unset or set to "both".

It is up to the controlling entity to optimize or not the data sent with whatever appropriate algorithm. For instance, if a mouse is moved twice without a button or keyboard being pressed in-between, the controlling entity could send a single mouse event that is a combination of the two moves.

If a Remote Control <content/> is used with Jingle RTP Sessions (XEP-0167) [2] <content/>, the session is a Remote Control session, and not a video call; it SHOULD be presented as a Remote Control session to the user, and the media streams are used as feedback and/or support as described above.

12. Discovering Support

If a client supports the protocol specified in this XEP, it MUST advertise it by including the "urn:xmpp:jingle:apps:remote-control:0" discovery feature in response to a Service Discovery (XEP-0030) [12] information request.

Example 5. Service Discovery information request
<iq type='get'
    from='juliet@example.org/balcony'
    to='romeo@example.org/orchard'
    id='disco1'>
  <query xmlns='http://jabber.org/protocol/disco#info'/>
</iq>
Example 6. Service Discovery information response
<iq type='result'
    from='romeo@example.org/orchard'
    to='juliet@example.org/balcony'
    id='disco1'>
  <query xmlns='http://jabber.org/protocol/disco#info'>
    ...
    <feature var='urn:xmpp:jingle:apps:remote-control:0'/>
    ...
  </query>
</iq>

13. Security Considerations

Obviously, allowing an entity to remotely control a device is giving a high and dangerous level of access, similar to having the person using the controlling device sitting in front of this computer.

Before starting the remote control session, the client SHOULD ask for permission from the user of the controlled device in a clear and understandable way, explaining in non-technical terms that the other person will fully control the inputs of the device. A client MAY allow for automatic permission (e.g., to control an IoT device or a work computer from home by well-known entities), but this must be done clearly or stated somewhere, and the controlling entity SHOULD be duly checked.

In addition, a controlled device SHOULD display an obvious, clear, and always visible indicator that the inputs are currently being remotely controlled, with an obvious and easily accessible button to stop remote control immediately.

14. IANA Considerations

TODO

15. XMPP Registrar Considerations

TODO

16. XML Schema

16.1 urn:xmpp:jingle:apps:remote-control:0

<?xml version='1.0' encoding='UTF-8'?>

<xs:schema
    xmlns:xs='http://www.w3.org/2001/XMLSchema'
    targetNamespace='urn:xmpp:jingle:apps:remote-control:0'
    xmlns='urn:xmpp:jingle:apps:remote-control:0'
    elementFormDefault='qualified'>

  <xs:element name='description'>
    <xs:complexType>
      <xs:all>
        <xs:element name='device' type='deviceElementType' minOccurs='0' maxOccurs='unbounded'/>
      </xs:all>
    </xs:complexType>
  </xs:element>

<xs:complexType name='deviceElementType'>
  <xs:attribute name='type' use='required'>
    <xs:simpleType>
      <xs:restriction base='xs:NMTOKEN'/>
    </xs:simpleType>
  </xs:attribute>
  <xs:any namespace='##other' processContents='lax' minOccurs='0' maxOccurs='unbounded'/>
</xs:complexType>

</xs:schema>

17. Acknowledgements

Thanks to NLNet foundation/NGI Assure for funding the work on this specification.


Appendices

Appendix A: Document Information

Series
XEP
Number
xxxx
Publisher
XMPP Standards Foundation
Status
ProtoXEP
Type
Standards Track
Version
0.0.1
Last Updated
2024-05-11
Approving Body
XMPP Council
Dependencies
XMPP Core, XEP-0001, XEP-0166, XEP-0343
Supersedes
None
Superseded By
None
Short Name
remote-control

This document in other formats: XML  PDF

Appendix B: Author Information

Jérôme Poisson
Email
goffi@goffi.org
JabberID
goffi@jabber.fr

Copyright

This XMPP Extension Protocol is copyright © 1999 – 2024 by the XMPP Standards Foundation (XSF).

Permissions

Permission is hereby granted, free of charge, to any person obtaining a copy of this specification (the "Specification"), to make use of the Specification without restriction, including without limitation the rights to implement the Specification in a software program, deploy the Specification in a network service, and copy, modify, merge, publish, translate, distribute, sublicense, or sell copies of the Specification, and to permit persons to whom the Specification is furnished to do so, subject to the condition that the foregoing copyright notice and this permission notice shall be included in all copies or substantial portions of the Specification. Unless separate permission is granted, modified works that are redistributed shall not contain misleading information regarding the authors, title, number, or publisher of the Specification, and shall not claim endorsement of the modified works by the authors, any organization or project to which the authors belong, or the XMPP Standards Foundation.

Disclaimer of Warranty

## NOTE WELL: This Specification is provided on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. ##

Limitation of Liability

In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall the XMPP Standards Foundation or any author of this Specification be liable for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising from, out of, or in connection with the Specification or the implementation, deployment, or other use of the Specification (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if the XMPP Standards Foundation or such author has been advised of the possibility of such damages.

IPR Conformance

This XMPP Extension Protocol has been contributed in full conformance with the XSF's Intellectual Property Rights Policy (a copy of which can be found at <https://xmpp.org/about/xsf/ipr-policy> or obtained by writing to XMPP Standards Foundation, P.O. Box 787, Parker, CO 80134 USA).

Visual Presentation

The HTML representation (you are looking at) is maintained by the XSF. It is based on the YAML CSS Framework, which is licensed under the terms of the CC-BY-SA 2.0 license.

Appendix D: Relation to XMPP

The Extensible Messaging and Presence Protocol (XMPP) is defined in the XMPP Core (RFC 6120) and XMPP IM (RFC 6121) specifications contributed by the XMPP Standards Foundation to the Internet Standards Process, which is managed by the Internet Engineering Task Force in accordance with RFC 2026. Any protocol defined in this document has been developed outside the Internet Standards Process and is to be understood as an extension to XMPP rather than as an evolution, development, or modification of XMPP itself.

Appendix E: Discussion Venue

The primary venue for discussion of XMPP Extension Protocols is the <standards@xmpp.org> discussion list.

Discussion on other xmpp.org discussion lists might also be appropriate; see <https://xmpp.org/community/> for a complete list.

Errata can be sent to <editor@xmpp.org>.

Appendix F: Requirements Conformance

The following requirements keywords as used in this document are to be interpreted as described in RFC 2119: "MUST", "SHALL", "REQUIRED"; "MUST NOT", "SHALL NOT"; "SHOULD", "RECOMMENDED"; "SHOULD NOT", "NOT RECOMMENDED"; "MAY", "OPTIONAL".

Appendix G: Notes

1. XEP-0166: Jingle <https://xmpp.org/extensions/xep-0166.html>.

2. XEP-0167: Jingle RTP Sessions <https://xmpp.org/extensions/xep-0167.html>.

3. RFC 8949: Concise Binary Object Representation (CBOR) <http://tools.ietf.org/html/rfc8949>.

4. XEP-0343: Signaling WebRTC datachannels in Jingle <https://xmpp.org/extensions/xep-0343.html>.

5. XEP-0176: Jingle ICE-UDP Transport Method <https://xmpp.org/extensions/xep-0176.html>.

6. MouseEvent: buttons property <https://developer.mozilla.org/en-US/docs/Web/API/MouseEvent/buttons>.

7. WheelEvent: deltaMode property <https://developer.mozilla.org/en-US/docs/Web/API/WheelEvent/deltaMode>.

8. Touch <https://developer.mozilla.org/en-US/docs/Web/API/Touch>

9. KeyboardEvent: key property <https://developer.mozilla.org/en-US/docs/Web/API/KeyboardEvent/key>

10. KeyboardEvent: location property <https://developer.mozilla.org/en-US/docs/Web/API/KeyboardEvent/location>

11. KeyboardEvent: location property <https://developer.mozilla.org/en-US/docs/Web/API/KeyboardEvent/location> available under Creative Commons Attribution-ShareAlike license (CC-BY-SA), v2.5 or any later version. as specified at <https://developer.mozilla.org/en-US/docs/MDN/Writing_guidelines/Attrib_copyright_license>.

12. XEP-0030: Service Discovery <https://xmpp.org/extensions/xep-0030.html>.

Appendix H: Revision History

Note: Older versions of this specification might be available at https://xmpp.org/extensions/attic/

  1. Version 0.0.1 (2024-05-11)

    First draft.

    jp

Appendix I: Bib(La)TeX Entry

@report{poisson2024remote-control,
  title = {Jingle Remote Control},
  author = {Poisson, Jérôme},
  type = {XEP},
  number = {xxxx},
  version = {0.0.1},
  institution = {XMPP Standards Foundation},
  url = {https://xmpp.org/extensions/xep-xxxx.html},
  date = {2024-05-11/2024-05-11},
}

END