This document defines mechanisms for server-side archiving of XMPP messages.
WARNING: This Standards-Track JEP is Experimental. Publication as a Jabber Enhancement Proposal does not imply approval of this proposal by the Jabber Software Foundation. Implementation of the protocol described herein is encouraged in exploratory implementations, but production systems should not deploy implementations of this protocol until it advances to a status of Draft.
Status:
Experimental
Type:
Standards Track
Number: 0136
Version: 0.7
Last Updated: 2006-09-08
JIG: Standards JIG
Approving Body: Jabber Council
Dependencies: XMPP Core, XMPP IM, JEP-0030, JEP-0059, JEP-0155
Supersedes: None
Superseded By: None
Short Name: archive
Wiki Page: <http://wiki.jabber.org/index.php/Message Archiving (JEP-0136)>
Email:
justin@affinix.com
JID:
justin@andbit.net
Email:
ian.paterson@clientside.co.uk
JID:
ian@zoofy.com
Email:
jonp@google.com
JID:
jonp@google.com
Email:
stpeter@jabber.org
JID:
stpeter@jabber.org
This Jabber Enhancement Proposal is copyright 1999 - 2006 by the Jabber Software Foundation (JSF) and is in full conformance with the JSF's Intellectual Property Rights Policy <http://www.jabber.org/jsf/ipr-policy.shtml>. This material may be distributed only subject to the terms and conditions set forth in the Creative Commons Attribution License (<http://creativecommons.org/licenses/by/2.5/>).
The preferred venue for discussion of this document is the Standards-JIG discussion list: <http://mail.jabber.org/mailman/listinfo/standards-jig>.
The Extensible Messaging and Presence Protocol (XMPP) is defined in the XMPP Core (RFC 3920) and XMPP IM (RFC 3921) specifications contributed by the Jabber Software Foundation to the Internet Standards Process, which is managed by the Internet Engineering Task Force in accordance with RFC 2026. Any protocol defined in this JEP has been developed outside the Internet Standards Process and is to be understood as an extension to XMPP rather than as an evolution, development, or modification of XMPP itself.
The following keywords as used in this document are to be interpreted as described in RFC 2119: "MUST", "SHALL", "REQUIRED"; "MUST NOT", "SHALL NOT"; "SHOULD", "RECOMMENDED"; "SHOULD NOT", "NOT RECOMMENDED"; "MAY", "OPTIONAL".
Many XMPP clients implement some form of client-side message archiving. However, it is not always convenient or even possible to archive messages locally, e.g., because it is easier to keep all archives in one universally accessable place (not scattered around on multiple computers or devices) or because the client operates in a web browser or resides on a mobile device that does not have sufficient local storage for message archiving. In addition, server-side archiving makes it possible to offer new services such as integration of IM and email. Therefore it is beneficial to define methods for server-side archiving of XMPP messages.
There are two main approaches to this problem:
So that client and server developers can refer to one specification, both approaches are defined in this document. In addition, this document defines common methods for retrieving and managing archived messages.
Complying with XMPP Core, the server MUST respond to all <iq/> element of type 'get' or 'set'. However, most successful responses have been omitted from this document in the interest of conciseness.
A client discovers whether its server supports this protocol using Service Discovery [1].
<iq type='get' to='montague.net'> <query xmlns='http://jabber.org/protocol/disco#info'/> </iq>
For each feature defined herein, if the server supports that feature it MUST return a <feature/> element with the 'var' attribute set to 'http://jabber.org/protocol/archive#name', where "name" is "auto" for the Automated Archiving feature, "encrypt" for the server-side encryption feature (see Automated Archiving), "manage" for the Archive Management feature, "manual" for the Manual Archiving feature, or "pref" for the Archiving Preferences feature.
<iq type='result' from='montague.net' to='romeo@montague.net/orchard'> <query xmlns='http://jabber.org/protocol/disco#info'/> ... <feature var='http://jabber.org/protocol/archive#auto'/> <feature var='http://jabber.org/protocol/archive#encrypt'/> <feature var='http://jabber.org/protocol/archive#manage'/> <feature var='http://jabber.org/protocol/archive#manual'/> <feature var='http://jabber.org/protocol/archive#pref'/> ... </query> </iq>
Not all users want to archive messages. A client SHOULD save its user's default archiving preference (or "Save Mode") to its own server (i.e., specify whether by default all conversations should be archived or not). In addition, a client MAY save different preferences for particular contacts.
Some users may also prefer that the messages they exchange with contacts are "Off The Record" (OTR). A client SHOULD save its user's default and contact-specific OTR preferences (or "OTR Modes") to its own server.
Whichever archiving method a client uses (e.g., local archiving, or automatic or manual archiving to a server), it SHOULD adhere to its user's archiving preferences.
This section addresses the following use cases:
In order to determine its user's current Save Mode(s) and OTR Mode(s), a client sends this query to its server:
<iq type='get' id='save1' from='juliet@capulet.com/chamber'> <pref xmlns='http://jabber.org/protocol/archive'/> </iq>
The server responds with the default Save Mode and OTR Mode (<default/> element) and any Save Modes and OTR Modes for specific contacts (<item/> elements).
Each child element in the response MUST include a 'save' attribute, whose value MAY be 'false' (the client MUST save no messages), 'body' (the client SHOULD save only <body/> elements) or 'all' (the client SHOULD save the full XML content of each <message/> element).
Note: Support for the 'all' value is optional and, to conserve bandwidth and storage space, it is RECOMMENDED that client implementations do not specify the 'all' value.
Note: When archiving locally a client MAY save the full XML content of each <message/> element even if the Save Mode is 'body'.
Each child element in the response MUST include an 'otr' attribute, whose value MAY be 'deny' (if Off The Record is required by the contact the client MUST send no messages), 'allow' (the client MAY save messages unless the contact requests OTR), 'try' (the client MUST try to negotiate OTR with the contact) or 'require' (the client MUST send no messages unless the contact explicitly agrees to OTR).
Note: If the OTR Mode is 'require' then the Save Mode MUST be 'false'.
<iq type='result' id='save1' to='juliet@capulet.com/chamber'> <pref xmlns='http://jabber.org/protocol/archive'> <default save='body' otr='allow'/> <item jid='romeo@montague.net' save='false' otr='require'/> <item jid='benvolio@montague.net' save='all' otr='deny'/> </pref> </iq>
If the user has never set the default Modes, the 'save' and 'otr' attributes SHOULD specify the server's default settings, and the 'unset' attribute SHOULD be set to "true". Note: The 'unset' attribute defaults to "false".
<iq type='result' id='save1' to='juliet@capulet.com/chamber'> <pref xmlns='http://jabber.org/protocol/archive'> <default save='false' otr='allow' unset='true'/> </pref> </iq>
Once it has received a request for archiving preferences from the client, the server MUST send any subsequent changes to any of the user's archiving preferences to the client until the stream is closed (see below).
A client may set the default Modes:
<iq type='set' id='save2' from='juliet@capulet.com/chamber'> <pref xmlns='http://jabber.org/protocol/archive'> <default save='false' otr='try'/> </pref> </iq>
If the server can process the request, it acknowledges the change:
<iq type='result' id='save2' to='juliet@capulet.com/chamber'/>
The server then MUST inform all of the user's connected resources that have previously requested the user's archiving preferences:
<iq type='set' id='savepush1' to='juliet@capulet.com/chamber'> <pref xmlns='http://jabber.org/protocol/archive'> <default save='false' otr='try'/> </pref> </iq> <iq type='set' id='savepush2' to='juliet@capulet.com/pda'> <pref xmlns='http://jabber.org/protocol/archive'> <default save='false' otr='try'/> </pref> </iq>
The server MAY be configured to return a <feature-not-implemented/> error in the following cases:
If it does not allow the saving of full message stanza content, and the client set the value of the 'save' attribute to "all", and any of the user's connected resources have Automated Archiving enabled.
If administrator policies require that at least the <body/> (or the full content) of every message is logged automatically, and the client sets the value of the 'save' attribute to "false" (or "body").
Note: More error cases to follow.
A client may use a similar protocol to set the Modes for a particular contact or domain of contacts (bare JID, full JID or domain). Note: It is STRONGLY RECOMMENDED for the value of the 'jid' attribute to be a bare JID (<node@domain.tld>).
<iq type='set' item='save3' from='juliet@capulet.com/chamber'> <pref xmlns='http://jabber.org/protocol/archive'> <item jid='romeo@montague.net' save='body' otr='allow'/> </pref> </iq>
<iq type='result' id='save3' to='juliet@capulet.com/chamber'/>
<iq type='set' id='savepush3' to='juliet@capulet.com/chamber'> <pref xmlns='http://jabber.org/protocol/archive'> <item jid='romeo@montague.net' save='body' otr='allow'/> </pref> </iq> <iq type='set' id='savepush4' to='juliet@capulet.com/pda'> <pref xmlns='http://jabber.org/protocol/archive'> <item jid='romeo@montague.net' save='body' otr='allow'/> </pref> </iq>
The same error cases apply as when Setting Default Modes.
A user will sometimes exchange messages with contacts who prefer that their conversations are not archived by either party. Any client that archives messages SHOULD support Chat Session Negotiation [2] and its 'logging' field both to give other contacts the opportunity to indicate this preference, and to negotiate an "Off The Record" (OTR) policy that complies with its user's own Archiving Preferences.
If a Chat Session Negotiation agreed to enable OTR (disable logging) then the clients MUST NOT allow messages sent in either direction to be archived in any way (including Manual Archiving and Automated Archiving). [3]
Note: If a contact does not include a 'logging' field in its initial Chat Session Negotiation request, and a user's Archiving Preferences indicate that OTR is required, then the client MUST refuse the request. It MAY then send its own Chat Session Negotiation request with a 'logging' field.
Note: Chat Session Negotiation messages SHOULD NOT be saved, since they are exchanged before clients have been able to negotiate OTR mode and disable archiving.
If a user's OTR preference for a contact changes during a Chat Session that has been negotiated with the contact, and if the new preference would affect the value of the 'logging' field that was previously negotiated, then the client MUST immediately terminate the Chat Session and try to negotiate a new one according to the user's new OTR preference.
While automated archiving is easy for the client and server to implement, there are many contexts in which automated archiving is required. For examples, when:
Therefore, often a client will want to send or receive a sequence of messages, optionally add private notes to the sequence, optionally encrypt the sequence, and then ask the server to store it.
Such messages and notes SHOULD be stored on the server in the form of a "collection", i.e., a set of messages to/from the same user that are received near each other in time or as part of the same conversation thread. A collection is intended to mimic the natural flow of human conversations, which in instant messaging (IM) systems tend to occur in bursts (e.g., a five-minute conversation one day, followed by a ten-minute conversation the next).
The client uniquely specifies a collection using a pair of attributes:
A friendly name for the collection MAY be specified with a 'subject' attribute. Note the Security Considerations regarding the subject attribute.
Each collection MAY contain <note/>, <to/> or <from/> elements (or <crypt/> elements - see Encryption).
The text of each individual private note MUST be encapsulated in a <note/> element. The absolute time the note was created SHOULD be specified with a 'utc' attribute (which MUST be UTC and adhere to the DateTime format specified in Jabber Date and Time Profiles).
The content of each individual message MUST be encapsulated in a <to/> or <from/> element. The time in seconds of the message relative to the previous message in the collection (or, for the first message, relative to the start of the collection) SHOULD be specified with a 'secs' attribute. The content of each <to/> or <from/> element SHOULD include a <body/> element.
Note: Other elements MAY be included, but they are NOT RECOMMENDED. To conserve bandwidth and storage space [6], elements qualified by the 'http://jabber.org/protocol/xhtml-im' namespace SHOULD NOT be included. <thread/> elements and elements qualified by the 'jabber:x:delay', 'jabber:x:event' and 'http://jabber.org/protocol/chatstates' namespaces MUST NOT be included. The server MAY be configured to return a <feature-not-implemented/> error if any <to/> or <from/> element contains anything other than a single <body/> element.
The collection of messages and notes to be uploaded are encapsulated in the <store/> element.
<iq type='set' to='montague.net' id='up1'> <store xmlns='http://jabber.org/protocol/archive' with='juliet@capulet.com/chamber' start='1469-07-21T02:56:15Z' subject='She speaks!'> <from secs='0'><body>Art thou not Romeo, and a Montague?</body></from> <to secs='11'><body>Neither, fair saint, if either thee dislike.</body></to> <from secs='14'><body>How cam'st thou hither, tell me, and wherefore?</body></from> <note utc='1469-07-21T03:04:35Z'>I think she might fancy me.</note> </store> </iq>
If the collection does not exist then the server MUST create a new collection. If the collection already exists then the server MUST append the messages to the existing collection.
Note: Clients MUST take care to append each sequence of messages to the collection before the sequence becomes so large that uploading it may violate common rate limiting restrictions (in Jabber systems, often called "karma").
<iq type='result' to='romeo@montague.net/orchard' id='up1'/>
If the server cannot service a store request because the collection is too large then it MUST return a <not-acceptable/> error:
<iq type='error' to='romeo@montague.net/orchard'> <error code='406' type='modify'> <not-acceptable xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/> </error> </iq>
The client MAY specify an absolute time for any message by providing a longer 'utc' attribute (which MUST be UTC and adhere to the DateTime format specified in Jabber Date and Time Profiles) instead of a 'secs' attribute:
<iq type='set' to='montague.net' id='up2'> <store xmlns='http://jabber.org/protocol/archive' with='juliet@capulet.com/chamber' start='1469-07-21T02:56:15Z' subject='She speaks!'> <from utc='1469-07-21T00:32:29Z'><body>Art thou not Romeo, and a Montague?</body></from> <to secs='11'><body>Neither, fair saint, if either thee dislike.</body></to> <from secs='14'><body>How cam'st thou hither, tell me, and wherefore?</body></from> </store> </iq>
A client MAY archive messages that it receives from Multi-User Chat [7] rooms. The 'with' attribute MUST be the bare JID of the room. The client MUST include a 'name' attribute for each <from/> element to specify the room nickname of the message sender:
<iq type='set' to='montague.net' id='up3'> <store xmlns='http://jabber.org/protocol/archive' with='balcony@house.capulet.com' start='1469-07-21T03:16:37Z'> <from secs='0' name='benvolio'><body>She will indite him to some supper.</body></from> <from secs='5' name='mercutio'><body>A bawd, a bawd, a bawd! So ho!</body></from> <from secs='11' name='romeo'><body>What hast thou found?</body></from> </store> </iq>
The examples above are not encrypted for clarity. However, clients SHOULD encrypt manually-archived collections, and the encryption of auto-archived collections (see Automated Archiving) is strongly RECOMMENDED.
Before uploading a sequence of messages to a collection, the client SHOULD encrypt and encapsulate the joined sequence of <to/>, <from/> and <note/> elements, base64 encode the resulting sequence of bytes, and wrap it inside a <crypt/> element. If a randomly generated unique public label was used to encapsulate the encrypted messages then the 'label' attribute of <crypt/> element MUST be set to that base64 encoded label.
Note: Entities MUST support the SHA-256 hash [8] and the DEM1 (with SC2/SHA-256) data encapsulation mechanism (see ISO 18033-2 at http://www.shoup.net/iso/std6.pdf, or ANSI-X9.44). Entities MAY support other hashes and algorithms (see Jabber Registrar Considerations). [9]
Clients MAY add one or more <crypt/> elements to a collection using exactly the same method as for <to/>, <from/> and <note/> elements (see Uploading Messages to a Collection). Note: a collection that contains <crypt/> elements MUST NOT contain <to/> or <from/> or <note/> elements.
When an encrypted collection is created (or when a <crypt/> element is added to an empty non-encrypted collection) four extra attributes MUST be specified for the <store/> element:
Note: Clients MUST support the SHA-256 hash and the RSA-KEM (with KDF2/SHA-256) key encapsulation scheme (see ISO 18033-2 at http://www.shoup.net/iso/std6.pdf, or ANSI-X9.44). [11] The client MAY support other hashes and algorithms (see Jabber Registrar Considerations). [12]
<iq type='set' to='montague.net' id='crypt1'> <store xmlns='http://jabber.org/protocol/archive' with='juliet@capulet.com/chamber' start='1469-07-21T02:56:15Z' subject='She speaks!' dataalg='DEM1-SC2-KDF2-SHA256' key='bfXv33i+Ybqypa4ETLyorGkVl73v67SMvzX41MPRKA5cOp9wGDMgd8SirwIDAQAB' keyalg='RSA-KEM-KDF2-SHA256' master='acc3594e844c77696f7a7ba9367ae324b6b958ad'> <crypt label='VROLURBVEFDb3JwU0dDL'>E5Qbvfa2gI5lBZMAHryv4g+OGQ0SR+ysraP6LnD43m77VkIVni5c7yPeIbkFdicZ</crypt> </store> </iq>
If the client specifies a new value for the 'subject' attribute of any existing collection, or new values for the 'key', 'keyalg' or 'master' attributes of an existing encrypted collection then the server MUST update the existing values. This enables a client to change the subject of a collection, to remove any dependencies on an obsolete private key, or to change the decryption key encryption algorithm.
Note: The client MUST NOT specify new values for the 'dataalg', 'with' or 'start' attributes. The only way to change these values is to delete the collection (see Removing a Collection) and then create a new one.
<iq type='set' to='montague.net' id='private1'> <store xmlns='http://jabber.org/protocol/archive' with='juliet@capulet.com/chamber' start='1469-07-21T02:56:15Z' key='IHNpbmd1bGFyIHBhc3Npb24gZnJvbSBvdGhlciBhbmltYWxzLCB3aGljaCBpcyBhIGx1c3Qgb2Yg' master='3fb37d8e817c673dfa60637351926d564ce65629'/> </iq>
<iq type='set' to='montague.net' id='subject1'> <store xmlns='http://jabber.org/protocol/archive' with='juliet@capulet.com/chamber' start='1469-07-21T02:56:15Z' subject='She speaks twice!' <crypt label='TWFuIGlzIGRpc3Rpbmd1'>dGhlIG1pbmQsIHRoYXQgYnkgYSBwZXJzZXZlcmFuY2Ugb2YgZGVsaWdodCBpbiB0aGUgY29udGlu</crypt> </store> </iq>
A client MAY enable or disable automatic archiving for messages sent over its stream. Automatic archiving SHOULD default to disabled for each new stream that is opened. Once automatic archiving is switched on then the server MUST automatically archive messages only according to the user's general Archiving Preferences.
Note: Both parties to an ESession (see Encrypted Sessions [13]) MUST disable automatic archiving, since ESession decryption keys are short-lived - making it impossible to decrypt automatically archived messages.
<iq type='set' id='auto1'> <auto save='true' xmlns='http://jabber.org/protocol/archive'/> </iq>
The server MAY be configured to return a <feature-not-implemented/> error in the following cases:
If the client is trying to enable automatic archiving, but the server does not allow the saving of full message stanza content, and the user has specified the 'all' Save Mode in one of its Archiving Preferences.
If administrator policies require that every message is logged automatically, and the client is trying to disable automatic archiving.
Whenever the client enables auto-archiving it MAY also specify a 'secret' attribute, this is a base64 encoded secret symmetric encryption key that the server SHOULD use to encrypt all new automatically archived collections. It should also specify the 'dataalg', 'key', 'keyalg' and 'master' attributes (see Encryption) that will eventually be downloaded by clients and used to decrypt collections (the 'dataalg' attribute is also required by the server to encrypt collections).
<iq type='set' id='auto2'> <auto save='true' secret='dWVkIGFuZCBpbmRlZmF0aWdhYmxl' dataalg='DEM1-SC2-KDF2-SHA256' key='bfXv33i+Ybqypa4ETLyorGkVl73v67SMvzX41MPRKA5cOp9wGDMgd8SirwIDAQAB' keyalg='RSA-KEM-KDF2-SHA256' master='acc3594e844c77696f7a7ba9367ae324b6b958ad' xmlns='http://jabber.org/protocol/archive'/> </iq>
As soon as the client specifies a new secret symmetric encryption key (or switches off all auto-archiving - thus completing all active collections), the server MUST securely destroy all its copies of the old secret key.
Note: If the client uses this protocol to change the secret key regularly (e.g. immediately after the start of every conversation) then, if the server is compromised, only the messages stored during the attack will be compromised (i.e. only those messages that would have been compromised even if they had not been stored).
Manually uploaded and automatically saved collections are managed in the same way. There are three main areas of functionality related to archive management:
Requirements and protocol flows for each of these use cases are defined below. The protocols to retrieve a list of collections and an indivdual collection both make extensive use of Result Set Management [14]. Clients and servers SHOULD support all the features defined in that protocol.
To request a list of collections the client sends a <list/> element. The 'start' and 'end' attributes MAY be specified to indicate a date range (the values of these attributes MUST be UTC and adhere to the DateTime format specified in Jabber Date and Time Profiles). The 'with' attribute MAY be specified to limit the list to a single participating full JID, bare JID or domain.
If the 'with' attribute is omitted then collections with any JID are returned. If only 'start' is specified then all collections on or after that date should be returned. If only 'end' is specified then all collections prior to that date should be returned.
The client SHOULD use Result Set Management to limit the number of collections returned by the server in a single stanza, taking care not to request a page of collections that is so big it might exceed karma limits.
<iq type='get' to='montague.net' id='juliet1'> <list xmlns='http://jabber.org/protocol/archive' with='juliet@capulet.com'> <set xmlns='http://jabber.org/protocol/rsm'> <max>30</max> </set> </list> </iq>
<iq type='get' to='montague.net' id='period1'> <list xmlns='http://jabber.org/protocol/archive' with='juliet@capulet.com' start='1469-07-21T02:00:00Z' end='1479-07-21T04:00:00Z'> <set xmlns='http://jabber.org/protocol/rsm'> <max>30</max> </set> </list> </iq>
<iq type='get' to='montague.net' id='list1'> <list xmlns='http://jabber.org/protocol/archive' start='1469-07-21T02:00:00Z'> <set xmlns='http://jabber.org/protocol/rsm'> <max>30</max> </set> </list> </iq>
The server MUST list the collections (empty <store/> elements including all attributes) in chronological order when responding to any request:
<iq type='result' to='romeo@montague.net/orchard' from='montague.net' id='list1'> <list xmlns='http://jabber.org/protocol/archive'> <store with='juliet@capulet.com/chamber' start='1469-07-21T02:56:15Z' subject='She speaks!' dataalg='DEM1-SC2-KDF2-SHA256' key='bfXv33i+Ybqypa4ETLyorGkVl73v67SMvzX41MPRKA5cOp9wGDMgd8SirwIDAQAB' keyalg='RSA-KEM-KDF2-SHA256' master='acc3594e844c77696f7a7ba9367ae324b6b958ad'/> . [28 more collections] . <store with='balcony@house.capulet.com' start='1469-07-21T03:16:37Z'/> <set xmlns='http://jabber.org/protocol/rsm'> <first index='0'>1469-07-21T02:56:15Zjuliet@capulet.com</first> <last>1469-07-21T03:16:37Zbalcony@house.capulet.com</last> <count>1372</count> </set> </list> </iq>
Note: In accordance with Result Set Management, the client MUST assume the unique IDs it receives in the <first/> and <last/> elements are opaque. Servers MAY adopt a unique ID format other than the one suggested in the example above.
If no collections correspond to the request the server MUST return an empty <list/> element:
<iq type='result' to='romeo@montague.net/orchard' from='montague.net' id='list1'> <list xmlns='http://jabber.org/protocol/archive'/> </iq>
<iq type='get' to='montague.net' id='list2'> <list xmlns='http://jabber.org/protocol/archive' start='1469-07-21T02:00:00Z'> <set xmlns='http://jabber.org/protocol/rsm'> <max>30</max> <after>1469-07-21T03:16:37Zbalcony@house.capulet.com</after> </set> </list> </iq>
Refer to Result Set Management to learn more about the various ways that the pages of the list may be accessed.
The 'master' attribute MAY be included to limit the list to encrypted collections whose messages decryption key was encrypted with the specified private key. This feature enables a client to find any collections that depend on an obsolete private key:
<iq type='get' to='montague.net' id='master1'> <list xmlns='http://jabber.org/protocol/archive' master='acc3594e844c77696f7a7ba9367ae324b6b958ad'> <set xmlns='http://jabber.org/protocol/rsm'> <max>30</max> </set> </list> </iq>
<iq type='result' to='romeo@montague.net/orchard' from='montague.net' id='master1'> <list xmlns='http://jabber.org/protocol/archive'> <store xmlns='http://jabber.org/protocol/archive' with='juliet@capulet.com/chamber' start='1469-07-21T02:56:15Z' subject='She speaks!' dataalg='DEM1-SC2-KDF2-SHA256' key='bfXv33i+Ybqypa4ETLyorGkVl73v67SMvzX41MPRKA5cOp9wGDMgd8SirwIDAQAB' keyalg='RSA-KEM-KDF2-SHA256' master='acc3594e844c77696f7a7ba9367ae324b6b958ad'/> . [29 more encrypted collections] . <set xmlns='http://jabber.org/protocol/rsm'> <first index='0'>1469-07-21T02:56:15Zjuliet@capulet.com</first> <last>1469-07-21T03:16:37Zbalcony@house.capulet.com</last> <count>142</count> </set> </list> </iq>
To request a page of messages from a collection the client sends a <retrieve/> element. The 'with' and 'start' attributes specify the participating full JID and the start time (see Jabber Date and Time Profiles). Both attributes MUST be included to uniquely identify a collection:
The client SHOULD use Result Set Management to limit the number of messages returned by the server in a single stanza, taking care not to request a page of messages that is so big it might exceed karma limits.
<iq type='get' to='montague.net' id='page1'> <retrieve xmlns='http://jabber.org/protocol/archive' with='juliet@capulet.com/chamber' start='1469-07-21T02:56:15Z'> <set xmlns='http://jabber.org/protocol/rsm'> <max>100</max> </set> </retrieve> </iq>
<iq type='result' to='romeo@montague.net/orchard' from='montague.net' id='page1'> <store xmlns='http://jabber.org/protocol/archive' with='juliet@capulet.com/chamber' start='1469-07-21T02:56:15Z' subject='She speaks!'> <from secs='0'><body>Art thou not Romeo, and a Montague?</body></from> <to secs='11'><body>Neither, fair saint, if either thee dislike.</body></to> . [98 more messages] . <from secs='14'><body>How cam'st thou hither, tell me, and wherefore?</body></from> <set xmlns='http://jabber.org/protocol/rsm'> <first index='0'>0</first> <last>99</last> <count>217</count> </set> </store> </iq>
Note: In accordance with Result Set Management, the client MUST assume the unique IDs it receives in the <first/> and <last/> elements are opaque. Servers MAY adopt a unique ID format other than the one suggested in the example above.
If the specified collection does not exist then the server MUST return an <item-not-found/> error:
<iq type='error' to='romeo@montague.net/orchard' from='montague.net' id='page1'> <error code='404' type='cancel'> <item-not-found xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/> </error> </iq>
If the requested collection is empty the server MUST return an empty <store/> element:
<iq type='result' to='romeo@montague.net/orchard' from='montague.net' id='page1'> <store xmlns='http://jabber.org/protocol/archive' with='juliet@capulet.com/chamber' start='1469-07-21T02:56:15Z' subject='She speaks!'/> </iq>
<iq type='get' to='montague.net' id='page2'> <retrieve xmlns='http://jabber.org/protocol/archive' with='juliet@capulet.com/chamber' start='1469-07-21T02:56:15Z'> <set xmlns='http://jabber.org/protocol/rsm'> <max>100</max> <after>99</after> </set> </retrieve> </iq>
Refer to Result Set Management to learn more about the various ways that the pages of the collection may be accessed.
To request the removal of a single collection the client sends an empty <remove/> element. The 'with' (full JID) and 'start' attributes MUST be included to uniquely identify the collection.
<iq type='set' to='montague.net'> <remove xmlns='http://jabber.org/protocol/archive' with='juliet@capulet.com/chamber' start='1469-07-21T02:56:15Z'/> </iq>
The client may remove several collections at once. The 'start' and 'end' elements MAY be specified to indicate a date range. The 'with' attribute MAY be a full JID, bare JID or domain.
<iq type='set' to='montague.net'> <remove xmlns='http://jabber.org/protocol/archive' with='juliet@capulet.com' start='1469-07-21T02:00:00Z' end='1469-07-21T04:00:00Z'/> </iq>
If the 'with' attribute is omitted then collections with any JID are removed.
If the end date is in the future then then all collections after the start date are removed.
<iq type='set' to='montague.net'> <remove xmlns='http://jabber.org/protocol/archive' start='1469-07-21T02:00:00Z' end='2038-01-01T00:00:00Z'/> </iq>
If the start date is before all the collections in the archive then all collections prior to the end date are removed.
<iq type='set' to='montague.net'> <remove xmlns='http://jabber.org/protocol/archive' start='0000-01-01T00:00:00Z' end='1469-07-21T04:00:00Z'/> </iq>
<iq type='set' to='montague.net'> <remove xmlns='http://jabber.org/protocol/archive'/> </iq>
If the specified collection (or collections) do not exist then the server MUST return an <item-not-found/> error:
<iq type='error' to='romeo@montague.net/orchard'> <error code='404' type='cancel'> <item-not-found xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/> </error> </iq>
This section describes how a client MAY replicate an archive locally. [15] The existance of a local copy of the archive enables clients to search the content of all messages (including collections saved by another client machine). [16]
The client MAY 'synchronize' its local copy of the archive with the 'master' archive on the server at any time. The first step is to request the list of collections that the server has changed (created, modified or removed) in its master archive since the last update to the client's copy of the archive.
The client MUST request each page of the list using the Result Set Management protocol embeded in a <modified/> element. The content of the <after/> element SHOULD be a UTC time (see Jabber Date and Time Profiles) that it has previously received from the server (see below). When synchronizing for the first time, the client MAY choose a suitable time for the first page request (e.g. 1970-01-01T00:00:00Z).
<iq type='get' to='montague.net' id='sync1'> <modified xmlns='http://jabber.org/protocol/archive'> <set xmlns='http://jabber.org/protocol/rsm'> <max>50</max> <after>1469-07-21T01:14:47Z</after> </set> </modified> </iq>
The server MUST return the changed collections in the chronological order that they were changed (most recent last). If a collection has been modified, created or removed after the time specified by the <after/> element then the server MUST include it in the returned result set page of collections (unless the specified maximum page size would be exceeded). Each <changed/> or <removed/> collection element (for modified/created, or removed collections respectively) in the returned list MUST include only 'with' and 'start' attribues. The server MUST set the content of the <last/> element to the UTC time (see Jabber Date and Time Profiles) that the last collection on the page was modified.
<iq type='result' to='romeo@montague.net/orchard' from='montague.net' id='sync1'> <modified xmlns='http://jabber.org/protocol/archive'> <changed with='juliet@capulet.com/chamber' start='1469-07-21T02:56:15Z'/> . [up to 48 more collections] . <removed with='balcony@house.capulet.com' start='1469-07-21T03:16:37Z'/> <set xmlns='http://jabber.org/protocol/rsm'> <last>1469-07-21T04:22:39Z</last> <count>1372</count> </set> </modified> </iq>
Note: The server should remember the 'with' and 'start' attribues and the time of removal of all deleted collections. If this 'state' cannot be maintained indefinitely, then unless all the user's clients replicate before the server deletes its memory of a removal then it will not be reflected in all the local copies of the archive.
Note: Along with its copy of the archive the client SHOULD store the most recent <last/> time that it received from the server. The next time it synchronizes with the server it SHOULD specify that time when requesting the first result set page (see above).
After receiving each result set page the client SHOULD delete from its local archive any collections that have been removed from the master archive. The client should also retrieve from the server the content of each collection that has been modified (see Retrieving a Collection) and add it to its local copy of the archive (deleting any older version of the same collection that it may already have).
Note the file format specified in this section is likely to be depricated once a standards-based format has been published in a separate JEP.
So that clients can share archived messages, this document specifies a common format for storage on disk (similar to email formats like mbox and Maildir). The file format uses the same XML constructs as the protocol. Each file may contain messages exchanged with a single JID. Any number of items may be stored in an archive file.
<?xml version='1.0'?> <archive xmlns='http://jabber.org/protocol/archive' with='juliet@capulet.com'> <store start='1469-07-21T02:56:15Z' subject='She speaks!'> <from secs='0'><body>Art thou not Romeo, and a Montague?</body></from> <to secs='11'><body>Neither, fair saint, if either thee dislike.</body></to> <from secs='14'><body>How cam'st thou hither, tell me, and wherefore?</body></from> </store> </archive>
When creating a new collection, it is RECOMMENDED that the client synchronizes the collection start time that it sends to the server with server time. This is important since the user may subsequently retrieve the stored collection using client machines whose UTC clocks are not synchronized with the client machine that stored the collection. (i.e. Either or both of the clients' UTC clocks may be wrong.) The client can achieve this synchronization with server time by using Entity Time [17] to estimate the difference between the server and client UTC clocks.
When retrieving collections, it is RECOMMENDED that the client adjusts the start times of the collections it receives from server to be synchronized with the clock of the client machine.
When uploading messages using manual archiving, a client SHOULD NOT store one message at a time on the server since this increases both bandwidth consumption and the total number of transactions. It is instead RECOMMENDED that clients store messages only when the conversation thread appears to be terminated, e.g. when the user closes the chat window. If the user reopens the window and the thread continues then the client should append the new messages to the collection when the user closes the window again.
Server implementations SHOULD give system administrators the option to disable support for both automated and manual archiving, since archived conversations can consume significant storage space.
Since the subject of each collection will not be encrypted, the client MUST warn its human user (if any) before including 'subject' attributes on encrypted collections.
The client that originates a message MAY specify a 'false' value for the 'store' header (see Stanza Headers and Internet Metadata [18]). The recipient MUST NOT archive such a message or any of the information it contains.
If the sender plans to use 'store' headers it MUST use Service Discovery to determine whether or not the recipient supports them. Note: Since servers are not required to check the content of message stanzas for headers, if the recipient is using automatic archiving then it MUST indicate that it does not support 'store' headers.
If the recipient does not support 'store' headers, then the sender MUST confirm with its human user (if any) before sending such a message.
No interaction with the Internet Assigned Numbers Authority (IANA) [19] is required as a result of this JEP.
The Jabber Registrar [20] shall include 'http://jabber.org/protocol/archive' in its registry of protocol namespaces (see <http://www.jabber.org/registrar/namespaces.html>):
The Jabber Registrar shall include the following features in its registry of service discovery features (see <http://www.jabber.org/registrar/disco-features.html>):
The Jabber Registrar shall include 'RSA-KEM-KDF2-SHA256' in its registry of cryptography scheme names with the following description:
The encrypting entity uses the RSA-KEM.Encrypt algorithm and the KDF2 key derivation function and the SHA256-based HMAC algorithm (see ISO 18033-2) along with the entity's public RSA key (i.e. the key whose identity is the 'master' attribute of the <store/> element) to generate a secret symmetric encryption key, K, and an RSA-encrypted version of the key, C (i.e. the 'key' attribute of the <store/> element).
The decrypting entity uses RSA-KEM.Decrypt algorithm and the KDF2 key derivation function and the SHA256-based HMAC algorithm (see ISO 18033-2) along with the entity's private RSA key to decrypt C to K.
The Jabber Registrar shall include 'DEM1-SC2-KDF2-SHA256' in its registry of cryptography scheme names with the following description:
The encrypting entity uses the DEM1.Encrypt algorithm with the SC2.Encrypt symmetric encryption algorithm and the KDF2 key derivation function and the SHA256-based HMAC algorithm (see ISO 18033-2) to encrypt the data, M (i.e. the complete sequence of <from/>, <to/> and <note/> elements), with the secret symmetric encryption key, K, and a randomly generated public label, L (i.e. the 'label' attribute of the <crypt/> element).
Note that the encrypting entity MAY use same key, K, for more than one collection. But it MUST use the label, L, with only one plain text, M.
The decrypting entity uses the DEM1.Decrypt algorithm with the SC2.Decrypt symmetric decryption algorithm and the KDF2 key derivation function and the SHA256-based HMAC algorithm (see ISO 18033-2) to decrypt the data with K and L.
<?xml version='1.0' encoding='UTF-8'?> <xs:schema xmlns:xs='http://www.w3.org/2001/XMLSchema' targetNamespace='http://jabber.org/protocol/archive' xmlns='http://jabber.org/protocol/archive' elementFormDefault='qualified'> <xs:annotation> <xs:documentation> The allowable root element for the namespace defined herein are: - archive - list - otr - remove - retrieve - save - store </xs:documentation> </xs:annotation> <xs:element name='archive'> <xs:complexType> <xs:sequence> <xs:element ref='store' minOccurs='1' maxOccurs='unbounded'/> </xs:sequence> <xs:attribute name='with' type='xs:string' use='optional'/> </xs:complexType> </xs:element> <xs:element name='auto'> <xs:complexType> <xs:simpleContent> <xs:extension base='empty'> <xs:attribute name='dataalg' type='xs:string' use='optional'/> <xs:attribute name='key' type='xs:string' use='optional'/> <xs:attribute name='keyalg' type='xs:string' use='optional'/> <xs:attribute name='master' type='xs:string' use='optional'/> <xs:attribute name='save' type='xs:boolean' use='required'/> <xs:attribute name='secret' use='optional' type='xs:string'/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:element name='changed'> <xs:complexType> <xs:simpleContent> <xs:extension base='empty'> <xs:attribute name='start' type='xs:dateTime' use='required'/> <xs:attribute name='with' type='xs:string' use='required'/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:element name='crypt'> <xs:complexType> <xs:simpleContent> <xs:extension base='xs:string'> <xs:attribute name='label' type='xs:string' use='optional'/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:element name='default'> <xs:complexType> <xs:simpleContent> <xs:extension base='empty'> <xs:attribute name='otr' use='required'> <xs:simpleType> <xs:restriction base='xs:NCName'> <xs:enumeration value='allow'/> <xs:enumeration value='deny'/> <xs:enumeration value='require'/> <xs:enumeration value='try'/> </xs:restriction> </xs:simpleType> </xs:attribute> <xs:attribute name='save' use='required'> <xs:simpleType> <xs:restriction base='xs:NCName'> <xs:enumeration value='all'/> <xs:enumeration value='body'/> <xs:enumeration value='false'/> </xs:restriction> </xs:simpleType> </xs:attribute> <xs:attribute name='unset' use='optional' type='xs:boolean'/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:element name='item'> <xs:complexType> <xs:simpleContent> <xs:extension base='empty'> <xs:attribute name='jid' use='required' type='xs:string'/> <xs:attribute name='otr' use='required'> <xs:simpleType> <xs:restriction base='xs:NCName'> <xs:enumeration value='allow'/> <xs:enumeration value='deny'/> <xs:enumeration value='require'/> <xs:enumeration value='try'/> </xs:restriction> </xs:simpleType> </xs:attribute> <xs:attribute name='save' use='required'> <xs:simpleType> <xs:restriction base='xs:NCName'> <xs:enumeration value='all'/> <xs:enumeration value='body'/> <xs:enumeration value='false'/> </xs:restriction> </xs:simpleType> </xs:attribute> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:element name='list'> <xs:complexType> <xs:sequence> <xs:element ref='store' minOccurs='0' maxOccurs='unbounded'/> </xs:sequence> <xs:attribute name='end' type='xs:dateTime' use='optional'/> <xs:attribute name='master' type='xs:string' use='optional'/> <xs:attribute name='start' type='xs:dateTime' use='optional'/> <xs:attribute name='with' type='xs:string' use='optional'/> </xs:complexType> </xs:element> <xs:element name='modified'> <xs:complexType> <xs:sequence> <xs:element ref='changed' minOccurs='0' maxOccurs='unbounded'/> <xs:element ref='removed' minOccurs='0' maxOccurs='unbounded'/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name='pref'> <xs:complexType> <xs:sequence> <xs:element ref='default' minOccurs='0' maxOccurs='1'/> <xs:element ref='item' minOccurs='0' maxOccurs='unbounded'/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name='remove'> <xs:complexType> <xs:simpleContent> <xs:extension base='empty'> <xs:attribute name='end' type='xs:dateTime' use='optional'/> <xs:attribute name='start' type='xs:dateTime' use='required'/> <xs:attribute name='with' type='xs:string' use='required'/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:element name='removed'> <xs:complexType> <xs:simpleContent> <xs:extension base='empty'> <xs:attribute name='start' type='xs:dateTime' use='required'/> <xs:attribute name='with' type='xs:string' use='required'/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:element name='retrieve'> <xs:complexType> <xs:simpleContent> <xs:extension base='empty'> <xs:attribute name='start' type='xs:dateTime' use='required'/> <xs:attribute name='with' type='xs:string' use='required'/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:element name='store'> <xs:complexType> <xs:choice maxOccurs='unbounded'> <xs:element ref='crypt'/> <xs:element name='from' type='messageType'/> <xs:element name='note' type='xs:string'/> <xs:element name='to' type='messageType'/> </xs:choice> <xs:attribute name='dataalg' type='xs:string' use='optional'/> <xs:attribute name='key' type='xs:string' use='optional'/> <xs:attribute name='keyalg' type='xs:string' use='optional'/> <xs:attribute name='master' type='xs:string' use='optional'/> <xs:attribute name='start' type='xs:dateTime' use='required'/> <xs:attribute name='subject' type='xs:string' use='optional'/> <xs:attribute name='with' type='xs:string' use='required'/> </xs:complexType> </xs:element> <xs:complexType name='messageType'> <xs:sequence> <xs:element name='body' type='xs:string' maxOccurs='unbounded'/> </xs:sequence> <xs:attribute name='secs' type='xs:nonNegativeInteger' use='required'/> </xs:complexType> <xs:simpleType name='empty'> <xs:restriction base='xs:string'> <xs:enumeration value=''/> </xs:restriction> </xs:simpleType> </xs:schema>
1. JEP-0030: Service Discovery <http://www.jabber.org/jeps/jep-0030.html>.
2. JEP-0155: Chat Session Negotiation <http://www.jabber.org/jeps/jep-0155.html>.
3. If a client (or user) acts in bad faith then its contacts cannot prevent it archiving conversations.
4. JEP-0116: Encrypted Sessions <http://www.jabber.org/jeps/jep-0116.html>.
5. JEP-0082: Jabber Date and Time Profiles <http://www.jabber.org/jeps/jep-0082.html>.
6. Stream compression typically does not mitigate bandwidth and storage issues since collections SHOULD be encrypted, and since clients running in constrained runtime environments typically cannot take advantage of stream compression (no binary data, only XML, may be transfered).
7. JEP-0045: Multi-User Chat <http://www.jabber.org/jeps/jep-0045.html>.
8. SHA-1 is broken (assuming the attacker has plenty of computing power) and many other standard hashes are not optimised for 32-bit processors (e.g. Whirlpool, SHA-384, SHA-512).
9. Future versions of this document MAY be modified to recommend other algorithms.
10. Mechanisms for the storage and retrieval of private keys are beyond the scope of this document.
11. RSA-KEM is the only required encapsulation scheme since it is NESSIE-recommended, its security is tightly proven (unlike RSA-OAEP or PKCS #1 v1.5), and it is very simple to implement.
12. Future versions of this document MAY be modified to recommend other algorithms.
13. JEP-0116: Encrypted Sessions <http://www.jabber.org/jeps/jep-0116.html>.
14. JEP-0059: Result Set Management <http://www.jabber.org/jeps/jep-0059.html>.
15. Clients that run in constrained environments may not be able to implement replication if they are prevented from accessing (sufficient) local storage.
16. Since collections should be stored in encrypted form on the server, server-side searching of the content of messages is beyond the scope of this protocol.
17. JEP-0090: Entity Time <http://www.jabber.org/jeps/jep-0090.html>.
18. JEP-0131: Stanza Headers and Internet Metadata <http://www.jabber.org/jeps/jep-0131.html>.
19. The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique parameter values for Internet protocols, such as port numbers and URI schemes. For further information, see <http://www.iana.org/>.
20. The Jabber Registrar maintains a list of reserved Jabber protocol namespaces as well as registries of parameters used in the context of protocols approved by the Jabber Software Foundation. For further information, see <http://www.jabber.org/registrar/>.
Added preferences, results set management and notes; reinstated encryption and replication; simplified auto-archiving and off-the-record (with JEP-0155); many minor changes
(ip)Added unset value for save attribute and added service attribute on default element; added source attribute on record element; specified that services should (not must) support save mode for particular contacts.
(jp/psa)Integrated text from server-side archiving proposal; added partial support to collection retrieval; harmonized XML formats and namespaces; defined Jabber Registrar considerations and XML schema.
(psa/jp/jk)Added Replication and Searching section, partial attribute; minor improvements
(ip)Added more examples to Removing Collections
(ip)Complete rewrite.
(ip)Initial version.
(jk)END