OMEMO Encryption (XEP-0384) , despite already being deployed in multiple clients, currently suffers from the limitation of only being able to encrypt the message body. The current strategy for a mid term solution is to gather experience on stanza content encryption by implementing OpenPGP for XMPP (XEP-0373)  and then later apply the gathered knowledge to OMEMO. However end users are demanding working, end-to-end encrypted media sharing right now. For that reason client developers came up with a temporary work around that that utilizes HTTP File Upload (XEP-0363)  and puts the resulting URL and a symmetric key in the body of an OMEMO message. This XEP describes the technical details of the work around.
An entity wishing to share an end-to-end encrypted file first generates a 32 byte random key and a 12 byte random IV. After successfully requesting a slot for HTTP upload the file can be encrypted with AES-256 in Galois/Counter Mode (GCM) on the fly while uploading it via HTTP. The authentication tag MUST be appended to the end of the file.
To share the file the entity converts the HTTPS URL, the key and the IV to an aesgcm:// URL. Both IV and key are converted to their hex representation of 24 characters and 64 characters respectively and concatenated for a total of 88 characters (44 bytes). The IV comes first followed by the key. The resulting string is put in the anchor part of the aesgcm URL.
Note: HTTP Upload has transport encryption as a MUST. Non HTTPS URLs MUST not be converted to the aesgcm URL scheme.
The resulting aesgcm URL is encrypted as an OMEMO message and send to the recipient(s).
The sending entity MAY also generate a thumbnail as a JPEG data uri and include that in the same message. The aesgcm:// and the data:image/jpep, are seperated by a new line character. The message SHOULD NOT include anything else. The JPEG thumbnail SHOULD be kept small (approximately 5KiB) to not run into into stanza size limitations. As a result the resulting thumbnail is considered to only be a very blury, very rough representation of the image.
The parser on the receiving end should be very strict and only display OMEMO message as shared media that contain a valid aesgcm URL or a valid aesgcm URL followed by a valid data uri seperated by a single new line character.
Traditional media sharing with HTTP Upload uses Out-of-Band Data (XEP-0066)  to repeat the URL from the body and thereby communicating that the URL is in fact meant as media attchment as opposed a clickable link. For the aesgcm URL scheme no such annotation is necessary as aesgcm URLs are considered unique enough and are never supposed to stand alone in a message.
When requesting the HTTP Upload slot and attempting on the fly encryption the requesting entity MUST take into account that the encrypted file size is larger then the original file due to the block mode of AES and the appended authentication tag. Most crypto libraries should have a method to calculate the size of the resulting file.
A aesgcm URL MUST never be linkified and clients MUST NOT offer another direct way for users to open them in a browser as this could leak the anchor with the encryption key to the server operator. This is also the reason the aesgcm URL was choosen in the first place to prevent users from accidentally opening a HTTP URL in the browser.
This document in other formats: XML PDF
This XMPP Extension Protocol is copyright © 1999 – 2020 by the XMPP Standards Foundation (XSF).
Permission is hereby granted, free of charge, to any person obtaining a copy of this specification (the "Specification"), to make use of the Specification without restriction, including without limitation the rights to implement the Specification in a software program, deploy the Specification in a network service, and copy, modify, merge, publish, translate, distribute, sublicense, or sell copies of the Specification, and to permit persons to whom the Specification is furnished to do so, subject to the condition that the foregoing copyright notice and this permission notice shall be included in all copies or substantial portions of the Specification. Unless separate permission is granted, modified works that are redistributed shall not contain misleading information regarding the authors, title, number, or publisher of the Specification, and shall not claim endorsement of the modified works by the authors, any organization or project to which the authors belong, or the XMPP Standards Foundation.
## NOTE WELL: This Specification is provided on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. ##
In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall the XMPP Standards Foundation or any author of this Specification be liable for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising from, out of, or in connection with the Specification or the implementation, deployment, or other use of the Specification (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if the XMPP Standards Foundation or such author has been advised of the possibility of such damages.
This XMPP Extension Protocol has been contributed in full conformance with the XSF's Intellectual Property Rights Policy (a copy of which can be found at <https://xmpp.org/about/xsf/ipr-policy> or obtained by writing to XMPP Standards Foundation, P.O. Box 787, Parker, CO 80134 USA).
The Extensible Messaging and Presence Protocol (XMPP) is defined in the XMPP Core (RFC 6120) and XMPP IM (RFC 6121) specifications contributed by the XMPP Standards Foundation to the Internet Standards Process, which is managed by the Internet Engineering Task Force in accordance with RFC 2026. Any protocol defined in this document has been developed outside the Internet Standards Process and is to be understood as an extension to XMPP rather than as an evolution, development, or modification of XMPP itself.
The primary venue for discussion of XMPP Extension Protocols is the <firstname.lastname@example.org> discussion list.
Discussion on other xmpp.org discussion lists might also be appropriate; see <http://xmpp.org/about/discuss.shtml> for a complete list.
Errata can be sent to <email@example.com>.
The following requirements keywords as used in this document are to be interpreted as described in RFC 2119: "MUST", "SHALL", "REQUIRED"; "MUST NOT", "SHALL NOT"; "SHOULD", "RECOMMENDED"; "SHOULD NOT", "NOT RECOMMENDED"; "MAY", "OPTIONAL".
Note: Older versions of this specification might be available at http://xmpp.org/extensions/attic/