<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type='text/xsl' href='xep.xsl'?>
<xep xmlns="">
<header>
  <title>Data Policy</title>
  <abstract>This document specifies metadata on how an entity handles its data (encryption, data retention, etc).</abstract>
  
<legal>
<copyright>This XMPP Extension Protocol is copyright © 1999 – 2024 by the <link url="https://xmpp.org/">XMPP Standards Foundation</link> (XSF).</copyright>
<permissions>Permission is hereby granted, free of charge, to any person obtaining a copy of this specification (the "Specification"), to make use of the Specification without restriction, including without limitation the rights to implement the Specification in a software program, deploy the Specification in a network service, and copy, modify, merge, publish, translate, distribute, sublicense, or sell copies of the Specification, and to permit persons to whom the Specification is furnished to do so, subject to the condition that the foregoing copyright notice and this permission notice shall be included in all copies or substantial portions of the Specification. Unless separate permission is granted, modified works that are redistributed shall not contain misleading information regarding the authors, title, number, or publisher of the Specification, and shall not claim endorsement of the modified works by the authors, any organization or project to which the authors belong, or the XMPP Standards Foundation.</permissions>
<warranty>## NOTE WELL: This Specification is provided on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. ##</warranty>
<liability>In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall the XMPP Standards Foundation or any author of this Specification be liable for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising from, out of, or in connection with the Specification or the implementation, deployment, or other use of the Specification (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if the XMPP Standards Foundation or such author has been advised of the possibility of such damages.</liability>
<conformance>This XMPP Extension Protocol has been contributed in full conformance with the XSF's Intellectual Property Rights Policy (a copy of which can be found at &lt;<link url="https://xmpp.org/about/xsf/ipr-policy">https://xmpp.org/about/xsf/ipr-policy</link>&gt; or obtained by writing to XMPP Standards Foundation, P.O. Box 787, Parker, CO 80134 USA).</conformance>
</legal>
  <number>0504</number>
  <status>Experimental</status>
  <type>Standards Track</type>
  <sig>Standards</sig>
  <approver>Council</approver>
  <dependencies>
    <spec>XMPP Core</spec>
    <spec>XEP-0030</spec>
    <spec>XEP-0128</spec>
    <spec>XEP-0080</spec>
  </dependencies>
  <supersedes/>
  <supersededby/>
  <shortname>data-policy</shortname>
  
  <author>
    <firstname>Jérôme</firstname>
    <surname>Poisson</surname>
    <email>goffi@goffi.org</email>
    <jid>goffi@jabber.fr</jid>
    <uri>https://www.goffi.org</uri>
  </author>

    <revision>
    <version>0.1.0</version>
    <date>2025-07-09</date>
    <initials>XEP Editor: dg</initials>
      <remark><p>Accepted as Experimental by Concil vote on 2025-07-08</p>
    </remark>
  </revision>
  <revision>
    <version>0.0.1</version>
    <date>2025-06-13</date>
    <initials>jp</initials>
    <remark><p>First draft.</p></remark>
  </revision>
</header>
<section1 topic="Introduction" anchor="intro">
  <p>It is important for a service user to know how its data are handled: where data are stored, whether encryption is used and how, which jurisdiction applies, etc.</p>
  <p>This document specifies fields to use with <span class="ref"><link url="https://xmpp.org/extensions/xep-0128.html">Service Discovery Extensions (XEP-0128)</link></span> <note>XEP-0128: Service Discovery Extensions &lt;<link url="https://xmpp.org/extensions/xep-0128.html">https://xmpp.org/extensions/xep-0128.html</link>&gt;.</note> to expose that information and is usable with any kind of XMPP entity (XMPP server, gateways, pubsub services, etc.). It is expected that those data are properly filled and exposed to end-users in an easy-to-understand way by XMPP clients.</p>
</section1>
<section1 topic="Requirements" anchor="reqs">
  <p>The goals of this specification are:</p>
  <ul>
    <li>Expose enough data policy information so end-users can obtain detailed information on how their data are handled.</li>
    <li>Be usable with any kind of XMPP entity.</li>
    <li>Use as much as possible existing specifications to expose those data.</li>
    <li>Avoid duplicating information: if information is already easily available with another XMPP specification/disco feature, and no other important information is needed, do not duplicate it.</li>
  </ul>
</section1>
<section1 topic="Exposing Data Policy" anchor="data_policy">
  <p>Data policy is exposed via <span class="ref"><link url="https://xmpp.org/extensions/xep-0128.html">Service Discovery Extensions (XEP-0128)</link></span> <note>XEP-0128: Service Discovery Extensions &lt;<link url="https://xmpp.org/extensions/xep-0128.html">https://xmpp.org/extensions/xep-0128.html</link>&gt;.</note>: in response to a disco#info query sent to the bare JID of the entity, the implementation MUST return a data form using the 'FORM_TYPE' of "urn:xmpp:data-policy:0" as specified in <span class="ref"><link url="https://xmpp.org/extensions/xep-0128.html">Service Discovery Extensions (XEP-0128)</link></span> <note>XEP-0128: Service Discovery Extensions &lt;<link url="https://xmpp.org/extensions/xep-0128.html">https://xmpp.org/extensions/xep-0128.html</link>&gt;.</note>.</p>
  <p>This form can be used with any kind of XMPP entity. It is specially expected to be used with XMPP entities having one of the following identity categories: "collaboration", "conference", "pubsub", "server", and "store", but it can be used with any kind of entity.</p>
  <p>If the service is proxying data to another (e.g., a gateway, or a storage service using another provider), the data form with a 'FORM_TYPE' of 'urn:xmpp:data-policy:0' applies to the policy of the service/gateway itself. The policy of the legacy network is specified by using the scope "urn:xmpp:data-policy:identity:&lt;category&gt;:&lt;type&gt;:0" where <em>&lt;category&gt;</em> and <em>&lt;type&gt;</em> are the category and type of the corresponding disco identity, as found in &lt;<link url="https://xmpp.org/registrar/disco-categories.html">https://xmpp.org/registrar/disco-categories.html</link>&gt;. That means that a service exposing data policy MUST have at least one main form with the "urn:xmpp:data-policy:0" scope for the service itself, and SHOULD have one identity-tied form per legacy service. This way, a user can see exactly how its data are used when using a gateway with a specific legacy network.</p>
  <p>The rest of this section describes the fields that MAY be used in this form.</p>
  <section2 topic="Auth Mechanism" anchor="auth_mechanism">
    <p>A field indicates how login data is used (login/password or equivalent data used for authentication). This field MUST have a 'var' attribute set to "auth_data", must be of type "list-single", and MUST have one of the following values:</p>
    <ul>
      <li><strong>no_auth</strong>: the service does not require authentication.</li>
      <li><strong>plain</strong>: the service receives the authentication data in plain text, meaning that it can access them and potentially copy them.</li>
      <li><strong>hidden</strong>: the auth data are used but not seen by the service, because a technique to hide it (such as using a challenge to verify it) is used.</li>
      <li><strong>restricted</strong>: no full authentication is used, instead a temporary access is given via a mechanism such as a token or OAuth, and the access is restricted in scope (i.e., what it is possible to do with the account) and time.</li>
    </ul>
  </section2>
  <section2 topic="Encryption" anchor="encryption">
    <p>While clients can obviously already determine if they can use end-to-end encryption with a service, they have no way to know if a service acting as a proxy/gateway does send its data encrypted or not to third-party services. For instance, a client can send data end-to-end encrypted to a gateway using <span class="ref"><link url="https://xmpp.org/extensions/xep-0384.html">OMEMO Encryption (XEP-0384)</link></span> <note>XEP-0384: OMEMO Encryption &lt;<link url="https://xmpp.org/extensions/xep-0384.html">https://xmpp.org/extensions/xep-0384.html</link>&gt;.</note>, and then the gateway decrypts it and may send it in a totally different way, even unencrypted! This could give a false sense of security to the end-user, as it would appear as end-to-end encrypted in most clients (while in reality it's only e2ee between the client and the gateway/proxy).</p>
    <p>To make the situation more clear, a data transmission field SHOULD be used, indicating how the data is transmitted from and to legacy services (this field doesn't specify how data is transmitted between the XMPP client and the service itself). The field MUST have a 'var' attribute set to "data_transmission", a type of "list-single". The value MUST be one of the following:</p>
    <ul>
      <li><strong>plain</strong>: the data is sent without any encryption. It may be viewed by anybody with access to the network between the XMPP service and the destination.</li>
      <li><strong>encrypted</strong>: the data is encrypted, but not end-to-end (e.g., with something such as TLS). It may be viewed by the XMPP service, the legacy service administrators, and the destination.</li>
      <li><strong>e2e</strong>: the data is encrypted using an end-to-end algorithm, which SHOULD be specified with the encryption algorithm field (see below). It may be seen by the XMPP service and the destination.</li>
      <li><strong>gre</strong>: Gateway Relayed Encryption is used: the data is encrypted by the XMPP client, passed by the XMPP service, and decrypted by the destination. Only the destination can see the data. When used, XMPP clients MUST ensure that GRE is actually used by the gateway.</li>
    </ul>
    <p>When "encrypted" or "e2e" values are used in the previous field, the algorithm used SHOULD be specified with a field with a "var" attribute of "encryption_algorithm", with a type of "text-single", and with the human-readable name of the algorithm used (e.g., "TLSv1.3"). This is not necessary when "gre" is used, as this data is already known by the client in this case.</p>
  </section2>
  <section2 topic="Data Retention" anchor="retention">
    <p>The data retention field indicates for how long data such as messages or files are stored on the service. The field MUST have a 'var' attribute set to "data_retention", be of type "text-single", and MUST indicate how long data is stored (at most) in hours. The following values have special meaning:</p>
    <ul>
      <li><strong>0</strong>: data is not stored and is used in transit in this service. Note that the data can still be stored by another service if this one is a proxy or a gateway (this can be checked with the service-tied form).</li>
      <li><strong>infinite</strong>: data is stored without automatic purge. Depending on the service, the user may or may not handle data deletion themselves (this is specified with the "data_deletion" field, see below).</li>
      <li><strong>unknown</strong>: data retention of this service cannot be determined. This can happen when the service is a gateway to another service, and data retention policy is not known.</li>
    </ul>
  </section2>
  <section2 topic="Data Deletion" anchor="deletion">
    <p>The data deletion field indicates if a user can explicitly delete data on this service (via ad-hoc commands, pubsub semantics, or any easy-to-use method). The field MUST have a 'var' attribute set to "data_deletion", and be of type "boolean". It defaults to "false".</p>
  </section2>
  <section2 topic="Encryption at Rest" anchor="encryption_at_rest">
    <p>When data retention has a value other than "0", it is important to know how the data is encrypted at rest. This is exposed with a field which MUST have a 'var' attribute set to "encryption_at_rest", a type of "boolean", set to "true" when data is encrypted at rest. It defaults to "false".</p>
  </section2>
  <section2 topic="Terms of Service" anchor="tos">
    <p>A link to Terms of Service (ToS) can be specified with a field which MUST have a 'var' attribute set to "tos" and a type of "text-single". The value MUST be an URI to the ToS. The link can be an xmpp: URI (e.g., to a Pubsub item), an http(s) one, or using any kind of relevant scheme. As for other fields, if a proxy or a gateway is used, the ToS of the main scope applies to the service itself, while the ToS of legacy services use the identity-scoped forms.</p>
  </section2>
  <section2 topic="Location" anchor="location">
    <p>Physical location of the data is very important information for the user: it determines the jurisdiction that applies, and it also indicates risks (e.g., natural disasters, geopolitical considerations, war zones, risks of spying, etc.).</p>
    <p>There is already a specification to indicate the location of an XMPP entity: <span class="ref"><link url="https://xmpp.org/extensions/xep-0080.html">User Geolocation (XEP-0080)</link></span> <note>XEP-0080: User Geolocation &lt;<link url="https://xmpp.org/extensions/xep-0080.html">https://xmpp.org/extensions/xep-0080.html</link>&gt;.</note>, which should be used to indicate the location of the server via the pep node "http://jabber.org/protocol/geoloc" as explained in the specification. However, data may be in different locations at once (with a cluster of servers), in which case multiple items should be used in the pep node, one per data cluster. Item IDs may give a hint on the related data (e.g., cluster name), and XEP-0080's "description" field should be used to give details on this cluster and when/how data is stored there.</p>
    <p>The "region" field of XEP-0080 SHOULD be duly specified, as it is information of major importance in the context of data policy: the law may differ from one administrative region to another within a given country.</p>
  </section2>
  <section2 topic="Data Export" anchor="data_export">
    <p>A field indicating whether users can export all their data from the service. This field MUST have a 'var' attribute set to "data_export" and a type of "boolean". The default value is "false".</p>
  </section2>
  <section2 topic="Access Policy" anchor="access_policy">
    <p>A field indicating who has access to user accounts or data. This field MUST have a 'var' attribute set to "access_policy", a type of "list-multi", and MUST include one or more of the following values:</p>
    <ul>
      <li><strong>admins</strong>: Administrators of the service can access user data for operational or security purposes (e.g., account management, system maintenance).</li>
      <li><strong>moderators</strong>: Moderators (e.g., for group chats or blog comments) can access user data within their moderation scope (e.g., content review, enforcement of community guidelines).</li>
      <li><strong>organization_member</strong>: Any member of the organization owning the service can access user data (e.g., employees, contractors under the organization's control).</li>
      <li><strong>government</strong>: Government or legal authorities can access user data under legal requirements (e.g., court orders, national security requests).</li>
      <li><strong>advertisers</strong>: Third-party advertisers or ad networks can access user data for targeted advertising or analytics.</li>
      <li><strong>partners</strong>: Business partners or affiliated services can access user data under contractual or collaborative agreements.</li>
      <li><strong>none</strong>: No entity (other than the user) can access user data. If used, this value MUST be the only one selected.</li>
    </ul>
  </section2>
  <section2 topic="Full Erasure" anchor="full_erasure">
    <p>A field indicating whether users can fully erase their account and associated data. This field MUST have a 'var' attribute set to "full_erasure" and a type of "boolean". The default value is "false".</p>
  </section2>
  <section2 topic="Backup Frequency" anchor="backup_frequency">
    <p>A field indicating how often backups of user data are performed. This field MUST have a 'var' attribute set to "backup_frequency" and a type of "text-single". The value MUST specify the maximum frequency in hours between backups. A value of "0" indicates that no backups are performed. For example, a value of "24" indicates backups occur at most every 24 hours (i.e., daily). This field is RECOMMENDED for services storing user data.</p>
  </section2>
  <section2 topic="Backup Retention" anchor="backup_retention">
    <p>A field indicating how long backups are retained. This field MUST have a 'var' attribute set to "backup_retention" and a type of "text-single". The value MUST specify the maximum duration in hours. The following special values can be used:</p>
    <ul>
      <li><strong>0</strong>: No backup is done.</li>
      <li><strong>infinite</strong>: Backups are retained indefinitely.</li>
      <li><strong>unknown</strong>: Backup retention policies are not publicly disclosed.</li>
    </ul>
  </section2>
  <section2 topic="Extra Infos" anchor="extra">
    <p>Human-readable information can be added with a field which MUST have a 'var' attribute set to "extra_info" and a type of "text-multi". This is useful to add extra details, precision, or any kind of information that end-users may need to know. The "xml:lang" attribute of the disco#info request can be used to determine the language of the description returned by the service.</p>
  </section2>
</section1>
<section1 topic="Summary" anchor="summary">
  <p>Below is a table summarizing all fields defined in this XEP for data policy discovery. All fields are optional. Fields with special constraints are noted in the "Comment" column.</p>
  <table caption="Data Policy Fields Summary">
    <tr>
      <th>Name</th>
      <th>Field (var)</th>
      <th>Type</th>
      <th>Meaning</th>
      <th>Comment</th>
    </tr>
    <tr>
      <td>Auth Mechanism</td>
      <td>auth_data</td>
      <td>list-single</td>
      <td>How authentication data is handled by the service</td>
      <td>Values: no_auth, plain, hidden, restricted</td>
    </tr>
    <tr>
      <td>Data Transmission</td>
      <td>data_transmission</td>
      <td>list-single</td>
      <td>How data is transmitted to/from legacy services</td>
      <td>Values: plain, encrypted, e2e, gre</td>
    </tr>
    <tr>
      <td>Encryption Algorithm</td>
      <td>encryption_algorithm</td>
      <td>text-single</td>
      <td>Name of the encryption algorithm used</td>
      <td>Required if data_transmission is "encrypted" or "e2e"</td>
    </tr>
    <tr>
      <td>Data Retention</td>
      <td>data_retention</td>
      <td>text-single</td>
      <td>Maximum time (in hours) data is stored</td>
      <td>Special values: 0, infinite, unknown</td>
    </tr>
    <tr>
      <td>Data Deletion</td>
      <td>data_deletion</td>
      <td>boolean</td>
      <td>Whether users can explicitly delete data</td>
      <td>Defaults to false</td>
    </tr>
    <tr>
      <td>Encryption at Rest</td>
      <td>encryption_at_rest</td>
      <td>boolean</td>
      <td>Whether data is encrypted when stored</td>
      <td>Defaults to false</td>
    </tr>
    <tr>
      <td>Terms of Service</td>
      <td>tos</td>
      <td>text-single</td>
      <td>URI to the service's Terms of Service</td>
      <td>May be xmpp:, http(s), or other URI schemes</td>
    </tr>
    <tr>
      <td>Data Export</td>
      <td>data_export</td>
      <td>boolean</td>
      <td>Whether users can export their data</td>
      <td>Defaults to false</td>
    </tr>
    <tr>
      <td>Access Policy</td>
      <td>access_policy</td>
      <td>list-multi</td>
      <td>Entities that can access user data</td>
      <td>Values: admins, moderators, organization_member, government, advertisers, partners, none</td>
    </tr>
    <tr>
      <td>Full Erasure</td>
      <td>full_erasure</td>
      <td>boolean</td>
      <td>Whether users can fully erase their account and data</td>
      <td>Defaults to false</td>
    </tr>
    <tr>
      <td>Backup Frequency</td>
      <td>backup_frequency</td>
      <td>text-single</td>
      <td>Maximum interval (in hours) between backups</td>
      <td>Value in hours; "0" indicates no backup</td>
    </tr>
    <tr>
      <td>Backup Retention</td>
      <td>backup_retention</td>
      <td>text-single</td>
      <td>Maximum time (in hours) backups are retained</td>
      <td>Special values: 0, infinite, unknown</td>
    </tr>
    <tr>
      <td>Extra Info</td>
      <td>extra_info</td>
      <td>text-multi</td>
      <td>Human-readable additional information</td>
      <td/>
    </tr>
  </table>
</section1>
<section1 topic="Example" anchor="example">
  <p>This example shows a gateway providing SMTP/IMAP bridging with two forms in its disco#info response: one for the main gateway service (which does not store data) and one for the SMTP identity (which clarifies that policies depend on the user's chosen external server).</p>

  <example caption="Entity queries gateway for data policy"><![CDATA[
<iq type='get'
  from='juliet@capulet.lit/balcony'
  to='gateway@example.org'
  id='disco_1'>
  <query xmlns='http://jabber.org/protocol/disco#info'/>
</iq>
  ]]></example>

  <example caption="Gateway responds with two data-policy forms"><![CDATA[
<iq type='result'
    from='gateway@example.org'
    to='juliet@capulet.lit/balcony'
    id='disco_1'>
  <query xmlns='http://jabber.org/protocol/disco#info'>
    <identity category='gateway' type='smtp'/>
    <feature var='http://jabber.org/protocol/disco#info'/>
    <x xmlns='jabber:x:data' type='result'>
      <!-- Main data-policy form for the gateway itself -->
      <field var='FORM_TYPE' type='hidden'>
        <value>urn:xmpp:data-policy:0</value>
      </field>
      <field var='auth_data' type='list-single'>
        <value>plain</value>
      </field>
      <field var='data_transmission' type='list-single'>
        <value>plain</value>
      </field>
      <field var='data_retention' type='text-single'>
        <value>0</value>
      </field>
      <field var='data_deletion' type='boolean'>
        <value>false</value>
      </field>
    </x>

    <x xmlns='jabber:x:data' type='result'>
      <!-- Identity-scoped form for SMTP gateway -->
      <field var='FORM_TYPE' type='hidden'>
        <value>urn:xmpp:data-policy:identity:gateway:smtp:0</value>
      </field>
      <field var='extra_info' type='text-multi'>
        <value>This gateway acts as a relay to external IMAP/SMTP servers. Data policies depend entirely on the external server chosen by the user. This gateway does not store or process user data.</value>
      </field>
    </x>
  </query>
</iq>
  ]]></example>
</section1>
<section1 topic="Implementation Notes" anchor="impl">
  <p>Client developers are encouraged to present data policy information in ways that are intuitive and accessible to all users, including those without technical expertise. While this specification defines detailed metadata fields, clients should prioritize visual indicators (e.g., security badges, warning icons) to summarize key privacy and security aspects at a glance. For example:</p>
  <ul>
    <li>Use color-coded badges to indicate encryption status (e.g., green for end-to-end encryption, red for unencrypted transmission).</li>
    <li>Display warning symbols for services with unclear data retention policies or third-party access.</li>
    <li>Provide tooltips or expandable sections to show technical details on demand.</li>
    <li>Include a security rating (e.g., a percentage score, a mark out of 10, or letter grades) to give users a quick overview of a service's overall security posture.</li>
  </ul>
  <p>This approach ensures users receive actionable insights without being overwhelmed by technical specifications, and helps them make informed decisions about which services to trust.</p>
</section1>
<section1 topic="Security Considerations" anchor="security">
  <p>The information exposed in this document is highly useful to end-users to understand what happens to their data. However, it can also provide information to potential attackers (e.g., server location, who can access data, etc.). Service administrators should keep this in mind and find the right balance between providing legitimate end-user information and avoiding disclosure of too many details usable by attackers.</p>
</section1>
<section1 topic="IANA Considerations" anchor="iana">
  <p>TODO</p>
</section1>
<section1 topic="XMPP Registrar Considerations" anchor="registrar">
  <p>TODO</p>
</section1>
<section1 topic="XML Schema" anchor="schema">
  <p>TODO</p>
</section1>
</xep>
