Jabber is meant to allow people everywhere in the world to communicate with each other. However, people converse in many different languages, not just English. Many humans in fact don't even understand English. Hence, Jabber should not be tied to a particular language, but rather allow usage of any language, be it English, Chinese, Inuit, or anything else.
One important step towards this goal is that Jabber is based upon Unicode, allowing for many different languages. But that alone is not enough. Jabber promotes a server-based system for many of its components and services, like the JUD, or transports. Many of these have to interact with users in some way. Currently, they do so in only one fixed language (usually English). Even if the server admin is willing to translate the messages, forms, etc. involved, there can only be one localization active for a given server/component.
Hence, Jabber must support a way for clients to inform the server about their preferred language. In addition, the server and other components have to understand and honor this information. Only this way can we ensure that Jabber is able to work in a multi-national, multi-lingual environment.
Some examples on how this information could and should be used, include
The basic idea behind this proposal was to use existing standards where possible, and to make it fully backward compatible. Furthermore it was a goal to allow clients to support it now, even without any server support, while at the same time permitting improved functionality once servers start to implement this spec.
To encode the locale on any given XML packet, we use the xml:lang attribute, as defined in the XML specification. This in turn uses values as specified in RFC 1766 to encode languages and regions. This way, you can even distinguish between British and Australian English.
An xml:lang tag can be put onto any XML element; for the purposes of this document, however, we will limit its usage to the four central Jabber elements: <stream/>, <message/>, <iq/> and <presence/>.
A client claiming to support this document has to initiate server connection slightly differently by putting an xml:lang attribute in the initial <stream:stream> element.
Servers not supporting this document will just ignore the additional attribute. Compliant server can be distinguished by the fact that their reply <stream:stream> element also contains an xml:lang attribute, indicating the main language of the server. A compliant client has to detect whether the server is compliant or not, and base its future behavior on this information.
If the client thus determines that the server is compliant, then it doesn't have to do anything beyond this point. All its outgoing messages will automatically be flagged by the server with an xml:lang attribute if necessary. Thus writing a minimal compliant client is trivial.
If it is determined that the server does not support this document, and the client still wants to offer locale support, it may start flagging all its outgoing message/iq/presence elements with the xml:lang attribute, to ensure that other components/clients which do conform to this document can handle the localization despite the local server not doing so.
Finally, if for whatever reasons the client wants to flag particular messages with a different locale (e.g. if the user is bilingual), it can do so at any time by putting an appropriate xml:lang element in the outgoing data. This will override the previously set default locale for this message only.
A compliant server must detect the xml:lang attribute in incoming <stream:stream> elements. The server then has to store this information for later use, i.e. it has to remember the default language for each active session.
Additionally, a compliant server must attach an xml:lang attribute to the reply <stream:stream> element sent in response to a newly initiated connection. This attribute should reflect the default language of that server, and is used to indicate to clients that the server implements this document.
The server should not only allow user clients to specify a default language this way, but also server-side components, like the JUD should be allowed to do this.
Whenever a message leave the server, it has to tag the message automatically with the xml:lang attribute of the corresponding seesion, if any was specified, unless the message is already tagged this way. In that case, the already existing xml:lang attribute takes precedence, thus allowing for greater flexibility.
If a client send a message to another local client which uses the same xml:lang value, then no change is applied. But if the recipient uses a different xml:lang, and if the message has no xml:lang attribute attached yet, the xml:lang of the server has to be attached before delievey of the message.
Jabber based services that wish to comply to this document have to make sure that all information they send to clients is tagged with an xml:lang attribute corresponding to the language used in the outgoing data, if appropriate, even if the component supports no other localizations. An example for this is a search form based on XEP-0004.
This way, a client could for example offer to translate the form since it now knows the language the form was written in. Previously it could just guess the language was English, which never was guaranteed.
To be able to tailor replies to the user's preferred language, the component has to know this information. This is simply inferred from any xml:lang attribute on incoming requests. If none is present, the default locale is assumed. If the client's default locale diverges from that of the component, it is the server's responsibility to tag the query with an appropriate xml:lang attribute (refer to the "Server support" section). If on the other hand the server is not compliant, then any interested client will manually tag its queries with an xml:lang attribute. Thus it is sufficient to check for this attribute.
A more sophisticated component supporting multiple localizations of its forms/messages could now honor the requested language and send this search form instead of the English one shown previously:
If the component doesn't have the requested localization available, it replies with the default localization (but of course with the matching xml:lang attribute tagged to it, and not the one of the request).
This document in other formats: XML PDF
This XMPP Extension Protocol is copyright © 1999 – 2020 by the XMPP Standards Foundation (XSF).
Permission is hereby granted, free of charge, to any person obtaining a copy of this specification (the "Specification"), to make use of the Specification without restriction, including without limitation the rights to implement the Specification in a software program, deploy the Specification in a network service, and copy, modify, merge, publish, translate, distribute, sublicense, or sell copies of the Specification, and to permit persons to whom the Specification is furnished to do so, subject to the condition that the foregoing copyright notice and this permission notice shall be included in all copies or substantial portions of the Specification. Unless separate permission is granted, modified works that are redistributed shall not contain misleading information regarding the authors, title, number, or publisher of the Specification, and shall not claim endorsement of the modified works by the authors, any organization or project to which the authors belong, or the XMPP Standards Foundation.
## NOTE WELL: This Specification is provided on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. ##
In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall the XMPP Standards Foundation or any author of this Specification be liable for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising from, out of, or in connection with the Specification or the implementation, deployment, or other use of the Specification (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if the XMPP Standards Foundation or such author has been advised of the possibility of such damages.
This XMPP Extension Protocol has been contributed in full conformance with the XSF's Intellectual Property Rights Policy (a copy of which can be found at <https://xmpp.org/about/xsf/ipr-policy> or obtained by writing to XMPP Standards Foundation, P.O. Box 787, Parker, CO 80134 USA).
The Extensible Messaging and Presence Protocol (XMPP) is defined in the XMPP Core (RFC 6120) and XMPP IM (RFC 6121) specifications contributed by the XMPP Standards Foundation to the Internet Standards Process, which is managed by the Internet Engineering Task Force in accordance with RFC 2026. Any protocol defined in this document has been developed outside the Internet Standards Process and is to be understood as an extension to XMPP rather than as an evolution, development, or modification of XMPP itself.
The primary venue for discussion of XMPP Extension Protocols is the <firstname.lastname@example.org> discussion list.
Discussion on other xmpp.org discussion lists might also be appropriate; see <http://xmpp.org/about/discuss.shtml> for a complete list.
Errata can be sent to <email@example.com>.
The following requirements keywords as used in this document are to be interpreted as described in RFC 2119: "MUST", "SHALL", "REQUIRED"; "MUST NOT", "SHALL NOT"; "SHOULD", "RECOMMENDED"; "SHOULD NOT", "NOT RECOMMENDED"; "MAY", "OPTIONAL".
Note: Older versions of this specification might be available at http://xmpp.org/extensions/attic/