Skip to main content
On a regular basis, DeepL adds translation support for new languages or language variants. In this article, we describe the process we’ll follow with a new language or variant release.

Language codes follow BCP 47

DeepL language codes follow BCP 47. A language code always includes a base language subtag (e.g. en, zh), and may include additional subtags for script, region, or variant where needed to distinguish variants. For example:
  • EN-US, PT-BR — region subtag to distinguish regional variants.
  • ZH-HANS, ZH-HANT — script subtag to distinguish writing systems.
BCP 47 is an expansive standard, and language codes can vary significantly in structure and length. As DeepL adds support for more languages and variants, new codes may use any combination of subtags permitted by the spec. For example, codes like sr-Cyrl-RS or sr-Latn-RS (Serbian in Cyrillic vs. Latin script, as used in Serbia) are valid BCP 47 codes — while DeepL does not support these today, your integration should be able to handle codes of this form if they are added in the future.
Do not hardcode assumptions about the format of language codes. For example, do not assume that language codes will always be exactly two letters, or that a hyphenated code will always be in the format xx-YY. Instead, always treat the lang codes returned by the /languages endpoint as opaque identifiers. If you need to parse language codes, use a BCP 47-compliant library rather than writing custom parsing logic — the full spec includes subtags for script, region, variant, extensions, and private use, and partial implementations are a common source of bugs.

What happens when a new language is released

  • We will add the language code for the newly supported language or variant to the “Source languages” and “Target languages” lists on the Supported languages page in the API documentation. We’ll include a note on that page if the language or variant does not support both text and document translation.
  • If a newly added language or variant supports both text and document translation, we will add the language or variant to the /languages endpoint response. The variant code used depends on the characteristics of the variant:
    • In some cases, a variant is primarily used in a specific region, and so a region subtag is the best way to identify it (e.g. EN-US, PT-BR).
    • In other cases, a variant is used widely across multiple regions, and so a script subtag is more appropriate (e.g. ZH-HANS, ZH-HANT). The subtag structure will be selected by DeepL on a case-by-case basis following BCP 47 conventions.
  • In cases where a new language code with a variant duplicates the behavior of an existing language code without a variant (e.g. ZH-HANS was recently added as a language code for translating into simplified Chinese, along with ZH):
    • In the /languages endpoint response, we will continue to return both language codes in two separate dicts with the same value in the "name" field.
    • For backwards compatibility, we will continue to support the original language code (in this example, ZH) for text and document translation.
  • We will add the language code for the newly supported language or variant to our OpenAPI spec.
Note about the/languagesendpoint: In the future, we plan to extend the language information returned by the API.This will allow us to specify whether a language supports both text and document translation, whether a language code is considered deprecated because it’s been duplicated by a variant language code, and so on.The additional metadata would also allow us, for example, to add languages like AR and ZH-HANT to the languages endpoint even before document translation is supported.