When language identification is enabled, DISCO identifies each document's primary language at ingest. To search by language:
primaryLanguage(German)
If a document has too few characters or is in an unsupported language, it is identified as "undetermined."
To display the primary language in search results, create a custom view and add the Primary Language column from the Metadata section.
Supported languages
Databases created before January 16, 2025 support eight languages: English, German, French, Portuguese, Spanish, Chinese, Japanese, and Korean.
Databases created after January 15, 2025 support 82 languages: Afrikaans, Albanian, Arabic, Armenian, Azerbaijani, Basque, Belarusian, Bengali, Bihari, Bulgarian, Catalan, Cebuano, Cherokee, Chinese, Croatian, Czech, Danish, Dhivehi, Dutch, English, Estonian, Finnish, French, Galician, Ganda, Georgian, German, Greek, Gujarati, Haitian_Creole, Hebrew, Hindi, Hmong, Hungarian, Icelandic, Indonesian, Inuktitut, Irish, Italian, Japanese, Javanese, Kannada, Khmer, Kinyarwanda, Korean, Laothian, Latvian, Limbu, Lithuanian, Macedonian, Malay, Malayalam, Maltese, Marathi, Nepali, Norwegian, Oriya, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Scots_Gaelic, Serbian, Sinhalese, Slovak, Slovenian, Spanish, Swahili, Swedish, Syriac, Tagalog, Tamil, Telugu, Thai, Turkish, Ukrainian, Urdu, Vietnamese, Welsh, Yiddish.