Sberbank’s unique neural network can create texts in 61 languages

The SberDevices team, part of the SberBank ecosystem, announced the launch of a multilingual version of the GPT-3 neural network — a model called mGPT can generate texts in 61 languages ​​of the world, including the languages ​​of the peoples of Russia and the CIS countries.

Sberbank’s unique neural network can create texts in 61 languages

According to the press service, mGPT is the world’s first generative model that supports so many languages. It is available in two versions: the basic version with 1.3 billion parameters, published in the public domain in the SberDisk cloud storage, and the extended version, with 13 billion parameters, which will soon be available on the ML Space machine learning platform from SberCloud.

The mGPT model can be used both simply for creating text and for solving various problems in the field of natural language processing in one of the supported languages ​​through additional training or as part of ensembles of models.

For example, you can teach an automated system to answer questions, determine the emotional coloring of the text, extract names, surnames, company names, etc. from the text. The model can also be used as a component of various speech technologies, for example, to improve the quality of speech recognition, generate scripts for dialogue systems, and so on.

The head of SberDevices Denis Filippov said:

In 2020, we introduced the Russian-language version of the GPT-3 neural network, which is used in two virtual assistants of the Salyut family from Sber — Joy and Athena. We continued to develop our NLP technologies and introduced the mGPT model, which supports more than 60 languages, while for many of them generative models simply did not exist before. This, among other things, will be our contribution to the preservation and development of the languages ​​of the peoples of Russia: mGPT can generate texts, for example, in Tatar or Yakut.”

Full list of languages ​​available in the mGPT model: Azerbaijani, English, Arabic, Armenian, Afrikaans, Basque, Bashkir, Belarusian, Bengali, Burmese, Bulgarian, Buryat, Hungarian, Vietnamese, Dutch, Greek, Georgian, Danish, Hebrew, Indonesian, Spanish , Italian, Yoruba, Kazakh, Kalmyk, Kyrgyz, Chinese, Korean, Latvian, Lithuanian, Malay, Malayalam, Marathi, Moldavian, Mongolian, German, Ossetian, Persian, Polish, Portuguese, Romanian, Russian, Swahili, Tajik, Thai, Tamil , Tatar, Telugu, Tuvan, Turkish, Turkmen, Uzbek, Ukrainian, Urdu, Finnish, French, Hindi, Chuvash, Swedish, Yakut, Japanese.

Source: ixbt

You may also like