The Yandex Cloud team has been working on the Yandex SpeechKit synthesis and recognition service for some time now. On its basis, a new neural network model was developed, which can simultaneously recognize more than 10 languages. Such a neural polyglot can be used to quickly create voice assistants and call center robots that can communicate in different languages.
The neural network works with both the most common languages (English and French) and less popular ones (Danish, Finnish, Turkish). Algorithms automatically recognize speech in the stream on any topic and can switch between languages. The neural network understands both short and long phrases, names, addresses, dates and numbers. She is constantly learning and improving her skills.
The new model works on the basis of the Transformer architecture, processing data in parallel and independently of each other. That is, speech in different languages is recognized separately. The training was carried out on dozens of terabytes of data from professional datasets, as well as on data arrays from Yandex services.
The polyglot neural network is available to Yandex SpeechKit users and is configured using standard tools in the API.
Source: Trash Box