untitled design

This neural network imitates a human voice in 3 seconds of training. Microsoft introduced VALL-E

Microsoft introduced the VALL-E neural network, which allows you to simulate a human voice after just three seconds of training. Its features do not end there: unlike alternative developments, VALL-E is also able to imitate the emotions and tone of the speaker, even when voicing text that the person did not speak.

The neural network was trained on 60,000 hours of English speech – at the moment its results are quite impressive (they can be assessed on GitHub), but sometimes the simulated voice still seems machine-made.

Although VALL-E is not a public domain development, journalists are already worried about the problem of such a tool falling into the wrong hands (especially if it continues to be improved). For example, thanks to this technology, attackers will be able to make realistic spam calls, imitating the voice of a person’s relatives and friends.

Source: Trash Box

You may also like

Get the latest

Stay Informed: Get the Latest Updates and Insights

 

Most popular