untitled design

Searching for your last name in the oldest documents: Yandex has learned to decipher handwritten archives

Yandex has launched a new service “Search in archives”, which is based on old archives (more than 2.5 million pages of historical documents) with text transcripts made by neural networks. This was made possible thanks to technology based on optical character recognition – it takes into account the peculiarities of handwriting, recognizes letters that have ceased to be used and understands the special structure of archival documents.

The neural networks were trained on hundreds of thousands of real handwritten documents dating back to the 18th and 19th centuries, as well as on tens of millions of generated samples. All this was supervised by experts.

“It can take a professional up to half an hour to decipher one page of archival handwritten text, and our service can do it in a few seconds,” says Elena Bubnova, head of Yandex Search. “In the future, the technology can also be used to solve other tasks in Yandex products.”

Archive Search was created not to demonstrate technology, but to really help people: the service will be useful to historians, sociologists, demographers, genealogists, and even ordinary people looking for information about their family. The service allows you to quickly find documents with the right keyword, whether it’s a name, a city name, or anything else.

At the moment, the site catalog is based on the Main Archive of Moscow, as well as on the archives of the Orenburg and Novgorod regions. The database will expand in the future.

Source: Trash Box

You may also like

Get the latest

Stay Informed: Get the Latest Updates and Insights

 

Most popular