Meta has announced new open-source artificial intelligence models capable of recognizing more than 4,000 languages spoken around the world, a forty-fold increase over previously available technologies.
Meta calls these models Massively Multilingual Speech Models, or MMS, and claims to be able to convert text to speech and back in more than 1,100 languages, ten times more than what existed earlier.
The American company says that one of its goals behind developing these models is to preserve languages from the threat of extinction, as the company stated through a statement published on its official website, saying: “Many languages of the world are at risk of disappearing, and restrictions imposed on speech recognition and generation technology will not Currently only to accelerate this trend.”
Meta has made its new models open source, so that researchers can access them and contribute to their development, in order to help preserve the world’s languages.
There are many areas in which AI-based speech technologies may be used, according to Meta, such as messaging services, virtual reality technologies, and augmented reality, as it will understand all voices, and serve the user in their preferred language.
Speech recognition and text-to-speech models require training on thousands of hours of audio recordings with accompanying transcripts, which is absolutely necessary for algorithms to understand and classify texts correctly.
Meta has had difficulties collecting audio recordings of languages that are not widely spoken, as the data currently available only cover the 100 most widely spoken languages around the world.
Mita resorted to using unconventional methods in order to overcome these difficulties, as it collected audio recordings of religious texts translated into various languages, and attached those recordings with certified translations, which allowed Mita to obtain 32 hours of data – on average – for less common languages. , and increased the total languages in the forms to more than 4,000 languages.
Meta is seeking in the future to support more languages in its language models based on artificial intelligence, in addition to making it able to deal with the different dialects of each language, which is still a challenge so far.