This model can both convert spoken text into written text and translate written text into spoken text.
Tech giant Meta has unveiled a new model called “SeamlessM4T” that can translate spoken text in almost 100 different languages. “SeamlessM4T” stands for “Large-Scale Multilingualism and Multimodal Machine Translation”. This model can both translate spoken text into written text and translate written text into spoken text. It can also recognize 100 input languages for speech-to-speech and text-to-speech operations and convert them into 35 different output languages.
Revolutionizing meta-language translation
SeamlessM4T was released under the Creative Commons CC BY-NC 4.0 license, a license available to researchers and developers. Inspired by the fictional character “Babel Fish”, the company says it aims to break down the barriers between the world’s languages. It also points out that existing translation systems can only cover a fraction of the world’s languages.
One-step translation capability of the SeamlessM4T model
One of the most important features that distinguishes the SeamlessM4T model from other major translation models is that it completes the translation process in a single step. Unlike other large models, this model performs the translation task as a whole without splitting it into different systems.
Another striking feature of the model is its ability to recognize when a speaker switches between multiple languages or uses different languages in the same sentence. Meta says the model can quickly recognize transitions between different languages such as Hindi, Telugu and English.
During the development of the SeamlessM4T model, Meta also developed a system for detecting sensitive words. Meta’s goal is to be able to identify expressions that may contain hate, violence or abuse during translation. Meta also notes that they have a capability to detect gender bias. By detecting gender-related expressions, SeamlessM4T aims to prevent gender inequality in translation.