Meta releases AI model for translating speech

By

Between dozens of languages.

Meta Platforms has released an AI model capable of translating and transcribing speech in dozens of languages, a potential building-block for tools enabling real-time communication across language divides.

Meta releases AI model for translating speech

The company said in a blog post that its SeamlessM4T model could support translations between text and speech in nearly 100 languages, as well as full speech-to-speech translation for 35 languages, combining technology that was previously available only in separate models.

CEO Mark Zuckerberg has said he envisions such tools facilitating interactions between users from around the globe in the metaverse, the set of interconnected virtual worlds on which he is betting the company's future.

Meta is making the model available to the public for non-commercial use, the blog post said.

The world's biggest social media company has released a flurry of mostly free AI models this year, including a large language model called Llama that poses a serious challenge to proprietary models sold by OpenAI and Google.

Zuckerberg says an open AI ecosystem works to Meta's advantage, as the company has more to gain by effectively crowd-sourcing the creation of consumer-facing tools for its social platforms than by charging for access to the models.

Nonetheless, Meta faces similar legal questions as the rest of the industry around the training data ingested to create its models.

In July, comedian Sarah Silverman and two other authors filed copyright infringement lawsuits against both Meta and OpenAI, accusing the companies of using their books as training data without permission.

For the SeamlessM4T model, Meta researchers said in a research paper that they gathered audio training data from 4 million hours of "raw audio originating from a publicly available repository of crawled web data," without specifying which repository.

A Meta spokesperson did not respond to questions on the provenance of the audio data.

Text data came from datasets created last year that pulled content from Wikipedia and associated websites, the research paper said.

Got a news tip for our journalists? Share it with us anonymously here.
Tags:
aimetasoftware

Sponsored Whitepapers

Unveiling the Invisible Threat: Mastering the Art of Conveying Cyber Risks to Boards
Unveiling the Invisible Threat: Mastering the Art of Conveying Cyber Risks to Boards
Transforming Your Business
Transforming Your Business
Operational Excellence Through System Modernisation
Operational Excellence Through System Modernisation
The Complete Cloud Security Buyer's Guide
The Complete Cloud Security Buyer's Guide
The Complete MDR Buyer's Guide
The Complete MDR Buyer's Guide

Most Read Articles

ATO 'clear' on role generative AI can play in its operations

ATO 'clear' on role generative AI can play in its operations
7-Eleven Australia to deploy computer vision in some stores

7-Eleven Australia to deploy computer vision in some stores
Suncorp starts experimenting with LLMs

Suncorp starts experimenting with LLMs
Bapcor transformation to deliver payoff in 2024

Bapcor transformation to deliver payoff in 2024

Digital Nation

How eBay uses interaction analytics to improve CX
How eBay uses interaction analytics to improve CX
More than half of loyalty members concerned about their data
More than half of loyalty members concerned about their data
Health tech startup Kismet raises $4m in pre-seed funding
Health tech startup Kismet raises $4m in pre-seed funding
COVER STORY: What AI regulation might look like in Australia
COVER STORY: What AI regulation might look like in Australia
DeepAI founder on the risks of artificial intelligence
DeepAI founder on the risks of artificial intelligence

Log In

  |  Forgot your password?