Advertisement

News

Here we go again? OpenAI’s CTO claims to not know what data Sora has been trained with

Mira Murati is not sure if Sora has trained with social media data.

Here we go again? OpenAI’s CTO claims to not know what data Sora has been trained with
Pedro Domínguez

Pedro Domínguez

  • Updated:

Every time a technology company launches a new artificial intelligence, the first question that arises is “where do the training data come from?”. AI models are trained using large datasets, which help the model learn to recognize patterns, make predictions, or understand language.

ChatGPT DOWNLOAD

And it is not few the AI that have been trained with data obtained illicitly or at least dubiously, including the popular ChatGPT from the company OpenAI. For this same reason, it is at least surprising that the CTO of this company, Mira Murati, is not clear about the source of the data used to train Sora, the new AI from the company capable of generating videos.

During an interview with The Wall Street Journal published on March 13th, Murati offered vague answers when asked about the source of data for OpenAI’s Sora model, which is capable of generating videos from text instructions. “We use publicly available data and licensed data,” Murati responded regarding how the company is training its upcoming model.

Joanna Stern, a journalist from WSJ, then asked if Sora had been trained with data from platforms like YouTube, Instagram or Facebook, to which Murati replied: “I’m not sure about that”, adding: “You know, if they were available to the public – available to the public to use. But I’m not sure. I’m not sure about it”.

Before moving on to another topic, Stern mentioned OpenAI’s partnership with the stock image company Shutterstock, asking if their data could be used to train Sora. “I’m not going to go into details about the data that was used. But they were public or licensed data,” Murati added. Later, the executive confirmed to the WSJ that indeed, Shutterstock data was used to train Sora.

ChatGPT DOWNLOAD
Pedro Domínguez

Pedro Domínguez

Publicist and audiovisual producer in love with social networks. I spend more time thinking about which videogames I will play than playing them.

Latest from Pedro Domínguez

Editorial Guidelines