Share

Sora: Meet ChatGPT-creator OpenAI’s text-to-video model

Sora can create videos using simple text prompts, and has been released to a select number of users for feedback.
Sora: Meet ChatGPT-creator OpenAI’s text-to-video model
A screengrab from one of the videos created by Sora, depicting a woman in Tokyo.

OpenAI, the company behind ChatGPT and image generator DALL-E, has now rolled out a text-to-video model Sora.

The model will allow users to create videos up a minute long using simple prompts.

The Microsoft-backed company said the new platform was currently in the testing phase, but released a few sample videos along with the accompanying input texts.

Read | How to get started with ChatGPT: A guide for beginners

OpenAI has granted access “to a number of visual artists, designers, and filmmakers to gain feedback on how to advance the model to be most helpful for creative professionals”, it said.

Descriptive instructions

One video sample showed a woman walking down a street in Tokyo. Its text prompt included details of the city lights and signage. It also described the woman’s attire, the condition of the street, as well as the people in the background.

Another video showed giant wooly mammoths walking in the snow. The text described the trees in the background, the snow-capped mountains in the distance and a low camera view, along with other instructions.

“Sora is able to generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background. The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world,” OpenAI said.

Weaknesses in Sora

The company also admitted to some weaknesses in the model.

It said the model may struggle with accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect. “For example, a person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark.”

Safety first

OpenAI said it is working with domain experts in areas like misinformation, hateful content and bias to test the model.

It is also building “tools to help detect misleading content such as a detection classifier that can tell when a video was generated by Sora”.

The tool will also check and reject input prompts about extreme violence, hateful imagery, celebrity likeness and other forms of inappropriate content.

The stories on our website are intended for informational purposes only. Those with finance, investment, tax or legal content are not to be taken as financial advice or recommendation. Refer to our full disclaimer policy here.