Sora: OpenAI's Groundbreaking AI Video Generator Revealed - See the Stunning Results!

Sora: OpenAI's Groundbreaking AI Video Generator Revealed - See the Stunning Results!

February 16, 2024

After ChatGPT and DALL-E, its conversational agent and AI image generator, OpenAI has unveiled Sora, a video generator, "capable of creating realistic and creative scenes from text instructions", describes the company in a blog post.

Sora: OpenAI's Groundbreaking AI Video Generator Revealed

In its final stages of testing, this new model appears particularly impressive in the examples shared by OpenAI.

Sora, OpenAI's text-to-video model, unveiled

Sora is an AI-based video generator. According to OpenAI, this new model can generate videos of up to one minute in length, "respecting visual quality and consistency with the user's prompt". In concrete terms, Sora will make it possible to generate videos in the same way as one generates an image with tools such as DALL-E or Midjourney. "Sora is capable of handling complex scenes with multiple characters, specific types of movement, and precise details of subject and background," explains OpenAI.
The model understands not only what the user has requested in the prompt, but also how these elements exist in the physical world.
The developers point out that the model has a deep understanding of requests, enabling it to interpret them accurately and generate realistic, convincing characters capable of showing emotions. What's more, Sora is capable of producing multiple shots within a single generated sequence, retaining characters and visual style.

Impressive examples of Sora-generated videos

In its blog post and via its X account, OpenAI has shared examples of the video sequences Sora seems capable of generating, accompanied by the prompts used. And these are impressive:
This video answers the following query (originally in English): the beautiful snow-covered city of Tokyo is animated. The camera moves along the city's bustling street, following several people enjoying the beautiful snowy weather and shopping at nearby stalls. Gorgeous Sakura petals flutter in the wind with the snowflakes.

This sequence was generated using the following query (originally in English): A movie trailer featuring the adventures of a 30-year-old spaceman wearing a motorcycle helmet knitted from red wool, blue sky, salt desert, cinematic style, shot on 35mm film, vivid colors.
Still undergoing testing, Sora can be perfected, warns OpenAI

Still undergoing testing, Sora can be perfected, warns OpenAI

Despite the many examples shared by Sora, the model is still in the testing phase and needs to be perfected. So we'll have to wait a little longer before we can try it out. While the footage unveiled has probably been hand-picked, OpenAI is also releasing a few videos showing the tool's weaknesses. "The current model (...) may struggle to accurately simulate the physics of a complex scene, and fail to understand specific cases of cause and effect."
Cloning entities is one of Sora's problems.
Spatial details can escape the model, which sometimes confuses its right and left, creates "physically implausible" camera movements, makes elements (animals, people) appear spontaneously "in scenes containing many entities", fails to render certain objects rigidly..." Simulating complex interactions between objects and several characters is a challenge for the model, which sometimes leads to humorous generations", comments OpenAI about this grandmother's rather dubious birthday party.

What are the risks and security measures for Sora?

Sora is currently in the hands of OpenAI's red teamers, the team that assesses the generator's "critical areas", particularly legal, moral, and ethical risks. These include experts in misinformation, hate content, and prejudice, "who are testing the model in an adversarial manner". The same rules that apply to DALL-E, the image generator, will be used for Sora: requests containing extreme violence, sexual content, hateful images, or celebrity likenesses will therefore be banned.
We are developing tools to help detect misleading content, such as a detection classifier capable of determining when a video has been generated by Sora.
While existing video generator models, such as Gen-2 or ADFA, have been the source of many deepfakes, and OpenAI and Microsoft have just warned about the malicious use of their chatbots, Sora will have to ensure responsible use. "We will mobilize policymakers, educators, and artists around the world to understand their concerns and identify positive use cases for this new technology," concludes OpenAI.