ChatGPT-creator OpenAI teased a new model capable of generating videos through text prompts, in what it claimed is an important milestone in the field of Artificial General Intelligence.

The company explained the product named Sora can generate videos with multiple characters and detailed backgrounds, made possible because its model “understands not only what the user has asked for in the prompt, but also how those things exist in the physical world”.

It further pitched the system as having a deep understanding of language which allows it to produce detailed imaging according to real-world context. 

Sora will be able to produce videos up to a minute long, and OpenAI said the system was trained on a blend of videos and images of variable durations, resolutions and aspect ratios, allowing it to simulate real life scenarios or render videos of different styles. 

The text-to-video model builds on past research OpenAI conducted for its AI image generator DALL-E, and the company stated the two share the same technique which involves “generating highly descriptive captions for the visual training data”. 

OpenAI is also developing tools that can detect when videos are generated by Sora, identifying potentially misleading content. It plans to engage with policymakers and educators to prevent abuse of the technology. 

Sora is now available to a number of artists, filmmakers and designers, which are invited to provide feedback on the platform.