OdysseyA startup that was founded by self-driving pioneers Oliver Cameron and Jeff Hawke has developed an AI model with which users can “communicate” with streaming video.
The model available on the internet in an “early demo”, generates and streams video frames every 40 milliseconds. Through basic controls, viewers can explore areas within a video, similar to a 3D-Render video game.
“Given the current state of the world, an incoming action and a history of states and actions, the model tries to predict the next state of the world,” explains Odyssey in a blog post. “Encouraging this is a new world model that demonstrates possibilities, such as generating pixels that feel realistic, retain spatial consistency, performing the learning actions of video and coherent video streams for 5 minutes or longer.”
A number of startups and large technology companies chase world models, including deep mind, influential AI researcher Fei-Fei Lee’s World Labs, Microsoft and Decart. They believe that world models can be used one day to make interactive media, such as games and films, and perform realistic simulations such as training environments for robots.
But creatives have mixed feelings about technology. A recently Wired research showed that game studios such as Activision Blizzard, who has fired dozens of employees, use AI to cut corners and wear. And a 2024 study Commissioned by the Animation Guild, a trade union that represents Hollywood animators and cartoonists, estimated that more than 100,000 American film, television and animation jobs will be disrupted by AI in the coming months.
For his part, Odyssey promises to collaborate with creative professionals – not replacing them.
“Interactive video […] Opens the door to completely new forms of entertainment, where stories can be generated and investigated on request, free from the limitations and costs of traditional production, “the company writes in its blog post.” Over time we believe everything that is video today – entertainment, advertisements, education, training, travel and more – will evolve towards interactive video, all powered by Odyssey. “
The Odyssey demo is a bit rough on the edges, which the company recognizes in its position. The environments that the model generates are blurry and distorted and unstable in the sense that their layouts do not always remain the same. Walk ahead for a while in one direction or turn around, and the environment can suddenly look different.
But the company promises to improve the model quickly, which can currently stream video with a maximum of 30 frames per second of clusters of NVIDIA H100 GPUs at the expense of $ 1- $ 2 per “user hour”.
“Looking ahead, we investigate richer world representations that record dynamics a lot of more framework, while we increase temporary stability and persistent state,” writes Odyssey in his post. “Parallel we are expanding the action space from movement to world interaction, open actions of large -scale video.”
Odyssey uses a different approach than many AI laboratories in the world modeling space. It designed a 360-degree, backpack-mounted camera system to record Real-World landscapes, of which Odyssey thinks it can serve as a basis for higher quality models than models that are exclusively trained on publicly available data.
To date, Odyssey has collected $ 27 million from investors, including EQT Ventures, GV and Air Street Capital. Ed Catmull, one of the co-founders of Pixar and former president of Walt Disney Animation Studios, is on the board of directors of the startup.
Last December Odyssey said it worked on software with which makers could load scenes that were generated by his models in tools such as Unreal Engine, Blender and Adobe After Effects, so that they can be edited by hand.
#Odysseys #model #streams #interactive #worlds #Techcrunch


