AI can replace actors and directors: SORA bot makes incredibly realistic videos based on text descriptions

OpenAI, known worldwide for its ChatGPT chatbot, introduced on February 15 a new generative artificial intelligence model, Sora. It allows you to generate videos based on text descriptions, the realism of which is amazing, reports Forbes.

The emergence of such a tool has raised even more concerns about the development of deepfakes (an image synthesis technique based on artificial intelligence), as well as the existence of which professions Sora could be at risk now.

Experts explained how OpenAI managed to create such a model, where it can be used, and why the risks of using such a model may be exaggerated.

Why Sora is not a step, but a giant leap

About a month ago, Google announced the work of the Lumiere neural network, which can generate 5-second videos with a resolution of 512x512 pixels. And now OpenAI has done the seemingly impossible - it has created a generative Sora model with realistic one-minute videos.

OpenAI trained the model on videos in the original resolution, such as FullHD (1920x1080), and not on short videos with a resolution of 512x512, as was customary. Therefore, Sora can create both vertical and horizontal videos, and also move away from the usual square generative videos.

The success of OpenAI is that they developed a neural network based on an approach similar to DALL-E 3 (the third generation of the Dall-E neural network from Open AI, a competitor of Midjourney and Stable Diffusion, allows you to generate images in different styles). They first train a separate model to write a short but accurate description of the video. Then, using GPT-4V (the ChatGPT function, thanks to which the neural network recognizes images and takes them into account when responding), they create detailed descriptions, receiving a large number of high-quality and variable video descriptions for training Sora.

In addition, the Sora architecture allows you not only to generate individual video fragments, but also to combine them into a single whole. This opens up the possibility of creating long, coherent videos that were previously unavailable to AI generation. As a result, we get realistic, high-quality videos up to one minute.

No fear

With technology as powerful as Sora, there are of course risks of abuse. Even current image generators have many questions regarding their misuse - generating fake and prohibited content. For example, this happened with Midjourney when people began to generate realistic images of Donald Trump or Pope Francis. With video, everything reaches a completely new level, because now you can generate any news feed and support it with video proof (fake). OpenAI understands this and is taking steps to protect against potential risks. Thus, according to information on the company’s website, they are developing tools for identifying fake and prohibited content.

Moreover, the AI ​​research community is actively working on marking generative content - perhaps soon every browser will have built-in generative-AI detectors. Educational work also plays an important role, because it is people who create the videos, and not the AL/ML model itself. We need to learn to understand new technologies and be able to use them, and not be afraid of them.

Impact on professions

Without a doubt, Sora will have a significant impact on the video production industry. For example, it will be possible to create high-quality advertising videos in a short time (up to a minute). But it is important to understand that in the near future, the neural network will not be able to completely replace professional video studios and creators - at this stage, Sora is not trained to create, for example, films, as well as similar high-quality and long-lasting content.

As was previously the case with the generation of pictures and texts, there will now be more video content, but its quality on average will deteriorate. But those who learn to use Sora professionally will remain in demand in the industry.

As for startups, Sora has clearly shown that generating realistic videos is more than possible. If text-to-video projects start to appear that want to fill a profitable niche, it will not be surprising. But, as often happens, success is unlikely to last. OpenAI can at any time announce new features that were not previously available in Sora, and thus again raise the bar for competitors.

Impact on advertising

The history of AI breakthroughs shows that any new technology becomes publicly available in open-source in the next year and a half after its appearance: first, large market players will begin to actively use Sora (as was the case with DALL-E and ChatGPT), and then everyone else will .

For example, Sora offers endless marketing opportunities. The ability to create personalized, high-quality video content has the potential to radically change approaches to advertising and content marketing—generative advertising has the potential to completely take over the market, including YouTube.

Sora not only sets new standards for quality video production, but also changes approaches to interaction with the audience. OpenAI was able to demonstrate the potential for the development of artificial intelligence. It is quite possible that other technological breakthroughs will soon appear on the horizon that will surprise us as well.

