5 Best Video AI Generators of 2024 Compared

5 Best Video AI Generators of 2024 Compared

The last two years the spotlight has been on powerful text and image AI generators, typically chatGPT, Midjourney and DALL·E 3. As we enter 2024 powerful video AI generators are starting to produce credible and creative video clips that are catching the public eye.

As is true in text and image AI the quality bar takes leaps forward almost every month. The available platforms range from large, typically expensive, commercial offerings to scrappy open source solutions that can run on your gaming PC or even mobile phone.

Let's walk through the current state in video AI generators with some side by side comparison videos using the same text prompts.

Sora

Hot off the presses (Feb 2024) is a research preview of OpenAIs first video AI generator. Sora produces stunning 1 minute long television quality video that matches simple prompts exceptionally well and provides smooth consistent video. It still hallucinates but far less than most models (e.g. a character bites a cheeseburger but there are no bite marks) and maintains consistency over a far longer period. Despite some articles touting Sora as producing videos 'instantly' it's likely that this generator require significant time and GPU horsepower to produce each minute of video at a high cost.

Targeted for film and tv, rumor has it that OpenAI does not plan to offer it to the general public. It's in preview with select studios and videographers so unfortunately you and I cannot try it out. Google's Lumiere is a similar large model (or collection of models) but it's 5 second clips, impressive on their own, do not have the consistency and run length of Sora.

We will take two example prompts that we'll run across all 5 generators. These are taken from Sora's demos which will clearly be the high water mark. Remember that Sora is likely requiring a small squadron of machines and significant time and expense to generate each of their clips vs even the higher end commercial models.

First Prompt: 'Several giant wooly mammoths approach treading through a snowy meadow, their long wooly fur lightly blows in the wind as they walk, snow covered trees and dramatic snow capped mountains in the distance, mid afternoon light with wispy clouds and a sun high in the distance creates a warm glow, the low camera view is stunning capturing the large furry mammal with beautiful photography, depth of field.'

Second Prompt: 'A flock of paper airplanes flutters through a dense jungle, weaving around trees as if they were migrating birds'

The image of a teddy bear in the city came from a Twisty video of the same that you can find here. It's one of the better examples of smooth motion in Twisty (AnimateDiff) and worth a look.

RunwayML

Runway was one of the first commercial video AI generators to hit the market. It's large model generator is highly capable of detailed scenery although its motion of characters and object can be challenged, falling back often to camera movement rather than in scene action.

Unlike Sora this platform is available today to use at a reachable price for video buffs and with fast enough performance to keep the creative flow going. They are on version 2 and their masking and other AI editing tools are substantial.

Here is Runway's take on the wooly mammoths and the flock of paper airplanes.

Twisty (AnimateDiff)

On the other end of the spectrum from the large commercial models we have the mighty force that is open source AI. The Stable Diffusion image models, SD 1.5 and SDXL in particular, from Stability have made AI image generation available to anyone with a gaming PC or using any number of semi-free creator community websites.

Likewise open source Video AI started to develop mid-year 2023 and picked up momentum coming into 2024 with multiple open models. These models show the power of well trained small models to generate compelling video for social media, music videos, advertising and entertaining shorts.

Twisty is a set of creator community that explores the boundaries of creativity provided by AnimateDiff, a motion model that uses Stable Diffusion to generate frames of video instantly. Twisty takes advantage of the high performance and low cost of this open model to provide easy to use apps on Android and iPhone, Discord and as a GPT in OpenAi's store.

Here is a twisty take on the wooly mammoths and flock of paper airplanes prompt using Stable Diffusion 1.5.

Admittedly the quality of video is not on par with the large commercial models but the generation of video is faster and lower cost for creating many variations. In addition the rate of development of open source is exceptionally fast with new features often appearing in projects well before commercial models catch up. Here is the same prompt using SDXL which will be out on Twisty in the near future.

Pika Labs

Pika Labs is another example of a small model for video AI, although it too is a proprietary and commercial generator.

This is Pika's take on the wooly mammoth and flock of paper airplane prompts. As you can see they rely on camera zoom more than action of the airplanes while the mammoths have a more natural gait.

SVD (Stable Video Diffusion)

Stability has been working on their own video AI generator, SVD, which is open source although commercial uses requires a variable license fee. They are on version 1.1. and if you sign up for their beta program you can get an early look. The results from SVD so far tend to be on par or perhaps a bit behind Pika Labs and far behind RunwayML. Never one to underestimate Stability's ability to advance models quickly we are keeping our eye on SVD to see if it catches up to some of the commercial models this year. Hopefully they rethink their licensing along the way.

Here is SVD's take on the wooly mammoths and flock of paper airplanes. Like Pika Labs their model can tend to lean on camera motion (especially in the plane example) and zooming rather than object motion.

Summary

Video AI is advancing rapidly in early 2024 from the high production, heavy weight Sora to the quick and creative open source AnimateDiff. If AI image and text generation has their break out years in 2022 and 2023 we believe that 2024 is completely the year when AI generated video blows our collective minds!

Come explore your creativity with Twisty to get your feet wet.