The Best AI Video Generators (and How They Compare to Each Other)

AI video generators are quickly improving and becoming more accessible: Google’s Veo 2 is now built into the Gemini app for anyone who pays for Google’s One AI Premium plan . Like OpenAI’s Sora, Runway, Adobe’s Firefly and others, Veo 2 allows you to create professional-looking videos using just a text prompt.

Now that Veo 2 is available to paying users, it seems like a good opportunity to test these different AI video generators against each other, compare their strengths and weaknesses, and evaluate where we are with video AI in general. We keep being told that these tools will change filmmaking , or at least fill the Internet with AI garbage , but are they really useful in practice?

Microsoft seems to think so, using it in a recent ad . However, only parts of the clip were created by artificial intelligence—those with fast frames and limited movement, where hallucinations are less likely to occur or be noticed.

For the purposes of this guide, I’m going to look at Google Veo 2 and compare it to Sora, Runway, and Firefly. There are other video generators available, but these are the four most famous: Each of them costs money to access (starting at $20 per month), so you’ll need to sign up for at least a month to play with them.

Bouncing balls

If you’re as old as I am, you’ll remember the incredible ad Sony did to promote its new 1080p Bravia TVs in 2005 (see above). While the cameras were rolling, over 100,000 beach balls were dropped onto the steep streets of San Francisco, and it was a spectacular sight ( the behind-the-scenes story is pretty funny, too).

This is a real challenge for the AI, requiring a lot of physics and movement. The prompt I used was: “Thousands of individual bright balls bouncing down a steep, sunny San Francisco street in slow motion. The camera moves carefully down the street as the balls bounce down, past trees and parked cars.”

Google’s Veo 2 attempt isn’t bad. There’s some weird physics going on here, but it looks natural enough and can work as a short clip if you don’t look too closely. The background elements are well rendered and the instructions in my original tip were followed fairly closely.

Sora doesn’t seem to understand what scene he’s supposed to be displaying. Sure, there are colored balls, but they move like a confusing mess and defy gravity. The speed of the video is fine, even if it’s moving in the opposite direction to what I requested, and the background portions of the video generally look fine.

“Runway” captures the atmosphere pretty closely when you compare it to the original Sony clip, but again there are a few problems: the balls are completely inconsistent, the movement isn’t what I asked for, and it looks like it’s being watched by an alien from the window in the top right corner. The street looks pretty cool though.

Firefly is probably the worst of the bunch here. Most of the balls are stationary, and those that move are not drawn very well. The street looks okay, but there’s nothing special about it – it definitely has a retro video game vibe to it. Just like in the “Sora” video, the camera takes me down the street when I really wanted to go down.

Scene from Jurassic Park

If AI is going to replace real people making movies, then it must be able to create scenes as powerful as the “Welcome to Jurassic Park” scene in Spielberg’s 1993 film: the moment when Richard Attenborough as John Hammond shows off the dinosaurs to his visitors for the first time (see above).

I was curious to see what the AI ​​had to say about this scene. The clue was: “At the top of the hill, two paleontologists are walking slowly through the grass. The camera pulls back for a wider shot, revealing a wide clearing and a lake below. Dinosaurs walk slowly through the lake and trees.”

The Google Veo 2 clip looks pretty good. The camera doesn’t really move the way I described, and the paleontologists don’t wobble around much (and they’re not on a hill either), but the scenery looks good and the dinosaurs look okay. Overall it’s a pretty generic approach, but it’s a worthy attempt.

Sora goes crazy at this clue. The camera movements are choppy and don’t follow my instructions, and the dinosaurs look like strange shape-shifting creatures. The best thing I can say about this piece is that all the elements I described are included and the surrounding scenery is done quite well.

As for Runway, it was probably the closest to what I wanted when it came to camera movements and the overall feel of the scene. The lake and dinosaurs look realistic enough, but it’s by no means a perfect rendering—where does the red-shirted paleontologist go?

This is another failed attempt by Firefly. I’m not sure he knows what paleontologists are, and dinosaurs are very small. The lake and surrounding forest are rendered to a high standard, even if there is a noticeable AI sheen throughout the frame. The camera movements are translated well here.

Scene “Living Daylights”

And then there’s Bond and Kara’s memorable border crossing scene in The Living Daylights , where they slide down a snowy mountain on a cello case (above). I don’t have to hire Timothy Dalton or Maryam d’Abo, learn how to use a camera, or go to Austria because AI can do the whole scene for me.

The clue for this was: “A man and a woman in winter clothes are gliding along a snowy road on a cello case. There is a barrier on the road, and when they reach it, both characters dive under it.”

What are your thoughts so far?

Google Veo 2 handles this pretty well, all things considered – the scene looks mostly realistic and fun, and looks a bit like a cello case. We do have to ignore two people walking through a road barrier as if there isn’t one, but at least the barrier is there (which other AI models couldn’t catch).

Back to Sora, and again, it’s not that scary. Okay, it’s not exactly a cello case, and of course two people will be looking forward, but the snowy road and surrounding trees look good – it’s a spectacular scene. Where’s my roadblock, Sora? I want to see these people dive under it.

As for Runway, whatever videos he learned from, they certainly weren’t videos of people riding down mountains on cello cases. People merge with each other, elements in the frame change shape, and it looks strange. However, snowy landscapes and real live snow effects look good.

Who knows what Adobe Firefly thinks about this. The physics in this game make absolutely no sense, the characters are inconsistent, and there are no road obstacles to dive under. It’s actually unpleasant to watch. But we have a snowy road, a cello case and two people in the video.

No clear winner

I think the Veo 2 videos overall impressed me the most, although Runway feels realistic more often than not. Overall we have a lot of problems with physics, realism and operational interpretation. These are all clearly AI videos, with many strange features and inconsistencies.

I didn’t expect these AI generators to come close to the quality of professional advertising or movies: it’s simply not possible to recreate these scenes using just a text prompt and a few minutes of time and effort. I’m not trying to cheaply evaluate these tools, which are obviously very smart, but rather pointing out some fundamental problems with AI video.

These balls don’t bounce. Photo credit: Adobe Firefly/Lifehacker

With more careful work and experience I could probably get something that looked a lot better, and it’s obvious that these video generators will improve over time. Who knows what they will be able to produce in five or ten years? If you watch the videos provided on these platforms, you will see that great results are possible.

However, I personally am not sure that these AI tools will ever completely replace traditional film work, no matter how well trained they are. To get something like Sony’s AI ad, you’d have to write a bunch of incredibly detailed clues, and even then you might not get what you wanted. Will AI come up with a frog jumping out of a sewer? Sure, the results are quick and easy, but you’re handing over most of the creative decisions to the AI. These videos appear to be computer generated.

One of these people is about to disappear. Credit: Runway/Lifehacker

AI doesn’t actually know how a ball bounces, what a dinosaur looks like, or which way people should be looking as they slide down a snowy road on a cello case. It approximates and calculates based on all the videos it has seen before, and these imperfections show up in videos much more often than in images or text. You’ll notice that most AI videos, including the examples above, don’t have elements that move in and out of the frame because the AI ​​will likely forget what they look like if they’re not visible.

And I didn’t even have space here to cover copyright issues or the cost of energy to the planet. No doubt we’ll see more AI-generated commercials and short films as time goes on and the technology improves, but it’s worth returning to the famous warning from Jurassic Park: Because we’re so concerned about whether we can do it, we can’t stop thinking about whether we should do it.

Disclosure: Lifehacker’s parent company, Ziff Davis, filed a lawsuit against OpenAI in April, alleging that it violated Ziff Davis’ copyrights in the training and operation of its artificial intelligence systems.

More…

Leave a Reply