I Tried Midjourney’s AI-Powered Video Generator and It Exceeded My Expectations

While AI image generators built into chatbots may have gotten most of the attention lately, Midjourney’s dedicated AI image engine has been quietly improving and evolving since its launch three years ago. It now includes a video model, too .

According to Midjourney, this is another step toward creating an AI tool that can create a 3D world simulator in real time. The V1 was released with this end goal in mind, although it will take some time to get there.

Midjourney’s AI video editor works a little differently than other generators. You start with an image — either one generated by the AI ​​or one you own — and Midjourney creates a five-second animation from it. These short clips can then be extended four seconds at a time, up to four times in total.

As usual at Midjourney, creating content will cost you in time (Midjourney’s version of credits): a second of video costs the same as creating an image, and Midjourney’s pricing plans start at $10 per month and go up.

Making Videos in Midjourney

To create a video in Midjourney, you first need to create an image via the web interface . Type a query into the box at the top, using the sliders on the right to set some parameters, such as the aspect ratio. Be as specific as possible in your query (there are more tips here ), then press Enter (or click the submit icon) to launch it.

As usual, Midjourney presents you with several results from your query, along with options for creating them. Included are now four animation options for creating videos. Your first decision is to choose Auto (Midjourney chooses the motion to add) or Manual (you describe the motion you want).

The second solution is to choose Low Motion (there is limited movement) or High Motion (where everything in the frame is moving and there is a higher chance of glitches). Once you have made your choice, you can edit the prompt again (if you have chosen Manual) and the video will be created. As with images, you will see several options presented.

Click on any of the generated videos and you’ll see the same four animation options, only now they’re designed to further extend the video – which you can do a total of four times. You can mix automatic and manual sections, as well as low-motion and high-motion sections, to create the clip you’re looking for.

You’ll find your video’s upload options above the prompt on the right: you can upload a raw video or a social media-optimized version (which combats some of the compression that occurs when videos are posted to these platforms). You can start over by clicking on the original prompt and then make changes to it.

Midjourney is an impressive AI image generator, and its videos reach the same standard. I tried creating a sci-fi cityscape and a natural landscape animation, and the end results were mostly consistent and logical, while following the prompts closely. Some of the typical AI-generated video oddities are present, such as weird physics, but even at this early stage, the V1 model is polished and capable.

In these clips, you can see both the limitations and benefits of the Midjourney approach: each four-second segment flows smoothly into the next, but you won’t have enough time to do what you want to do in your video if you work in four-second chunks, and as the video progresses, you tend to lose some of the detail and richness that was in the original image.

What do you think at the moment?

Comparison of Midjourney with Sora and Gemini

If you pay OpenAI $20 or more per month for ChatGPT, you also have access to Sora . Like Midjourney, Sora lets you start a video from an image (AI-generated or otherwise) or from a new prompt.

I had Sora build a futuristic sci-fi city and animated landscape images I created in Midjourney, with mixed results. The scene seemed more interesting, but it had more weirdness, such as unnatural movements and glitchy backgrounds (especially with the animation, which became really weird).

You can use Sora to create videos up to 20 seconds long, but it has less control over how the scene unfolds than Midjourney: you simply enter your prompt, and then get what you get back. At least for casual projects, Midjourney feels like a more accessible tool, capable of more realistic results.

I also tried creating the same scene in Google’s Veo 2 via the Flow online app . Flow lets you base your videos on images and extended scenes while maintaining a flow, like in Midjourney (you don’t get the same features with the Veo 2 in the Gemini app). Overall, I’d say this gave me the closest results to what I was looking for, though there were still some inconsistencies and oddities.

You can see that the flying car really does descend believably through the cityscape, and the clear instructions are followed accurately. When it comes to the animation, flying through the cartoonish landscape, the results from Google Flow and Veo 2 were the best of all – although again you can see that you gradually lose some of the richness and detail present in the original image.

If your AI movie-making ambitions are a little more ambitious, Google’s tools may be the best fit, though again, there’s a price: video generation and access to Flow will set you back $20 or more per month. You can also pay $250 for Google’s AI Ultra plan, which gives you expanded access to the more advanced Veo 3 model, complete with audio (though the Veo 3 can’t yet create video from a still image).

While it’s not the largest sample size, the quality of Midjourney’s clips is clear, and the approach to video creation is simple and intuitive. Google Veo 2 remains the best choice for overall quality, while Sora is still quite chaotic and unpredictable. You’ll have to spend much more time with the OpenAI model to get acceptable results.

More…

Leave a Reply