How to Tell If a Video You’re Watching Was Created Using Artificial Intelligence
This post is part of Lifehacker’s Artificial Intelligence Debunked series . We explore six different types of AI-generated media and highlight common features, by-products, and distinguishing features that will help you distinguish artificial content from human-generated content.
AI companies are excited about video generators, touting the creative possibilities these new models offer and enjoying how impressive the end results can be. However, in my opinion, technology that allows anyone to create realistic videos with a simple prompt is not interesting or promising, but rather scary. Do you really want to live in a world where every video you see on the internet could have been created out of thin air using artificial intelligence? Whether you like it or not, this is where we are going.
When you give bad actors the tools to manipulate videos to the point that many, if not most people will believe they are real, you are adding gasoline to a fire that has been burning since the first person lied on the Internet. Now more than ever, it is important to be vigilant about what we see online and critical of any videos that purport to reflect reality, especially when that reality is intended to provoke us or influence our worldview.
AI videos aren’t all the same
There are actually two types of video AI worth paying attention to right now. The first are videos created entirely by artificial intelligence models, entire sequences that do not use real footage and that never existed before they were created. Consider OpenAI’s Sora model , which is capable of displaying short but high-quality videos that can easily fool people into thinking they are real. Luckily for us, Sora is still in development and not yet available to the public, but there are other tools that people in the know can use to create these videos from scratch.
What’s more relevant at the moment, and more important in terms of short-term consequences, are AI- altered videos. Think about deepfakes : real-life videos that use AI to superimpose one person’s face onto another’s, or to alter a real face to match the manipulated audio content.
We’ll look at ways to detect both types of AI video content: As AI video generators improve and become more accessible, you may start to see these videos popping up online just as AI images have exploded. Be careful.
How AI video generators work
Like other generative AI models, video AI generators receive a huge amount of data to operate. While AI image models are trained on individual images and learn to recognize patterns and relationships in static pieces, AI video generators are trained to look for relationships between multiple images and how those images change sequentially. After all, a video is simply a series of individual images played at a speed that creates the illusion of movement. If you want a program to create videos out of nowhere, you need it to not only be able to create objects in those videos, but also to know how those objects should change frame by frame.
Deepfake programs are specially trained to work with faces and are designed to imitate the movements and emotions of the video they overlay. They often use a generative adversarial network (GAN) , which pits two AI models against each other: one that generates AI content, and another that tries to determine whether that content is generated by AI. On the other hand, a model like Sora is theoretically capable of generating videos for almost any request. Sora is a so-called diffusion model that adds “noise” (truly static) to the training data until the original image disappears. From here, the model will try to create a new version of this data from the noise, which will train it to create new content from scratch.
Full video creation using artificial intelligence is still in its infancy, and while deepfake technology is good, it is not very good. There are limitations here that may not be present in future versions of these technologies, but for now there are clues you can look for to determine if the video you’re watching is real or if it’s something manipulated.
The faces don’t look quite right
The technology of superimposing one person’s face onto another’s is impressive, but it’s far from perfect. In many (if not most) cases, a deepfake will have obvious signs of being fake. It often looks like a mediocre photoshop: the face doesn’t blend with the rest of the person’s head, the lighting doesn’t match the scene it’s in, and it all has a weird valley effect.
If you watch a video of a famous person saying or doing something controversial, take a close look at their face: is it possible that AI played a role? This video of “President Obama” saying ridiculous things shows some of the flaws. This deepfake was made six years ago, but exhibits some of the noticeable visual flaws that this type of AI-altered video is known for:
Mouths don’t match speech
Likewise, another drawback of current deepfake technology is how difficult it is for it to match the fake face’s mouth movements to the underlying speech, especially if the speech is also artificial.
Check out this deepfake of Anderson Cooper from last year: The fake face is more realistic than the Obama video above, but the lip movements don’t match the speech they gave to A.I. Anderson:
Many of the deepfakes circulating on social media are so poorly made that they are obvious AI garbage if you know what you’re looking for. A lot of people don’t, so they see a video of a politician saying something they don’t like and assume it’s true, or they’re interested enough that they don’t care.
Look for glitches and artifacts
Similar to AI image generators, AI video generators produce videos with strange glitches and artifacts. You may notice leaves on a tree flickering as the camera zooms in on them, or people walking in the background at a different frame rate than the rest of the video. Although the video below seems realistic at first glance, it is full of such errors, especially in the trees. (Also notice how the cars on the road to the left keep disappearing.)
But the worst of all? Deepfakes. These videos often look terrible , as if they have been uploaded and re-uploaded 1000 times, losing all fidelity. This is done intentionally, in an attempt to disguise the flaws present in the video. Most deepfake videos would give themselves away in an instant if they were presented in 4K, as the high-resolution video would highlight all of their aforementioned flaws. But when you lower the quality, it becomes easier to hide these flaws and thus easier to fool people into believing the video is genuine.
Physics disabled
A video camera will capture the world as it is, at least as much as the camera’s lens and sensor can. On the other hand, the AI video generator creates a video based on what it has seen before, but without any additional context. He doesn’t really know anything, so he fills in the blanks as best he can. This may result in some wonky physics in the AI-generated video.
Sora, for example, created a video of a church on a cliff on the Amalfi Coast. At first glance, this looks quite convincing. However, if you focus on the ocean, you will see that the waves are actually moving away from the shore, in the opposite direction that they should be moving.
The generator also created a compelling video of a person running on a treadmill. The important point here is that the man is running “forward” with his back turned away from the treadmill, as the model does not understand exactly how treadmills are supposed to work. But if you look closely, you can see that the man’s gait is abnormal: as if the upper half of his body stops from time to time, while the lower half continues to walk. In the real world this would be impossible, but Sora doesn’t really understand how running physics are supposed to work.
In another video, “archaeologists” discover a plastic chair in the desert sands, pull it out and shake off the dust. While this is a complex model request, and it does capture some realistic moments, the physics involved in this whole endeavor are far from ideal : a chair appears out of thin air, the person holding it carries it in a way that no one ever could. , and the chair floats away on its own, eventually transforming into something else entirely by the end of the clip.
Too many limbs
The AI models that create this video content don’t understand how many limbs you should have. They associate the movement of limbs between frames, but they don’t quite understand that there need to be the same limbs throughout the scene.
That’s why you’ll see arms, legs, and paws appear and reappear throughout the video. While this doesn’t always happen, you can see it in this video of Sora : As the “camera” tracks the women walking forward, a third arm is swaying in front of it, visible between her left arm and her left side. . It’s subtle, but that’s what artificial intelligence video generators will do.
In this example, take a close look at the cat: towards the end of the clip, it suddenly generates a third paw because the model doesn’t realize that this kind of thing doesn’t usually happen in the real world. On the other hand, when a woman turns over on the bed, her “arm” seems to turn into a sheet.
Things just don’t make sense
Extra limbs don’t make much sense, but in AI videos they often mean more than that. Again, these models don’t actually know anything: they’re just trying to reproduce a hint based on the dataset they were trained on. They know, for example, that there must be a lot of stone stairs in a town on the Amalfi Coast, but they don’t seem to understand that those stairs must lead somewhere. In OpenAI’s demo video, many of these stairs are arranged haphazardly and have no real purpose.
In the same video, look at the “people” in the crowd. At first glance, it may look like a bunch of tourists strolling through the city, but some of them disappear into thin air. Some look like they’re walking down stairs, only they don’t use stairs to nowhere: they just “climb down” on level ground.
Look, it’s long been important to take the things you see online with a grain of salt. You don’t need artificial intelligence to write misleading blog posts that distort the truth or manipulate a video clip to present a story the way you want. But AI video is different: not only is the video itself manipulated, but all of this may never have happened. It’s a shame that we have to be so cynical about the internet (and the world), but when one clue can create an entire video out of nothing, what other choice do we have?