How to Tell If What You’re Reading Was Written by Artificial Intelligence

This post is part of Lifehacker’s Artificial Intelligence Debunked series. We explore six different types of AI-generated media and highlight common features, by-products, and distinguishing features that will help you distinguish artificial content from human-generated content.

From the moment ChatGPT introduced generative artificial intelligence to the world at the end of 2022 , it became clear that in the future you would no longer be able to trust what you read was written by a human. You can ask an AI program like ChatGPT to write something—anything—and it will do it in seconds. So how can you be sure that what you read came from a human head and is not the product of an algorithm?

If the ongoing bursting of the AI ​​bubble has shown us anything, it’s that most people generally hate AI , which means they probably don’t like the idea of ​​what they read being mindlessly spat out by a machine. However, some have fully embraced AI’s ability to generate realistic text, for better or, often, for worse. Last year, CNET quietly began publishing content about artificial intelligence alongside articles written by humans , but faced scorn and backlash from its own employees. Lifehacker’s former parent company G/O Media also published AI content on its sites , albeit openly, and experienced similar backlash—both for introducing the technology without employee input, and because the content itself was simply terrible .

But not all AI-generated text makes itself known so clearly. When used correctly, artificial intelligence programs can generate compelling text, even if you can still find clues that reveal its non-human source.

How AI writing works

Generative AI is not some all-knowing digital consciousness that can answer your questions like a human would. In fact, it’s not “reasonable” at all. Modern artificial intelligence tools are based on large language models (LLMs), which are deep learning algorithms trained on huge datasets—in this case text datasets. This training takes into account all their responses to user requests. When you ask ChatGPT to write something to you, the AI ​​parses your question and determines what it thinks is the most important element of your request. It then “predicts” what the correct sequence of words will be to answer your query based on its understanding of the relationships between the words.

More powerful models are able to absorb more information at once and also return longer-lasting and natural-looking results in kind. In addition, chatbots are typically programmed with specific instructions that apply to all prompts, which, if used strategically, can potentially mask normal features of AI-generated text.

However, no matter how you coax the AI ​​to respond, it is beholden to its training, and there will likely be indications that such a piece of text was generated by an LLM. Here’s what you should pay attention to.

Keep track of frequently used words and phrases

Because chatbots are trained to look for connections between words, they tend to use certain words and phrases more often than a human. There’s no specific list of words and phrases that serve as red flags, but if you use a tool like ChatGPT often enough, you may start to notice them.

For example, ChatGPT uses the word “get into” a lot, especially during transitions in an email. (e.g., “Let’s dig into its meaning.”) The tool also likes to express how an idea “underscores” an overall argument (e.g., “This experience highlights the importance of persistence…”) and how one thing is a “testament.” to” something else. (For this section, I created three essays using ChatGPT – two with GPT-4o and one with GPT-4o mini – and a “testament” appeared in each of them.)

Likewise, you can see repeated use of words like “emerging”, “ruthless”, “innovative” and other famous regulars. In particular, when ChatGPT describes a collection of something, it often refers to it as a “mosaic” or “tapestry”. (e.g. “Madrid’s cultural landscape is a vibrant mosaic .”)

This Reddit thread from r/chatgpt highlights some of these that are generated frequently, although it’s worth noting that this post is 10 months old and OpenAI updates its models frequently, so some of them may not be as relevant today. During testing, I found that some of the most cited words in the Reddit thread did not appear at all in my test essays, while others certainly appeared quite often.

All of these words are certainly great to use when writing your own texts. If a student writes the word “go deeper” in their essay, this is not proof that they created it using ChatGPT. If an employee writes in a report that something is “evidence” of something else, this does not mean that he is outsourcing his work to AI. This is just one aspect of AI writing that you should pay attention to when analyzing text later.

Consider the writing style

It’s impressive how quickly AI can generate an answer to a query, especially when you’re working with a particularly powerful LLM. And while some of this writing may seem very natural, if you read carefully, you will begin to notice features that most human writers would not use.

Whether you use OpenAI’s GPT model or Google’s Gemini, AI over its generations has had a bad habit of using flowery language as if it were primarily trained on marketing copy. The AI ​​will often try to force what it says on you: the city it writes about is often “integral”, “vibrant” and a “cornerstone” of the country it is in; the analogy he uses “beautifully” highlights the overall argument; a negative consequence is not just bad, but “devastating.” None of these examples are killer in and of themselves, but if you read enough text about AI, you’ll feel like you’re talking to a thesaurus.

This becomes even more obvious when the chatbot tries to use a casual tone. For example, if a bot is impersonating a real person, it will often appear playful, over-the-top, and too enthusiastic to listen to anything you say. To be fair, during my testing for this article, the GPT-4o ChatGPT model didn’t seem to do this as often as before, preferring shorter responses to personal queries, but the Meta AI chatbot still does it, stepping up to the plate friend and therapist whenever I talked about my made-up problem.

If you are reading an essay or article that expresses an argument, pay attention to how the “writer” structures his or her point of view. Someone who asks an artificial intelligence tool to write an essay on a topic without paying too much attention to it often ends up with an essay that doesn’t actually delve very much into the argument. The AI ​​will likely generate short paragraphs, offering surface-level points that don’t particularly deepen the argument or advance the narrative, masking those limitations with the aforementioned $10 words and flowery language. Each paragraph may be a summary of the argument rather than an attempt to contribute to the argument itself. Remember that LLM doesn’t even know what he’s arguing about; he simply links together words that he thinks belong together.

If you feel like you walked away from a lesson without learning anything, the AI ​​may be to blame.

Fact checking and proofreading

LLMs are black boxes. Their learning is so complex that we can’t look inside and see exactly how they established their understanding of the relationships between words. What we do know is that any AI has the ability (and tendency) to hallucinate . In other words, sometimes the AI ​​just makes things up. Again, LLMs don’t actually know anything: they just predict word combinations based on their training. So while a lot of what they spew out will likely be based on truth, sometimes they predict incorrectly and you may end up with some strange results on the other end. If you are reading a piece of text and see a statement that you know is not true stated as fact, especially without a source, be skeptical.

On the other hand, consider how much proofreading the article will require. If there were no typos and grammatical errors, this is also a signal to the AI: these models can make things up, but they do not produce errors such as spelling errors. Sure, it’s possible the author made sure to dot every i and cross every t, but if you’re already concerned that the text was created by AI, then the unnatural perfectionism might be a clue.

Try an AI text detector (but they can’t be trusted either)

AI detectors, like LLMs, are based on artificial intelligence models. However, instead of training on large amounts of general text, these detectors are trained specifically on AI text. In theory, this means they should be able to recognize the AI’s text when presented with a sample. This is not always the case.

When I wrote about AI detectors last year , I warned against using them because they are not as reliable as they claim. It’s hard to say how much they’ve improved since then: when I run one of my stories through a tool like ZeroGPT , it says my article is 100% human-written. (Damn right.) If I submit an essay generated by Gemini about the significance of Harry’s loss of his parents in the Harry Potter series, the tool will determine that 94.95% of the text was generated by artificial intelligence. (The only sentence he thought was written by a human was: “This vested interest in the conflict sets Harry apart from other characters, giving him an unwavering purpose.” Of course.)

And yet, the detector still fails the same test I ran on it in 2023: it believes that 100% of Article 1, Section 2 of the US Constitution was created by artificial intelligence. Someone tell Congress! I also took on the task of analyzing this short article from The New York Times , published on July 16, 2015, long before the advent of modern LLM programs. Again, I was assured that this piece was 100% created by artificial intelligence.

There are many AI detectors on the market, and some are probably better than others. If you find one that tends to reliably identify text that you know is human-generated as such, as well as for text that you know is artificially intelligent, go ahead and check the spelling of the text you’re not sure about. But I still think the best method is to analyze it yourself. The AI’s text is becoming more realistic, but it still contains a lot of clues that give itself away – and often you’ll know it when you see it.

More…

Leave a Reply