Journalists Accuse This AI Chatbot of Stealing Their Work

Google introduced AI reviews in search results shortly after Google I/O in May, but it wasn’t the first AI search game. This has already given Gemini the ability to search the Internet , and Meta and other competing artificial intelligence companies have done the same with their own models. One of the biggest players in this space has been Perplexity, which bills itself as a ” conversational search engine “—essentially another web-enabled chatbot, but with an even greater emphasis on summaries and current events. Unfortunately, Perplexity is currently in trouble for breaking the rules and, like Google , is returning wrong answer after wrong answer.

On June 11, Forbes published an article accusing Perplexity of stealing its content to quickly rewrite original articles without sources and pass them off as their own. The artificial intelligence company did the honest thing and adapted Forbes reports into podcast form. Shortly after, Wired published an exposé on Perplexity, accusing it of “bullshit” and violating widely accepted internet rules (more on that in a moment). We’re now learning a lot more about what recent data AI can use to make further progress, and why AI often makes so many mistakes when trying to summarize current events.

Perplexity is accused of violating a long-standing internet rule

Bots are nothing new on the Internet. Before AI scraped websites for educational materials, search engines crawled websites to determine where to place them in search results. This led to the creation of a standard called the Robot Exclusion Protocol, which allows developers to specify which parts of their site they do not want bots to access. Perplexity claims to follow this rule , but, spurred by the Forbes story and allegations of rule violations by developer Robb Knight , Wired conducted its own investigation . What he found did not flatter Perplexity.

“Wired provided Perplexity’s chatbot with the headlines of dozens of articles published on our site this year, as well as tips on Wired reporting topics,” the Wired article said. According to the investigation, the bot would then return responses “exactly paraphrasing Wired stories,” complete with original Wired illustrations. Additionally, it summarized the stories “inaccurately and with minimal attribution.” Examples include a chatbot wrongly accusing a police officer of stealing bicycles and, in a test, responding to a request to summarize a one-sentence web page with a completely fictitious story. about a young girl going on a fairytale adventure. Wired concluded that Perplexity’s reports were the result of a flagrant AI violation of the Robot Exclusion Protocol, and that its inaccuracies likely resulted from an attempt to circumvent that rule.

According to Knight and Wired, when users ask Perplexity questions that require the bot to summarize an article protected by a robot exclusion protocol, a specific IP address running a so-called automated web browser will access websites that the bots should not be used. to scratch. The IP address could not be traced with complete certainty to Perplexity, but its frequent association with the service was suspicious.

In other cases, Wired recognized traces of its metadata in Perplexity responses, which could mean that the bot may not be reading the articles itself, but accessing traces of them left in URLs and search engines. They won’t be protected by the Robot Exclusion Protocol, but they contain so little information that they are more likely to lead to AI hallucinations – hence the problem with misinformation in AI search results.

Both of these issues foreshadow a battle over the future of AI in search engines, from both an ethical and a technical perspective. Even as artists and other creators debate AI’s right to remove old works , accessing works created just days ago exposes Perplexity to further legal risk.

Perplexity CEO Aravind Srinivas issued a statement to Wired saying that “Wired’s questions reflect a deep and fundamental misunderstanding of how Perplexity and the Internet work.” Meanwhile, Forbes reportedly sent Perplexity a letter this week threatening legal action for “willful infringement” of its copyrights.

More…

Leave a Reply