Reddit Blocked Bing Because of Artificial Intelligence, Not Because of the Google Deal

After Reddit results started disappearing from search engines not named Google last week, the company has finally come forward to explain why, essentially downplaying the search problem and saying it’s fed up with artificial intelligence companies giving free training her data.

“We forced Microsoft, Anthropic and Perplexity to act as if all content on the Internet was free to them,” Reddit CEO Steve Huffman told The Verge . “…which has put us in a position where we can block people who are unwilling to agree to how we want our data to be used or not used.”

Huffman accused Microsoft of training its AI on Reddit data collected through Bing, and of trading and selling that data through the Bing API. Reddit’s removal of searches appears to have been largely just a byproduct of blocking that process, although the company was also unhappy that search engine Bing used AI to summarize its posts without requiring users to click through.

But how can fighting AI hinder search? While they may seem like completely different technologies, both AI and search rely on “web crawlers” that crawl the Internet and collect data that can be stored, displayed, or used elsewhere. Such technology is necessary for search engines to work the way they do, but it can also be used to train AI. When websites update their files to block web crawlers, it breaks both.

Given that Huffman spent most of his time talking about artificial intelligence, it seems that the crux of the problem is that Reddit doesn’t want companies learning from their users’ data without having any say in the process. Huffman told The Verge that companies such as Microsoft, Anthropic and Perplexity have refused to negotiate.

“Without these agreements, we have no say or knowledge about how our data is displayed or what it is used for.” The CEO said it was a “real headache” to block these companies.

Note that this does not mean that Reddit is completely altruistic. Earlier this year, the company signed a $60 million-a-year licensing deal that allows Google to train its AI to work with users’ posts, which also explains why Reddit posts continue to show up unchecked in Google searches. Likewise, OpenAI can also learn from Reddit posts , and its future SearchGPT engine will be able to link to them, although the specific dollar amount underlying Reddit’s agreement with the creator of ChatGPT has not been disclosed.

Instead of being anti-AI, Reddit wants to be part of the decision-making process about what happens to its data. Yes, and also receive a salary.

The Verge reported that Huffman cited a recent comment by Microsoft AI CEO Mustafa Suleiman as an example of the type of behavior the company is seeking to combat. Speaking to CNBC’s Andrew Ross Sorkin at the Aspen Ideas Festival, the executive said content “that was already on the open web… was ‘free’, if you will.”

This is certainly a creative interpretation of copyright law, but it’s not unique to Microsoft. Despite Google’s deal with Reddit, last July Gizmodo spotted a change to Google’s privacy policy, which states that the company uses “publicly available information” to train its AI models, without acknowledging that Google doesn’t actually own everything. what is published on the Internet. .

While it’s unclear how Google defines “public,” a new agreement with Reddit may shed some light on the topic. At this point, AI training could go from being free for everyone to a point where those who can afford to make companies pay will get their due (assuming that profiting from selling user-generated content before AI was a thing). something is considered deserved). in your eyes). Along with Reddit, The Verge’s parent company Vox Media also has a deal with OpenAI , as does The Atlantic . For the rest of us, we will have to rely on regulation that is slow to respond to AIoutside the EU .

More…

Leave a Reply