Reddit CEO: Microsoft and other companies must pay to scrape data

WBOY
Release: 2024-08-01 15:17:23
Original
721 people have browsed it

According to news from this website on August 1, Reddit CEO Steve Hoffman recently stated that if companies such as Microsoft want to continue to crawl the website’s data, they must pay. Reddit has previously reached agreements with Google and OpenAI.

Reddit CEO:微软等公司必须付费才能抓取数据

Image via Pexels
Image via Pexels
Hoffman noted that without these agreements, Reddit has no control or visibility into how its data is used, forcing them to block companies that are unwilling to accept the terms on which their data is used. He singled out three companies, Microsoft, Anthropic and Perplexity, for refusing to negotiate and called blocking them "very troublesome."
In recent months, Reddit has been increasing its efforts to crack down on scrapers. In early July, Reddit updated its robots.txt file to block unauthorized web crawlers. It was later discovered that Reddit content only appeared in Google search results and was not visible on other search engines such as Bing.
Hoffman accused Microsoft of using Reddit data to train AI without authorization, summarizing Reddit content in Bing search results, and even selling this data to other search engines through the Bing API. He also responded to Microsoft AI chief Mustafa Suleiman’s previous remarks about Internet public data being “free software,” saying that companies such as Microsoft believe that all content on the Internet is free for them to use and that this is theirs. True position.
This site noticed that in response to the disappearance of Reddit search results from Bing, Microsoft search director Jody Ribas said on social media that Reddit blocked Bing’s crawlers and favored another search engine, affecting Bing and Bing-based Search engine competition. Microsoft spokesperson Caitlin Lawton also said the company respects websites’ wishes not to have their content used in generative AI models.
Hoffman used OpenAI’s SearchGPT as an example to emphasize the importance of paid agreements. Earlier this year, Reddit and OpenAI reached an agreement to allow SearchGPT to display Reddit content. Reddit spokesman Tim Rutschmidt said that none of the current content licensing agreements involve exclusive data use rights.
Reddit’s request for payment is similar to traditional media publishers who also hope to gain revenue from allowing content to be used for generative AI. Hoffman believes that the traditional value exchange of search engines has changed. Search, summary and training are integrating, and the model of simply relying on crawling content in exchange for traffic has become blurred.

The above is the detailed content of Reddit CEO: Microsoft and other companies must pay to scrape data. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:ithome.com
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template