Photo by Andrew Neel on Pexels
Reddit is taking a firm stance against AI data harvesting by blocking the Internet Archive from indexing the majority of its content. The decision stems from Reddit’s belief that AI companies are exploiting the Wayback Machine to circumvent platform policies, particularly those related to privacy and content removal. While the Internet Archive will retain access to Reddit’s homepage, it will be restricted from indexing post details, comments, and user profiles.
This move follows Reddit’s established practice of limiting access to data scrapers. The platform has recently forged partnerships with AI giants like Google and OpenAI for authorized data usage. Furthermore, Reddit is currently engaged in legal action against Anthropic, accusing them of unauthorized data scraping. This latest action underscores the escalating friction between online platforms, digital archives, and AI developers over the ethical and legal parameters of data utilization. The Internet Archive has yet to issue a statement regarding the development.