Cloudflare’s new marketplace lets websites charge AI bots for scraping


Cloudflare announced plans on Monday to launch a marketplace in the next year where website owners can sell AI model providers access to scrape their site’s content. The marketplace is the final step of Cloudflare CEO Matthew Prince’s larger plan to give publishers greater control over how and when AI bots scrape their websites.

“If you don’t compensate creators one way or another, then they stop creating, and that’s the bit which has to get solved,” said Prince in an interview with TechCrunch.

As a means to get there, Cloudflare launched free observability tools for customers, called AI Audit, on Monday. Website owners will get a dashboard to view analytics on why, when, and how often AI models are crawling their sites for information. Cloudflare will also let customers block AI bots from their sites with the click of a button. Website owners can block all web scrapers using AI Audit, or let certain web scrapers through if they have deals or find their scraping beneficial.

A demo of AI Audit shared with TechCrunch showed how website owners can use the tool to see how AI models are scraping their sites. Cloudflare’s tool is able to see where each scraper that visits your site comes from, and offers selective windows to see how many times scrapers from OpenAI, Meta, Amazon, and other AI model providers are visiting your site.

Screenshot 2024 09 23 at 6.08.40AM
Demo of AI audit. (Cloudflare)

Cloudflare is trying to address a problem looming over the AI industry: how will smaller publishers survive in the AI era if people go to ChatGPT instead of their website? Today, AI model providers scrape thousands of small websites for information that powers their LLMs. While some larger publishers have struck deals with OpenAI to license content, most websites get nothing, but their content is still fed into popular AI models on a daily basis. That could break the business models for many websites, reducing traffic they desperately need.

Earlier this summer, AI-powered search startup Perplexity was accused of scraping websites that deliberately indicated they did not want to be crawled using the Robots Exclusion Protocol. Shortly after, Cloudflare released a button to ensure customers could block all AI bots with one click.

“That was out of frustration we were hearing, where people were feeling like their content was being stolen,” said Prince.

Some website owners told Business Insider that AI bots were scraping their websites so much, it felt like a DDoS attack was crippling their servers. Having your website scraped can not only be upsetting, but it can literally run up your cloud bill and impact your service.

But what if you wanted to block Perplexity’s bots, but not OpenAI’s? Prince tells TechCrunch that Cloudflare’s customers are asking for tools that allow them to choose what AI models have access to their sites. Cloudflare’s new tools launching today will allow customers to block some AI crawlers, while letting others through.

Even large publishers that have struck licensing deals with OpenAI – such as TIME, Condé Nast, and The Atlantic – have relatively little insight into how much ChatGPT is scraping their websites, according to Prince. Many of them have to accept what OpenAI tells them, but the answer determines if the publishers are getting a good licensing deal or not.

But Cloudflare’s marketplace, launching sometime in the next year, aims to give small publishers to strike deals with AI model providers as well.

“Let’s give all of you have the ability to do what only Reddit, Quora, and the big publishers of the world have done previously,” said Prince. “What if we let you set, effectively, a price for accessing and taking your content to ingest into these systems.”

While it’s a bold idea, Cloudflare is not sharing a fully fleshed-out idea of what its marketplace will look like. Prince says websites could charge AI model providers based on the rates at which they’re scraping individual websites, but it’s unclear how much they will really pay. Further, he says websites could charge a monetary price to be scraped, or simply ask AI labs to give them credit. The details are fuzzy.

While AI companies may not initially be excited about paying for content they currently get for free, Cloudflare’s CEO says he thinks this is ultimately good for the AI ecosystem. Prince says the current landscape, where some AI companies don’t pay for content ever, is not sustainable.



Source link

About The Author

Scroll to Top