Perplexing Bots
The economic conflict at the heart of Perplexity and Cloudflare's recent conflict
Perplexity and Cloudflare are two companies with very different views on the future of the internet. Recently, they came into conflict after Cloudflare published a blog post accusing Perplexity of some shady practices. The issue, the missidentification of web crawlers, seems purely technical on its surface. Yet it is a symptom of a much deeper issue: the economic structure of the internet.
Robots, bots, scrapers and web crawlers are all names for automated programs that scour websites for information, digging through search results and cataloguing content for training AI. When a bot lands on a webpage, it announces itself through its UserAgent. This is a string of text that tells the web server who sent the bot. This exchange of ID allows a publisher of content to be selective about who can scrape their site.
After it announces itself, a bot will check the server for a text file called the robots.txt. This file tells a bot if and what they are allowed to scrape. The robots.txt originated in the 1990s as an informal agreement between scrapers and site administrators. This arrangement was never codified into any regulation or law.
In their blog post, Cloudflare claims to have identified problematic behavior from Perplexity’s bots. Cloudflare alleges that if one of Perplexity’s bots were disallowed by a website, an alternate bot would be used. The alternate bot would not identify itself and would ignore the robots.txt.
Perplexity composed a scathing response. They claim that there is a clear distinction between the two bots. One is “just a bot” – presumably the one that self identifies. While the other is something that “serves the needs of real people.” In a recent Hard Fork interview, Aravind Srinivas said of the second type of bot: “it’s literally like a user delegated an AI to open these [Chrome] tabs just like how a human would on Chrome.” This is agentic AI.
Agentic AI is the new hype. The vision is that one day we will have a myriad of bots crawling the internet on our behalf and scheduling Ubers, buying groceries, and writing our emails. This future of convenience is very appealing. Yet for publishers and content creators, the agentic world may have dramatic economic consequences.
A world with agentic AI could redefine the basic economic relationships that govern the internet. A recent Pew report shows that when given an AI summary on google, people click 50% fewer links compared to people not shown the summary. Publishers have long relied on advertising revenue through direct user engagement. Your click through to their site puts money in a publisher's pocket. Agentic AI promises to further distance the user from the content and threatens advertising income.
Matthew Prince, the CEO of Cloudflare, announced a possible unique solution to this problem. Pay per Crawl seeks to create a marketplace for content. Cloudflare is a decades old security and content delivery platform that serves 20% of the internet. Their systems position them to be a middleman for a new economic model for content creators. Pay per Crawl would allow the owner of a website to set a price for their content. Bots that crawled your site would have to agree to the price set by a publisher before they could scrape your data.
In the same Hard Fork interview, Srinivas accused Prince of trying to become the new gatekeeper of the internet. In a Pay per Crawl world, Cloudflare would benefit as a middle man. However, they would not be gatekeeping content, but limiting the unfettered access that Big AI has enjoyed to date. Pay per Crawl threatens agentic dreams by adding another layer of cost to an already expensive undertaking. For instance, estimates are that training GPT-5 took between $1.7 and $2.5 billion alone.
Big publishers, like the New York Times, are protected by the relationships they can create with AI companies. A system like Pay per Crawl would empower smaller publishers and creators to profit from their work.
It’s a brave new world. We don’t yet know the full impact of AI. Any groundbreaking new technology, like the printing press for instance, threatens a lot of existing institutions. But this shaky foundation creates opportunities to dream of a better world and experiment with novel solutions.