IT service management company Cloudflare on March 19 introduced a free tool to confuse AI web crawlers, keeping them from cluttering sites with bot traffic or scraping information. The AI Labyrinth uses generative AI to create fake pages, distracting and identifying the bots.

Users must be Cloudflare customers to use AI Labyrinth, but the tool is available starting in the free tier.

What AI crawlers do and how they hurt businesses

“Like any newer tool it (generative AI) has both wonderful and malicious uses,” wrote Cloudflare’s Senior Director of Product Reid Tatoris, Product Manager Harsh Saxena, and Senior Software Engineer Luis Miglietti in a blog post.

Large AI companies built their fortunes on scraping content from the internet to train their models. Plenty of website owners have reasons to want to prevent such scrapers, and multiple techniques exist to do so.

As pointed out by Unite.AI, swarms of bots can slow down servers, increasing hosting costs and reducing page load times. If an AI generates content too close to existing website content, the duplicate can reduce the original site’s SEO rankings. Some content creators and organizations might want to block AI scrapers from their sites due to copyright concerns, such as those recently raised by creators after the news that Meta used a library of pirated content to train AI.

How AI Labyrinth fights back

Cloudflare’s AI Labyrinth has two main purposes: blocking AI crawlers and identifying bots. It works by embedding hidden links in a protected website. The bot will follow those links to premade websites filled with AI-generated content. That content contains real scientific information – the Cloudflare personnel said they didn’t want to contribute to AI-generated misinformation – but not topics related to the real website the bot is crawling. All of the information on the premade websites is publicly available. Therefore, the bots will waste time crawling information they already know. The original website stays untouched, while the AI companies waste resources, Cloudflare said.

The second purpose, identifying bots, is possible because only bots will engage with the labyrinth links; using that information, Cloudflare can monitor new bot patterns and signatures. It’s automation all the way down: Information about those bots feeds back into Cloudflare’s machine-learning system to analyze more crawlers.

Humans won’t see the maze of AI slop

Cloudflare put several guardrails in place to make sure the cure isn’t as bad as the disease. Users can’t see the content meant for the bots; therefore, AI Labyrinth doesn’t add more generic slop to the web. Search engines can’t index the AI-generated pages because Cloudflare added meta directives to that end, so the fake sites won’t affect SEO rankings.

Cloudflare users will see a toggle marked AI Labyrinth in their control menu. Go to Security | Bots or Security | Settings to turn it on.