All posts
Technical·5 min read·

Which AI Bots Should You Allow in robots.txt?

A breakdown of the 10 major AI crawlers and why blocking them hurts your AI visibility.

Your robots.txt file controls which bots can crawl your website. Many sites unknowingly block AI search bots, making their content invisible to AI-powered search engines.

The 10 Major AI Crawlers

| Bot | Company | Service |

|-----|---------|---------|

| GPTBot | OpenAI | ChatGPT Search |

| ChatGPT-User | OpenAI | ChatGPT browsing |

| ClaudeBot | Anthropic | Claude AI |

| Claude-Web | Anthropic | Claude web search |

| PerplexityBot | Perplexity | Perplexity AI |

| Google-Extended | Google | Gemini / AI Overviews |

| Amazonbot | Amazon | Alexa / Amazon search |

| Applebot-Extended | Apple | Apple Intelligence |

| cohere-ai | Cohere | Cohere models |

| Bytespider | ByteDance | TikTok / Doubao AI |

Why You Should Allow AI Bots

If you block GPTBot, for example, ChatGPT's search feature won't be able to access your content. When someone asks ChatGPT a question your content could answer, it simply won't know about you.

The same applies to every other AI bot. Each one powers a different AI assistant, and blocking it means that assistant can't recommend your website.

How to Check Your robots.txt

Look for rules like:

User-agent: GPTBot

Disallow: /

This blocks GPTBot entirely. To allow it, either remove the rule or change it to:

User-agent: GPTBot

Allow: /

Should You Block Any AI Bots?

For most websites, the answer is no. Allowing AI bots doesn't give them permission to train on your content — it just lets them access your public pages to provide answers in their search features.

The only exception might be if you have content behind a paywall or have specific licensing concerns. Even then, you can allow access to certain paths while blocking others.

Quick Check

Use our free AI Bot Checker tool to instantly see which AI bots can access your website and which ones you're blocking.

Scan your site

See how the ideas in this post map to your URLs. Free, no account.

Run a free scan