All free tools
Free tool · Crawler control

AI Crawler robots.txt Generator

Decide which AI crawlers can read your site - and generate the robots.txt rules to enforce it.

100% free · No sign-up · Runs in your browser

All AI crawlers allowed
GPTBotOpenAI

Trains ChatGPT models

OAI-SearchBotOpenAI

Powers ChatGPT search results

ChatGPT-UserOpenAI

Fetches a page when a user asks ChatGPT about it

ClaudeBotAnthropic

Trains Claude models

anthropic-aiAnthropic

Legacy Anthropic crawler token

Claude-WebAnthropic

Fetches pages for Claude answers

Google-ExtendedGoogle

Controls Gemini / Vertex AI training

Applebot-ExtendedApple

Controls Apple Intelligence training

CCBotCommon Crawl

Open dataset that many AI models train on

BytespiderByteDance

Trains ByteDance / TikTok AI

AmazonbotAmazon

Powers Alexa and Amazon AI features

Meta-ExternalAgentMeta

Trains Meta AI / Llama models

Output
# robots.txt - AI crawler controls
# Generated free with Outcited - https://outcited.co/free-tools/ai-robots-txt-generator
# Note: robots.txt is a voluntary standard. Reputable AI crawlers honor it; not every bot does.

# OpenAI - Trains ChatGPT models
User-agent: GPTBot
Allow: /

# OpenAI - Powers ChatGPT search results
User-agent: OAI-SearchBot
Allow: /

# OpenAI - Fetches a page when a user asks ChatGPT about it
User-agent: ChatGPT-User
Allow: /

# Anthropic - Trains Claude models
User-agent: ClaudeBot
Allow: /

# Anthropic - Legacy Anthropic crawler token
User-agent: anthropic-ai
Allow: /

# Anthropic - Fetches pages for Claude answers
User-agent: Claude-Web
Allow: /

# Google - Controls Gemini / Vertex AI training
User-agent: Google-Extended
Allow: /

# Apple - Controls Apple Intelligence training
User-agent: Applebot-Extended
Allow: /

# Common Crawl - Open dataset that many AI models train on
User-agent: CCBot
Allow: /

# ByteDance - Trains ByteDance / TikTok AI
User-agent: Bytespider
Allow: /

# Amazon - Powers Alexa and Amazon AI features
User-agent: Amazonbot
Allow: /

# Meta - Trains Meta AI / Llama models
User-agent: Meta-ExternalAgent
Allow: /

Paste this into the robots.txt at the root of your site (yourdomain.com/robots.txt). If you already have one, merge these User-agent blocks into it.

Most guides tell you to block AI. For GEO, the opposite is usually true.

There's a wave of advice telling site owners to slam the door on AI crawlers. That makes sense if you're protecting paid or proprietary content. But if your goal is to show up when buyers ask ChatGPT, Claude, or Gemini for recommendations, blocking those crawlers makes you invisible at exactly the moment that matters.

This generator gives you precise, per-crawler control. Allow the bots that help you get cited, block the ones you don't want training on your content, and export a clean robots.txt you can paste straight in. Each crawler is labelled with who runs it and what it does, so you're never guessing.

How to use it

  1. 1Start from the default (all allowed) or hit Block all, then fine-tune.
  2. 2Toggle any individual crawler between Allowed and Blocked.
  3. 3Optionally add your sitemap URL so crawlers find your pages faster.
  4. 4Copy the output into your robots.txt, or download the file.

Frequently asked questions

Should I block or allow AI crawlers?+

It depends on your goal. If you want to be discovered, cited, and recommended by AI assistants - which is the whole point of GEO - you generally want to ALLOW crawlers like GPTBot, ClaudeBot, and Google-Extended. Blocking is for content you specifically don't want used for training or retrieval. This tool defaults to allow for that reason.

What's the difference between training and retrieval bots?+

Training bots (GPTBot, ClaudeBot, Google-Extended, CCBot) collect pages to train future models. Retrieval bots (ChatGPT-User, OAI-SearchBot, Claude-Web) fetch a page live when a user asks about it. Blocking retrieval bots can stop AI assistants from quoting your current page in real time - usually the opposite of what you want.

Does robots.txt actually stop AI crawlers?+

robots.txt is a voluntary standard. The major, reputable AI crawlers honor it, but it is not a hard security boundary - it can't force a non-compliant bot to obey. For pages that must stay private, use authentication, not robots.txt.

Where do I put these rules?+

In the robots.txt file at the root of your domain (yourdomain.com/robots.txt). If you already have one, merge the User-agent blocks from this tool into it rather than replacing the whole file.

Outcited

Stop being invisible to AI.

Track, create, publish, monitor - on autopilot. Start your free 7-day trial.