Definition
What this term means
Automated bots operated by AI companies to discover, access, and index web content, either for model training, real-time retrieval, or both. Major AI crawlers include GPTBot (OpenAI), Google-Extended (Google), ClaudeBot (Anthropic), PerplexityBot (Perplexity), and CCBot (Common Crawl). Each crawler has different purposes and can be individually controlled through robots.txt directives.
Why it matters
The business impact
AI crawlers are the mechanism through which your content enters the AI ecosystem. If your website blocks these crawlers, your content cannot be indexed, retrieved, or cited by the AI platforms they serve. Understanding which AI crawlers exist, what they are used for, and how to configure access to them is essential for maintaining and growing AI visibility.
Used in context
How you might use this term
“A company configured their robots.txt to explicitly allow GPTBot, ClaudeBot, and PerplexityBot while monitoring their server logs to track crawl frequency. They discovered that PerplexityBot was the most active crawler, visiting key pages daily, explaining why Perplexity cited their content more frequently than other AI platforms.”