Glossary

Robots.txt

A text file at the root of your website that controls which pages search engine and AI crawlers are allowed to access.

Definition

What this term means

A plain text file placed at the root of a website that provides instructions to web crawlers about which pages and directories they are allowed or disallowed from accessing. Robots.txt is the primary mechanism for controlling how both traditional search engine crawlers and AI-specific crawlers (like GPTBot, Google-Extended, and ClaudeBot) interact with your website content.

Why it matters

The business impact

Your robots.txt configuration directly determines whether AI systems can access and index your content. Blocking AI crawlers, either intentionally or accidentally, means your content cannot be discovered, retrieved, or cited by AI platforms. Conversely, strategically allowing access while blocking staging, duplicate, or low-value pages ensures AI systems focus on your best content.

Used in context

How you might use this term

An audit revealed that a company's robots.txt was inadvertently blocking GPTBot and ClaudeBot due to an overly broad disallow rule. After updating the configuration to explicitly allow major AI crawlers while keeping staging environments blocked, their AI visibility began improving within weeks as content was re-indexed.
Ready to improve AI visibility?

Put This Knowledge Into Action

Understanding the language of AI visibility is the first step. See how your brand performs across AI systems with a free scan.