Definition
What this term means
An XML file that provides search engines and AI crawlers with a structured list of all important URLs on a website, along with metadata about each page, including when it was last modified, how frequently it changes, and its relative priority. Sitemaps serve as a roadmap that helps crawlers discover, prioritise, and efficiently index your content.
Why it matters
The business impact
For AI visibility, sitemaps ensure that AI crawlers can discover all of your important pages, especially new content, recently updated pages, and deep-linked resources that might not be easily found through navigation alone. A well-maintained sitemap with accurate last-modified dates also reinforces freshness signals, which AI systems consider when deciding which sources to prioritise in their responses.
Used in context
How you might use this term
“A company with 500+ product pages noticed that AI crawlers were only indexing 120 of them. After implementing a comprehensive XML sitemap with accurate lastmod dates and submitting it through Google Search Console, full indexation was achieved within two weeks, significantly expanding their AI-discoverable product catalogue.”
Related terms
Explore connected concepts
Freshness Signals
The collection of indicators that tell search engines and AI systems how recently content was created or updated. Freshness signals include Last-Modified headers, sitemap lastmod dates, visible 'last updated' dates on pages, recent internal and external references, and the frequency of content changes detected by crawlers. Together, these signals help AI systems determine whether content is current and reliable.
Robots.txt
A plain text file placed at the root of a website that provides instructions to web crawlers about which pages and directories they are allowed or disallowed from accessing. Robots.txt is the primary mechanism for controlling how both traditional search engine crawlers and AI-specific crawlers (like GPTBot, Google-Extended, and ClaudeBot) interact with your website content.
Crawl Budget
The total number of pages that search engine and AI crawlers will fetch from your website within a given time period. Crawl budget is determined by a combination of your site's perceived authority, server performance, URL structure, and content freshness signals. Crawlers allocate their budget based on these factors, spending more time on sites they consider valuable and efficient to crawl.