$ man ai-crawlers
AI Crawlers
Specialized web crawlers that AI companies use to discover and index content for their models and search features — including GPTBot, PerplexityBot, ClaudeBot, and Google-Extended.
by Shawn Tenam
Each AI engine has its own crawler with its own quirks. GPTBot can't execute JavaScript — if your site isn't server-side rendered, OpenAI can't see it. OAI-SearchBot handles real-time citations for ChatGPT search. PerplexityBot curates sources and cites by default. ClaudeBot uses the Brave Search index. Google-Extended feeds AI Overviews. If you block any of these in robots.txt, you're invisible to that engine. I explicitly allow all of them.
All three ShawnOS sites have robots.txt files that explicitly allow every AI crawler: GPTBot, ChatGPT-User, PerplexityBot, ClaudeBot, Applebot-Extended, Google-Extended, OAI-SearchBot. The sites are server-side rendered with Next.js so JavaScript-blind crawlers can still read everything. No JavaScript-only content.