# El Cafesito By Mexicanada — robots.txt # Restaurant website: https://www.elcafesitomexicanada.ca # --------------------------------------------------------------- # Default rules for all crawlers # --------------------------------------------------------------- User-agent: * Allow: / # Legal pages (excluded from indexing) Disallow: /privacy-policy.html Disallow: /terms-of-use.html # AI-only resource — should not be indexed as a regular search result Disallow: /llms.txt # Common non-content URL patterns Disallow: /config/ Disallow: /search/ Disallow: /account/ Disallow: /api/ Disallow: /static/ Disallow: /*?*author=* Disallow: /*?*tag=* Disallow: /*?*month=* Disallow: /*?*view=* Disallow: /*?*format=* # --------------------------------------------------------------- # AI / LLM crawlers (training and answer engines) # Allow site-wide crawling but keep legal + llms.txt out of indexes # --------------------------------------------------------------- User-agent: GPTBot User-agent: ChatGPT-User User-agent: OAI-SearchBot User-agent: CCBot User-agent: anthropic-ai User-agent: Claude-Web User-agent: ClaudeBot User-agent: Google-Extended User-agent: PerplexityBot User-agent: cohere-ai User-agent: FacebookBot User-agent: Applebot-Extended Allow: / Disallow: /privacy-policy.html Disallow: /terms-of-use.html Disallow: /llms.txt # --------------------------------------------------------------- # Google Ads bots — must be allowed everywhere ads run # --------------------------------------------------------------- User-agent: AdsBot-Google User-agent: AdsBot-Google-Mobile User-agent: AdsBot-Google-Mobile-Apps Allow: / # --------------------------------------------------------------- # Crawl delay for heavy bots # --------------------------------------------------------------- User-agent: Baiduspider Crawl-delay: 10 # --------------------------------------------------------------- # Sitemap # --------------------------------------------------------------- Sitemap: https://www.elcafesitomexicanada.ca/sitemap.xml