LLMs.txt Check

The LLMs.txt check validates whether your website has a properly configured llms.txt file that controls how AI training systems can use your content.

What this check validates

✅ File exists - llms.txt file is present at /llms.txt
✅ Proper format - Follows standard llms.txt syntax
✅ Valid directives - Uses correct User-agent and Disallow rules
✅ Accessibility - File is publicly accessible and not blocked
✅ Clear permissions - Specifies allowed or disallowed AI training

Why LLMs.txt matters

Content Control: Specify which content can be used for AI training
Copyright Protection: Protect proprietary or sensitive content
Licensing Compliance: Ensure AI systems respect your content usage terms
Future-Proofing: Prepare for evolving AI training regulations

What LLMs.txt looks like

Basic llms.txt file structure:

# Allow all AI systems to train on content
User-agent: *
Allow: /

# Block all AI training
User-agent: *
Disallow: /

# Allow specific content only
User-agent: *
Allow: /blog/
Allow: /docs/
Disallow: /

Common configurations

# Allow training on public content only
User-agent: *
Allow: /blog/
Allow: /docs/
Allow: /help/
Disallow: /

# Block specific AI systems
User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

# Allow with restrictions
User-agent: *
Allow: /
Disallow: /private/
Disallow: /admin/

Best practices

Be specific: Clearly define what content is available for training
Regular updates: Review and update permissions as your content evolves
Legal alignment: Ensure directives align with your terms of service
Documentation: Keep internal records of your AI training policies

Common issues

Missing file: No llms.txt file found at the root domain
Syntax errors: Incorrect formatting or invalid directives
Conflicting rules: Contradictory Allow/Disallow statements
Wrong location: File not placed at domain root (/llms.txt)
Unclear permissions: Vague or overly broad directives

LLMs.txt Check

What this check validates

Why LLMs.txt matters

What LLMs.txt looks like

Common configurations

Best practices

Common issues

links

For Who?