Securing AI Systems: Hardening Your LLM Infrastructure
You've secured your web app, your APIs, your cloud infrastructure. Now you're running LLMs in production. The attack surface is different and most teams are not thinking about it carefully enough.
The AI attack surface
A production LLM deployment has several distinct attack surfaces:
- The model itself — adversarial inputs, extraction attacks
- The system prompt — exfiltration, injection via user input
- The tool/function layer — privilege escalation via the agent's capabilities
- The retrieval layer — poisoned documents in RAG, indirect injection
- The output layer — generated malicious content, data exfiltration via responses
Hardening the deployment
# docker-compose for a hardened LLM service
services:
llm-api:
image: your-llm-service:latest
environment:
# Never expose raw model access
MODEL_ENDPOINT: "internal-only"
# Rate limiting
MAX_TOKENS_PER_MINUTE: "100000"
MAX_REQUESTS_PER_USER: "100"
networks:
- llm-internal # isolated network
read_only: true
cap_drop:
- ALL
security_opt:
- no-new-privileges:true
System prompt security
Your system prompt is a secret — treat it like one:
- Never include credentials, internal hostnames, or sensitive business logic
- Test whether your prompt can be extracted via user queries
- Use a content security layer that strips prompt-extraction attempts
# Basic prompt injection filter
injection_patterns = [
r"ignore.{0,20}(previous|above|prior).{0,20}instructions",
r"(reveal|show|print|output).{0,20}(system|original).{0,20}prompt",
r"you are now",
r"new persona",
]
def sanitize_input(text: str) -> str:
for pattern in injection_patterns:
if re.search(pattern, text, re.IGNORECASE):
raise SecurityException(f"Prompt injection attempt detected")
return text
Logging and monitoring for AI systems
Log everything at every layer:
- Input tokens and semantic categories (without storing the full text if PII)
- Tool calls made by the agent and their parameters
- Output classifications (did the model refuse? produce unusual output?)
- Latency anomalies (very long responses can indicate extraction attempts)
The supply chain risk
Model weights from third parties are the new dependencies. Treat them like you would an npm package:
- Verify checksums before deployment
- Run models in isolated environments before production
- Monitor model behavior for drift (fine-tuned backdoors are real)
- Prefer providers with reproducible model cards and audit trails
Keep going
Get the next writeup in your inbox
New posts delivered when I publish. No spam.