A

Securing AI Systems: Hardening Your LLM Infrastructure

A
Amit Nepal
Security Engineer · Linux & Infrastructure · Offensive Security
·Jun 1, 2026·1 min read
AI & Agents

Securing AI Systems: Hardening Your LLM Infrastructure

Jun 1, 2026 · 1 min read

Securing AI Systems: Hardening Your LLM Infrastructure

You've secured your web app, your APIs, your cloud infrastructure. Now you're running LLMs in production. The attack surface is different and most teams are not thinking about it carefully enough.

The AI attack surface

A production LLM deployment has several distinct attack surfaces:

  1. The model itself — adversarial inputs, extraction attacks
  2. The system prompt — exfiltration, injection via user input
  3. The tool/function layer — privilege escalation via the agent's capabilities
  4. The retrieval layer — poisoned documents in RAG, indirect injection
  5. The output layer — generated malicious content, data exfiltration via responses

Hardening the deployment

# docker-compose for a hardened LLM service
services:
  llm-api:
    image: your-llm-service:latest
    environment:
      # Never expose raw model access
      MODEL_ENDPOINT: "internal-only"
      # Rate limiting
      MAX_TOKENS_PER_MINUTE: "100000"
      MAX_REQUESTS_PER_USER: "100"
    networks:
      - llm-internal  # isolated network
    read_only: true
    cap_drop:
      - ALL
    security_opt:
      - no-new-privileges:true

System prompt security

Your system prompt is a secret — treat it like one:

  • Never include credentials, internal hostnames, or sensitive business logic
  • Test whether your prompt can be extracted via user queries
  • Use a content security layer that strips prompt-extraction attempts
# Basic prompt injection filter
injection_patterns = [
    r"ignore.{0,20}(previous|above|prior).{0,20}instructions",
    r"(reveal|show|print|output).{0,20}(system|original).{0,20}prompt",
    r"you are now",
    r"new persona",
]

def sanitize_input(text: str) -> str:
    for pattern in injection_patterns:
        if re.search(pattern, text, re.IGNORECASE):
            raise SecurityException(f"Prompt injection attempt detected")
    return text

Logging and monitoring for AI systems

Log everything at every layer:

  • Input tokens and semantic categories (without storing the full text if PII)
  • Tool calls made by the agent and their parameters
  • Output classifications (did the model refuse? produce unusual output?)
  • Latency anomalies (very long responses can indicate extraction attempts)

The supply chain risk

Model weights from third parties are the new dependencies. Treat them like you would an npm package:

  • Verify checksums before deployment
  • Run models in isolated environments before production
  • Monitor model behavior for drift (fine-tuned backdoors are real)
  • Prefer providers with reproducible model cards and audit trails
Keep going

Get the next writeup in your inbox

New posts delivered when I publish. No spam.