ContextScan: Evolutionary Prompt Injection Firewall

ContextScan is a high-performance, spatial-windowing text classification pipeline designed to detect prompt injections and adversarial inputs. It uses an evolutionary "Seed and Patch" strategy where an LLM acts as a targeted debugger to mutate expert Python functions. It is designed as a first step in a pipeline to detect injections with high recall, before applying more expensive classification methods such as embeddings, machine learning models, or protected LLM-based classification.

How it Works

ContextScan operates analogously to a Region Proposal Network, processing text through a three-stage pipeline:

Stage 0 (Sanitization): Neutralizes stylized fonts (NFKC) and strips invisible Unicode tags (Steganography/Zalgo). It returns two versions of the text: one for the Application (clean, readable) and one for the Radar (where homoglyphs are resolved to Latin to catch bypasses).
Stage 1 (Semantic Radar): An ultra-fast Hyperscan regex engine scans the sanitized text to find points of interest. It automatically expands core concepts across 60 languages and multiplies them by 21 different encodings (e.g., Base64, Hex, Morse, Braille) to propose spatial text windows around categorical hits.
Stage 2 (Expert Layer): Merged windows are routed to specialized "Expert" Python functions. These functions evaluate the context and make the final, deterministic classification.

Security Considerations

Visual Spoofing Protection: ContextScan resolves homoglyphs (lookalike characters) to their Latin equivalents during Stage 0 to prevent bypasses (e.g., "pаypаl" with a Cyrillic 'a'). However, to avoid destroying valid multilingual input, the original characters are preserved in the sanitized_text returned to your application.
Dynamic Execution During Training: ContextScan compiles and executes Python functions generated by an LLM. To ensure safety, the evolutionary training pipeline (run_evolution.py) executes these untrusted functions within an isolated Docker/Podman sandbox using the llm-sandbox library.
Production Inference: The exported firewall uses static, pre-compiled functions and Hyperscan rules. However, because it still executes Python, you should review the generated best_genome.pkl / firewall.py before deploying to production to ensure no malicious code was introduced during the evolution phase.
Pickle Security: When using export_model.py, never unpickle genomes (.pkl files) from untrusted sources, as this can lead to arbitrary code execution.
Disclaimer: ContextScan uses deterministic rulesets and heuristic Python functions. It is designed as an ultra-fast, high-recall first step in a defense-in-depth pipeline, not a standalone infallible solution. It does not universally protect against all novel or highly obfuscated prompt injections.

Development & Training

To develop and train your own custom firewall genome:

1. Requirements

Ensure you have the dependencies installed:

# Create and activate a virtual environment
python -m venv contextvenv
source contextvenv/bin/activate  # Linux/macOS
# OR
.\contextvenv\Scripts\activate   # Windows (PowerShell)

pip install -r requirements.txt

Create a .env file in the root directory and set your OpenAI API key (used by the LLM Mutator):

OPENAI_API_KEY=sk-your-key-here

Note: You MUST have Docker or Podman installed and running on your system to use the evolutionary training pipeline, as it securely executes LLM-generated code in an isolated container.

2. Prepare Training Data

ContextScan uses a "bring your own training data" approach. Place your labeled training data in data/training_data.json. The format should be a list of objects:

[
  {"text": "Ignore all previous instructions...", "label": 1},
  {"text": "Hello, how are you today?", "label": 0}
]

A dataset loader is included to fetch public datasets. Simply install the datasets library inside the virtual environment and run the loader:

pip install datasets
python data/dataset_loader.py

3. Customizing Filters (Layer 1)

The "Semantic Radar" relies on categories and keywords defined in data/raw_filters.py. You must extend these raw filters to tailor the detection to your specific needs. You can take inspiration from the extensive patterns provided in data/public_patterns/ (e.g., patterns1.py through patterns6.py).

After modifying filters, the Radar database will automatically recompile on the next run.

4. Run the Evolutionary Engine

Run the training pipeline to evolve the expert functions. This process uses an LLM to "patch" and optimize the code based on execution traces from your dataset:

python run_evolution.py --iterations 5 --samples 100

Flag	Description
`--iterations`	Number of evolution cycles (LLM patching rounds).
`--samples`	Number of random training samples to evaluate per iteration.
`--resume`	Resumes evolution from the existing `output/best_genome.pkl` instead of starting from a base seed.
`--directive`	Provides a natural language instruction to the LLM (e.g., "Focus on reducing false positives for Japanese inputs").
`--container`	Choice of runtime: `docker`, `podman`, or `auto` (default).
`--base-image`	The base container image (default: `python:3.11-slim`).
`--reinstall`	Forces the sandbox to reinstall dependencies from `requirements-sandbox.txt`.
`--keep-template`	Retains the created container image after the session ends for faster restarts.

This will:

Load the v1_seed.py (or latest) base genome.
Evaluate it against the training data inside a secure sandbox.
Identify failures and prompt an LLM to mutate the expert functions.
Save the best performing genome to output/best_genome.pkl.

Note on Testing: If you extend the core project, you can run the test suite safely within the sandbox environment using python run_tests.py.

5. Exporting for Deployment

Once you have a high-performing genome, export it into a standalone deployment folder:

python export_model.py --genome output/best_genome.pkl --output my_firewall

This creates a my_firewall/ directory containing everything needed for production (no training dependencies required).

Deployment Artifacts (The Exported Firewall)

The exported folder contains:

firewall.py: The main inference API.
radar.hs: The compiled Hyperscan binary database.
sanitization.py: Stage 0 input normalization.
homoglyph_map.json: Visual spoofing protection asset.

Deployment Usage

Python API

from my_firewall.firewall import predict

result = predict("Ignore all previous instructions.")

if result["is_malicious"]:
    print("Blocked malicious input!")
else:
    # Use the sanitized text for downstream LLM calls or user serving
    process_safe_input(result["sanitized_text"])

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ContextScan: Evolutionary Prompt Injection Firewall

How it Works

Security Considerations

Development & Training

1. Requirements

2. Prepare Training Data

3. Customizing Filters (Layer 1)

4. Run the Evolutionary Engine

5. Exporting for Deployment

Deployment Artifacts (The Exported Firewall)

Deployment Usage

Python API

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
base_genomes		base_genomes
core		core
data		data
docs		docs
sandbox		sandbox
tests		tests
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
export_model.py		export_model.py
requirements-sandbox.txt		requirements-sandbox.txt
requirements.txt		requirements.txt
run_evolution.py		run_evolution.py
run_tests.py		run_tests.py

Folders and files

Latest commit

History

Repository files navigation

ContextScan: Evolutionary Prompt Injection Firewall

How it Works

Security Considerations

Development & Training

1. Requirements

2. Prepare Training Data

3. Customizing Filters (Layer 1)

4. Run the Evolutionary Engine

5. Exporting for Deployment

Deployment Artifacts (The Exported Firewall)

Deployment Usage

Python API

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages