Skip to content

88448844/overthewire-ai-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Alt text

Autonomous CTF Solver

This project is an autonomous agent that solves Capture The Flag (CTF) challenges, specifically tailored for the OverTheWire Bandit wargame.

Architecture

The agent follows a simple Observe-Think-Act loop:

  1. Observe: It captures the output from an SSH shell.
  2. Think: It uses a large language model (LLM) to decide the next command based on the current goal and the last output.
  3. Act: It executes the command on the remote machine.
  4. Validate: It checks the output for a flag pattern.

How To Run

  1. Dependencies: Install the required Python packages from requirements.txt.

    pip install -r requirements.txt
  2. API Key: Set your Gemini API key as an environment variable. The agent will use placeholder logic if this is not set.

    # On Windows
    set GEMINI_API_KEY="YOUR_API_KEY"
    
    # On Linux/macOS
    export GEMINI_API_KEY="YOUR_API_KEY"
  3. Run the Agent: Start the solver from the project's root directory (CTF).

    python -m ctf_solver.main
  4. Like the repo :)

The agent will log its progress and save any found flags to ctf_solver/state.json.

Configuration

The CTF levels, credentials, and goals are defined in ctf_solver/levels.yaml. You can extend this file to add more levels.

Safety

The agent operates under a strict command allow list to prevent destructive or unsafe operations. See policies.py for the defined rules.

About

AI agent that solves OverTheWire Bandit levels safely over SSH.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages