EyeWitness Developer Reference

Overview

EyeWitness is a cybersecurity reconnaissance tool designed for penetration testers and security professionals. It automates the process of taking screenshots of web applications, collecting server headers, identifying default credentials, and generating comprehensive HTML reports with visual evidence.

Primary Purpose: Authorized security assessments, red team operations, and network reconnaissance.

Architecture Overview

Platform Strategy

Python Implementation: Cross-platform support for Linux/Unix (Kali, Debian, CentOS, Rocky Linux), Windows, and macOS
Docker Support: Containerized deployment eliminating dependency management

Docker Architecture

The Docker implementation provides a fully isolated environment with all dependencies pre-installed:

Base Image: Python 3.11-slim-bookworm (Debian-based for stability)
Display Server: Xvfb (X Virtual Framebuffer) for headless screenshot capture
Browser: Chromium with ChromeDriver for Selenium automation
Isolation: Non-root user execution with proper permission handling
Volume Mounts: Input files and output directory mapping

Container Components:

Container Environment:
├── Python 3.11 Runtime
├── Chromium Browser
├── ChromeDriver (Selenium WebDriver)
├── Xvfb Display Server
├── EyeWitness Application
└── All Python Dependencies

Core Architecture Pattern

CLI Interface → Target Parser → Multi-threaded Capture → Database Storage → Report Generation

Project Structure

EyeWitness/
├── Python/                     # Primary Linux/Unix implementation
│   ├── EyeWitness.py           # Main entry point and CLI interface
│   ├── modules/                # Core functionality modules
│   │   ├── objects.py          # Data models (HTTPTableObject, UAObject)
│   │   ├── selenium_module.py  # Web automation and screenshot capture
│   │   ├── db_manager.py       # SQLite database operations
│   │   ├── reporting.py        # HTML report generation
│   │   ├── helpers.py          # Utility functions and XML parsing
│   │   ├── driver_manager.py   # WebDriver management and auto-download
│   │   └── platform_utils.py   # Cross-platform compatibility
├── setup/                      # Installation and dependencies
│   ├── setup.sh               # Linux/Unix installation script
│   ├── setup.ps1              # Windows PowerShell installation
│   └── requirements.txt       # Python dependencies
├── Dockerfile                  # Docker container definition
├── .dockerignore              # Docker build exclusions
├── DOCKER.md                  # Docker usage documentation
└── docker-compose.yml         # Docker Compose configuration (optional)

Core Modules and Functions

1. objects.py - Data Models

HTTPTableObject Class

Purpose: Core data structure representing a web target.

Key Attributes:

remote_system: Target URL
page_title: Extracted page title
headers: HTTP response headers
source_code: Page source HTML
error_state: Connection/load error status
screenshot_path: Path to captured screenshot

Notable Methods:

def CreateDataRow(self)           # Generate HTML table row for reports
def CreateReport(self, output_dir) # Create individual target report page
def PrintCVSRow(self)             # Export data as CSV row

UAObject Class

Purpose: User-Agent testing variant that extends HTTPTableObject.

Key Features:

Tracks differences between baseline and UA-specific requests
Inherits all HTTPTableObject functionality
Used for user-agent enumeration testing

2. selenium_module.py - Web Automation Engine

create_driver(cli_options, user_agent=None)

Purpose: Initialize Chromium WebDriver with optimized headless configuration.

Parameters:

cli_options: Command-line arguments and settings
user_agent: Custom user-agent string (optional)

Returns: Configured Selenium Chrome WebDriver instance

Key Features:

Cross-platform Chromium/Chrome detection
Optimized headless operation with new headless mode
Custom user-agent configuration
SSL certificate error handling
Enhanced network error categorization
Memory and performance optimization

capture_host(targets, selenium_driver, output_directory, cli_options)

Purpose: Core screenshot capture function.

Parameters:

targets: List of HTTPTableObject instances to process
selenium_driver: Configured WebDriver instance
output_directory: Directory for saving screenshots and data
cli_options: Configuration options

Workflow:

Navigate to target URL with timeout handling
Dismiss authentication prompts and alerts
Capture screenshot and save to file
Extract page source and headers
Handle SSL errors and connection issues
Store results in HTTPTableObject

3. db_manager.py - Data Persistence

Important Functions:

create_database_tables(db_conn, cli_options)

Purpose: Initialize SQLite database schema.

Tables Created:

http: Main targets table with HTTPTableObject data
ua: User-agent testing variants
opts: Scan configuration and options

db_store_screenshot(cli_options, http_object)

Purpose: Store HTTPTableObject in database with pickle serialization.

db_get_incomplete_targets(cli_options)

Purpose: Retrieve unfinished targets for resume functionality.

Returns: List of HTTPTableObject instances that need processing

4. reporting.py - Report Generation

create_report(cli_options, report_objects)

Purpose: Generate main HTML report with categorized results.

Parameters:

cli_options: Configuration including output directory
report_objects: List of completed HTTPTableObject instances

Key Features:

25+ predefined service categories
Fuzzy matching for service grouping (70% similarity threshold)
Pagination for large result sets
Bootstrap-based responsive design
CSV export functionality

create_table_string(objects, table_head)

Purpose: Generate HTML table for report sections.

Parameters:

objects: List of HTTPTableObject instances for table
table_head: HTML table header string

Returns: Complete HTML table string with navigation

5. helpers.py - Utility Functions

create_targets_from_file(file_name)

Purpose: Parse input files and create target lists.

Supported Formats:

Plain text files with URLs
Nmap XML output files
Nessus XML export files
Masscan XML results

Returns: List of target URLs extracted from input

parse_nmap_xml(nmap_file)

Purpose: Extract web services from Nmap scan results.

Logic:

Parse XML for host and port information
Identify HTTP/HTTPS services
Generate URLs with appropriate protocols
Handle custom ports and service detection

default_creds_category(page_source, page_title)

Purpose: Identify applications with known default credentials.

Parameters:

page_source: HTML source code of target page
page_title: Extracted page title

Returns: Boolean indicating if default credentials detected

6. selenium_module.py - Enhanced Browser Management

find_chromedriver()

Purpose: Locate ChromeDriver executable in system paths.

Search Locations:

Standard system paths (/usr/bin, /usr/local/bin)
Snap package locations
System PATH environment
Common installation directories

check_browsers_available()

Purpose: Verify Chromium/Chrome and ChromeDriver availability.

Returns: Dictionary with browser and driver status information

7. platform_utils.py - Cross-Platform Support

detect_platform()

Purpose: Determine operating system and architecture.

Returns: Dictionary with platform information

find_chromium_executable()

Purpose: Locate Chromium/Chrome installation across platforms.

Search Paths:

Windows: Program Files, Program Files (x86), AppData paths
Linux: /usr/bin, /usr/local/bin, snap installations
macOS: Applications folder and Homebrew paths

setup_virtual_display()

Purpose: Configure headless display for screenshot capture.

Returns: PyVirtualDisplay instance for Unix systems

Main Workflow (EyeWitness.py)

1. Command Line Interface

Entry Point: main() function

Key Arguments:

-f, --file: Input file with URLs
-x, --xml: Nmap/Nessus XML input
--single: Single URL mode
-d, --out: Output directory
--timeout: Page load timeout
--threads: Number of worker processes
--resume: Resume interrupted scan

2. Processing Pipeline

Phase 1: Input Processing

def create_targets(cli_options):
    # Parse input files (text, XML) to create target list
    # Return list of URLs for processing

Phase 2: Multi-threaded Capture

def capture_screenshots(targets, cli_options):
    # Create process pool for parallel execution
    # Each worker captures screenshots using Selenium
    # Store results in SQLite database for persistence

Phase 3: Report Generation

def generate_report(cli_options):
    # Load completed targets from database
    # Categorize and group similar services
    # Generate HTML reports with navigation

Data Flow

Input Sources (URLs/XML) 
    ↓
Target Parser (helpers.py)
    ↓
HTTPTableObject Creation (objects.py)
    ↓
Multi-threaded Screenshot Capture (selenium_module.py)
    ↓
Database Storage (db_manager.py)
    ↓
Report Generation (reporting.py)
    ↓
HTML Reports + Screenshots

Key Configuration Options

Database Configuration

File: {output_dir}/EyeWitness.db
Engine: SQLite with pickle serialization
Schema: http, ua, opts tables
Resume: Tracks incomplete targets for resumption

Selenium Configuration

Browser: Chromium/Chrome (required)
Mode: Headless with new headless mode (default)
Timeouts: Configurable page load and screenshot timeouts
User Agents: Custom UA string support
Performance: Memory optimization and background throttling disabled
Security: Certificate error handling and automation detection disabled

Report Configuration

Format: HTML with Bootstrap CSS
Pagination: 25 results per page (configurable)
Categories: 25+ predefined service types
Grouping: 70% similarity threshold for page title clustering
Export: CSV data export capability

Extension Points for Developers

1. Adding New Input Formats

Location: helpers.py Function: create_targets_from_file() Process: Add new parsing logic for additional file formats

2. Custom Service Categories

Location: categories.txt and reporting.py Process: Add new categories and update categorization logic

3. Additional Signature Detection

Location: signatures.txt and helpers.py Function: default_creds_category() Process: Add new application signatures and detection rules

4. Enhanced Screenshot Capture

Location: selenium_module.py Function: capture_host() Process: Modify screenshot logic, add new data extraction

5. Custom Report Templates

Location: reporting.py Functions: create_report(), create_table_string() Process: Modify HTML templates and styling

Dependencies and Requirements

Core Python Dependencies

rapidfuzz>=3.0.0      # Fast string matching for clustering
selenium>=4.29.0      # Modern web browser automation  
netaddr>=0.10.0       # Network address manipulation
pyvirtualdisplay>=3.0 # Virtual display support (Unix)

System Requirements

Chromium/Chrome Browser: Required for Selenium WebDriver
ChromeDriver: WebDriver executable (installed via package manager)
Xvfb: Virtual display server (Linux headless environments)
Python 3.7+: Minimum Python version requirement

Security Considerations

Legitimate Security Tool

Purpose: Authorized security assessments only
No Malicious Code: Clean, transparent implementation
Standard Practices: Follows security tool development patterns

Safe Usage Guidelines

Only use on authorized targets
Respect rate limiting and target resources
Follow responsible disclosure for findings
Maintain scan logs for audit purposes

Performance Optimization

Multi-threading Architecture

Process Pool: Configurable worker thread count
Shared Queue: Efficient work distribution
Database Per Worker: Parallel database writes
Progress Tracking: Real-time status updates

Resource Management

Memory Optimization: Efficient object lifecycle management
Disk Space: Screenshot compression and cleanup
Network: Configurable timeouts and retry logic
CPU: Balanced parallel processing

Troubleshooting Common Issues

Chromium/ChromeDriver Issues

Linux Solution: sudo apt install chromium-browser chromium-chromedriver
Windows Solution: Run setup.ps1 script as Administrator
macOS Solution: brew install --cask google-chrome

Headless Display Issues (Linux)

Solution: Install Xvfb: sudo apt install xvfb
Verification: Check platform_utils.py virtual display setup
Alternative: Use optimized headless mode (no display server needed)

Permission Issues

Solution: Run setup scripts with appropriate privileges
Alternative: Manual dependency installation per requirements.txt

Database Corruption

Solution: Delete EyeWitness.db file to reset scan state
Prevention: Avoid forceful termination during database writes

This reference provides comprehensive information for developers to understand, modify, and extend EyeWitness functionality without requiring assistance from the original authors.

Getting Started for New Contributors

Development Environment Setup

Fork and Clone Repository

git clone https://github.com/yourusername/EyeWitness.git
cd EyeWitness
git checkout -b feature/your-feature-name

Development Dependencies

# Install in development mode
cd setup && sudo ./setup.sh  # or setup.ps1 on Windows

# Optional: Create virtual environment for development
python3 -m venv eyewitness-dev
source eyewitness-dev/bin/activate  # Linux/macOS
# eyewitness-dev\Scripts\activate.bat  # Windows

Testing Your Changes

cd Python

# Quick functionality test
python3 EyeWitness.py --single https://example.com

# Test with sample URLs
echo -e "https://example.com\nhttps://google.com" > test_urls.txt
python3 EyeWitness.py -f test_urls.txt

Code Contribution Guidelines

Code Style Standards
- Follow PEP 8 for Python code formatting
- Use meaningful variable and function names
- Add docstrings for public functions and classes
- Keep functions focused and modular
Testing Requirements
- Test your changes against multiple target types
- Verify functionality on different platforms if possible
- Ensure backward compatibility with existing features
- Test error handling and edge cases

Commit Message Format

feat: add new signature detection for Apache Tomcat
fix: resolve ChromeDriver path detection on Windows
docs: update installation instructions for Alpine Linux
refactor: optimize database connection handling

Common Development Tasks

Adding New Application Signatures

Add to signatures.txt

APPLICATION_NAME:identifying_string_in_source

Update helpers.py detection logic

def default_creds_category(page_source, page_title):
    # Add your detection logic
    if "your_app_identifier" in page_source.lower():
        return True

Adding New Input File Formats

Extend create_targets_from_file() in helpers.py

def create_targets_from_file(file_name):
    # Detect file format
    if file_name.endswith('.yourformat'):
        return parse_your_format(file_name)

Implement format parser

def parse_your_format(file_path):
    targets = []
    # Your parsing logic here
    return targets

Modifying Report Generation

Update HTML templates in reporting.py
Modify CSS in bin/style.css
Test report rendering with various data sizes

Cross-Platform Compatibility

Use platform_utils.py for OS-specific code
Test path handling across Windows/Linux/macOS
Verify browser detection on different platforms

Architecture Decision Records

Why Chromium Over Other Browsers?

Consistency: Same rendering engine across platforms
Headless Support: Mature headless mode implementation
Performance: Optimized for automation workloads
Availability: Widely available through package managers

Why SQLite for Data Storage?

Simplicity: No external database dependencies
Resume Capability: Persistent state for large scans
Performance: Fast for read/write operations
Portability: Database file can be moved between systems

Why Multiprocessing Over Threading?

GIL Limitations: Python's Global Interpreter Lock
Stability: Process isolation prevents single failures
Resource Control: Better memory and CPU management
Scalability: Effective utilization of multiple CPU cores

Debugging and Troubleshooting

Common Development Issues

ChromeDriver Version Mismatches

# Check versions
chromium-browser --version
chromedriver --version

# Update ChromeDriver
# Linux: sudo apt update && sudo apt install chromium-chromedriver
# Windows: Run setup.ps1 again

Virtual Display Issues (Linux)

# Manual Xvfb testing
Xvfb :99 -screen 0 1920x1080x24 &
export DISPLAY=:99
python3 EyeWitness.py --single https://example.com

Database Lock Errors

# Add to db_manager.py for debugging
import sqlite3
try:
    # Database operation
except sqlite3.OperationalError as e:
    print(f"Database error: {e}")
    # Handle appropriately

Debug Mode Usage

# Enable verbose logging
python3 EyeWitness.py -f urls.txt --debug

# Browser window visibility (for development)
python3 EyeWitness.py -f urls.txt --show-selenium

Performance Optimization Guidelines

Thread/Process Tuning

Default Formula: threads = min(cpu_count * 2, 20)
Memory Consideration: Each browser instance uses ~100-200MB
I/O Bound vs CPU Bound: Screenshot capture is I/O bound

Memory Management

# Best practices for large scans
def process_batch(targets, batch_size=100):
    for i in range(0, len(targets), batch_size):
        batch = targets[i:i+batch_size]
        process_targets(batch)
        # Allow garbage collection
        del batch

Database Optimization

# Batch database operations
def batch_store_screenshots(objects, batch_size=50):
    for i in range(0, len(objects), batch_size):
        batch = objects[i:i+batch_size]
        # Single transaction for batch
        with db_conn:
            for obj in batch:
                store_screenshot(obj)

Security Considerations for Developers

Input Validation

# Always validate and sanitize inputs
def validate_url(url):
    if not url.startswith(('http://', 'https://')):
        return False
    # Additional validation logic
    return True

Safe File Handling

# Secure file operations
import os.path
def safe_file_path(user_path, base_dir):
    # Prevent directory traversal
    safe_path = os.path.abspath(os.path.join(base_dir, user_path))
    if not safe_path.startswith(base_dir):
        raise ValueError("Invalid file path")
    return safe_path

Error Information Disclosure

# Don't expose internal paths or sensitive info
try:
    # Operation that might fail
except Exception as e:
    # Log detailed error internally
    logger.error(f"Internal error: {e}")
    # Return sanitized error to user
    return "Operation failed. Please check your input."

Release and Deployment

Version Numbering

Follow Semantic Versioning (MAJOR.MINOR.PATCH)
MAJOR: Breaking changes
MINOR: New features, backward compatible
PATCH: Bug fixes, backward compatible

Release Checklist

All tests pass on Windows, Linux, and macOS
Documentation updated for new features
CHANGELOG updated with changes
Version numbers updated in relevant files
Security review completed
Performance impact assessed

Community and Support

Getting Help

Check existing GitHub issues for similar problems
Review this developer reference and README.md
Test with minimal reproduction case
Provide platform, Python version, and error details

Contributing Back

Submit pull requests for bug fixes and features
Update documentation for changes
Add tests for new functionality
Follow the established code style

This comprehensive reference should enable developers to contribute effectively to EyeWitness without requiring guidance from original authors.

FilesExpand file tree

DEVELOPER_REFERENCE.md

Latest commit

History

DEVELOPER_REFERENCE.md

File metadata and controls

EyeWitness Developer Reference

Overview

Architecture Overview

Platform Strategy

Docker Architecture

Core Architecture Pattern

Project Structure

Core Modules and Functions

1. objects.py - Data Models

HTTPTableObject Class

UAObject Class

2. selenium_module.py - Web Automation Engine

create_driver(cli_options, user_agent=None)

capture_host(targets, selenium_driver, output_directory, cli_options)

3. db_manager.py - Data Persistence

Important Functions:

create_database_tables(db_conn, cli_options)

db_store_screenshot(cli_options, http_object)

db_get_incomplete_targets(cli_options)

4. reporting.py - Report Generation

create_report(cli_options, report_objects)

create_table_string(objects, table_head)

5. helpers.py - Utility Functions

create_targets_from_file(file_name)

parse_nmap_xml(nmap_file)

default_creds_category(page_source, page_title)

6. selenium_module.py - Enhanced Browser Management

find_chromedriver()

check_browsers_available()

7. platform_utils.py - Cross-Platform Support

detect_platform()

find_chromium_executable()

setup_virtual_display()

Main Workflow (EyeWitness.py)

1. Command Line Interface

2. Processing Pipeline

Phase 1: Input Processing

Phase 2: Multi-threaded Capture

Phase 3: Report Generation

Data Flow

Key Configuration Options

Database Configuration

Selenium Configuration

Report Configuration

Extension Points for Developers

1. Adding New Input Formats

2. Custom Service Categories

3. Additional Signature Detection

4. Enhanced Screenshot Capture

5. Custom Report Templates

Dependencies and Requirements

Core Python Dependencies

System Requirements

Security Considerations

Legitimate Security Tool

Safe Usage Guidelines

Performance Optimization

Multi-threading Architecture

Resource Management

Troubleshooting Common Issues

Chromium/ChromeDriver Issues

Headless Display Issues (Linux)

Permission Issues

Database Corruption

Getting Started for New Contributors

Development Environment Setup

Code Contribution Guidelines

Common Development Tasks

Adding New Application Signatures

Adding New Input File Formats

Modifying Report Generation

Cross-Platform Compatibility

Architecture Decision Records

Why Chromium Over Other Browsers?