- Development Setup
- Architecture Overview
- Code Structure
- Development Workflow
- Testing
- Security Guidelines
- Contributing
- Deployment
- Python 3.8+
- FFmpeg and ImageMagick installed
- PostgreSQL (required)
- Git
- Clone the repository:
git clone https://github.com/ttlequals0/PixelProbe.git
cd PixelProbe- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Install system dependencies:
On Ubuntu/Debian:
sudo apt-get update
sudo apt-get install -y ffmpeg imagemagick libmagic1On macOS:
brew install ffmpeg imagemagick libmagic- Set up environment variables:
cp .env.example .env
# Edit .env with your configuration- Initialize the database:
python -c "from app import create_tables; create_tables()"- Run the development server:
python app.pyThe application will be available at http://localhost:5000
- Build the Docker image:
docker build -t pixelprobe:dev .- Run with Docker Compose:
docker-compose up -d┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Web Client │────▶│ Flask API │────▶│ PostgreSQL DB │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│
▼
┌─────────────────┐
│ Media Scanner │
└─────────────────┘
│
┌──────┴──────┐
▼ ▼
┌─────────┐ ┌─────────┐
│ FFmpeg │ │ImageMag │
└─────────┘ └─────────┘
-
Presentation Layer (
templates/,static/)- HTML templates with Bootstrap UI
- JavaScript for dynamic interactions
- Real-time progress updates
-
API Layer (
pixelprobe/api/)- RESTful endpoints
- Request validation
- Rate limiting
- CSRF protection
-
Business Logic Layer (
pixelprobe/services/)- Scan orchestration
- Statistics calculation
- Export functionality
- Maintenance operations
-
Data Access Layer (
pixelprobe/repositories/)- Database operations
- Query optimization
- Transaction management
-
Core Scanner (
media_checker.py)- File discovery
- Corruption detection
- Multi-tool validation
PixelProbe/
├── app.py # Application entry point
├── media_checker.py # Core scanning engine
├── models.py # SQLAlchemy models
├── scheduler.py # Scheduled scan management
├── version.py # Version information
├── requirements.txt # Python dependencies
├── Dockerfile # Docker configuration
├── docker-compose.yml # Docker Compose setup
│
├── pixelprobe/ # Main application package
│ ├── __init__.py
│ ├── api/ # API endpoints
│ │ ├── scan_routes.py # Scanning endpoints
│ │ ├── stats_routes.py # Statistics endpoints
│ │ ├── admin_routes.py # Admin endpoints
│ │ ├── export_routes.py # Export endpoints
│ │ └── maintenance_routes.py
│ │
│ ├── services/ # Business logic
│ │ ├── scan_service.py
│ │ ├── stats_service.py
│ │ ├── export_service.py
│ │ └── maintenance_service.py
│ │
│ ├── repositories/ # Data access
│ │ ├── base_repository.py
│ │ ├── scan_repository.py
│ │ └── config_repository.py
│ │
│ └── utils/ # Utilities
│ ├── security.py # Security utilities
│ ├── validators.py # Input validation
│ ├── decorators.py # Custom decorators
│ └── helpers.py # Helper functions
│
├── templates/ # HTML templates
│ ├── index.html # Main UI
│ └── api_docs.html # API documentation
│
├── static/ # Static assets
│ ├── css/
│ ├── js/
│ └── images/
│
├── tests/ # Test suite
│ ├── unit/
│ └── integration/
│
├── docs/ # Documentation
│ ├── api/ # API docs
│ ├── developer/ # Developer guides
│ └── examples/ # Code examples
│
└── tools/ # Utility scripts
└── fix_*.py # Database repair tools
- Follow PEP 8 for Python code
- Use type hints where appropriate
- Maximum line length: 100 characters
- Use meaningful variable names
- Create a feature branch:
git checkout -b feature/your-feature-name- Make your changes and commit:
git add .
git commit -m "feat: add new scanning feature"- Push and create PR:
git push origin feature/your-feature-nameFollow the Conventional Commits specification:
feat:New featurefix:Bug fixdocs:Documentation changesstyle:Code style changesrefactor:Code refactoringtest:Test additions/changeschore:Maintenance tasks
- API Endpoint:
# pixelprobe/api/your_routes.py
from flask import Blueprint, request, jsonify
from pixelprobe.utils.security import validate_json_input
your_bp = Blueprint('your_feature', __name__, url_prefix='/api')
@your_bp.route('/your-endpoint', methods=['POST'])
@validate_json_input({
'field': {'required': True, 'type': str}
})
def your_endpoint():
"""Your endpoint description"""
data = request.get_json()
# Implementation
return jsonify({'result': 'success'})- Register Blueprint:
# app.py
from pixelprobe.api.your_routes import your_bp
app.register_blueprint(your_bp)- Add Service Logic:
# pixelprobe/services/your_service.py
class YourService:
def __init__(self):
pass
def process_data(self, data):
# Business logic here
return resultWhen adding new database fields:
- Update the model:
# models.py
class YourModel(db.Model):
new_field = db.Column(db.String(100))- Add migration in app.py:
def migrate_database():
# ... existing code ...
migrations = [
('new_field', "ALTER TABLE your_table ADD COLUMN new_field VARCHAR(100)")
]# Run all tests
pytest
# Run with coverage
pytest --cov=pixelprobe
# Run specific test file
pytest tests/unit/test_scan_service.py- Unit Test Example:
# tests/unit/test_scan_service.py
import pytest
from pixelprobe.services.scan_service import ScanService
def test_scan_file_validation():
service = ScanService()
# Test invalid path
with pytest.raises(ValueError):
service.scan_file("../../../etc/passwd")
# Test valid path
result = service.scan_file("/allowed/path/image.jpg")
assert result is not None- Integration Test Example:
# tests/integration/test_api_endpoints.py
def test_scan_endpoint(client):
response = client.post('/api/scan-file', json={
'file_path': '/test/image.jpg'
})
assert response.status_code == 200
assert 'message' in response.jsonUse the provided test database creation script:
python scripts/create_test_database.pyAlways validate user input:
from pixelprobe.utils.security import validate_file_path, validate_json_input
# Path validation
try:
safe_path = validate_file_path(user_input)
except PathTraversalError:
return jsonify({'error': 'Invalid path'}), 400
# JSON validation decorator
@validate_json_input({
'field': {'required': True, 'type': str, 'max_length': 100}
})Always use the safe wrapper:
from pixelprobe.utils.security import safe_subprocess_run
# Safe
result = safe_subprocess_run(['ffmpeg', '-i', file_path])
# Never do this
result = subprocess.run(f'ffmpeg -i {file_path}', shell=True) # DANGEROUS!When implementing authentication:
- Use JWT tokens
- Implement refresh tokens
- Add role-based access control
- Secure sensitive endpoints
- Read the Code of Conduct
- Check existing issues and PRs
- Discuss major changes in an issue first
- Update documentation for new features
- Add tests for new functionality
- Ensure all tests pass
- Update CHANGELOG.md
- Request review from maintainers
- Code follows style guidelines
- Tests added/updated
- Documentation updated
- Security considerations addressed
- Performance impact considered
- Backward compatibility maintained
- Environment Variables:
# .env.production
DEBUG=False
SECRET_KEY=your-strong-secret-key
DATABASE_URL=postgresql://user:pass@host/db
ALLOWED_SCAN_PATHS=/media/photos:/media/videos
TZ=UTC- Gunicorn Configuration:
# gunicorn_config.py
bind = "0.0.0.0:5000"
workers = 4
worker_class = "sync"
worker_connections = 1000
max_requests = 1000
max_requests_jitter = 50
timeout = 120- Run with Gunicorn:
gunicorn -c gunicorn_config.py app:app- Build production image:
docker build -t pixelprobe:latest .- Run container:
docker run -d \
--name pixelprobe \
-p 5000:5000 \
-v /media:/media:ro \
-v pixelprobe_data:/app/data \
-e SECRET_KEY=your-secret \
pixelprobe:latest-
Health Checks:
- Monitor
/healthendpoint - Check scan queue status
- Monitor disk space
- Monitor
-
Logging:
- Application logs:
/app/logs/ - Scan logs: Include timestamps and file paths
- Error tracking: Log all exceptions
- Application logs:
-
Performance:
- Monitor scan duration
- Track memory usage
- Database query performance
Regular backups of:
- SQLite database
- Configuration files
- Scan results
- Error logs
- Test updates in staging environment
- Backup database before updates
- Run database migrations
- Monitor for issues after deployment
-
"No module named 'magic'"
- Install:
pip install python-magic - On Windows: Also need
python-magic-bin
- Install:
-
"ffmpeg not found"
- Ensure FFmpeg is in PATH
- Install with package manager
-
Database connection issues
- Check PostgreSQL service is running
- Verify DATABASE_URL environment variable
-
Memory issues with large scans
- Increase worker memory limits
- Use parallel scanning
- Implement batch processing
Enable debug logging:
# .env
DEBUG=True
LOG_LEVEL=DEBUG# Enable profiling
from werkzeug.middleware.profiler import ProfilerMiddleware
app.wsgi_app = ProfilerMiddleware(app.wsgi_app)