This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Metta AI is a reinforcement learning project focusing on the emergence of cooperation and alignment in multi-agent AI systems. It creates a model organism for complex multi-agent gridworld environments to study the impact of social dynamics (like kinship and mate selection) on learning and cooperative behaviors.
The codebase consists of:
metta/: Core Python implementation for agents, maps, RL algorithms, simulationmettagrid/: C++/Python grid environment implementationmettascope/: Visualization and replay tools
# Initial setup - installs uv, configures metta, and installs components
./install.sh
# After installation, you can use metta commands directly:
metta status # Check component status
metta configure --profile=softmax # Reconfigure for different profile
metta install aws wandb # Install specific components
# Run `metta -h` to see all available commands@.cursor/commands.md
# Run all tests with coverage
metta test --cov=mettagrid --cov-report=term-missing
# Run linting and formatting on python files with Ruff
metta lint # optional --fix and --staged arguments
# Auto-fix Ruff errors with Claude (requires ANTHROPIC_API_KEY)
uv run ./devops/tools/auto_ruff_fix.py path/to/file
# Format shell scripts
./devops/tools/format_sh.shNot needed, just run scripts, they'll work automatically through uv-powered shebangs.
# Clean debug cmake build artifacts. `metta install` also does this
metta clean- Each agent has a policy with action spaces and observation spaces
- Policies are stored in
PolicyStoreand managed byMettaAgent - Agent architecture is designed to be adaptable to new game rules and environments
- Neural components can be mixed and matched via configuration
- Gridworld environments with agents, resources, and interaction rules
- Procedural world generation with customizable configurations
- Various environment types with different dynamics and challenges
- Support for different kinship schemes and mate selection mechanisms
- Distributed reinforcement learning with multi-GPU support
- Integration with Weights & Biases for experiment tracking
- Scalable architecture for training large-scale multi-agent systems
- Support for curriculum learning and knowledge distillation
- Comprehensive suite of intelligence evaluations
- Navigation tasks, maze solving, in-context learning
- Cooperation and competition metrics
- Support for tracking and comparing multiple policies
The project uses OmegaConf for configuration, with config files organized in configs/:
agent/: Agent architecture configurationstrainer/: Training configurationssim/: Simulation configurationshardware/: Hardware-specific settingsuser/: User-specific configurations
@.cursor/docs.md
- Tests should be independent and idempotent
- Tests should be focused on testing one thing
- Tests should cover edge cases and boundary conditions
- Tests are organized in the
tests/directory, mirroring the project structure
- Use modern Python typing syntax (PEP 585:
list[str]instead ofList[str]) - Use Union type syntax for Python 3.10+ (
type | Noneinstead ofOptional[type]) - Follow selective type annotation guidelines:
- Always annotate: All function parameters
- Selectively annotate returns for:
- Public API functions/methods (not prefixed with _)
- Functions with complex logic or multiple branches
- Functions where the return type isn't obvious from the name
- Functions that might return None in some cases
- Skip return annotations for:
- Private methods internal to a class
- Functions enclosed within other functions
- Simple getters/setters with obvious returns
- Very short functions (1-3 lines) with obvious returns
- Variable annotations: Only when type inference fails or for empty collections
- Prefer dataclasses over TypedDict for complex data structures
- Use descriptive variable names that clearly indicate purpose
- Remove unnecessary comments that just restate what the code does
- Prefer properties over methods for computed attributes using
@propertydecorator - Implement proper error handling with clear, actionable error messages
When reviewing code, focus on:
- Type Safety: Check for missing type annotations, especially return types
- API Consistency: Ensure similar functionality follows the same patterns
- Performance: Identify potential bottlenecks or inefficient patterns
- Maintainability: Look for code that will be difficult to modify or extend
- Documentation: Ensure complex logic is properly documented
- Testing: Verify that new functionality has appropriate test coverage
- Convert methods to properties where appropriate for better API consistency
- Use
@propertydecorator for computed attributes - Ensure all environment properties follow consistent naming patterns
- Example:
action_names()→action_names(property)
- Validate policy types with runtime checking using
policy_as_metta_agent() - Use Union types for policies:
Union[MettaAgent, DistributedMettaAgent] - Ensure proper type safety for policy handling throughout the system
- Add explicit
torch.devicetype hints in trainer and simulation modules - Be consistent about device placement and movement of tensors
When creating PRs (triggered by @claude open-pr):
The workflow automatically determines the appropriate base branch:
- From PR Comments: New branches are created from the current PR's branch
- From Issue Comments: New branches are created from the main branch
- Example: If you comment
@claude open-prin PR #657 (branch:robb/0525-agent-type-changes), Claude will create a new branch based onrobb/0525-agent-type-changes, not main
- Use descriptive branch names with prefixes:
feature/add-type-safety- New functionalityfix/missing-annotations- Bug fixesrefactor/method-to-property- Code improvementsdocs/update-readme- Documentation updates
- Include issue number when applicable:
fix/657-type-safety-improvements
- Follow conventional commit format:
feat:,fix:,refactor:,docs:,test: - Be specific about what was changed:
fix: add missing return type annotations to PolicyStore methods - Reference issues when applicable:
fix: resolve type safety issues (#657)
- Title: Clear, concise description of the change
- Description: Must include:
- What: Summary of changes made
- Why: Rationale for the change
- Testing: How the changes were verified
- Breaking Changes: Any API changes that affect existing code
- Linking: Reference related issues with "Closes #123", "Fixes #123", or "Addresses #123"
- Analyze: Understand the request and examine current codebase patterns
- Plan: Create focused, incremental changes rather than large rewrites
- Implement: Make changes following established project patterns
- Test: Ensure all existing tests pass and add new tests if needed
- Document: Update docstrings and comments where necessary
- Review: Self-review the changes for consistency with project standards
Before creating a PR, ensure:
- All new public methods have return type annotations
- Code follows the established naming conventions
- No unnecessary comments that restate obvious code
- Properties are used instead of simple getter methods
- Proper error handling is implemented
- Tests pass locally
- Code is formatted according to project standards