This guide explains how to structure examples for the Codegen library. A well-structured example helps both humans and AI understand the code's purpose and how to use it effectively.
- Single Responsibility: Each example should demonstrate one clear use case
- Self-Contained: Examples should work independently with minimal setup
- Clear Structure: Follow a consistent file organization pattern
- Good Documentation: Include README.md with clear explanations and examples
example-name/
├── README.md # Documentation and usage examples
├── run.py # Main implementation
├── guide.md # (Optional) Additional technical details
└── input_repo/ # (Optional) Sample code for transformation
Your run.py should follow this structure, demonstrated well in the generate_training_data example:
-
Imports at the top
import codegen from codegen import Codebase from codegen.sdk.core import Function # ... other imports
-
Utility functions with clear docstrings
def hop_through_imports(imp: Import) -> Symbol | ExternalModule: """Finds the root symbol for an import""" # Implementation...
-
Main Codegen function with decorator
@codegen.function("your-function-name") def run(codebase: Codebase): """Clear docstring explaining what the function does. Include: 1. Purpose of the function 2. Key steps or transformations 3. Expected output """ # Implementation...
-
Entry point at bottom
if __name__ == "__main__": # Initialize codebase # Run transformation # Save/display results
Prefer using public repositories for examples when possible. However, sometimes you need a specific code structure to demonstrate a concept clearly. Here's how to handle both cases:
# Preferred: Use a well-known public repo that demonstrates the concept well
codebase = Codebase.from_repo("fastapi/fastapi")
# Alternative: Create a minimal example repo when you need specific code structure
# 1. Create an input_repo/ directory in your example
# 2. Add minimal code that clearly demonstrates the transformation
codebase = Codebase("./input_repo")For example:
example-name/
├── README.md
├── run.py
└── input_repo/ # Your minimal example code
├── app.py
└── utils.py
Choose between these approaches based on:
- Can you find a public repo that clearly shows the concept?
- Is the transformation specific enough that a custom example would be clearer?
- Would a minimal example be more educational than a complex real-world one?
-
Function Decorator
- Always use
@codegen.function()with a descriptive name - Name should match the example's purpose
- Always use
-
Utility Functions
- Break down complex logic into smaller, focused functions
- Each utility should demonstrate one clear concept
- Include type hints and docstrings
-
Main Function
- Name it
run()for consistency - Include comprehensive docstring explaining the transformation
- Return meaningful data that can be used programmatically
- Name it
-
Entry Point
- Include a
__name__ == "__main__"block - Show both initialization and execution
- Add progress messages for better UX
- Include a
-
Error Handling
- Include appropriate error handling for common cases
- Provide clear error messages
The generate_training_data example demonstrates these principles well:
# Focused utility function
def get_function_context(function) -> dict:
"""Get the implementation, dependencies, and usages of a function."""
# Clear, focused implementation...
# Main transformation with decorator
@codegen.function("generate-training-data")
def run(codebase: Codebase):
"""Generate training data using a node2vec-like approach...
This codemod:
1. Finds all functions...
2. For each function...
3. Outputs structured JSON...
"""
# Clear implementation with good structure...
# Clean entry point
if __name__ == "__main__":
print("Initializing codebase...")
codebase = Codebase.from_repo("fastapi/fastapi")
run(codebase)
# ... rest of executionEvery example should include:
- README.md
- Clear explanation of purpose
- Explains key syntax and program function
- Code examples showing the transformation (before/after)
- If using
input_repo/, explain its structure and contents - Output format (if applicable)
- Setup and running instructions
Before submitting:
- Test with a fresh environment
- Verify all dependencies are listed
- Ensure the example runs with minimal setup
- Check that documentation is clear and accurate
Remember: Your example might be used by both humans and AI to understand Codegen's capabilities. Clear structure and documentation help everyone use your code effectively.