This notebook walks through the two tutorial policy generators: --scripted and --trainable.
- Run from the repo root with your virtual environment activated.
- If
cogamesis not found, activate.venvand retry.
Optional: confirm the CLI is available:
cogames --helpThe scripted template is a rule-based policy you can edit by hand. It runs immediately with cogames play and does not require training.
cogames tutorial make-policy --scripted -o my_scripted_policy.pyExpected output (example):
Scripted policy template copied to: /path/to/your/project/my_scripted_policy.py
Play with: cogames play -m arena -p class=my_scripted_policy.StarterPolicy
Note: Replace /path/to/your/project/ with your local repo path.
Common pitfalls:
- These commands overwrite existing files; use
-oto choose a new filename.
Run the scripted policy (no training required):
cogames play -m arena -p class=my_scripted_policy.StarterPolicyExpected terminal output (example):
Playing arena
Max Steps: 1000, Render: gui
Initializing Mettascope...
Episode Complete!
Steps: <N>
Total Rewards: [<value>]
Final Reward Sum: <value>
The trainable template defines a neural policy. You edit the model/logic, then train it with cogames tutorial train, and run it using the saved weights.
cogames tutorial make-policy --trainable -o my_trainable_policy.pyExpected output (example):
Trainable policy template copied to: /path/to/your/project/my_trainable_policy.py
Train with: cogames tutorial train -m arena -p class=my_trainable_policy.MyTrainablePolicy --steps 2000
Note: Replace /path/to/your/project/ with your local repo path.
Common pitfalls:
- These commands overwrite existing files; use
-oto choose a new filename.
Train and run the trainable policy:
cogames tutorial train -m arena -p class=my_trainable_policy.MyTrainablePolicy --steps 2000
cogames play -m arena -p class=my_trainable_policy.MyTrainablePolicy,data=./train_dir/<run_id>/model_000001.ptNote: Add --steps for quick tutorial runs; the default is very large.
Expected terminal output (example):
Training on mission: arena
...progress logs...
Training complete. Checkpoints saved to: ./train_dir
Final checkpoint: ./train_dir/<run_id>/model_000001.pt
Expected terminal output (example):
Playing arena
Max Steps: 1000, Render: gui
Initializing Mettascope...
Episode Complete!
Steps: <N>
Total Rewards: [<value>]
Final Reward Sum: <value>
You can edit the generated policy files to make your own behavior. For scripted policies, change rules directly and run immediately. For trainable policies, modify the model or logic, then retrain and run with the new checkpoint.
- Scripted = rule-based, runs immediately without training.
- Trainable = neural policy, train with
cogames tutorial train.
- Scripted: run the
cogames play ...command printed by the CLI. - Trainable: run the
cogames tutorial train ...command printed by the CLI.