Benchmarking imitation

The src/imitation/scripts/config/tuned_hps directory provides the tuned hyperparameter configs for benchmarking imitation. For v0.4.0, these correspond to the hyperparameters used in the paper imitation: Clean Imitation Learning Implementations.

Configuration files can be loaded either from the CLI or from the Python API.

Single benchmark

To run a single benchmark from the command line:

python -m imitation.scripts.<train_script> <algo> with <algo>_<env>

train_script can be either 1) train_imitation with algo as bc or dagger or 2) train_adversarial with algo as gail or airl. The env can be either of seals_ant, seals_half_cheetah, seals_hopper, seals_swimmer, or seals_walker. The hyperparameters for other environments are not tuned yet. You may be able to get reasonable performance by using hyperparameters tuned for a similar environment; alternatively, you can tune the hyperparameters using the tuning script.

To view the results:

python -m imitation.scripts.analyze analyze_imitation with \
    source_dir_str="output/sacred" table_verbosity=0  \
    csv_output_path=results.csv \
    run_name="<name>"

To run a single benchmark from Python add the config to your Sacred experiment ex:

...
from imitation.scripts.<train_script> import <train_ex>
<train_ex>.run(command_name="<algo>", named_configs=["<algo>_<env>"])

Entire benchmark suite

Running locally

To generate the commands to run the entire benchmarking suite with multiple random seeds:

python experiments/commands.py \
  --name=<name> \
  --cfg_pattern "benchmarking/example_*.json" \
  --seeds 0 1 2 \
  --output_dir=output

To run those commands in parallel:

python experiments/commands.py \
  --name=<name> \
  --cfg_pattern "benchmarking/example_*.json" \
  --seeds 0 1 2 \
  --output_dir=output | parallel -j 8

(You may need to brew install parallel to get this to work on Mac.)

Running on Hofvarpnir

To generate the commands for the Hofvarpnir cluster:

python experiments/commands.py \
  --name=<name> \
  --cfg_pattern "benchmarking/example_*.json" \
  --seeds 0 1 2 \
  --output_dir=/data/output \
  --remote

To run those commands pipe them into bash:

python experiments/commands.py \
  --name <name> \
  --cfg_pattern "benchmarking/example_*.json" \
  --seeds 0 1 2 \
  --output_dir /data/output \
  --remote | bash

Results

To produce a table with all the results:

python -m imitation.scripts.analyze analyze_imitation with \
    source_dir_str="output/sacred" table_verbosity=0  \
    csv_output_path=results.csv \
    run_name="<name>"

To compute a p-value to test whether the differences from the paper are statistically significant:

python -m imitation.scripts.compare_to_baseline results.csv

Tuning Hyperparameters

The hyperparameters of any algorithm in imitation can be tuned using src/imitation/scripts/tuning.py. The benchmarking hyperparameter configs were generated by tuning the hyperparameters using the search space defined in the scripts/config/tuning.py.

The tuning script proceeds in two phases:

Tune the hyperparameters using the search space provided.
Re-evaluate the best hyperparameter config found in the first phase based on the maximum mean return on a separate set of seeds. Report the mean and standard deviation of these trials.

To use it with the default search space:

python -m imitation.scripts.tuning with <algo> 'parallel_run_config.base_named_configs=["<env>"]'

In this command:

<algo> provides the default search space and settings for the specific algorithm, which is defined in the scripts/config/tuning.py
<env> sets the environment to tune the algorithm in. They are defined in the algo-specifc scripts/config/train_[adversarial|imitation|preference_comparisons|rl].py files. For the already tuned environments, use the <algo>_<env> named configs here.

See the documentation of scripts/tuning.py and scripts/parallel.py for many other arguments that can be provided through the command line to change the tuning behavior.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmarking imitation

Single benchmark

Entire benchmark suite

Running locally

Running on Hofvarpnir

Results

Tuning Hyperparameters

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Benchmarking imitation

Single benchmark

Entire benchmark suite

Running locally

Running on Hofvarpnir

Results

Tuning Hyperparameters