4. Experiments 🧪

Run Existing Experiments

The easiest way to recreate the experiments is to clone this repository and run the corresponding scripts as listed below. If you'd like to build from source, make sure to have

The experiments/ and scripts/ folders downloaded
The intercode package built from source or from pypi
The pip dependencies listed in environment.yml installed
Put a keys.cfg file in the root directory of the repository and copy/paste + fill out the following template (All keys are not necessary if you are only interested in running a subset of all models):

OPENAI_API_KEY: '<OpenAI Key Here>'
HF_TOKEN: '<HuggingFace Token Here>'
HF_API_URL: '<HuggingFace Endpoint URL>'
PALM_API_KEY: '<PaLM Key Here>'

The following table lists each of the runnable experiments along with the script to invoke the experiment and its implementation (each of which includes a set of flags).

Experiment (Prompt Strategy)	Script	File
Try Again	`./scripts/expr_multi_turn.sh` `./scripts/expr_n_turn_others.sh`	`./experiments/eval_n_turn.py` `./experiments/eval_n_turn_others.py`
Plan & Solve [1]**	`./scripts/expr_plan_solve.sh`	`./experiments/eval_plan_solve.py`
ReAct [2]**	`./scripts/expr_react.sh`	`./experiments/eval_react.py`

* - The eval_n_turn file is written to handle running Try Again experiments for the GPT family, while eval_n_turn_others is for running the PaLM and open source family models mentioned in the paper.
** - At the moment, this experiment has only been test run with the GPT 3.5 model

Results

The output .json files containing the reward and interaction history for the task instances of each experiments discussed in the main paper can be found in the ./data/results/ folder.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

4. Experiments 🧪

Run Existing Experiments

Results

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally