Skip to content

[release/0.2.1] Change the prompt for the finetuning#722

Merged
xyao-nv merged 3 commits into
release/0.2.1from
pulkitg/release/0.2.1/change_trained_prompt
May 29, 2026
Merged

[release/0.2.1] Change the prompt for the finetuning#722
xyao-nv merged 3 commits into
release/0.2.1from
pulkitg/release/0.2.1/change_trained_prompt

Conversation

@pulkitg01
Copy link
Copy Markdown
Collaborator

@pulkitg01 pulkitg01 commented May 26, 2026

Summary

Align static apple prompt

Detailed description

  • Reason: align the simulated static apple workflow with the same prompt used for the real setup.
  • Changed the GR00T LeRobot training config prompt to move the apple to the plate.
  • Updated the closed-loop evaluation config to use the same prompt.
  • Updated the static apple policy-training docs snippet so users finetune with the matching prompt.
  • Impact: training and evaluation now use consistent language conditioning across sim and real workflows.

@pulkitg01 pulkitg01 self-assigned this May 26, 2026
@pulkitg01 pulkitg01 changed the title Change the prompt for the finetuning [release/0.2.1] Change the prompt for the finetuning May 26, 2026
Copy link
Copy Markdown
Contributor

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

This PR simplifies the language instruction prompt from a detailed description to a more concise task statement across 3 configuration files and documentation.

Changes Reviewed

File Change
docs/pages/example_workflows/static_apple/step_3_policy_training.rst Updated documentation prompt
isaaclab_arena_gr00t/lerobot/config/g1_static_apple_config.yaml Updated training config prompt
isaaclab_arena_gr00t/policy/config/g1_static_apple_gr00t_closedloop_config.yaml Updated inference config prompt

Observations

Consistency: The prompt change is applied consistently across all relevant files (documentation, LeRobot training config, and closed-loop policy config).

Commit hygiene: Good to see the follow-up commit removing the trailing period for cleaner prompt formatting.

Clarification Request

The prompt is changing from:

"Pick up the apple from the shelf and place it onto the plate on the same shelf next to it."

to:

"move the apple to the plate"

Question: Could you confirm this matches the prompt used during the finetuning of checkpoint-20000? Language-conditioned policies can be sensitive to prompt wording, so ensuring the inference prompt matches the training prompt is important for optimal performance.


This is an automated review. Please address any questions before merging.


Update (4ab975d): Reviewed incremental changes from 23c7a78 to 4ab975d.

Additional Changes in This Push

This update contains significant documentation improvements across the static apple workflow and related teleoperation guides:

Documentation restructuring (step_1_environment_setup.rst):

  • Condensed the verbose code walkthrough into a cleaner component table format
  • Moved implementation details to a concise "Environment Composition" section
  • Removed inline code snippets in favor of pointing to the source file

CloudXR network requirements (multiple files):

  • Added consistent .. important:: admonition at the top of teleoperation steps to highlight network requirements upfront
  • Removed scattered inline network setup notes that were redundant with the new consolidated requirement

Training workflow updates (step_3_policy_training.rst):

  • Changed recommended GPU config from 8x L40s to single RTX 6000 Ada with adjusted batch size (12 vs 96)
  • Added dropdown for pre-generated dataset download from HuggingFace
  • Added explicit host path guidance for DATASET_DIR and MODELS_DIR outside Docker

Evaluation workflow updates (step_4_evaluation.rst):

  • Added dropdown for pre-trained model download from HuggingFace
  • Added new "Parallel Environments Evaluation" section for multi-env testing
  • Added note about Kit permission errors and stale cache issues
  • Improved troubleshooting guidance for action shape mismatches

Minor fixes:

  • Updated GR00T repo links to point to specific commit 4b1dca9d88d2a0b9ea5a65aa61c82ff89f5c4f0e
  • Fixed typo "GR00T-flavoured" → generic "Arena"
  • Added headset view reset instructions for Meta Quest and PICO 4 Ultra

✅ All changes look good — documentation improvements are well-organized and the pre-recorded dataset/checkpoint downloads will make the tutorial more accessible.


This is an automated review. Please address any questions before merging.

@pulkitg01 pulkitg01 force-pushed the pulkitg/release/0.2.1/change_trained_prompt branch from 23c7a78 to 7080605 Compare May 26, 2026 22:14
@shanereetz
Copy link
Copy Markdown

LGTM 🚀

@pulkitg01 pulkitg01 marked this pull request as ready for review May 29, 2026 21:32
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 29, 2026

Greptile Summary

This PR aligns the static apple workflow's language_instruction with the real-setup prompt by replacing the verbose sentence with the shorter "move the apple to the plate" across training config, evaluation config, and docs.

  • Training config (g1_static_apple_config.yaml) and evaluation config (g1_static_apple_gr00t_closedloop_config.yaml) both receive the updated prompt, ensuring consistent language conditioning between sim and real.
  • Documentation snippet (step_3_policy_training.rst) updates the displayed config, but line 110 retains stale prose that still characterises the old verbose instruction and will mislead readers.

Confidence Score: 4/5

Safe to merge after fixing the stale explanatory prose in the docs that still describes the old verbose instruction.

The two YAML config changes are correct and consistent. The docs file updates the displayed config snippet correctly, but line 110 still reads "describing the same-shelf placement (no walking, no second table)" — a parenthetical that was accurate for the old prompt but actively misleads readers now that the instruction is simply "move the apple to the plate".

docs/pages/example_workflows/static_apple/step_3_policy_training.rst — line 110 prose needs updating to reflect the new short prompt.

Important Files Changed

Filename Overview
docs/pages/example_workflows/static_apple/step_3_policy_training.rst Updates the doc snippet's language_instruction to the new short prompt, but line 110 retains stale prose that still describes the old verbose instruction.
isaaclab_arena_gr00t/lerobot/config/g1_static_apple_config.yaml Replaces the verbose language_instruction with the shorter "move the apple to the plate" to align training config with the real setup.
isaaclab_arena_gr00t/policy/config/g1_static_apple_gr00t_closedloop_config.yaml Updates language_instruction in the closed-loop evaluation config to match the new training prompt, ensuring consistent language conditioning.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["language_instruction\n'move the apple to the plate'"] --> B["g1_static_apple_config.yaml\n(LeRobot training config)"]
    A --> C["g1_static_apple_gr00t_closedloop_config.yaml\n(closed-loop eval config)"]
    A --> D["step_3_policy_training.rst\n(docs snippet)"]
    B --> E["GR00T N1.7 fine-tuning"]
    C --> F["Closed-loop policy evaluation"]
    D --> G["User docs\n⚠️ line 110 prose still describes old prompt"]
Loading

Comments Outside Diff (1)

  1. docs/pages/example_workflows/static_apple/step_3_policy_training.rst, line 110 (link)

    P1 Stale prose describing the old prompt

    Line 110 still reads "describing the same-shelf placement (no walking, no second table)", which was written to characterise the previous verbose instruction. Now that language_instruction is "move the apple to the plate", that parenthetical no longer reflects what the instruction says and will mislead readers comparing the static-apple config against the loco-manipulation one.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Reviews (2): Last reviewed commit: "Merge branch 'release/0.2.1' into pulkit..." | Re-trigger Greptile

@xyao-nv xyao-nv enabled auto-merge (squash) May 29, 2026 21:46
@xyao-nv xyao-nv merged commit a1ad172 into release/0.2.1 May 29, 2026
5 of 6 checks passed
pulkitg01 added a commit that referenced this pull request May 29, 2026
…ing (#743)

cp #722  from release/0.2.1

Align static apple prompt

- Reason: align the simulated static apple workflow with the same prompt
used for the real setup.
- Changed the GR00T LeRobot training config prompt to `move the apple to
the plate`.
- Updated the closed-loop evaluation config to use the same prompt.
- Updated the static apple policy-training docs snippet so users
finetune with the matching prompt.
- Impact: training and evaluation now use consistent language
conditioning across sim and real workflows.

---------

## Summary
Short description of the change (max 50 chars)

## Detailed description
- What was the reason for the change?
- What has been changed?
- What is the impact of this change?

Co-authored-by: Xinjie Yao <xyao@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants