[release/0.2.1] Change the prompt for the finetuning#722
Conversation
There was a problem hiding this comment.
Review Summary
This PR simplifies the language instruction prompt from a detailed description to a more concise task statement across 3 configuration files and documentation.
Changes Reviewed
| File | Change |
|---|---|
docs/pages/example_workflows/static_apple/step_3_policy_training.rst |
Updated documentation prompt |
isaaclab_arena_gr00t/lerobot/config/g1_static_apple_config.yaml |
Updated training config prompt |
isaaclab_arena_gr00t/policy/config/g1_static_apple_gr00t_closedloop_config.yaml |
Updated inference config prompt |
Observations
✅ Consistency: The prompt change is applied consistently across all relevant files (documentation, LeRobot training config, and closed-loop policy config).
✅ Commit hygiene: Good to see the follow-up commit removing the trailing period for cleaner prompt formatting.
Clarification Request
The prompt is changing from:
"Pick up the apple from the shelf and place it onto the plate on the same shelf next to it."
to:
"move the apple to the plate"
Question: Could you confirm this matches the prompt used during the finetuning of checkpoint-20000? Language-conditioned policies can be sensitive to prompt wording, so ensuring the inference prompt matches the training prompt is important for optimal performance.
This is an automated review. Please address any questions before merging.
Update (4ab975d): Reviewed incremental changes from 23c7a78 to 4ab975d.
Additional Changes in This Push
This update contains significant documentation improvements across the static apple workflow and related teleoperation guides:
Documentation restructuring (step_1_environment_setup.rst):
- Condensed the verbose code walkthrough into a cleaner component table format
- Moved implementation details to a concise "Environment Composition" section
- Removed inline code snippets in favor of pointing to the source file
CloudXR network requirements (multiple files):
- Added consistent
.. important::admonition at the top of teleoperation steps to highlight network requirements upfront - Removed scattered inline network setup notes that were redundant with the new consolidated requirement
Training workflow updates (step_3_policy_training.rst):
- Changed recommended GPU config from 8x L40s to single RTX 6000 Ada with adjusted batch size (12 vs 96)
- Added dropdown for pre-generated dataset download from HuggingFace
- Added explicit host path guidance for
DATASET_DIRandMODELS_DIRoutside Docker
Evaluation workflow updates (step_4_evaluation.rst):
- Added dropdown for pre-trained model download from HuggingFace
- Added new "Parallel Environments Evaluation" section for multi-env testing
- Added note about Kit permission errors and stale cache issues
- Improved troubleshooting guidance for action shape mismatches
Minor fixes:
- Updated GR00T repo links to point to specific commit
4b1dca9d88d2a0b9ea5a65aa61c82ff89f5c4f0e - Fixed typo "GR00T-flavoured" → generic "Arena"
- Added headset view reset instructions for Meta Quest and PICO 4 Ultra
✅ All changes look good — documentation improvements are well-organized and the pre-recorded dataset/checkpoint downloads will make the tutorial more accessible.
This is an automated review. Please address any questions before merging.
23c7a78 to
7080605
Compare
|
LGTM 🚀 |
Greptile SummaryThis PR aligns the static apple workflow's
Confidence Score: 4/5Safe to merge after fixing the stale explanatory prose in the docs that still describes the old verbose instruction. The two YAML config changes are correct and consistent. The docs file updates the displayed config snippet correctly, but line 110 still reads "describing the same-shelf placement (no walking, no second table)" — a parenthetical that was accurate for the old prompt but actively misleads readers now that the instruction is simply "move the apple to the plate". docs/pages/example_workflows/static_apple/step_3_policy_training.rst — line 110 prose needs updating to reflect the new short prompt. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["language_instruction\n'move the apple to the plate'"] --> B["g1_static_apple_config.yaml\n(LeRobot training config)"]
A --> C["g1_static_apple_gr00t_closedloop_config.yaml\n(closed-loop eval config)"]
A --> D["step_3_policy_training.rst\n(docs snippet)"]
B --> E["GR00T N1.7 fine-tuning"]
C --> F["Closed-loop policy evaluation"]
D --> G["User docs\n⚠️ line 110 prose still describes old prompt"]
|
…ing (#743) cp #722 from release/0.2.1 Align static apple prompt - Reason: align the simulated static apple workflow with the same prompt used for the real setup. - Changed the GR00T LeRobot training config prompt to `move the apple to the plate`. - Updated the closed-loop evaluation config to use the same prompt. - Updated the static apple policy-training docs snippet so users finetune with the matching prompt. - Impact: training and evaluation now use consistent language conditioning across sim and real workflows. --------- ## Summary Short description of the change (max 50 chars) ## Detailed description - What was the reason for the change? - What has been changed? - What is the impact of this change? Co-authored-by: Xinjie Yao <xyao@nvidia.com>
Summary
Align static apple prompt
Detailed description
move the apple to the plate.