Stop importing object_library from calling the Lightwheel API#741
Stop importing object_library from calling the Lightwheel API#741cvolkcvolk wants to merge 5 commits into
Conversation
There was a problem hiding this comment.
IsaacLab Review Bot Summary
PR #741: Defer Lightwheel SDK calls in asset classes to first-access
Overview
This PR introduces a LightwheelLazyPath descriptor to defer Lightwheel SDK calls from module import time to first attribute access. This is a well-motivated fix for a production issue where transient SDK timeouts during module reload would leave asset classes unregistered.
What I Reviewed
- New file:
isaaclab_arena/assets/lightwheel_lazy.py(117 lines) - Modified file:
isaaclab_arena/assets/object_library.py(6 asset classes updated)
Findings
✅ Strengths
-
Clean descriptor pattern: The
LightwheelLazyPathdescriptor correctly implements__get__for lazy evaluation and caching. -
Good error handling: SDK failures are wrapped with contextual information (asset identifier, registry type), making debugging easier than the previous bare SDK exceptions.
-
Minimal behavioral change: Existing users see no change — paths still resolve via the SDK, just at first-access instead of import-time.
-
Well-documented: Excellent docstrings explaining the problem, solution, and usage patterns.
-
Correct use of both SDK signatures: Properly supports both
file_name=andregistry_name=variants as used by the different asset classes.
📝 Minor Observations (Not Blocking)
-
Thread safety: The lazy initialization in
__get__has a benign race condition — concurrent first-accesses could both call the SDK before_cached_pathis set. This is acceptable here since:- Both calls would return identical results
- Worst case is duplicate network calls, not incorrect state
- Adding thread locks would increase complexity for minimal benefit in this context
If thread safety ever becomes a concern (e.g., heavily parallel asset loading), a simple
threading.Lockcould be added. -
Type annotation: The return type of
__get__could be explicitly annotated asstrfor clarity, though this is stylistic.
Verdict
Looks good to merge. The implementation is clean, well-documented, and correctly addresses the stated problem. The architectural decision to use a descriptor is appropriate and Pythonic.
🤖 Automated review by IsaacLab Review Bot
Update (f2c94bc): Reviewed incremental changes between 35f0a18..f2c94bc.
Additional Changes Since Last Review
This update significantly expands the PR scope beyond the original Lightwheel lazy loading:
1. New Lightwheel Utilities (Related to PR Title)
lightwheel_utils.py: Addsacquire_lightwheel_asset()with retry logic and timeout handling for SDK calls. Good defensive programming for network-based asset fetching.background_library.py: UpdatedRobocasaKitchento use the newacquire_lightwheel_asset()wrapper.
2. Composite Task System (Major Addition)
composite_task_base.py(~400 lines): New base class for unordered composite tasks where subtask completion order does not matter.sequential_task_base.py: Refactored to extendCompositeTaskBase, now enforces ordering constraints by overriding the success function.- Renames internal state variable
_subtask_success_state→_subtask_ever_succeededfor clarity. - Adds support for
desired_subtask_success_statewithNoneentries as "do not care" wildcards.
3. Relation Solver Refactoring
relation_solver_interface.py(~166 lines): Extracts relation-solving logic fromArenaEnvBuilderinto a standalone module for reusability.arena_env_builder.py: Significantly simplified by delegating to the new interface.
4. Documentation Updates
- Multiple
.rstfiles updated for XR/CloudXR teleoperation workflows. - New images for second viewport monitoring.
5. Test Coverage
- New tests for
CompositeTaskBase,SequentialTaskBase,LightwheelLazyPath, andrelation_solver_interface. - Tests renamed from "pooled_placer" to "pooled_object_placer" for consistency.
Observations
-
Scope creep: The PR title says "Defer Lightwheel SDK calls" but now includes a substantial task system refactor and documentation updates. Consider splitting into separate PRs for cleaner history.
-
The Lightwheel changes are solid: The
acquire_lightwheel_asset()wrapper with retry/timeout logic is a good addition that complements the lazy descriptor. -
CompositeTaskBase looks well-designed: Clean separation between ordered (sequential) and unordered composite task semantics.
Verdict
Changes look good technically. The code is well-tested and the refactoring improves modularity. The main concern is scope — this PR now contains three distinct features. If the team is okay with the combined scope, it is ready to merge.
🤖 Automated review by IsaacLab Review Bot
Update (47c6d74): Reviewed incremental changes between f2c94bc..47c6d74.
Changes in This Update
This commit simplifies the LightwheelLazyPath descriptor API and adds explicit file_type parameters:
1. API Simplification in lightwheel_lazy.py
The descriptor constructor is now a simple **kwargs passthrough:
- Before: Explicit
registry_type,file_name,registry_name,file_typeparameters with validation - After:
__init__(self, **acquire_by_registry_kwargs)— just stores and forwards kwargs directly
This is a nice simplification that:
- Removes coupling between the descriptor and SDK parameter names
- Makes adding new SDK parameters easier (no descriptor changes needed)
- Trades compile-time argument checking for runtime SDK validation
Trade-off note: The previous version had an assert ensuring exactly one of file_name or registry_name was provided. This validation is now deferred to the SDK call. Both approaches are valid.
2. Explicit file_type="USD" in object_library.py
All LightwheelLazyPath usages now explicitly specify file_type="USD":
Microwave,CoffeeMachine,Broccoli,SweetPotato,Jug,BeerBottle
Previously file_type="USD" was the default in the descriptor; now that defaults are removed, it must be explicit. This is cleaner — no hidden defaults.
3. Import Ordering Fix
Sorted imports in object_library.py — LightwheelLazyPath import moved to alphabetical position.
Verdict
Good incremental improvement. The kwargs-passthrough pattern is cleaner and more maintainable. Explicit file_type parameters improve readability. No concerns with this update.
🤖 Automated review by IsaacLab Review Bot
Six asset classes in ``object_library`` (Microwave, CoffeeMachine, Broccoli, SweetPotato, Jug, BeerBottle) resolved their USD paths via ``object_loader.acquire_by_registry(...)`` in the class body. Class bodies are evaluated at module import time, so importing or reloading ``object_library`` made blocking HTTP calls to the Lightwheel API. The SDK has a 10-second read timeout. If any one call timed out (or hit any other transient error), the class body raised, the module import or reload failed halfway, and every asset class defined later in the same file was left undefined and unregistered. We hit this in a long multi-job eval sweep: a transient Lightwheel stall during ``importlib.reload(object_library)`` broke ``RubiksCubeHot3DRobolab``'s registration alongside the actual culprit, and subsequent jobs crashed with a misleading "component rubiks_cube_hot3d_robolab not found" error. Introduce ``LightwheelLazyPath``: a class-attribute descriptor that defers the SDK call from class-body evaluation to first attribute access. Resolved paths are cached per descriptor instance, so the SDK is called at most once per asset per process. Failures are re-raised at use-time with the asset's identifier, so a future broken asset surfaces where it's being used rather than corrupting unrelated registrations elsewhere. After this change, ``object_library`` imports without network access, all ``@register_asset`` decorators run, and the registry is fully populated regardless of Lightwheel API health. Existing users of these six assets see no behavioral change — first access still resolves the path, just slightly later in the lifecycle. Signed-off-by: Clemens Volk <cvolk@nvidia.com>
PR #731 added a shared ``acquire_lightwheel_asset`` helper that wraps ``object_loader.acquire_by_registry`` calls with a longer scoped timeout (60 s vs. the SDK default 10 s) and a small retry budget for transient timeout failures. After the lazy descriptor lands, the moment Lightwheel USD paths are actually fetched moves from class-body evaluation to first attribute access, but the call itself still benefits from the resilience work. Route ``LightwheelLazyPath.__get__`` through the helper so the two fixes compose: zero network at import, retry-on-timeout on actual use. The descriptor still wraps the resolved call in a ``RuntimeError`` that names the failing asset, so caller-facing error messages remain specific even when retries have been exhausted. Signed-off-by: Clemens Volk <cvolk@nvidia.com>
35f0a18 to
4152e55
Compare
Move the query-kwargs assembly and retry-helper description string from ``__get__`` into ``__init__`` — they only depend on constructor arguments and don't need to be recomputed on every attribute access. Drop the ``try / except → RuntimeError`` wrapper: the SDK already includes the registry parameters in its own error message, and ``acquire_lightwheel_asset`` already logs the asset description in its retry-attempt lines, so the wrap added no information that wasn't already in the failure-path logs. ``__get__`` is now just a cache check, the lazy SDK import, and the retry-helper call. Signed-off-by: Clemens Volk <cvolk@nvidia.com>
Drop the module-level docstring (history, before/after example, rationale) and reduce the class docstring to a one-liner. The PR description carries the bug-discovery write-up and the rationale; the code itself is small enough that the new one-line summary is sufficient. Signed-off-by: Clemens Volk <cvolk@nvidia.com>
Drop the named ``registry_type``/``file_name``/``registry_name``/ ``file_type`` parameters on ``LightwheelLazyPath`` along with the assert that enforced exactly one of file_name/registry_name. The descriptor now accepts ``**acquire_by_registry_kwargs`` and forwards them verbatim to the SDK; mis-use surfaces from the SDK at first access rather than from a parallel guard. Restore the explicit ``file_type="USD"`` argument at the six call sites in ``object_library.py`` so the descriptor no longer needs a default to keep them short. Signed-off-by: Clemens Volk <cvolk@nvidia.com>
Greptile SummaryThis PR removes six unconditional Lightwheel HTTPS calls that were triggered on
Confidence Score: 4/5Safe to merge for the primary goal of eliminating import-time API calls; the main instantiation path works correctly. The core logic is sound: the descriptor fires on first isaaclab_arena/assets/lightwheel_lazy.py — the descriptor's Important Files Changed
Sequence DiagramsequenceDiagram
participant C as Caller
participant M as Microwave.__init__
participant L as LibraryObject.__init__
participant D as LightwheelLazyPath.__get__
participant SDK as lightwheel_sdk
participant O as Object.__init__
C->>M: Microwave()
M->>L: super().__init__(...)
L->>D: self.usd_path (descriptor access)
alt _cached_path is set
D-->>L: return cached string
else first access
D->>SDK: acquire_lightwheel_asset(...)
SDK-->>D: (file_path, _, _)
D->>D: "self._cached_path = file_path"
D-->>L: return file_path
end
L->>O: "super().__init__(usd_path=file_path, ...)"
O->>O: "self.usd_path = file_path (instance attr)"
O-->>C: "instance ready (instance.usd_path = string)"
Reviews (1): Last reviewed commit: "Make LightwheelLazyPath a pure kwargs pa..." | Re-trigger Greptile |
| def __get__(self, instance, owner): | ||
| if self._cached_path is not None: |
There was a problem hiding this comment.
Missing
instance is None guard in __get__
Without a check for instance is None, accessing usd_path directly on the class (Microwave.usd_path) triggers a live Lightwheel API call. This breaks the lazy-loading contract for any code that introspects the class attribute — e.g., registry tooling, serialization, or a simple cls.usd_path is not None guard — all of which will now make network requests. The standard descriptor convention is to return self when instance is None.
| def __init__(self, **acquire_by_registry_kwargs): | ||
| self._kwargs = acquire_by_registry_kwargs | ||
| self._cached_path: str | None = None |
There was a problem hiding this comment.
None used as cache sentinel is fragile
_cached_path is initialized to None and the guard is if self._cached_path is not None. If the SDK ever returns None for a valid (but empty/missing) path without raising, the value would be cached as None but the guard would miss it on every subsequent access, causing repeated API calls. lightwheel_utils.py already uses a _MISSING = object() sentinel for exactly this pattern — applying the same approach here avoids the ambiguity between "not yet resolved" and a genuinely None result.
| def __init__(self, **acquire_by_registry_kwargs): | |
| self._kwargs = acquire_by_registry_kwargs | |
| self._cached_path: str | None = None | |
| _UNSET = object() # sentinel distinct from None | |
| def __init__(self, **acquire_by_registry_kwargs): | |
| self._kwargs = acquire_by_registry_kwargs | |
| self._cached_path: str | object = LightwheelLazyPath._UNSET |
| def __get__(self, instance, owner): | ||
| if self._cached_path is not None: | ||
| return self._cached_path |
There was a problem hiding this comment.
Cache guard must also be updated to use the sentinel
This is the companion change to replacing None with _UNSET as the sentinel (see the __init__ comment above).
| def __get__(self, instance, owner): | |
| if self._cached_path is not None: | |
| return self._cached_path | |
| def __get__(self, instance, owner): | |
| if instance is None: | |
| return self | |
| if self._cached_path is not LightwheelLazyPath._UNSET: | |
| return self._cached_path |
Summary
Importing
isaaclab_arena.assets.object_library— anyone instantiating an Arena asset, Lightwheel or not — made six unconditional HTTPS calls to the Lightwheel API at module-import time.This PR introduces
LightwheelLazyPath: a class-attribute descriptor that defers the SDK call to first access.