Skip to content

[EXTRA][FEAT] StructuralErrorContext for structural-context-aware error reporting#586

Open
tqchen wants to merge 1 commit into
apache:mainfrom
tqchen:tvm-ffi-structured-error-chain-for-recursive-ir-visits
Open

[EXTRA][FEAT] StructuralErrorContext for structural-context-aware error reporting#586
tqchen wants to merge 1 commit into
apache:mainfrom
tqchen:tvm-ffi-structured-error-chain-for-recursive-ir-visits

Conversation

@tqchen
Copy link
Copy Markdown
Member

@tqchen tqchen commented May 13, 2026

Summary

When code throws while recursively visiting an Object tree, the resulting error message lacks position context — the user sees what went wrong but not WHERE in the tree. This change adds a typed payload to attach the visit chain to Error and resolve it to a structured access path.

API

  • StructuralErrorContext — typed payload in tvm/ffi/extra/, holding a reverse_visit_pattern (innermost-first breadcrumb trail) and an optional previous_error_context (preserved when wrapping a pre-existing payload in extra_context).

  • TVM_FFI_STRUCTURAL_VISIT_BEGIN() / TVM_FFI_STRUCTURAL_VISIT_END(node) macros instrument visit dispatch. On throw, they populate the StructuralErrorContext attached to the in-flight Error.

  • StructuralErrorContext::FindAccessPaths(root, ctx, allow_prefix_match=false) walks root via reflection and returns Array<reflection::AccessPath> for matched positions. Strict full-match default; opt-in allow_prefix_match=true for best-effort partial reporting.

  • StructuralErrorContext::TryGetFromError(err) inline helper fetches the payload from an Error.

ABI

No ABI changes. No modifications to c_api.h or error.h (existing Error::extra_context slot is reused; direct refcount-aware assignment via existing ObjectUnsafe patterns).

Test plan

  • Layer A — macro-builds-chain: `MacroBuildsChain`, `MacroPreExistingPayloadWrap`
  • Layer B — `FindAccessPaths` correctness: `BasicMatch`, `AccessKindCoverage`, `SparsePatternAnchors`, `PartialChain`, `EdgeCases`
  • `StructuralErrorContext.TryGetFromError` helper
  • All 363 existing C++ tests pass + 8 new tests = 371 total (2 pre-existing disabled, unrelated)
  • All 2301 Python tests pass
  • Pre-commit clean (27 hooks)

…or reporting

When code throws while recursively visiting an Object tree, the
resulting error message lacks position context — the user sees what
went wrong but not WHERE in the tree. This change adds a typed payload
to attach the visit chain to Error and resolve it to a structured
access path.

API
---
- `StructuralErrorContext` typed payload in `tvm/ffi/extra/`, holding a
  `reverse_visit_pattern` (innermost-first breadcrumb trail) and an
  optional `previous_error_context` (preserved when wrapping a
  pre-existing payload in `extra_context`).

- `TVM_FFI_STRUCTURAL_VISIT_BEGIN()` / `TVM_FFI_STRUCTURAL_VISIT_END(node)`
  macros instrument the visit dispatch. On throw, they populate the
  StructuralErrorContext attached to the in-flight Error.

- `StructuralErrorContext::FindAccessPaths(root, ctx, allow_prefix_match)`
  walks `root` via reflection and returns
  `Array<reflection::AccessPath>` for matched positions. Strict full-match
  by default; opt-in `allow_prefix_match=true` for best-effort partial
  reporting.

- `StructuralErrorContext::TryGetFromError(err)` inline helper fetches
  the payload from an Error.

Tests cover macro-builds-chain and FindAccessPaths-resolves-paths
separately, including sparse-pattern anchoring, CSE multi-match, partial
chains, and the `allow_prefix_match` toggle.
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces StructuralErrorContext, a mechanism to capture and report the chain of objects visited during recursive structural visits when an error occurs. It includes C++ and Cython additions to store this context in Error::extra_context and a finder to resolve these breadcrumbs into structured AccessPath instances. Feedback was provided regarding the efficiency of Array::push_back during exception unwinding, the safety of casting away const for metadata pointers, and potential type-compatibility issues when casting specialized arrays to Array<Any>.

if (ec) {
Optional<StructuralErrorContext> sec = ec->as<StructuralErrorContext>();
if (sec) {
sec.value()->reverse_visit_pattern.push_back(node);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Mutating an Array member directly via push_back can be inefficient if the underlying ArrayObj is shared, as it will trigger a copy-on-write (COW) operation at every level of the recursive visit. While acceptable for error reporting where performance is less critical, consider using a std::vector internally during the accumulation phase if tree depths are expected to be very large.

s.kind = reflection::AccessKind::kAttr;
// Any has TypeTraits<void*> (kTVMFFIOpaquePtr) but not const void*.
// Cast away const for storage; restore on retrieval.
s.key = Any(const_cast<void*>(static_cast<const void*>(fi)));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Casting away const from TVMFFIFieldInfo* to store it in Any is slightly risky if Any's destructor or other operations assume ownership or mutability. Since TVMFFIFieldInfo is typically static metadata, this is likely safe here, but it's a pattern to use with caution. Ensure that Any is used only for opaque pointer storage in this context.

switch (type_index) {
case TypeIndex::kTVMFFIArray:
this->VisitSequence(
details::AnyUnsafe::MoveFromAnyAfterCheck<Array<Any>>(std::move(value)));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using MoveFromAnyAfterCheck<Array<Any>> on an Any that might contain a specialized Array<T> (like Array<ObjectRef>) assumes that the FFI Array implementation is type-erased or that Array<Any> is compatible with all Array<T> at the ABI level. If Array<T> is strict about its template parameter, this check might fail for specialized arrays.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant