feat: add DFS-based ffi.ReprPrint for unified object repr by junrushao · Pull Request #454 · apache/tvm-ffi

junrushao · 2026-02-16T04:57:37Z

Summary

Single C++ ffi.ReprPrint function produces human-readable repr for any TVM FFI value
DFS with 3-state tracking (NotVisited/InProgress/Done):
- DAGs: memoized repr returned in full on every re-encounter
- Cycles: detected via InProgress state, shown as ...
Addresses hidden by default; set TVM_FFI_REPR_WITH_ADDR=1 to show
Per-field Repr(false) InfoTrait to exclude fields from repr output
Built-in repr for String, Bytes, Tensor, Shape, Array, List, Map
All Python __repr__ methods delegate to this function

Format Examples

42                                    # int
"hello"                               # String
(1, 2, 3)                             # Array
[1, 2, 3]                             # List
{"key": "value"}                      # Map
testing.MyObj(x=1, y="hi")            # User object
...                                   # Cycle marker
float32[3, 4]@cpu:0@0x1234            # Tensor

Test plan

55 Python tests covering primitives, containers, user objects, DAGs, cycles, and TVM_FFI_REPR_WITH_ADDR
All pre-commit hooks pass (ruff, ty, clang-format, markdownlint, etc.)
Container tests (test_container.py) pass with updated Array format

gemini-code-assist · 2026-02-16T04:58:01Z

Summary of Changes

Hello @junrushao, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the string representation (repr) of FFI objects by introducing a robust, centralized C++ system. The new ffi.ReprPrint mechanism leverages a Breadth-First Search (BFS) algorithm to accurately represent complex object graphs, including those with cycles and shared references, preventing infinite recursion and providing concise output for duplicates. This change streamlines how objects are displayed, offering both built-in type-specific formatting and a flexible reflection-based approach for user-defined types, while also integrating seamlessly with Python's __repr__ methods.

Highlights

Unified Object Representation: Introduced a new centralized, reflection-based ffi.ReprPrint system in C++ for generating string representations of FFI objects.
Cycle and DAG Handling: The ffi.ReprPrint system uses a BFS traversal to correctly handle object graphs with cycles and duplicate references, ensuring consistent and informative repr output.
Custom and Generic Formatters: Provided specialized formatting for common built-in types (String, Tensor, Shape, Array, List, Map) and a generic reflection-based mechanism for user-defined types, including type_key@0xADDR(field=value, ...).
Field Selection for Repr: Added a refl::repr_fields tag struct to allow explicit control over which fields appear in the repr output for user-defined types.
Python Integration: Python's __repr__ methods for Object, Array, List, and Map now delegate to the new C++ ffi.ReprPrint functionality, ensuring consistent behavior across language boundaries.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

CMakeLists.txt
- Updated build configuration to include the new repr_print.cc source file.
include/tvm/ffi/reflection/registry.h
- Added new type attributes kRepr and kReprFields for custom representation and field selection.
- Introduced a repr_fields tag struct to control which fields are included in an object's representation.
python/tvm_ffi/_ffi_api.py
- Exposed the new ReprPrint function from the FFI API for Python usage.
python/tvm_ffi/container.py
- Modified __repr__ methods for Array, List, and Map to delegate to the new centralized ffi.ReprPrint system.
python/tvm_ffi/cython/object.pxi
- Updated the __object_repr__ function to dynamically load and utilize the ffi.ReprPrint global function.
python/tvm_ffi/dataclasses/_utils.py
- Removed the Python-side method_repr generation logic, as object representation is now handled by the C++ ffi.ReprPrint.
python/tvm_ffi/dataclasses/c_class.py
- Removed the repr parameter and related logic from the c_class decorator, centralizing repr generation in C++.
python/tvm_ffi/dataclasses/field.py
- Removed the repr parameter from the Field class and field function, streamlining field definition.
src/ffi/extra/repr_print.cc
- Added a new C++ file implementing a BFS-based ReprPrinter for FFI objects, including cycle detection and custom formatters for built-in types.
- Registered ffi.ReprPrint as a global FFI function.
src/ffi/testing/testing.cc
- Configured TestCxxClassDerived to use refl::repr_fields to specify which fields are included in its representation for testing purposes.
tests/python/test_container.py
- Updated an assertion in test_repr to match the new string quoting format for map keys.
tests/python/test_dataclasses_c_class.py
- Removed tests related to Python-generated __repr__ methods, as this functionality is now handled in C++.
tests/python/test_repr.py
- Added a new test file containing comprehensive unit tests for the ffi.ReprPrint functionality across various data types, user objects, and object graph structures.

Activity

A new feature for unified object representation (ffi.ReprPrint) has been introduced.
17 new tests were added in tests/python/test_repr.py to cover primitives, strings, containers, tensors, shapes, user objects, repr_fields, duplicate references, Python __repr__ integration, and empty containers.
An existing test in test_container.py was updated to reflect changes in string quoting format.
The author confirmed that the full test suite passes, including 401 Python tests and 276 C++ tests.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a centralized, reflection-based representation system (ffi.ReprPrint) for FFI objects, which is a great improvement for debuggability and consistency. The implementation correctly handles cycles and duplicate references in object graphs using a BFS traversal. The changes to delegate __repr__ in Python containers to this new system are clean. The new tests are comprehensive. I've found a few areas for improvement regarding code duplication, efficiency, and error handling, which are detailed in the specific comments. Overall, this is a solid feature addition.

python/tvm_ffi/cython/object.pxi

src/ffi/extra/repr_print.cc

tests/python/test_repr.py

tqchen

Would be great to discuss the text format a bit mainly in our behavior of common refererence, is there a precedence we can refer to?

src/ffi/extra/repr_print.cc

tqchen · 2026-02-16T12:56:16Z

Looks like great improvement, would be good to discuss a bit how are we thinking in terms of repr printing and behavior against python repr. The repr do not exactly need round trippable(serialization perhaps is a better choice there). So there is a question whether we want to have the duplicated value printing (or as an option), and whether it should be default.

The default behavior of python atm is simply expand. Expansion also could make sense for cases like immutable data structure. Say a shape value get reference in multiple places beause of the way we copy the data structure

x = (1,2,3)
y = (1,4)

print([y, y])
> [(1,4), (1,4)]

# circle case
x = [12]
x.append(x)
> [12, [...]]

This being said, there can be value in cases where we might want duplicated value printing. Perhaps we can do it under a flag.

tqchen · 2026-02-16T12:57:24Z

src/ffi/extra/repr_print.cc

+  return String(FormatBytes(obj->data, obj->size));
+}
+
+String ReprTensor(const TensorObj* obj, const Function& fn_repr) {


although this one is concise, personally i think it is good to be explicit here

Tensor(shape=(1, 2), dtype="float32", device="cuda:0")

I think it's a matter of personal taste, but also please consider the output length and readability to users. In that case:

float32[10, 20]@gpu:0

seems an overall win.

i agree it is a close call. My original thinking is ideally align with existing ones and not inventing new syntax (that users needs to learn from). Indeed the output synax would be longer, but being explicit allows user not having to learn what float32[10, 20] means.

Just to note some of the nit comments when i read the new syntax:

when looking at it, i would have questions like "is it maps to an on stack raw array?" (in which case it is not, and it maps to a tensor).

Tensor(shape=(1, 2), dtype="float32", device="cuda:0") avoids that confusion at the cost of slightly longer

Another minor nit is that @gpu:0 syntax conflicts a bit with @{addr} although it is really nit.

I updated the syntax. @0x{addr} shows up only when TVM_FFI_REPR_WITH_ADDR is set to 1. It means there's no point of confusion

junrushao · 2026-02-17T09:51:01Z

Updated the text format and duplication handling

junrushao · 2026-02-18T08:02:20Z

The default behavior of python atm is simply expand. Expansion also could make sense for cases like immutable data structure. Say a shape value get reference in multiple places beause of the way we copy the data structure
x = (1,2,3)
y = (1,4)

print([y, y])
> [(1,4), (1,4)]

# circle case
x = [12]
x.append(x)
> [12, [...]]
This being said, there can be value in cases where we might want duplicated value printing. Perhaps we can do it under a flag.

Behavior from the latest commit:

import tvm_ffi as ffi
from tvm_ffi._ffi_api import ReprPrint

x = ffi.List([12])
x.append(x)
print(ReprPrint(x))
# gives: [12, ...]

y = ffi.Array((1,4))
y = ffi.Array((y, y))
print(ReprPrint(y))
# gives: ((1, 4), (1, 4))

@tqchen LMK if it looks good to you

- Single C++ ffi.ReprPrint function handles all types - DFS with 3-state tracking (NotVisited/InProgress/Done): - DAGs: memoized repr returned in full on re-encounter - Cycles: detected via InProgress state, shown as ... - Addresses hidden by default; set TVM_FFI_REPR_WITH_ADDR=1 to show - Per-field Repr(false) to exclude fields from repr output - Built-in repr for String, Bytes, Tensor, Shape, Array, List, Map - All Python __repr__ methods delegate to this function

junrushao force-pushed the 2026-02-14/dataclass-repr branch from cb25c51 to 493d366 Compare February 16, 2026 04:58

junrushao marked this pull request as draft February 16, 2026 04:58

gemini-code-assist bot reviewed Feb 16, 2026

View reviewed changes

junrushao force-pushed the 2026-02-14/dataclass-repr branch 3 times, most recently from 4f8962e to 02b2f36 Compare February 16, 2026 05:52

junrushao marked this pull request as ready for review February 16, 2026 07:26

junrushao force-pushed the 2026-02-14/dataclass-repr branch 2 times, most recently from 3d0562f to 8d10187 Compare February 16, 2026 08:09

junrushao mentioned this pull request Feb 16, 2026

How far are we from @dataclass #356

Open

15 tasks

tqchen reviewed Feb 16, 2026

View reviewed changes

src/ffi/extra/repr_print.cc Outdated Show resolved Hide resolved

tqchen reviewed Feb 16, 2026

View reviewed changes

junrushao force-pushed the 2026-02-14/dataclass-repr branch from 8d10187 to 3d1fe18 Compare February 17, 2026 09:10

junrushao changed the title ~~feat: add BFS-based ffi.ReprPrint for unified object repr~~ feat: add DFS-based ffi.ReprPrint for unified object repr Feb 17, 2026

junrushao force-pushed the 2026-02-14/dataclass-repr branch from 3d1fe18 to 9df6b09 Compare February 17, 2026 09:12

junrushao force-pushed the 2026-02-14/dataclass-repr branch from 9df6b09 to a709442 Compare February 18, 2026 07:54

junrushao force-pushed the 2026-02-14/dataclass-repr branch from a709442 to 997cbf1 Compare February 18, 2026 08:07

junrushao force-pushed the 2026-02-14/dataclass-repr branch from 997cbf1 to cf535ab Compare February 18, 2026 08:22

tqchen approved these changes Feb 18, 2026

View reviewed changes

tqchen merged commit b648c5d into apache:main Feb 18, 2026
8 checks passed

Conversation

junrushao commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Format Examples

Test plan

Uh oh!

gemini-code-assist bot commented Feb 16, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tqchen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tqchen commented Feb 16, 2026

Uh oh!

tqchen Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

junrushao Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

tqchen Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

junrushao Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

junrushao commented Feb 17, 2026

Uh oh!

junrushao commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

junrushao commented Feb 16, 2026 •

edited

Loading

tqchen Feb 16, 2026 •

edited

Loading

tqchen Feb 17, 2026 •

edited

Loading

junrushao commented Feb 18, 2026 •

edited

Loading