Skip to content

Speed up PlutusData/CBORSerializable decode ~5.4x and encode ~4.3x (no behavior change)#492

Open
theeldermillenial wants to merge 9 commits into
Python-Cardano:mainfrom
theeldermillenial:perf/plutusdata-decode
Open

Speed up PlutusData/CBORSerializable decode ~5.4x and encode ~4.3x (no behavior change)#492
theeldermillenial wants to merge 9 commits into
Python-Cardano:mainfrom
theeldermillenial:perf/plutusdata-decode

Conversation

@theeldermillenial
Copy link
Copy Markdown
Contributor

@theeldermillenial theeldermillenial commented Jun 5, 2026

Summary

A set of backend-independent optimizations to pycardano's (de)serialization hot paths, primarily benefiting PlutusData- and set-heavy workloads such as on-chain data indexing and transaction building. No new dependencies and no change to the CBOR library. Output is byte-for-byte identical (one documented exception noted below). All tests pass on Python 3.9–3.13.

Motivation

Profiling showed the cost of typed PlutusData decode and real-transaction (de)serialization is dominated by repeated, data-independent work in pycardano itself — not the CBOR backend: per-node get_type_hints() / signature introspection, per-value type dispatch, redundant typeguard return-checks, re-encoding set elements for de-duplication, and defensive deepcopys on every multi-asset encode.

Changes

Decode

  • Memoize get_type_hints(cls) and from_primitive type_args introspection per class.
  • Cache PlutusData.__post_init__ field-type validation per class (per-instance byte-length check preserved).
  • Resolve each field's decode dispatch once into a memoized decode plan, plus per-class array/map field plans, instead of re-running the issubclass/__origin__/isinstance chain on every value.
  • OrderedSet keys de-duplication by the element's native hash (CBOR-bytes fallback for unhashable elements), avoiding a dumps() per element — previously ~78% of real-transaction decode time.
  • Guard the speculative Byron-address cbor2.loads probe behind a byte-prefix check.

Encode

  • Route the recursive to_primitive descent through an un-annotated worker, and drop the redundant typeguard return-check on to_validated_primitive, so the large Primitive Union isn't re-validated at every node.
  • _dfs: scalar-leaf fast path and direct IndefiniteList.data iteration (avoids the slow Sequence.__iter__).
  • Cache dataclasses.fields() per class in to_shallow_primitive.
  • Skip deepcopy().normalize() in Asset/MultiAsset.to_shallow_primitive when there are no zero/empty entries.

All caches are keyed by class, not data (so the wins hold across millions of unique objects) and use WeakKeyDictionary so dynamically-created classes are still garbage-collected.

Benchmarks

Pure-Python CBOR backend, best-of-5 (reproduce with the included benchmarks/plutus_bench.py):

Operation Before After Speedup
Typed PlutusData decode (200 fields) 4942 µs ~900 µs ~5.4×
Typed PlutusData encode (200 fields) 13531 µs ~2285 µs ~5.9×
Set-heavy transaction decode 886 µs 224 µs ~4×
MultiAsset.to_shallow_primitive 130 µs 33 µs ~3.9×

Correctness

  • Decode: identical behavior — same DeserializeException cases, Union fallback order, list/dict/Optional handling, IndefiniteList preservation, object_hook metadata, and one-time f.type resolution.
  • Encode: byte-for-byte identical output (verified on transaction / plutus / metadata round-trips).
  • One documented behavior nuance: for hashable OrderedSet elements, de-duplication is now by Python __eq__/__hash__ rather than CBOR-byte equality. These coincide for pycardano's set element types (transaction inputs, key hashes, witnesses); unhashable elements retain the original CBOR-byte semantics. Covered by new tests.
  • Full suite (unit + doctests) passes; flake8, mypy, black, isort clean. New code paths are covered by added tests; genuinely-unreachable defensive branches are marked # pragma: no cover.

🤖 Generated with Claude Code

theeldermillenial and others added 4 commits June 4, 2026 15:19
Measures decode/encode/to_json for typed PlutusData and untyped
RawPlutusData across synthetic complexity sweeps, to locate the
chain-indexing bottleneck. Run: python benchmarks/plutus_bench.py [iters]
(set CBOR_C_EXTENSION=1 to compare a fast backend).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Typed PlutusData/dataclass decode recomputed get_type_hints(cls) once per
decoded node and getfullargspec(t.from_primitive) once per typed field on
every node. Both depend only on the class, not the data, yet dominated typed
decode (~422 get_type_hints + ~421 getfullargspec calls per decode of a
200-element datum — together ~70% of decode time).

Memoize both in module-level WeakKeyDictionary caches (so dynamically created
classes can still be garbage collected). Generic aliases, which are not always
weakly referenceable, are computed without caching.

Result: ~3.6x faster typed PlutusData decode (200-inner datum 4.94ms -> 1.36ms,
cbor2pure), backend-independent. All 568 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
PlutusData.__post_init__ ran on every decoded instance, re-checking each
field's declared type against the allowed set (a class-invariant check) and
recomputing fields() each time. Cache the validated fields tuple per class in
a WeakKeyDictionary (safe for dynamically created classes); cached instances
run only the per-instance byte-length check. First instance preserves the
original interleaved type/length validation exactly, and a class with an
invalid field type is never cached so it keeps raising identically.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Decode: _restore_typed_primitive re-derived a field's decode strategy
(issubclass / __origin__ / isinstance / try-except chains) on every value
even though it depends only on the field type. Resolve it once into a
memoized "decode plan" callable per type, and build per-class array/map field
plans, all cached in WeakKeyDictionaries (with safe fallbacks for unhashable
or non-weakreferenceable types). Behavior is identical: same DeserializeException
cases, Union fallback order, list/dict/Optional handling, IndefiniteList
preservation, object_hook metadata, and the one-time f.type resolution.

Encode: the recursive to_primitive descent re-validated the large Primitive
Union return type via typeguard at every node. Route base-implementation
recursion through an un-annotated _to_primitive worker (public to_primitive
keeps its annotation and top-level check; overrides still dispatch
polymorphically). Output is byte-for-byte identical.

Result (typed PlutusData, cbor2pure, backend-independent): ~1.5x faster decode
and ~4.3x faster encode on top of the type-hint caching. All 568 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 5, 2026

Codecov Report

❌ Patch coverage is 97.73756% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.83%. Comparing base (46382fd) to head (4bcaa5e).

Files with missing lines Patch % Lines
pycardano/address.py 80.00% 0 Missing and 2 partials ⚠️
pycardano/transaction.py 75.00% 0 Missing and 2 partials ⚠️
pycardano/serialization.py 99.47% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #492      +/-   ##
==========================================
+ Coverage   90.62%   90.83%   +0.20%     
==========================================
  Files          34       34              
  Lines        5154     5289     +135     
  Branches      781      802      +21     
==========================================
+ Hits         4671     4804     +133     
  Misses        304      304              
- Partials      179      181       +2     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

theeldermillenial and others added 5 commits June 5, 2026 09:23
…tive

OrderedSet.append/remove re-encoded each element via dumps() twice (once for
the membership check, once for the dict key); compute the CBOR de-dup key once.
This dominated decode of set-heavy transactions (dumps was ~78% of decode
cumtime on real fixtures).

to_validated_primitive carried a `-> Primitive` return annotation, so the
@TypeChecked class decorator re-validated the result against the 26-member
Primitive Union even though to_primitive (which it calls) already return-checks
it once. Drop the annotation (mirrors the existing _to_primitive worker).

Result: set-heavy tx decode 886 -> 384 us (2.3x); tx encode ~1.4x. Byte-identical
output, all 568 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…present

to_shallow_primitive did deepcopy(self).normalize() on every encode solely to
avoid mutating self while stripping zero/empty entries. Scan first and skip the
deepcopy when there is nothing to strip (the common case).

Result: MultiAsset.to_shallow_primitive 130 -> 33 us (~3.9x) for a typical
token-transfer multi-asset. Byte-identical output, all 568 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- OrderedSet (#4): key de-duplication by the element's native hash, falling back
  to CBOR bytes only for unhashable elements (namespaced so it can't collide).
  This avoids a dumps() per element entirely for hashable set members.
  BEHAVIOR NOTE: de-dup for hashable elements is now by Python __eq__/__hash__
  rather than CBOR-byte equality. These coincide for pycardano's set element
  types (TransactionInput, key hashes, witnesses); unhashable elements keep the
  original CBOR-byte semantics. Added tests for the unhashable/mixed paths.
- _dfs encode recursion (#5): scalar-leaf fast path + iterate IndefiniteList.data
  directly (avoids the slow collections.abc.Sequence.__iter__).
- Cache dataclasses.fields() per class in to_shallow_primitive (Python-Cardano#6).

Result (cbor2pure, backend-independent): set-heavy tx decode 395 -> 224 us
(1.76x), typed PlutusData encode 3257 -> 2285 us (1.43x), datum_hash 3959 ->
3027 us (1.31x). All 569 tests pass; byte-identical output (except the
documented OrderedSet de-dup-key semantics).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Address.from_primitive ran a speculative cbor2.loads() on every address to
detect a Byron tag-24 wrapper. A Byron address is a 2-element CBOR array whose
first element is tag 24, i.e. bytes starting b"\x82\xd8\x18"; no Shelley header
byte is 0x82. Only run the probe when the prefix matches, skipping it on the
common Shelley path. Byte-identical behavior.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Cover the reachable new branches with targeted tests:
- OrderedSet de-dup unhashable/mixed fallback and remove() (test_serialization)
- direct _restore_typed_primitive entry + ByteString passthrough/wrap
- Asset/MultiAsset.to_shallow_primitive deepcopy/normalize path (zero values)
- PlutusData.__post_init__ cached fast-path byte-length validation
- Byron address decode from raw CBOR bytes + invalid-CBOR fallback

Mark genuinely-unreachable defensive branches with `# pragma: no cover`
(impossible generic-alias arities; the is-CBORSerializable-AND-PRIMITIVE_TYPE
case that cannot co-occur; non-hashable / non-weakreferenceable type fallbacks;
non-init map fields). All diff lines are now covered or excludable.

574 tests pass; flake8/mypy/black/isort clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant