Speed up PlutusData/CBORSerializable decode ~5.4x and encode ~4.3x (no behavior change)#492
Open
theeldermillenial wants to merge 9 commits into
Open
Conversation
Measures decode/encode/to_json for typed PlutusData and untyped RawPlutusData across synthetic complexity sweeps, to locate the chain-indexing bottleneck. Run: python benchmarks/plutus_bench.py [iters] (set CBOR_C_EXTENSION=1 to compare a fast backend). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Typed PlutusData/dataclass decode recomputed get_type_hints(cls) once per decoded node and getfullargspec(t.from_primitive) once per typed field on every node. Both depend only on the class, not the data, yet dominated typed decode (~422 get_type_hints + ~421 getfullargspec calls per decode of a 200-element datum — together ~70% of decode time). Memoize both in module-level WeakKeyDictionary caches (so dynamically created classes can still be garbage collected). Generic aliases, which are not always weakly referenceable, are computed without caching. Result: ~3.6x faster typed PlutusData decode (200-inner datum 4.94ms -> 1.36ms, cbor2pure), backend-independent. All 568 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
PlutusData.__post_init__ ran on every decoded instance, re-checking each field's declared type against the allowed set (a class-invariant check) and recomputing fields() each time. Cache the validated fields tuple per class in a WeakKeyDictionary (safe for dynamically created classes); cached instances run only the per-instance byte-length check. First instance preserves the original interleaved type/length validation exactly, and a class with an invalid field type is never cached so it keeps raising identically. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Decode: _restore_typed_primitive re-derived a field's decode strategy (issubclass / __origin__ / isinstance / try-except chains) on every value even though it depends only on the field type. Resolve it once into a memoized "decode plan" callable per type, and build per-class array/map field plans, all cached in WeakKeyDictionaries (with safe fallbacks for unhashable or non-weakreferenceable types). Behavior is identical: same DeserializeException cases, Union fallback order, list/dict/Optional handling, IndefiniteList preservation, object_hook metadata, and the one-time f.type resolution. Encode: the recursive to_primitive descent re-validated the large Primitive Union return type via typeguard at every node. Route base-implementation recursion through an un-annotated _to_primitive worker (public to_primitive keeps its annotation and top-level check; overrides still dispatch polymorphically). Output is byte-for-byte identical. Result (typed PlutusData, cbor2pure, backend-independent): ~1.5x faster decode and ~4.3x faster encode on top of the type-hint caching. All 568 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #492 +/- ##
==========================================
+ Coverage 90.62% 90.83% +0.20%
==========================================
Files 34 34
Lines 5154 5289 +135
Branches 781 802 +21
==========================================
+ Hits 4671 4804 +133
Misses 304 304
- Partials 179 181 +2 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
…tive OrderedSet.append/remove re-encoded each element via dumps() twice (once for the membership check, once for the dict key); compute the CBOR de-dup key once. This dominated decode of set-heavy transactions (dumps was ~78% of decode cumtime on real fixtures). to_validated_primitive carried a `-> Primitive` return annotation, so the @TypeChecked class decorator re-validated the result against the 26-member Primitive Union even though to_primitive (which it calls) already return-checks it once. Drop the annotation (mirrors the existing _to_primitive worker). Result: set-heavy tx decode 886 -> 384 us (2.3x); tx encode ~1.4x. Byte-identical output, all 568 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…present to_shallow_primitive did deepcopy(self).normalize() on every encode solely to avoid mutating self while stripping zero/empty entries. Scan first and skip the deepcopy when there is nothing to strip (the common case). Result: MultiAsset.to_shallow_primitive 130 -> 33 us (~3.9x) for a typical token-transfer multi-asset. Byte-identical output, all 568 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- OrderedSet (#4): key de-duplication by the element's native hash, falling back to CBOR bytes only for unhashable elements (namespaced so it can't collide). This avoids a dumps() per element entirely for hashable set members. BEHAVIOR NOTE: de-dup for hashable elements is now by Python __eq__/__hash__ rather than CBOR-byte equality. These coincide for pycardano's set element types (TransactionInput, key hashes, witnesses); unhashable elements keep the original CBOR-byte semantics. Added tests for the unhashable/mixed paths. - _dfs encode recursion (#5): scalar-leaf fast path + iterate IndefiniteList.data directly (avoids the slow collections.abc.Sequence.__iter__). - Cache dataclasses.fields() per class in to_shallow_primitive (Python-Cardano#6). Result (cbor2pure, backend-independent): set-heavy tx decode 395 -> 224 us (1.76x), typed PlutusData encode 3257 -> 2285 us (1.43x), datum_hash 3959 -> 3027 us (1.31x). All 569 tests pass; byte-identical output (except the documented OrderedSet de-dup-key semantics). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Address.from_primitive ran a speculative cbor2.loads() on every address to detect a Byron tag-24 wrapper. A Byron address is a 2-element CBOR array whose first element is tag 24, i.e. bytes starting b"\x82\xd8\x18"; no Shelley header byte is 0x82. Only run the probe when the prefix matches, skipping it on the common Shelley path. Byte-identical behavior. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Cover the reachable new branches with targeted tests: - OrderedSet de-dup unhashable/mixed fallback and remove() (test_serialization) - direct _restore_typed_primitive entry + ByteString passthrough/wrap - Asset/MultiAsset.to_shallow_primitive deepcopy/normalize path (zero values) - PlutusData.__post_init__ cached fast-path byte-length validation - Byron address decode from raw CBOR bytes + invalid-CBOR fallback Mark genuinely-unreachable defensive branches with `# pragma: no cover` (impossible generic-alias arities; the is-CBORSerializable-AND-PRIMITIVE_TYPE case that cannot co-occur; non-hashable / non-weakreferenceable type fallbacks; non-init map fields). All diff lines are now covered or excludable. 574 tests pass; flake8/mypy/black/isort clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
A set of backend-independent optimizations to pycardano's
(de)serializationhot paths, primarily benefiting PlutusData- and set-heavy workloads such as on-chain data indexing and transaction building. No new dependencies and no change to the CBOR library. Output is byte-for-byte identical (one documented exception noted below). All tests pass on Python 3.9–3.13.Motivation
Profiling showed the cost of typed
PlutusDatadecode and real-transaction (de)serialization is dominated by repeated, data-independent work in pycardano itself — not the CBOR backend: per-nodeget_type_hints()/ signature introspection, per-value type dispatch, redundant typeguard return-checks, re-encoding set elements for de-duplication, and defensivedeepcopys on every multi-asset encode.Changes
Decode
get_type_hints(cls)andfrom_primitivetype_argsintrospection per class.PlutusData.__post_init__field-type validation per class (per-instance byte-length check preserved).issubclass/__origin__/isinstancechain on every value.OrderedSetkeys de-duplication by the element's native hash (CBOR-bytes fallback for unhashable elements), avoiding adumps()per element — previously ~78% of real-transaction decode time.cbor2.loadsprobe behind a byte-prefix check.Encode
to_primitivedescent through an un-annotated worker, and drop the redundant typeguard return-check onto_validated_primitive, so the largePrimitiveUnionisn't re-validated at every node._dfs: scalar-leaf fast path and directIndefiniteList.dataiteration (avoids the slowSequence.__iter__).dataclasses.fields()per class into_shallow_primitive.deepcopy().normalize()inAsset/MultiAsset.to_shallow_primitivewhen there are no zero/empty entries.All caches are keyed by class, not data (so the wins hold across millions of unique objects) and use
WeakKeyDictionaryso dynamically-created classes are still garbage-collected.Benchmarks
Pure-Python CBOR backend, best-of-5 (reproduce with the included
benchmarks/plutus_bench.py):PlutusDatadecode (200 fields)PlutusDataencode (200 fields)MultiAsset.to_shallow_primitiveCorrectness
DeserializeExceptioncases,Unionfallback order, list/dict/Optionalhandling,IndefiniteListpreservation,object_hookmetadata, and one-timef.typeresolution.OrderedSetelements, de-duplication is now by Python__eq__/__hash__rather than CBOR-byte equality. These coincide for pycardano's set element types (transaction inputs, key hashes, witnesses); unhashable elements retain the original CBOR-byte semantics. Covered by new tests.flake8,mypy,black,isortclean. New code paths are covered by added tests; genuinely-unreachable defensive branches are marked# pragma: no cover.🤖 Generated with Claude Code