Skip to content

Remove Generic-based Flat derivation machinery#7662

Open
Unisay wants to merge 2 commits intomasterfrom
yura/issue-7542-remove-dead-flat-code
Open

Remove Generic-based Flat derivation machinery#7662
Unisay wants to merge 2 commits intomasterfrom
yura/issue-7542-remove-dead-flat-code

Conversation

@Unisay
Copy link
Contributor

@Unisay Unisay commented Mar 13, 2026

Summary

Closes #7542.

The internal flat library contained ~440 lines of Generic-based Flat class derivation machinery (GFlat*, GEncode*, GSize* classes, NumConstructors type family) that had a known bug: deserializing enum types with 512+ constructors consumed infinite memory. While not a security risk (all on-chain UPLC code uses only manual Flat instances), the buggy dead code was distracting. This PR removes the entire Generic derivation machinery.

Changes

  • Add manual Flat instances to all types that previously relied on Generic defaults: Maybe, Either, Complex, [], [Char], NonEmpty, tuples (2-7), Tree, Filler, PostAligned, PreAligned, SrcSpan, SrcSpans, and all PIR types (Recursivity, Strictness, Datatype, Binding, Program, Term). Manual encoding is bit-identical to the Generic encoding.

  • Add missing size methods to 9 partial Flat instances in FlatInstances.hs (Name, Version, VarDecl, TyVarDecl, Program, NamedDeBruijn) and Value.hs (K, Quantity, Value) that previously relied on the Generic default for size.

  • Gut Class.hs: remove ~440 lines of Generic machinery, leaving only the abstract Flat class with encode, decode, size methods and getSize.

  • Remove ENUM_LARGE test infrastructure: E256/E258 types, their Flat/Arbitrary instances (~530 lines), and the -DENUM_LARGE cpp-option.

  • Delete Core.hs and Tutorial.hs: Core.hs contained GHC inspection tests verifying that Generic-derived code optimized well, and Tutorial.hs taught Flat usage via deriving (Generic, Flat) examples — both are meaningless without Generic derivation, and the original tutorial remains available in the upstream flat package on Hackage.

  • Fix transitive import breakage: files that got Generic from the PlutusCore.Flat re-export now import GHC.Generics directly. Remove stale hiding (to, from) from imports that no longer re-export Generics.

  • Add QuickCheck round-trip property test for PIR Flat instances: generates 200 random well-typed PIR terms via genTypeAndTerm_, encodes with flat, decodes with unflat, and verifies byte-level identity of re-encoded output.

@Unisay Unisay self-assigned this Mar 13, 2026
@Unisay Unisay force-pushed the yura/issue-7542-remove-dead-flat-code branch 5 times, most recently from 6825737 to 4f79329 Compare March 13, 2026 14:27
@Unisay Unisay marked this pull request as ready for review March 13, 2026 14:27
@Unisay Unisay force-pushed the yura/issue-7542-remove-dead-flat-code branch 2 times, most recently from 409fb17 to 5b0a29e Compare March 13, 2026 15:10
@Unisay Unisay requested a review from a team March 13, 2026 15:14
@Unisay Unisay force-pushed the yura/issue-7542-remove-dead-flat-code branch from 5b0a29e to a845510 Compare March 13, 2026 15:17
@Unisay Unisay enabled auto-merge (squash) March 16, 2026 08:45
Unisay added 2 commits March 17, 2026 12:25
The internal flat library contained Generic-based Flat class derivation
via GHC.Generics that had a bug: deserializing large enum types (512+
constructors) consumed infinite memory. While not a security risk (UPLC
on-chain code uses only manual Flat instances), the buggy dead code was
distracting. This commit removes the entire Generic derivation machinery.

Changes:

1. Add manual Flat instances to all types that previously relied on
   Generic defaults: Maybe, Either, Complex, [], NonEmpty, tuples (2-7),
   Tree, Filler, PostAligned, PreAligned, SrcSpan, SrcSpans, and all PIR
   types (Recursivity, Strictness, Datatype, Binding, Program, Term).
   Manual encoding is bit-identical to the Generic encoding.

2. Add missing `size` methods to partial Flat instances in
   FlatInstances.hs (Name, Version, VarDecl, TyVarDecl, Program,
   NamedDeBruijn) and Value.hs (K, Quantity, Value) that previously
   relied on the Generic default for `size`.

3. Gut Class.hs: remove ~440 lines of GFlat*/GEncode*/GSize* classes,
   NumConstructors type family, all Generic default method implementations,
   and associated imports/pragmas/re-exports.

4. Remove ENUM_LARGE test infrastructure (E256/E258 types, Flat/Arbitrary
   instances, ~530 lines) and delete Core.hs inspection tests.

5. Fix transitive import breakage: files that got Generic from the
   PlutusCore.Flat re-export now import GHC.Generics directly.
   Remove `hiding (to)` from imports that no longer re-export Generics.

6. Strip Generic-specific doctests from Tutorial.hs.

All existing tests pass (5281 across 4 suites). Manual instances produce
bit-identical encoding verified by round-trip tests and golden files.
Mutation testing on the hand-written Flat instances from #7542 found
9 coverage gaps (none on the on-chain path). Add roundtrip tests for
Tree, Set, PreAligned, PIR Program, SrcSpan, SrcSpans, and
NamedDeBruijn. Remove the dead OVERLAPPING Flat [Char] instance from
Extra.hs — the module was never imported by the umbrella module, so
GHC always resolved Flat String to the generic [a] instance.
@Unisay Unisay force-pushed the yura/issue-7542-remove-dead-flat-code branch from a845510 to 6c8826f Compare March 17, 2026 11:41
@Unisay
Copy link
Contributor Author

Unisay commented Mar 17, 2026

Mutation testing on Flat instances

Ran mutation testing on all 32 hand-written Flat instances from this PR to check test coverage. For each instance, I replaced encode or size with error "MUTATION: <Type>" and ran the relevant test suite. If a test fails, the mutation is "killed" (covered). If tests pass, it "survived" (gap).

Initial run: 22/32 killed, 10 survived

The first pass found 10 uncovered instances. None of them were on the on-chain serialization path, but some were worth covering anyway:

  • Tree a, Set a — container types with no roundtrip property tests (only doctests)
  • PreAligned a — internal alignment type, only PostAligned was exercised
  • PIR Programtest_flatRoundTrip tested Terms but never wrapped them in Programs
  • Version size, TPLC Program size, NamedDeBruijn sizesize methods for buffer pre-allocation, not exercised by the PIR test suite
  • SrcSpan, SrcSpans — debug annotation types, PIR tests used () annotations
  • [Char] — turned out to be dead code (see below)

Added tests, re-ran: 31/32 killed

I added roundtrip tests for every survivor and re-ran. All 9 are now killed. The sole remaining survivor is [Char] — which is genuinely dead code: PlutusCore.Flat.Instances never imports PlutusCore.Flat.Instances.Extra, so the OVERLAPPING [Char] instance is compiled but never linked into any consuming module. GHC resolves Flat String to the generic [a] instance from Base.hs everywhere.

Tests added in this follow-up:

  • flat-test: Data.Tree.Tree and Set QC roundtrips, PreAligned unit roundtrip, explicit String unit trips
  • plutus-ir-test: PIR Program Flat roundtrip property (50 random programs)
  • plutus-core-test: SrcSpan, SrcSpans, NamedDeBruijn unit roundtrips
Full results table (click to expand)
ID Type Method Test Suite Initial After
M01 Maybe a encode flat-test KILLED
M02 Either a b encode flat-test KILLED
M03 Complex a encode flat-test KILLED
M04 [a] encode flat-test KILLED
M05 NonEmpty a encode flat-test KILLED
M06 (a,b) encode flat-test KILLED
M07 (a,b,c) encode flat-test KILLED
M08 (a,b,c,d) encode flat-test KILLED
M09 (a,b,c,d,e) encode flat-test KILLED
M10 (a,b,c,d,e,f) encode flat-test KILLED
M11 (a,b,c,d,e,f,g) encode flat-test KILLED
M12 [Char] encode flat-test SURVIVED SURVIVED (dead code)
M13 PostAligned a encode flat-test KILLED
M14 PreAligned a encode flat-test SURVIVED KILLED
M15 Tree a encode flat-test SURVIVED KILLED
M16 Set a encode flat-test SURVIVED KILLED
M17 Recursivity encode plutus-ir-test KILLED
M18 Strictness encode plutus-ir-test KILLED
M19 Datatype encode plutus-ir-test KILLED
M20 Binding encode plutus-ir-test KILLED
M21 PIR Term encode plutus-ir-test KILLED
M22 PIR Program encode plutus-ir-test SURVIVED KILLED
M23 Name size plutus-ir-test KILLED
M24 Version size plutus-ir-test SURVIVED KILLED
M25 VarDecl size plutus-ir-test KILLED
M26 TyVarDecl size plutus-ir-test KILLED
M27 TPLC Program size plutus-core-test SURVIVED KILLED
M28 NamedDeBruijn size plutus-core-test SURVIVED KILLED
M29 K encode plutus-core-test KILLED
M30 Quantity encode plutus-core-test KILLED
M31 SrcSpan encode plutus-core-test SURVIVED KILLED
M32 SrcSpans encode plutus-core-test SURVIVED KILLED

Dead code removed: [Char] instance

While investigating the [Char] survivor, I found that PlutusCore.Flat.Instances.Extra — which defines an OVERLAPPING Flat [Char] instance — was never imported by PlutusCore.Flat.Instances (the umbrella module). GHC resolved Flat String to the generic [a] instance everywhere. The module was dead code predating this PR. Removed it in this follow-up.

Copy link
Member

@zliu41 zliu41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe the flat bug reported in #7542 is difficult to fix. It likely just requires a few lines of change - much smaller than what this PR does!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dead flat code has bugs in it

2 participants