autobahn: add data_prune_after to bound data.State memory (CON-256)#3375
autobahn: add data_prune_after to bound data.State memory (CON-256)#3375wen-coding wants to merge 1 commit intomainfrom
Conversation
data.State.runPruning is a background goroutine that drops in-memory blocks/QCs/AppProposals older than a configurable duration, but the config knob (data.Config.PruneAfter) was never wired up — giga_router constructed data.NewState with only Committee set, so the pruner never spawned. data.State.PruneBefore (the giga_router-driven path based on cosmos-sdk RetainHeight) is also a no-op when the chain is configured with pruning="nothing" (Sei's localnode default, common in test setups), so in-memory data.State grew with the chain under sustained load and eventually OOM-killed nodes. Plumb DataPruneAfter through: AutobahnFileConfig.data_prune_after (json) → GigaRouterConfig.DataPruneAfter → data.Config.PruneAfter → data.State.runPruning Production default (gen-autobahn-config): 30m, gives operators plenty of recent history for /block, /tx, /trace_*, etc. while bounding memory under load. Localnode/test override (step4_config_override.sh): 1m, keeps data.State small under sustained-throughput tests where cosmos pruning is "nothing".
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3375 +/- ##
==========================================
- Coverage 59.17% 59.17% -0.01%
==========================================
Files 2097 2097
Lines 172641 172648 +7
==========================================
+ Hits 102163 102167 +4
- Misses 61615 61617 +2
- Partials 8863 8864 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
fyi, pruneAfter in data was used in sei-v3, but in sei-chain it is application which is responsible for pruning, via retainHeight field in ResponseCommit. I currently don't know if we want to change ownership of pruning from application to consensus. IMO it would make sense, given that application should rather be solely concerned with the latest state at all times. However perhaps sei-chain app makes some assumptions wrt which blocks are available (I can imagine that it does, but I haven't looked into that yet). |
Sorry I'm confused. Not planning to change prune ownership in this PR (although we can discuss whether that should be done, I'm generally of the opinion this is consensus cleanup which should probably be controlled via consensus), I just want to set a smaller prune period in tests, so that in less powerful machines (my Mac) we can still run long throughput tests without the validators getting OOM. The 30m default in gen-autobahn-config is a defensive cap (still opt-out — operators can drop the field), not a replacement for app-driven pruning. |
|
Currently pruning is driven by retainHeight computed via sei-chain/sei-cosmos/baseapp/abci.go Line 680 in 52d368f |
data.State.runPruning is a background goroutine that drops in-memory blocks/QCs/AppProposals older than a configurable duration, but the config knob (data.Config.PruneAfter) was never wired up — giga_router constructed data.NewState with only Committee set, so the pruner never spawned. data.State.PruneBefore (the giga_router-driven path based on cosmos-sdk RetainHeight) is also a no-op when the chain is configured with pruning="nothing" (the localnode default, common in test setups), so in-memory data.State grew with the chain under sustained load and eventually OOM-killed nodes.
Plumb DataPruneAfter through:
`AutobahnFileConfig.data_prune_after` (json) → `GigaRouterConfig.DataPruneAfter` → `data.Config.PruneAfter` → `data.State.runPruning`.
Production default (gen-autobahn-config): 30m, gives operators plenty of recent history for /block, /tx, /trace_*, etc. while bounding memory under load. Localnode/test override (step4_config_override.sh): 1m, keeps data.State small under sustained-throughput tests where cosmos pruning is "nothing".
Things done