Skip to content

Commit dde3881

Browse files
committed
feat(tools/talis): vendor talis deployment tool + Fibre experiment runner
Brings the celestia-app talis multi-cloud deploy tool into ev-node, plus the wiring needed to deploy a working Fibre DA aggregator end-to-end on top of it. Verified via a fresh AWS run from this branch state — talis up → genesis → deploy → setup-fibre → start-fibre → fibre-bootstrap-evnode → loadgen reaches 24.57 MB/s @ 99.7 % ok on 60 s (3-validator c6in.4xlarge + c6in.2xlarge bridge + c6in.8xlarge ev-node + c6in.2xlarge load-gen, all us-east-1). What this adds: • tools/talis/ — vendored from celestia-app's feat/fibre-payments. Provisions AWS / DO / GCP boxes for one or more validators + bridge + ev-node + load-gen, deploys binaries + init scripts, drives a Fibre setup-fibre + start-fibre flow, and ships an evnode-bootstrap step that scp's the bridge JWT and Fibre payment keyring onto each ev-node before its init script starts the daemon. • tools/celestia-node-fiber/cmd/evnode-fibre/ — long-lived aggregator runner that wires block.NewFiberDAClient on top of the celestia-node-fiber adapter. Compiled on each `talis genesis` and shipped to evnode-* hosts. • tools/talis/cmd/evnode-txsim/ — small Go load-gen that pumps the runner's HTTP /tx ingress for a fixed duration; deployed to the load-gen boxes and prints a single TXSIM: line on completion. • tools/talis/Makefile — cross-compiles celestia-appd, the fibre server + load tool, the bridge/light celestia binary, and both runner binaries to linux/amd64 for talis genesis -b. setup-fibre fixes uncovered during the verified run: • bash script for set-host now retries until the validator's host appears in `query valaddr providers`. The previous one-shot call relied on `--yes` returning the txhash before block inclusion; if the chain wasn't ready (validator just-started, mempool full, parallel set-host on other validators contending) the tx silently bounced and the validator never registered. Fibre client cached the partial set on startup and uploads cascaded to "host not found" → "voting power: collected 0". • talis-CLI side polls `query valaddr providers` after the per- validator scripts finish and refuses to return until all validators are registered. Up to 5-minute deadline, then errors so the operator can re-run instead of pressing on with a broken Fibre setup. What this changes in ev-node itself (load-bearing for the runner): • pkg/config/config.go: new ApplyFiberDefaults() profile that flips the DA client to Fibre-friendly defaults (adaptive batching, 1 s DA.BlockTime, 50-deep pending-cache window) so a runner can opt in with one call. • block/public.go: SetMaxBlobSize override so the runner can lift Celestia's 5 MiB default to Fibre's 120 MiB headroom; new NewFiberDAClient + Fibre type re-exports for the adapter. • block/internal/da/{fiber_client.go,fiber/types.go,fibremock/}: Fibre adapter wired through the DA client interface, with a matching mock used by the testing tree. • block/internal/submitting/da_submitter.go: per-stream upload workers, splitByBlobSize chunking, parallel signing pool, and an oversized-blob safety net that advances the cache instead of looping forever (which OOM'd the daemon under sustained Fibre stalls). • core/sequencer/sequencing.go: ErrQueueFull sentinel for sequencer backpressure; pkg/sequencers/solo/sequencer.go honours a configurable mempool cap and returns it on overflow so the reaper can back off instead of feeding an unbounded queue while the executor is paused on the pending-cache cap. External dependency (documented in tools/talis/fibre.md): • Sibling clone of celestia-app on a branch with feat/fibre-payments + sysrex/fibre_url_fix cherry-picked. Without the URL-parse fix the Fibre client rejects every host:port registration. Tested: - go test ./block/... ./pkg/... ./types/... — all green - go build ./... + Makefile build-bins — all 6 binaries cross- compile cleanly to linux/amd64 - End-to-end AWS run from this branch state — TXSIM 24.57 MB/s, 99.7 % ok on a 60 s sustained run
1 parent da7df02 commit dde3881

98 files changed

Lines changed: 20503 additions & 1091 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.claude/scheduled_tasks.lock

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
{"sessionId":"e50bfb66-350f-42ce-a67f-1053bf478384","pid":32397,"procStart":"Sun Apr 26 13:39:16 2026","acquiredAt":1777338483302}

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
coverage.txt
22
*.out
3+
go.work
4+
go.work.sum
35
proto/pb
46
proto/tendermint
57
types/pb/tendermint

apps/evm/cmd/run.go

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,10 @@ var RunCmd = &cobra.Command{
131131
}()
132132
}
133133

134-
return rollcmd.StartNode(logger, cmd, executor, sequencer, nodeKey, datastore, nodeConfig, genesis, node.NodeOptions{})
134+
// fibre-experiment: StartNode now takes a FiberClient as the
135+
// trailing arg. The EVM app doesn't wire Fiber — nil is fine
136+
// as long as DA.Fiber.Enabled stays false in its config.
137+
return rollcmd.StartNode(logger, cmd, executor, sequencer, nodeKey, datastore, nodeConfig, genesis, node.NodeOptions{}, nil)
135138
},
136139
}
137140

apps/grpc/cmd/run.go

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -86,8 +86,10 @@ The execution client must implement the Evolve execution gRPC interface.`,
8686
return err
8787
}
8888

89-
// Start the node
90-
return rollcmd.StartNode(logger, cmd, executor, sequencer, nodeKey, datastore, nodeConfig, genesis, node.NodeOptions{})
89+
// Start the node. fibre-experiment: StartNode now takes a
90+
// FiberClient as the trailing arg. The grpc app doesn't wire
91+
// Fiber — nil is fine as long as DA.Fiber.Enabled stays false.
92+
return rollcmd.StartNode(logger, cmd, executor, sequencer, nodeKey, datastore, nodeConfig, genesis, node.NodeOptions{}, nil)
9193
},
9294
}
9395

apps/testapp/cmd/run.go

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,12 @@ var RunCmd = &cobra.Command{
9797
return err
9898
}
9999

100-
return cmd.StartNode(logger, command, executor, sequencer, nodeKey, datastore, nodeConfig, genesis, node.NodeOptions{})
100+
// fibre-experiment: StartNode now takes a FiberClient as the
101+
// trailing arg. testapp doesn't wire Fiber here (the Fiber-
102+
// aware runner lives in tools/talis/cmd/evnode-fibre); nil is
103+
// fine as long as DA.Fiber.Enabled stays false in testapp's
104+
// config.
105+
return cmd.StartNode(logger, command, executor, sequencer, nodeKey, datastore, nodeConfig, genesis, node.NodeOptions{}, nil)
101106
},
102107
}
103108

block/internal/cache/manager.go

Lines changed: 0 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -76,10 +76,8 @@ type PendingManager interface {
7676
GetPendingData(ctx context.Context) ([]*types.SignedData, [][]byte, error)
7777
SetLastSubmittedHeaderHeight(ctx context.Context, height uint64)
7878
GetLastSubmittedHeaderHeight() uint64
79-
ResetInFlightHeaderRange(start, end uint64)
8079
SetLastSubmittedDataHeight(ctx context.Context, height uint64)
8180
GetLastSubmittedDataHeight() uint64
82-
ResetInFlightDataRange(start, end uint64)
8381
NumPendingHeaders() uint64
8482
NumPendingData() uint64
8583
}
@@ -312,14 +310,6 @@ func (m *implementation) SetLastSubmittedHeaderHeight(ctx context.Context, heigh
312310
m.pendingHeaders.SetLastSubmittedHeaderHeight(ctx, height)
313311
}
314312

315-
func (m *implementation) ResetInFlightHeaderHeight() {
316-
m.pendingHeaders.ResetInFlightHeaderRange(0, 0)
317-
}
318-
319-
func (m *implementation) ResetInFlightHeaderRange(start, end uint64) {
320-
m.pendingHeaders.ResetInFlightHeaderRange(start, end)
321-
}
322-
323313
func (m *implementation) GetLastSubmittedDataHeight() uint64 {
324314
return m.pendingData.GetLastSubmittedDataHeight()
325315
}
@@ -328,10 +318,6 @@ func (m *implementation) SetLastSubmittedDataHeight(ctx context.Context, height
328318
m.pendingData.SetLastSubmittedDataHeight(ctx, height)
329319
}
330320

331-
func (m *implementation) ResetInFlightDataRange(start, end uint64) {
332-
m.pendingData.ResetInFlightDataRange(start, end)
333-
}
334-
335321
func (m *implementation) NumPendingHeaders() uint64 {
336322
return m.pendingHeaders.NumPendingHeaders()
337323
}

block/internal/cache/manager_test.go

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -221,12 +221,6 @@ func TestPendingHeadersAndData_Flow(t *testing.T) {
221221
// update last submitted heights and re-check
222222
cm.SetLastSubmittedHeaderHeight(ctx, 1)
223223
cm.SetLastSubmittedDataHeight(ctx, 2)
224-
cm.ResetInFlightHeaderRange(1, 3)
225-
cm.ResetInFlightDataRange(2, 3)
226-
227-
// numPending views (before getPending claims items)
228-
assert.Equal(t, uint64(2), cm.NumPendingHeaders())
229-
assert.Equal(t, uint64(1), cm.NumPendingData())
230224

231225
headers, _, err = cm.GetPendingHeaders(ctx)
232226
require.NoError(t, err)
@@ -237,6 +231,10 @@ func TestPendingHeadersAndData_Flow(t *testing.T) {
237231
require.NoError(t, err)
238232
require.Len(t, signedData, 1)
239233
assert.Equal(t, uint64(3), signedData[0].Height())
234+
235+
// numPending views
236+
assert.Equal(t, uint64(2), cm.NumPendingHeaders())
237+
assert.Equal(t, uint64(1), cm.NumPendingData())
240238
}
241239

242240
func TestManager_TxOperations(t *testing.T) {

0 commit comments

Comments
 (0)