Skip to content

Improve handling of large messages #4129

@tillrohrmann

Description

@tillrohrmann

At the moment, a distributed Restate cluster can choke up when trying to replicate large Store messages. Large Store messages can appear if services create large journal entries (e.g. when setting large state values) or due to batching multiple messages into a single Store message. Right now we have a hard upper limit of 32 MB for the Store messages. If the system tries to replicate a message or batch of messages that is larger than 32 MB, then the SequencerAppender will indefinitely retry appending a message that will always fail being sent because of the size limit. When this happens one will eventually see the following in the logs.

2026-01-05T09:27:25.050405Z  WARN restate_bifrost::providers::replicated_loglet::sequencer::appender: Append wave failed, retrying with a new wave after 6.483829236s. Status is [N1(ERROR(7)), N2(ERROR(7)), N3(COMMITTED)] wave=7 loglet_id=0_1 first_offset=53 to_offset=56 length=4 otel.name="replicated_loglet::sequencer::appender: run"
2026-01-05T09:27:31.695533Z  WARN restate_bifrost::providers::replicated_loglet::sequencer::appender: Append wave failed, retrying with a new wave after 5.451028884s. Status is [N1(ERROR(8)), N2(ERROR(8)), N3(COMMITTED)] wave=8 loglet_id=0_1 first_offset=53 to_offset=56 length=4 otel.name="replicated_loglet::sequencer::appender: run"
2026-01-05T09:27:37.327455Z  WARN restate_bifrost::providers::replicated_loglet::sequencer::appender: Append wave failed, retrying with a new wave after 5.934103456s. Status is [N1(ERROR(9)), N2(ERROR(9)), N3(COMMITTED)] wave=9 loglet_id=0_1 first_offset=53 to_offset=56 length=4 otel.name="replicated_loglet::sequencer::appender: run"
2026-01-05T09:27:43.457925Z  WARN restate_bifrost::providers::replicated_loglet::sequencer::appender: Append wave failed, retrying with a new wave after 5.069349886s. Status is [N1(ERROR(10)), N2(ERROR(10)), N3(COMMITTED)] wave=10 loglet_id=0_1 first_offset=53 to_offset=56 length=4 otel.name="replicated_loglet::sequencer::appender: run"
2026-01-05T09:27:48.702249Z  WARN restate_bifrost::providers::replicated_loglet::sequencer::appender: Append wave failed, retrying with a new wave after 5.765945394s. Status is [N1(ERROR(11)), N2(ERROR(11)), N3(COMMITTED)] wave=11 loglet_id=0_1 first_offset=53 to_offset=56 length=4 otel.name="replicated_loglet::sequencer::appender: run"
2026-01-05T09:27:54.613599Z  WARN restate_bifrost::providers::replicated_loglet::sequencer::appender: Append wave failed, retrying with a new wave after 5.442828005s. Status is [N1(ERROR(12)), N2(ERROR(12)), N3(COMMITTED)] wave=12 loglet_id=0_1 first_offset=53 to_offset=56 length=4 otel.name="replicated_loglet::sequencer::appender: run"

Note that replication to co-located log server will succeed because of using an in-memory communication channel that does not impose any size restrictions.

A SequencerAppender in this state, will prevent appending any other messages to the log and thereby brings the whole system eventually down.

A helper service to reproduce the problem can be found here: https://github.com/tillrohrmann/large-state-service. Note that the problem does not occur with single node clusters because of the in memory connection that does not impose any size limits.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions