Skip to content

Add RPC provider failover for block ingestor#6430

Open
dimitrovmaksim wants to merge 4 commits intographprotocol:masterfrom
dimitrovmaksim:rpc-provider-failover
Open

Add RPC provider failover for block ingestor#6430
dimitrovmaksim wants to merge 4 commits intographprotocol:masterfrom
dimitrovmaksim:rpc-provider-failover

Conversation

@dimitrovmaksim
Copy link
Member

Resolves #6213

When the block ingestor's polling fails and the current RPC provider is unreachable, the ingestor now automatically switches to a healthy alternative provider. If no alternative providers are configured, the current one will be retried indefinetelly, similar to the current logic.

How it works

  1. On do_poll() failure, the current provider is probed via eth_blockNumber first — do_poll() can fail for non-RPC reasons (DB errors, chain reorgs), so switching would not help in those cases.
  2. If the current provider is unreachable, all other validated providers are probed in parallel. The first to respond is selected.
  3. If all providers are unreachable, the ingestor stays on the current provider and re-probes on the next failure.
  4. There is no automatic return to the original provider — the ingestor stays on whatever provider it switched to until that one fails.

Other changes

  • latest_block_ptr retries are now limited to ENV_VARS.request_retries instead of infinite, so failures surface to the failover logic.
  • retry_strategy uses saturating_sub to avoid underflow when limit is 0.
  • Added is_reachable() to the EthereumAdapter trait for lightweight provider health checks.
  • Added all_cheapest() to EthereumNetworkAdapters to expose all validated providers.

Use saturating_sub to avoid underflow when limit is set to 0.
When do_poll() fails, the ingestor now probes the current provider
before switching. If the current provider is unreachable, all
alternatives are probed in parallel and the first healthy one is
selected. This avoids unnecessary switches on non-RPC failures
(e.g. DB errors, chain reorgs).

Also limits latest_block_ptr retries to ENV_VARS.request_retries
so failures surface to the failover logic instead of retrying
indefinitely.
@fordN fordN requested a review from isum March 10, 2026 15:49
Simplify resolve_provider_idx into resolve_provider by returning
a reference to the Arc<A> instead of a usize index, eliminating
the separate indexing step at the call site.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Block ingestor retries failing provider indefinitely

1 participant