Skip to content

Releases: ar-io/ar-io-node

Release 77

24 Apr 22:39

Choose a tag to compare

This is a recommended release focused on cross-gateway GraphQL fan-out, ClickHouse query-path hardening, and composite query resilience. Key highlights include GatewaysGqlQueryable, a new adapter that fans GraphQL queries out to configured upstream ar-io-node gateways and merges the results — letting a node compose its local index with broader upstream coverage — and a parallelized composite ClickHouse/SQLite GraphQL path protected by a SQLite circuit breaker that surfaces PARTIAL_RESULT warnings via extensions.warnings instead of silent partials. ClickHouse gets several query-path improvements: dropping FINAL in favor of LIMIT 1 BY dedupe to re-enable projection planning, a new owner_address bloom with projection skipping on tag filters, a tag_names / tag_values fix for owner_projection, a configurable query timeout (default 3s), and a max_rows_to_read guardrail that fails noisy full-scans fast. It also adds per-job status tracking to the Parquet export admin API and bundles an Observer update to ddd3a9c with reference-gateway chunk-header offset validation and continuous-observer reliability hardening, alongside a set of ClickHouse auto-import reliability fixes.

Added

  • Fan-Out GraphQL Over Upstream Gateways (GatewaysGqlQueryable): A new GqlQueryable adapter fans GraphQL queries out to configured upstream ar-io-node gateways and merges the results, letting a node act as a thin fan-out proxy or compose its local index with upstream sources for broader coverage. Single-record queries use first-non-null resolution; connection queries k-way merge by the ar-io-node cursor tuple and dedupe by id. Per-endpoint circuit breakers isolate slow or failing upstreams. Configured via GATEWAYS_GQL_URLS; disabled by default.

  • Configurable ClickHouse GraphQL Query Timeout: The ClickHouse GQL backend now applies a configurable timeout both server-side (as max_execution_time, so ClickHouse aborts runaway queries and frees resources) and client-side (as the HTTP request_timeout, with a 2s grace window so the server-side timeout error surfaces before the client aborts). Default 3s.

  • max_rows_to_read Guardrail on ClickHouse GraphQL Queries: Every GraphQL query against the ClickHouse transactions table now appends SETTINGS max_rows_to_read = N. Queries that would scan more than the configured threshold throw Code: 158: Limit for rows ... exceeded instead of silently scanning the whole table — catches projection-shadowing bugs and planner regressions where a skip index is bypassed. Default 10M rows (~20% of current table size); tunable via CLICKHOUSE_GQL_MAX_ROWS_TO_READ.

  • Per-Job Status Tracking for Parquet Export API: POST /ar-io/admin/export-parquet now returns a jobId, and the exporter keeps a bounded per-job history (32 entries) so concurrent callers can each poll their own record at GET /ar-io/admin/export-parquet/status/:jobId. The legacy singleton status endpoint is retained for back-compat and still reflects the most-recent update. scripts/parquet-export prefers the per-job endpoint when a jobId is returned and falls back to the singleton-with-drift-detection path for older gateways.

Changed

  • Observer Update to ddd3a9c: Bundles two upstream PRs on top of the previous 21098d2 pin.

    • Reference-gateway chunk-header offset validation: The observer now HEADs the reference gateway's /chunk/{offset}/data and anchors the advertised x-arweave-chunk-* headers (tx id, boundaries, data root) to the chain via /tx/{id}/offset and /tx/{id}, replacing the block-and-tx binary search as the default offset-validation path. Typical cost drops from ~20–30 node lookups per offset to one HEAD plus two O(1) lookups per unique tx, with a per-tx LRU cache for repeated offsets. Any header/chain mismatch or missing header falls back to the legacy chain search, so older gateways keep working. New metric observer_chunk_metadata_anchor_total{result} (hit / cache_hit / metadata_missing / mismatch / error / fallback) tracks the rollout. Gateways that return an HTTP error on the new probe are no longer blacklisted from the shared pool — only transport failures do.
    • Continuous observer reliability hardening: The per-gateway schedule map is replaced with a flat list of ScheduledObservation events so duplicates, restart catch-up, and overdue retries are deterministic (legacy state auto-migrates on load). An explicit submission deadline (windowEnd + submissionBufferMs) now bounds the epoch — once exceeded, the scheduler clears pending work, marks the epoch expired, and stops issuing observations instead of spinning on stale state. Finalization is gated on both the window being complete and the pending queue being empty, and only flips reportSubmitted on a successful submit so transient submit failures retry. Unsubmitted prior epochs are discarded on epoch transition rather than force-finalized into the wrong epoch.
    • Report telemetry: Reports now record each gateway's release field from /ar-io/info, a yarn summarize script prints pass/fail counts grouped by release, and offset rendering now shows <failures>/<observed> (<pct>) so the denominator reflects the sampled subset.
  • ClickHouse GraphQL query no longer uses FINAL: The composite ClickHouse backend previously issued FROM transactions AS t FINAL to deduplicate unmerged ReplacingMergeTree versions at read time. FINAL prevented owner_projection from being selected and forced a PrimaryKeyExpand that widened the skip-index-pruned granule set by ~4×. It is replaced with a LIMIT 1 BY height, block_transaction_index, is_data_item, id clause that dedupes in-engine as a post-sort filter without disabling projection planning or PREWHERE push-down. Safe because Arweave transaction data is immutable: all versions of a given primary key are byte-identical by construction.

  • Composite ClickHouse GraphQL Parallelized With SQLite Circuit Breaker: The CompositeClickHouseDatabase now runs its ClickHouse and SQLite legs concurrently instead of serially, and wraps the SQLite leg in an opossum circuit breaker. ClickHouse errors (timeout, max_rows_to_read) still propagate to the caller, while SQLite failures degrade the response to ClickHouse-only results with a PARTIAL_RESULT warning attached via GraphQL extensions.warnings — ending silent partials for tip-of-chain rows and for the single-record transaction(id) lookup, which previously returned a bare null when SQLite was unavailable. The ClickHouse max-height boundary-optimization cache is now read non-blocking from the request path, with a background refresh keeping it warm. Fan-out preserves warnings end-to-end: RemoteGqlQueryable pulls upstream extensions.warnings off each response, GatewaysGqlQueryable merges them across sources, and synthesizes UPSTREAM_UNAVAILABLE / UPSTREAM_CIRCUIT_OPEN warnings for partially-failed aggregates that were previously logged-and-dropped. New env vars under CLICKHOUSE_SQLITE_CIRCUIT_BREAKER_* (defaults: timeout 5000ms, error threshold 80%, reset timeout 60000ms, rolling window 30000ms).

  • ClickHouse owner_address Bloom + Skip Projection on Tag Filters: ClickHouse projections cannot carry inline skip indexes, so owner+tag GraphQL queries that routed through owner_projection scanned every granule within the owner range. An owner_address bloom filter is now defined on the main transactions table, and the per-query optimize_use_projections = 0 guard is extended to tag filters. Owner-only queries still benefit from owner_projection's sort order; owner+tag queries now fall back to the main table where id_bloom / tag_names_bloom / tag_values_bloom / owner_address_bloom can prune granules across all three dimensions. Existing deployments get the index registered via an idempotent ALTER TABLE ... ADD INDEX IF NOT EXISTS on the next clickhouse-import cycle; a manual MATERIALIZE INDEX owner_address_bloom is required to populate the index on existing parts.

  • Parquet Export Defaults to Include L1 Transactions and Tags: ParquetExporter.export() defaults now align with the scripts/parquet-export CLI wrapper and the auto-verify harness, both of which already included L1 by default. Callers that want L2-only output must now pass skipL1Transactions / skipL1Tags explicitly.

Fixed

  • ClickHouse owner_projection now usable for tag-filtered owner queries: The projection was previously defined with SELECT *, which in ClickHouse excludes MATERIALIZED columns — so tag_names and tag_values were absent from the projection and the optimizer rejected it for any query with predicates on those columns (which includes all tag-filtered GraphQL queries). The projection body is now SELECT *, tag_names, tag_values, so the optimizer picks owner_projection for owner-scoped queries and reads orders of magnitude fewer granules. Existing deployments need a one-time manual migration (DROP PROJECTION / ADD PROJECTION / MATERIALIZE PROJECTION) — see the inline comment in src/database/clickhouse/schema.sql. Fresh deployments get the corrected projection from the CREATE TABLE body with no operator action required.

  • GraphQL Block.timestamp Non-Nullable Field Error: Addresses a "Cannot return null for non-nullable field Block.timestamp" error that could surface when resolving blocks with incomplete data.

  • GraphQL Data Item Signature Fetch Falls Back to NOT_FOUND: The data-item path in resolveTxSignature returned the fetcher result directly, so an undefined from SignatureFetcher.getDataItemSignature (e.g., missing attributes or a stream failure reading from the parent bundle) would trigger a "Cannot return null for non-nullable field" error on the String! signature field. The data-item path now mirrors the transaction path and falls back to `NOT_...

Read more

Release 76

17 Apr 19:32

Choose a tag to compare

[Release 76] - 2026-04-17

This is a recommended release focused on response signing, ClickHouse data lifecycle management, and query-path efficiency. Key highlights include RFC 9421 HTTP Message Signatures for cryptographically verifiable gateway responses, tag-based TTL rules for ClickHouse-exported data so operators can expire indexed rows by tag or uploader, and a major ClickHouse schema consolidation into a single partitioned transactions table with bloom filter skip indexes and native projections. It also adds per-host APEX_ARNS_NAME mapping, Parquet export partition progress in the status API, and a clickhouse-import --flat-dir mode. GraphQL gets two performance improvements — skipping SQLite for heights already covered by ClickHouse and skipping the owner.key fetch when only owner.address is selected — plus a fix for duplicate transaction results from ClickHouse and correctly-populated indexedAt / blockPreviousBlock fields.

Added

  • RFC 9421 HTTP Message Signatures for Gateway Responses: The gateway can now sign responses using RFC 9421 HTTP Message Signatures, allowing clients to cryptographically verify response integrity and origin. Controlled by HTTPSIG_ENABLED (default: false) with an Ed25519 key auto-generated at HTTPSIG_KEY_FILE (default: data/keys/httpsig.pem). HTTPSIG_BIND_REQUEST (default: true) binds each response to the triggering request via @method;req and @path;req. An attestation linking the key to the operator wallet is uploaded to Arweave on startup when HTTPSIG_UPLOAD_ATTESTATION=true and OBSERVER_WALLET is set. HTTPSIG response metadata is documented in OpenAPI for /ar-io/info and data endpoints.

  • Tag-Based TTL Rules for ClickHouse-Exported Data: Operators can now expire rows in the ClickHouse transactions table by tag content or uploader owner address. Rules are declared in config/clickhouse-ttl-rules.yaml (copy from the committed .example.yaml template) and loaded at the top of every clickhouse-auto-import cycle into four source tables, with exact-match lookups going through refreshing COMPLEX_KEY_HASHED dictionaries and prefix matches falling back to scanned tables. Native TTL enforcement deletes rows when expires_at elapses. Supports a top-level default_ttl_seconds fallback and per-rule never_expire: true exemptions (precedence: exempt > shortest TTL match > default > NULL). v1 applies only to rows imported after rules are loaded; no backfill. The loader fails open on missing/malformed rules to avoid blocking imports.

  • Per-Host APEX_ARNS_NAME Mapping: APEX_ARNS_NAME now accepts a comma-separated list of values positionally mapped to ARNS_ROOT_HOST entries (e.g., APEX_ARNS_NAME=turbo,ar-io with ARNS_ROOT_HOST=arweave.dev,g8way.io). A single value still applies to all hosts.

  • Parquet Export Partition Progress in Status API: The admin status endpoint and the parquet-export CLI poll loop now surface the current partition range and completed/total partition counts while a Parquet export is in progress.

  • clickhouse-import --flat-dir mode: scripts/clickhouse-import accepts a flat directory of Parquet files named <table>-minHeight:<min>-maxHeight:<max>-rowCount:<n>.parquet (blocks / transactions / tags all in the same directory), as an alternative to the default <table>/data/height=<min>-<max>/*.parquet Hive layout.

Changed

  • ClickHouse schema consolidation: The ClickHouse GQL backend now uses a single transactions table with partitioning by height, bloom filter skip indexes on id and tags, and native projections for owner and recipient queries, replacing the previous four-table design (transactions, id_transactions, owner_transactions, target_transactions). Column codecs (Delta + ZSTD) and LowCardinality on content_type / signature_type reduce storage. The GQL query layer uses hasAny for multi-value tag filters and tuple-comparison cursor pagination. Requires ClickHouse 24.8 or later and a one-time full re-import from Parquet — see docs/parquet-and-clickhouse-usage.md.

  • Skip SQLite for Heights Covered by ClickHouse in GraphQL: Opt-in optimization in CompositeClickHouseDatabase raises the SQLite fallback's minHeight to (clickhouseMax - buffer + 1) and skips the SQLite call entirely when the adjusted range is empty. Controlled by CLICKHOUSE_SQLITE_MIN_HEIGHT_ENABLED (default: false), with a configurable safety buffer (default: 10 heights) and a cached ClickHouse max-height lookup (default TTL: 60s). Degrades to prior behavior on lookup failure.

  • GraphQL owner.key Fetch Skipped When Only owner.address Requested: Splitting the Transaction.owner resolver into field-level Owner.address and Owner.key resolvers lets GraphQL skip the per-row owner key fetch unless key is explicitly selected. Memoization on the Owner parent still fetches the key only once when multiple aliased selections or overlapping fragments request it in the same query.

  • Default ClickHouse Image Bumped to 26.3: The default ClickHouse container image used by clickhouse-auto-import is now 26.3.

  • Observer Image Updated: OBSERVER_IMAGE_TAG bumped to include epoch source fixes.

Fixed

  • ClickHouse GQL indexedAt and blockPreviousBlock fields: These fields were previously always returned as undefined because the base SELECT omitted the corresponding columns. They are now populated.

  • Duplicate GraphQL Transaction Results from ClickHouse: The transactions table uses ReplacingMergeTree(inserted_at), which only deduplicates during background merges. Queries now use FINAL so GraphQL returns a single edge per id instead of every un-merged version.

  • Tag Headers on Manifest-Resolved Responses: When a manifest path resolves to an inner data item, X-Arweave-Tag-* headers are now populated from the resolved inner item rather than the manifest transaction.

  • Turbo Fallback Narrowed to Module-Not-Found: The optional Turbo upload path now falls back only on module-not-found (instead of swallowing unrelated errors) and requires an explicit trigger header before signing.

Image SHAs

Release 75

08 Apr 20:06

Choose a tag to compare

This is a recommended release focused on on-demand data item resolution and response header enrichment. Key highlights include tag and verification response headers that expose transaction tags and cryptographic metadata directly in HTTP responses, on-demand data item metadata resolution that resolves unindexed data items by parsing ANS-104 bundle binaries on the fly, HyperBEAM as a root TX offset source for efficient bundle navigation without full downloads, and GraphQL on-demand transaction resolution for querying unindexed data items. It also adds configurable chunk GET retry behavior to reduce worst-case retrieval times and Prometheus metrics for root TX semaphore observability.

Added

  • Tag and Verification Response Headers: Data responses on /raw/:id and /:id now include X-Arweave-Tag-* headers with transaction/data item tags, plus verification headers (X-Arweave-Signature, X-Arweave-Owner, X-Arweave-Owner-Address, X-Arweave-Target, X-Arweave-Anchor, X-Arweave-Signature-Type). Enabled by default (ARWEAVE_TAG_RESPONSE_HEADERS_ENABLED=true). Uses a fast local-only resolution path (LMDB txStore -> LRU cache -> GQL DB) with background indexing for uncached items. Includes a configurable byte budget (ARWEAVE_TAG_RESPONSE_HEADERS_MAX_BYTES, default 8KB) and tag count cap (ARWEAVE_TAG_RESPONSE_HEADERS_MAX, default 100). For L2 data item signatures and owner keys, WRITE_ANS104_DATA_ITEM_DB_SIGNATURES=true is also required.

  • On-Demand Data Item Metadata Resolution: Data items not yet indexed locally are resolved on-demand by discovering the root bundle, parsing the binary header, and extracting signature/owner/tags. Results are cached in an LRU cache and persisted to the database for future requests. Background resolution is capped at 1 concurrent operation (configurable via TX_METADATA_RESOLVE_CONCURRENCY) with fail-fast semantics.

  • HyperBEAM Root TX Offset Source (PE-9043): HyperBEAM can now be used as a root transaction offset source for on-demand data item resolution. Uses offset-guided recursive bundle index navigation to extract complete data item metadata without downloading full bundles. Controlled by HYPERBEAM_ROOT_TX_ENABLED and HYPERBEAM_ENDPOINT (default: arweave.net).

  • Configurable Chunk GET Retry Behavior (PE-9042): Arweave chunk retrieval retry count and geometry timeout are now configurable, reducing worst-case chunk retrieval time from ~115s to ~15s. New env vars: ARWEAVE_CHUNK_RETRY_COUNT (default: 5), ARWEAVE_TX_GEOMETRY_TIMEOUT_MS (default: 5000), ARWEAVE_TX_GEOMETRY_TIMEOUT_RETRIES (default: 2).

  • GraphQL On-Demand Transaction Resolution: The transaction(id) GraphQL query can now resolve unindexed data items on-demand by extracting metadata from ANS-104 bundle binaries. Enabled by default (GRAPHQL_ON_DEMAND_RESOLUTION_ENABLED=true). Includes a configurable timeout (GRAPHQL_ON_DEMAND_RESOLUTION_TIMEOUT_MS, default 5s) and concurrency limit (GRAPHQL_ON_DEMAND_RESOLUTION_MAX_CONCURRENT, default 1). Only applies to single-ID lookups; the plural transactions(...) query is unaffected.

  • SignatureType in GraphQL: The signatureType field is now surfaced in GraphQL transaction responses for data items.

  • Root TX Semaphore Prometheus Metrics: New Prometheus metrics for root TX resolution semaphore observability, including acquire/release/timeout counters and queue depth gauge.

Changed

  • Root TX Lookup Order: ROOT_TX_LOOKUP_ORDER reordered to prefer GraphQL over HyperBEAM and CDB for faster local resolution.

  • HyperBEAM Request Timeout: Default HyperBEAM request timeout lowered to 500ms.

Docker Images

Service Image
core ghcr.io/ar-io/ar-io-core:6e023ad1dbdfac67fdac1e62449bedfef1bb7fe4
envoy ghcr.io/ar-io/ar-io-envoy:6934e519fb98a46da4c17bdfa51d66225428b7c0
clickhouse-auto-import ghcr.io/ar-io/ar-io-clickhouse-auto-import:dec86f85bb9585658e424f393083bf6d69a7c5e1
observer ghcr.io/ar-io/ar-io-observer:9356a3d5cc2ed9ac406a62c3a01450ae80ddc6c3
litestream ghcr.io/ar-io/ar-io-litestream:be121fc0ae24a9eb7cdb2b92d01f047039b5f5e8
redis redis:7
clickhouse clickhouse/clickhouse-server:25.4
otel-collector otel/opentelemetry-collector-contrib:0.119.0

Release 74

01 Apr 19:28

Choose a tag to compare

This is a recommended release focused on cache performance, multi-domain ArNS support, and content moderation correctness. Key highlights include background caching for range request cache misses to improve video/media streaming performance, multiple ArNS root hosts for serving ArNS names across multiple domains from a single gateway, contiguous data cache hit/miss Prometheus metrics for improved observability, and configurable cache control for blocked responses. It also corrects HTTP 451 handling for blocked content, simplifies the parquet export pipeline, and adds ClickHouse and block verification to auto-verify.

Added

  • Background Caching for Range Request Cache Misses: When a range request (e.g., byte-range for video seeking) misses the local cache, the gateway now optionally fetches and caches the full item in the background so subsequent requests (range or full) are served locally. Controlled by BACKGROUND_CACHE_RANGE_MAX_SIZE (default: 0 / disabled) and BACKGROUND_CACHE_RANGE_CONCURRENCY. Includes deduplication, capacity-based drop semantics, and Prometheus metrics for monitoring cache activity

  • Multiple ArNS Root Hosts: Operators can now serve ArNS names across multiple domains from a single gateway instance by providing a comma-separated list in ARNS_ROOT_HOST (e.g., ARNS_ROOT_HOST=arweave.dev,g8way.io). The first host is used as the "primary" for gateway identity headers. ArNS resolution, apex content, and sandbox redirects work per-matched host with longest-suffix matching (#621)

  • ClickHouse Verification in Auto-Verify: Auto-verify is a data validation tool that checks consistency of indexed blockchain data (blocks, transactions, data items) across multiple backends (SQLite, Parquet, ClickHouse). This adds an optional ClickHouse source for verification in addition to the existing SQLite and Parquet sources.

  • Block Verification in Auto-Verify: Verify block data alongside transactions and data items

  • Bundle Data Prefetch in Auto-Verify: During auto-verify runs, raw bundle bytes are now fetched from the local gateway while it is still running, then parsed after shutdown. Previously the bundle-parser source had to fetch from arweave.net after the gateway was stopped, which was significantly slower.

  • Contiguous Data Cache Hit/Miss Metrics: New contiguous_data_cache_hits_total and contiguous_data_cache_misses_total Prometheus counters in ReadThroughDataCache, labeled by request_type (range vs full). Enables operators to monitor cache performance per request type.

  • Accurate Cache Miss vs Not-Found Metrics: Cache miss counter now fires only after a successful upstream fetch (data exists but wasn't cached locally). A new contiguous_data_not_found_total counter tracks requests where data is unavailable in any source, preventing not-found requests from inflating the miss count and skewing the cache hit rate.

  • Configurable Cache Control for Blocked (451) Responses: New CACHE_BLOCKED_MAX_AGE env var (default: 30 days, matching stable data TTL) controls the Cache-Control max-age sent with 451 responses. Previously, blocked responses used the short not-found TTL, causing CDNs and proxies to re-request blocked content too frequently.

  • Parent Bundle ID in Missing Data Item Errors: Auto-verify compareItems now includes the parent bundle ID in missing_in_source discrepancy messages for data items, making it easier to identify which bundle a missing item belongs to when debugging bundle-parser failures.

Changed

  • Parquet Export Pipeline Simplification: Eliminated DuckDB intermediate tables from the export pipeline. All core export logic moved from the bash script into src/workers/parquet-exporter.ts, with the CLI script becoming a thin wrapper around the admin API. The CLI now uses --api-host/--api-port instead of --core-db/--bundles-db.

  • Removed Legacy Auto-Verify CLI Options: Cleaned up deprecated verification flags

Fixed

  • ClickHouse ETL Height Range: Fixed off-by-one errors in height range calculations in clickhouse-auto-import

  • ClickHouse ETL Exit Code Capture: Fixed $? capturing the exit code of a variable assignment instead of the curl command

  • HTTP 451 for Blocked Content: Corrects r73's blocked-content status code from 452 (non-standard) to 451 ("Unavailable For Legal Reasons"), the IANA-registered standard for content blocked due to legal or policy reasons.

  • Trusted Gateway ArNS 451 Handling: The TrustedGatewayArNSResolver now accepts HTTP 451 responses from trusted gateways instead of treating them as errors. When a trusted gateway indicates a name is blocked, the local gateway respects that moderation signal and returns 451 to the client rather than falling through to the on-demand resolver.

  • Serving Cached Data with Undefined or Zero Content-Length: The read path of ReadThroughDataCache now skips cache entries with a missing or zero dataSize, preventing responses without a Content-Length header. The write path already rejected zero-size entries; this closes the corresponding read-side gap.

  • Parquet Export and ClickHouse Import Robustness: Parquet-export script now uses curl -o with temp files instead of head/tail parsing to handle multiline JSON API responses correctly. Auto-verify's importToClickHouse switches to execFileSync with an args array, preventing shell injection in ClickHouse import invocations.

  • Parquet Export Verify-Count Non-Zero Exit: parquet-export --verify-count now exits non-zero when row counts don't match, making it useful in CI and automation pipelines. Also validates curl availability at startup alongside python3.

  • High-Severity Dependency Vulnerabilities: Resolved known vulnerabilities in transitive production dependencies via yarn resolutions: path-to-regexp (ReDoS, via express/express-openapi-validator), h3 (request smuggling), picomatch (glob injection), preact (VNode injection), socket.io-parser (unbounded binary attachments), undici (multiple HTTP smuggling/memory issues), and bumped fast-xml-parser from 5.3.6 to 5.5.9.

  • JSON Data Files Missing from Production Build: offset-block-mapping.json was excluded from dist/ because the build copy step only matched .graphql, .sql, and .lua files. This caused a startup warning and fallback to slower full-range block searches in containers. .json files are now included in the copy step.

  • Auto-Verify Prefetch Timing and Empty ClickHouse URL: Removed the last_fully_indexed_at filter from the bundle prefetch query — this flag is set asynchronously by BundleRepairWorker, causing prefetch to find 0 bundles even when indexing was complete. Also handles AUTO_VERIFY_CLICKHOUSE_URL="" (empty string) explicitly to prevent a crash when the variable is set but empty in .env.

Docker Images

Service Image
core ghcr.io/ar-io/ar-io-core:9ea1a4cd12e220ea9790c1a457a0133a3dfd5960
envoy ghcr.io/ar-io/ar-io-envoy:6934e519fb98a46da4c17bdfa51d66225428b7c0
clickhouse-auto-import ghcr.io/ar-io/ar-io-clickhouse-auto-import:fc32edf92518d28cd3a5bbd759ad92d97b453322
observer ghcr.io/ar-io/ar-io-observer:9356a3d5cc2ed9ac406a62c3a01450ae80ddc6c3
litestream ghcr.io/ar-io/ar-io-litestream:be121fc0ae24a9eb7cdb2b92d01f047039b5f5e8
redis redis:7
clickhouse clickhouse/clickhouse-server:25.4
otel-collector otel/opentelemetry-collector-contrib:0.119.0
ao-cu ghcr.io/permaweb/ao-cu:08436a88233f0247f3eb35979dd55163fd51a153

Release 73

18 Mar 20:46

Choose a tag to compare

Added

  • Unified Cache-Control Headers: Move default Cache-Control from Envoy's catch-all route into Express middleware, eliminating duplicate headers and making data handler cache durations operator-configurable via DEFAULT_CACHE_CONTROL_MAX_AGE_SECONDS, STABLE_CACHE_CONTROL_MAX_AGE_DAYS, and related env vars (PE-9002)

  • Cache-Only Client IPs/CIDRs: New CACHE_ONLY_CLIENT_IPS_AND_CIDRS env var to short-circuit data retrieval requests with a 404 if the data is not already cached locally, useful for protecting upstream bandwidth from specific high-volume clients

  • Client Disconnect Prometheus Metric: New client_disconnect_total counter metric tracks when clients abort requests before the response completes (PE-9000)

  • P2P Contiguous Data Retrieval Improvements: Major overhaul of peer data retrieval to reduce tail latency and improve cache efficiency (PE-9007)

    • Hedged requests: Fires a second request to the next candidate peer after a configurable delay (PEER_HEDGE_DELAY_MS) if no response yet; first success cancels all others, capped at PEER_MAX_HEDGED_REQUESTS
    • Per-peer concurrency limiter: Fail-fast counter that skips saturated peers instead of queuing, configurable via PEER_MAX_CONCURRENT_OUTBOUND
    • Consistent hash ring: Routes each data ID to the same small set of "home" peers for cache locality, with weighted fallback for remaining slots; configured via PEER_HASH_RING_VIRTUAL_NODES and PEER_HASH_RING_HOME_SET_SIZE
    • Decoupled candidate pool: PEER_CANDIDATE_COUNT replaces the old min(peerCount, 3) logic, giving hedging a deeper bench to draw from
    • Chunk peer selection via hash ring: Chunk requests are also routed through the hash ring by absolute offset for improved peer cache utilization
  • Request Trace IDs: Every HTTP request gets a unique requestId in Winston log entries via AsyncLocalStorage, independent of OTEL. Reads or generates X-Request-Id headers and echoes them in responses for end-to-end request correlation (PE-8977)

Changed

  • HTTP 452 for Blocked Content: Blocked content now returns HTTP 452 with a descriptive message identifying the blocked ID and the node's content policy, instead of a generic 404 Not Found

Fixed

  • HTTP 499 Only for Actual Client Disconnects: Internal data retrieval timeouts (e.g., upstream gateway timeouts) were being misidentified as client disconnects. Now checks req.signal.aborted to confirm the client actually disconnected before returning 499

  • /tx_anchor Route Shadowing: Move /tx_anchor route before /tx in Envoy config to prevent prefix-match shadowing that caused /tx_anchor requests to be handled by the /tx route

  • Security Audit Vulnerabilities: Bump simple-git to fix critical RCE via blockUnsafeOperationsPlugin bypass; add multer resolution to fix high severity DoS vulnerabilities in express-openapi-validator

Docker Images

Image Tag
ghcr.io/ar-io/ar-io-core 92defe82acc1e7d2337bdacde1f65300503768ae
ghcr.io/ar-io/ar-io-envoy bedcb761098a2729c49bcfb3f7546151f5a6b632
ghcr.io/ar-io/ar-io-clickhouse-auto-import 4512361f3d6bdc0d8a44dd83eb796fd88804a384
ghcr.io/ar-io/ar-io-litestream be121fc0ae24a9eb7cdb2b92d01f047039b5f5e8
ghcr.io/ar-io/ar-io-observer 9356a3d5cc2ed9ac406a62c3a01450ae80ddc6c3

Release 72

11 Mar 18:05

Choose a tag to compare

This is a recommended release focused on data retrieval reliability and caching intelligence. Key highlights include a negative data cache that reduces upstream load for consistently missing data, direct byte offset hints to help gateways locate data when internal lookup mechanisms fall short, untrusted data caching with stochastic re-verification, and significant stream reliability improvements that eliminate false timeouts on large transfers. It also adds gateway loop prevention via per-gateway via-chain detection.

Added

  • Negative Data Cache: Two-phase cache that tracks data IDs consistently missing across configurable thresholds and short-circuits future requests with 404 responses, reducing upstream load during outages and for permanently unavailable data

    • Includes exponential backoff with fast re-promotion, health gating to prevent false positives during upstream outages, and TTL-based miss tracker eviction
    • Controlled via NEGATIVE_CACHE_ENABLED (default: true), NEGATIVE_CACHE_MAX_SIZE, NEGATIVE_CACHE_TTL_MS, NEGATIVE_CACHE_MISS_THRESHOLD_MS, and NEGATIVE_CACHE_MISS_COUNT_THRESHOLD
  • Direct Byte Offset Hints for Data Item Retrieval: Clients can supply X-AR-IO-Root-Transaction-Id, X-AR-IO-Root-Path, X-AR-IO-Root-Data-Offset, and X-AR-IO-Root-Data-Size headers to bypass server-side bundle lookups and resolve data items via direct byte offsets

    • Includes fetch-with-hint CLI tool for resolving hints via GraphQL
  • DATA_CACHED Webhook Event: Emits a webhook when data is cached for the first time, enabling external content moderation sidecars (e.g., phishing scanners)

    • Opt-in via WEBHOOK_EMIT_DATA_CACHED_EVENTS=true (default: false)
  • Untrusted Data Caching with Stochastic Re-verification: Caches all upstream data optimistically instead of only when a hash exists locally, with configurable background re-verification rates to ensure integrity

    • Controlled via UNTRUSTED_CACHE_RETRY_RATE (default: 0.1) and TRUSTED_CACHE_RETRY_RATE (default: 0.0)
    • Evicts data on hash mismatch to maintain integrity
  • 12-Hour Cache-Control Tier: New middle tier for data that is unstable but from a trusted source (e.g., trusted bundlers), providing a three-tier system: stable (30d, immutable) > unstable trusted (12h) > unstable (2h)

  • Chunk Broadcast Improvements: All 5 tip nodes (tip-1 through tip-5) are now included in default preferred chunk POST nodes, with shuffled ordering and a minimum success requirement

    • Controlled via CHUNK_POST_MIN_PREFERRED_SUCCESS_COUNT (default: 2)
  • OTEL Resource Attributes Passthrough: Operators can set custom OpenTelemetry resource attributes via the standard OTEL_RESOURCE_ATTRIBUTES environment variable, with env var values overriding auto-detected attributes

  • Gateway Loop Prevention: Per-gateway via-chain detection skips individual gateways already visited in the request path, with hop count validation against MAX_DATA_HOPS (3) as defense-in-depth. Client IP, forwarded IPs, and via header are now included as OTEL span attributes for observability.

Changed

  • Default CDB64_REMOTE_RETRIEVAL_ORDER changed to 'chunks' only, removing gateways from the default order since range requests aren't effectively cached on gateways

Fixed

  • Stream Reliability Improvements: Replaced wall-clock stream timeouts with backpressure-aware stall-based timeouts (30s no-data threshold), preventing false kills and truncated responses on large or slow transfers

    • Extracted pipeStreamToResponse helper for consistent stream pipe and error handling across routes
  • Fixed Axios CanceledError not being normalized to AbortError, causing incorrect upstream disconnection handling

  • Fixed streams not being destroyed on unexpected HTTP status codes from peers, preventing socket leaks

  • Added 206 Partial Content acceptance for ranged peer requests

  • Fixed upstream stream not being destroyed on premature client disconnect

  • Fixed detectLoopInViaChain to lowercase via entries for proper case-insensitive matching

Docker Images

  • ghcr.io/ar-io/ar-io-envoy:17a2cbdb71e1d1eba1a3c4e29aff96d69feb3246
  • ghcr.io/ar-io/ar-io-core:fb4017499c42a60d81bf5d0624a26b84841cd005
  • ghcr.io/ar-io/ar-io-clickhouse-auto-import:4512361f3d6bdc0d8a44dd83eb796fd88804a384
  • ghcr.io/ar-io/ar-io-observer:9356a3d5cc2ed9ac406a62c3a01450ae80ddc6c3
  • ghcr.io/ar-io/ar-io-litestream:be121fc0ae24a9eb7cdb2b92d01f047039b5f5e8

Release 71

26 Feb 18:41

Choose a tag to compare

This is a recommended release that adds per-gateway trust configuration for TRUSTED_GATEWAYS_URLS, enabling operators to mark individual gateways as trusted or untrusted for finer-grained data verification control. It also includes peer URL tracking in chunk broadcast responses for improved debuggability, and fixes for upstream gateway content-length validation to prevent serving bogus responses from gateways that return 200 instead of 404.

Added

  • Per-Gateway Trust Flag for TRUSTED_GATEWAYS_URLS: Extended the TRUSTED_GATEWAYS_URLS configuration format to support per-gateway trust levels

    • Untrusted gateways only cache data when the hash matches a known value, providing defense-in-depth against serving incorrect data
    • Default configuration now uses turbo-gateway.com (trusted) with arweave.net as an untrusted fallback
  • Peer URL in Chunk Broadcast Responses: Chunk broadcast responses now include the peer URL for better debuggability when troubleshooting chunk propagation issues

Changed

  • Default TRUSTED_GATEWAYS_URLS now uses turbo-gateway.com as the primary trusted gateway with arweave.net as an untrusted fallback

Fixed

  • Upstream Gateway Content-Length Validation: Added validation of content-length in GatewaysDataSource to reject responses with missing or zero content-length, preventing upstream gateways from serving bogus HTML landing pages when they return 200 instead of 404.

  • Zero-Byte Data Item Handling: Removed size-0 rejection from data handlers to allow zero-byte data items to be served correctly.

Docker Images

  • ghcr.io/ar-io/ar-io-envoy:17a2cbdb71e1d1eba1a3c4e29aff96d69feb3246
  • ghcr.io/ar-io/ar-io-core:dbdf97db26627c1fd38fd765eebe8db513a66dff
  • ghcr.io/ar-io/ar-io-clickhouse-auto-import:4512361f3d6bdc0d8a44dd83eb796fd88804a384
  • ghcr.io/ar-io/ar-io-observer:9356a3d5cc2ed9ac406a62c3a01450ae80ddc6c3
  • ghcr.io/ar-io/ar-io-litestream:be121fc0ae24a9eb7cdb2b92d01f047039b5f5e8

Release 70

24 Feb 23:02

Choose a tag to compare

This is a recommended release that introduces CDB64 download tooling for fetching remote partitioned indexes with resume support, chunk request concurrency limiting and first-data timeouts for improved data retrieval reliability, and expanded CDB64 root TX index coverage with new AO and without-content-type index sources. It also includes critical fixes for event listener leaks, stream data loss, and request cancellation under high concurrency, along with defense-in-depth loop protection enhancements and a new auto-verify indexing tool for cross-source bundle data validation.

Added

  • CDB64 Download Tool: New CLI tool (tools/download-cdb64) for fetching remote partitioned CDB64 indexes with production-grade reliability

    • Downloads partition files from manifest sources (HTTP URLs, Arweave TX IDs, byte-range specifications, local files)
    • HTTP Range request resume for interrupted downloads — partial .tmp files are preserved and downloads resume from where they left off
    • Per-partition retry support with configurable retry count (--retries/-r, default: 5)
    • Concurrent downloads with configurable parallelism (--concurrency/-c, default: 3)
    • SHA-256 verification of downloaded partitions against manifest checksums
    • Generates updated manifest with local file locations on completion
  • Streaming Partitioned CDB64 Writer: New low-memory CDB64 generation mode for large indexes

    • Reduces peak memory from O(total_records) to O(largest_partition) via two-phase scatter/build approach
    • Phase 1 writes records to per-partition temp files; Phase 2 builds CDB files sequentially
    • New --low-memory flag added to all CDB64 CLI tools (requires --partitioned)
    • Progress callbacks show partition-level build status
  • AO and Without-Content-Type CDB64 Index Sources: Expanded default CDB64 root TX index coverage

    • New AO data items index (~1.6B records) covering AO-tagged data items up to block height 1,820,000
    • New without-content-type index (~1.2B records) covering data items lacking a Content-Type tag up to block height 1,820,000
    • Default CDB64_ROOT_TX_INDEX_SOURCES now includes all three indexes
  • Chunk Request Concurrency Limiting and First-Data Timeout: New controls for chunk-based data retrieval under load

    • CHUNK_REQUEST_CONCURRENCY (default: 50): Limits concurrent chunk fetch requests to prevent overwhelming backends
    • CHUNK_FIRST_DATA_TIMEOUT_MS (default: 10000): Timeout for receiving the first chunk of data; if exceeded, request falls through to alternative data sources. Set to 0 to disable.
  • Root Bundle Gateway Fallback Path: Added /<id> fallback path for root bundle gateway requests when /raw/<id> fails

    • Enables bundle retrieval from HyperBEAM endpoints that may not support the /raw/<id> path
    • Separate rootBundleGatewaysDataSource instance configured with fallback enabled
  • ArNS Resolver Host Override: New TRUSTED_ARNS_RESOLVER_HOST_HEADER environment variable decouples connection target from Host header in ArNS resolver, with __NAME__ placeholder substitution for dynamic values

  • Defense-in-Depth Loop Protection Improvements (PE-8947): Additional safeguards against request forwarding loops between gateways

    • Hop count validation added to GatewaysDataSource (MAX_DATA_HOPS = 3)
    • Origin/IP blocking applied to peer forwarding path via FilteredContiguousDataSource
    • Startup warning when TRUSTED_GATEWAYS_URLS contains this gateway's own ARNS_ROOT_HOST, indicating a self-forwarding loop
  • Auto-Verify Indexing Tool: New tool (tools/auto-verify) for cross-source bundle data comparison and indexing consistency validation

    • Indexes configurable block ranges, then compares data items across SQLite, Parquet, GraphQL, and independently-parsed raw bundles
    • Verifies field consistency: offset, size, ownerOffset, ownerSize, signatureOffset, signatureSize, rootParentOffset
    • Bundle indexing timeout (5 minutes) for graceful handling of slow indexing
    • Cache preserved by default for faster iteration across runs
    • Detailed discrepancy output shows data item ID, field name, and per-source values
  • Git Worktree Helper: New development tool (tools/wt) for parallel development using git worktrees

    • Creates worktrees under wt/<branch> with symlinked .env and CLAUDE.local.md
    • Commands: add, rm, ls with --existing flag for checking out existing branches
    • Automatically runs yarn install with a clean data/ directory per worktree

Changed

  • Default ArNS gateway changed from ar-io.net to turbo-gateway.com
  • Default Arweave gateway in test suites changed from arweave.net to turbo-gateway.com
  • Renamed AO CDB directory from cdb64-root-tx-index-ao to cdb64-root-tx-index-ao-to-height-1820000 for naming consistency
  • Removed unused Cucumber dependency

Fixed

  • Stream Data Loss Prevention: Fixed range stream being put into flowing mode prematurely, causing data to be emitted and lost before the consumer could attach
  • HTTP Request Cancellation: Threaded AbortController through chunk timeout to properly cancel in-flight HTTP requests instead of only rejecting the promise
  • Timer Leaks: Fixed timeout timer leaks in chunk request implementation by hoisting timeout variable for proper cleanup in finally blocks
  • Event Listener Leaks: Replaced AbortSignal.any() with anySignal() from the any-signal package and added proper ClearableSignal.clear() calls in finally blocks across composite ArNS resolver and chunk request paths to prevent listener accumulation under high concurrency
  • CDB64 Config Order: Fixed CDB64 root TX index source search order to preserve configuration order instead of sorting alphabetically
  • Data Item Tag Handling: Content-Type and Content-Encoding tag processing now uses first match instead of last match, aligning with legacy gateway behavior
  • Docker Build: Multi-stage Dockerfile now copies resources/ directory into runtime stage so the default CDB64 manifest is accessible inside containers
  • Cache Directory Cleanup: .gitkeep files properly recreated after cleaning cache directories
  • CDB64 Test Reliability: Conditional test skipping when native CDB64 module is unavailable; direct module probing prevents async rejection leaks in CI
  • Dependency Security: Updated axios (DoS via __proto__), tar (symlink/overwrite CVEs), qs (arrayLimit bypass DoS), fast-xml-parser (RangeError DoS), and other transitive dependencies with known vulnerabilities

Release 69

11 Feb 22:18

Choose a tag to compare

This is a recommended release that introduces DNS-based multi-peer discovery via Envoy's Endpoint Discovery Service, enabling automatic Arweave peer detection with health-checked routing and consensus-based failover. It also adds multi-layered HyperBEAM request loop prevention using header, via-chain, and User-Agent detection to block infinite forwarding loops between gateways. Additionally, this release includes comprehensive CDB64 documentation covering operator guides, format specifications, and tooling reference.

Added

  • DNS-Based Multi-Peer Discovery with Envoy EDS: Automatic Arweave peer discovery and health-checked routing via Envoy's Endpoint Discovery Service

    • Resolves DNS records (e.g., peers.arweave.xyz) to discover Arweave peers automatically
    • Health-checks peers and classifies them as "full" (complete blockchain data) or "partial" (incomplete) based on sync status
    • Routes requests to fully-synced peers first with automatic failover to partial peers
    • Consensus-based reference height calculation prevents routing to stale or outlier peers
    • New Prometheus metrics for peer discovery, classification, and health check monitoring
    • Configurable via ARWEAVE_PEER_DNS_RECORDS, ARWEAVE_PEER_DNS_PORT, ARWEAVE_PEER_HEALTH_CHECK_INTERVAL_MS, and related environment variables
    • Enabled by default with ENABLE_ARWEAVE_PEER_EDS=true; falls back to static TRUSTED_NODE_HOST when disabled
    • EDS files validated on startup with corrupt files automatically removed and re-seeded
  • HyperBEAM Request Loop Prevention: Multi-layered detection of compute-origin requests to prevent infinite forwarding loops between gateways. Local data sources (cache, S3, database) always continue to serve data normally.

    • Header-based detection: Requests with configured headers (default: ao-peer-port) are identified as compute-origin and blocked from remote forwarding. Configurable via SKIP_FORWARDING_HEADERS.
    • Via-chain loop detection: New X-AR-IO-Via header tracks the chain of gateway identities across hops. When a gateway detects its own identity in the via chain, it stops forwarding to prevent loops. Gracefully degrades when ARNS_ROOT_HOST is not configured.
    • User-Agent detection: Requests with missing or empty User-Agent headers skip remote forwarding by default, catching HTTP clients like Erlang's gun (used by HyperBEAM) that don't send a User-Agent. Configurable via SKIP_FORWARDING_EMPTY_USER_AGENT (default: true). Additionally, SKIP_FORWARDING_USER_AGENTS allows specifying User-Agent substrings for case-insensitive matching (e.g., HyperBEAM).
  • CDB64 Documentation: Comprehensive documentation for the CDB64 root transaction index feature

    • Operator guide (docs/cdb64-guide.md) covering configuration, monitoring, custom index sources, and troubleshooting
    • Format specification (docs/cdb64-format.md) detailing the CDB64 binary format, key encoding, and location types
    • Tools reference (docs/cdb64-tools.md) documenting all 6 CDB64 CLI tools with usage examples
    • Overview page (docs/cdb64.md) linking all CDB64 documentation
    • Documentation index (docs/INDEX.md) providing a navigable overview of all gateway documentation
    • CDB64 section added to README for quick orientation
    • Glossary entries for CDB64-related terms
    • Fixed default values for CDB64_CACHE_SIZE and CDB64_ROOT_TX_INDEX_SOURCES in docs/envs.md
    • Build script (tools/build-cdb64-napi) for compiling the native CDB64 N-API module from source

Changed

  • Updated observer to increase default chunk observation sample rate to 20%

Docker Images

Image SHA
ar-io-envoy 86c53dcbf0bb1533c5d32d44f2db11ab9cfa2629
ar-io-core 56c7e1c0fe14d8033ddd9fdd57344933b1f1baaa
ar-io-clickhouse-auto-import 4512361f3d6bdc0d8a44dd83eb796fd88804a384
ar-io-litestream be121fc0ae24a9eb7cdb2b92d01f047039b5f5e8

Release 68

04 Feb 18:09

Choose a tag to compare

Release 68 - 2026-02-04

This is an optional release that introduces A/B testing infrastructure for data sources via the new SamplingContiguousDataSource, enabling operators to safely evaluate alternative retrieval strategies with controlled traffic exposure and built-in metrics. It also includes CDB64 location type renames for improved clarity in manifest schemas, and data retrieval tooling enhancements with content validation options for easier gateway comparison testing.

Added

  • SamplingContiguousDataSource for A/B Testing: New data source wrapper that
    probabilistically routes requests through an experimental source (PE-8900)

    • Enables safe A/B testing of new retrieval strategies with controlled traffic
      exposure
    • Two sampling strategies: random (per-request) or deterministic
      (consistent per ID using SHA-256 hash)
    • Configurable sampling rate (0-1) with validation to reject invalid values
    • New Prometheus metrics: sampling_decision_total, sampling_request_total,
      sampling_latency_ms
    • OpenTelemetry span instrumentation for sampled requests
  • Data Retrieval Tool Enhancements: New options in tools/test-data-retrieval
    for content validation

    • --bytes <n>: Fetch first N bytes using HTTP Range headers for content
      comparison
    • --max-size <bytes>: Two-phase retrieval (HEAD then GET) for content under
      size threshold
    • RPS (requests per second) statistics in console and JSON output
    • SHA-256 hash computation for content comparison between gateways

Changed

  • CDB64 Arweave Location Type Renames: Renamed location types in CDB64
    manifests for clarity
    • arweave-txarweave-id (field: txIdid)
    • arweave-bundle-itemarweave-byte-range (field: txIdrootTxId,
      offsetdataOffsetInRootTx, added optional dataItemId, removed size
      from location)
    • Includes script to add dataItemId fields by reading ANS-104 headers
    • Updated bundled manifest with new format

Docker Images

Image SHA
ghcr.io/ar-io/ar-io-envoy 4755fa0a2deb
ghcr.io/ar-io/ar-io-core 86e9adedb44f
ghcr.io/ar-io/ar-io-clickhouse-auto-import 4512361f3d6b
ghcr.io/ar-io/ar-io-litestream be121fc0ae24