Skip to content

Add CLAUDE.md / AGENTS.md for AI coding assistant context #5407

@crtahlin

Description

@crtahlin

Why

AI coding assistants (Claude Code, Cursor, GitHub Copilot, Codex) auto-load project context files when opening a repository — CLAUDE.md, AGENTS.md, .cursorrules, etc. Without these files, AI tools start with zero project context: they don't know about the package structure, Kademlia addressing, chunk types, protocol patterns, how the node bootstraps, or how to run tests.

Adding these files means:

  • Faster onboarding — AI understands the architecture before generating code
  • Fewer mistakes — knows about error wrapping conventions, separate test packages, protobuf protocols, the CODING.md rules
  • Consistent output — follows existing conventions (copyright headers, American English, golangci-lint rules)

Proposed CLAUDE.md

Full file content (click to expand)
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

Bee is the reference implementation of an Ethereum Swarm node, written in Go. It implements decentralized storage and communication protocols — content-addressed chunk storage, Kademlia-based routing, postage stamp accounting, push/pull syncing, PSS messaging, feeds, and storage incentives (redistribution game).

**Module**: `github.com/ethersphere/bee/v2`
**Go version**: 1.25
**License**: BSD-3-Clause

## Build & Development Commands

```bash
make build        # Compile all packages (CGO_ENABLED=0)
make binary       # Build dist/bee binary with version ldflags
make test         # Run unit tests (-failfast)
make test-race    # Run tests with race detector
make test-integration  # Run integration tests (requires -tags=integration)
make lint         # Run golangci-lint (installs v2.5.0 if missing)
make format       # Format with gofumpt + gci
make protobuf     # Regenerate protobuf files
make clean        # Remove dist/ and go clean
make docker-build # Build Docker image
```

**CI-specific targets:**
- `make test-ci` — skips tests with names ending in `FLAKY`
- `make test-ci-race` — same, with race detector
- `make test-ci-flaky` — runs only `FLAKY` tests

**Beekeeper (integration testing):**
- `make beekeeper` — install beekeeper tool
- `make beelocal` — start local cluster
- `make deploylocal` — deploy bee cluster
- `make testlocal` — run integration checks (connectivity, manifest, pingpong, pss, pushsync, retrieval, settlements, soc)

## Architecture

### Entry Point & CLI

The binary is built from `cmd/bee/`. CLI uses **Cobra** for commands and **Viper** for configuration:

- `bee start` — start a full or light node (`cmd/bee/cmd/start.go`)
- `bee init` — initialize data directory (`cmd/bee/cmd/init.go`)
- `bee deploy` — deploy smart contracts (`cmd/bee/cmd/deploy.go`)
- `bee db` — database management commands (`cmd/bee/cmd/db.go`)
- `bee split` — chunk splitting utility (`cmd/bee/cmd/split.go`)
- `bee version` — print version (`cmd/bee/cmd/version.go`)
- `bee start --dev-mode` — start in development mode with in-memory storage (`cmd/bee/cmd/start_dev.go`)

Configuration is defined as constants in `cmd/bee/cmd/cmd.go` (option names like `optionNameDataDir`, `optionNameAPIAddr`, etc.). Viper reads from CLI flags, environment variables (`BEE_` prefix), and YAML config file.

### Node Bootstrap

`pkg/node/node.go` is the main orchestrator. It wires all subsystems together via dependency injection — no global mutable state. The `Bee` struct holds references to every service and provides `Shutdown()` for clean teardown.

### HTTP API

- **Router**: `gorilla/mux` in `pkg/api/router.go`
- **Handler pattern**: `jsonhttp.MethodHandler` maps HTTP methods to handler functions
- **Three route groups** mounted in `Mount()`:
  - `mountTechnicalDebug()``/node`, `/addresses`, `/health`, `/readiness`, `/metrics`, `/loggers`, pprof
  - `mountBusinessDebug()` — topology, accounting, settlements, stamps management
  - `mountAPI()``/bytes`, `/chunks`, `/bzz`, `/feeds`, `/soc`, `/stamps`, `/tags`, `/pins`, `/pss`, `/grantee`
- **Route gating**: `checkRouteAvailability` blocks endpoints during sync; `checkSwapAvailability`, `checkChequebookAvailability`, `checkStorageIncentivesAvailability` for feature-specific gates
- **OpenAPI spec**: `openapi/Swarm.yaml` (v7.4.1) and `openapi/SwarmCommon.yaml`
- All endpoints available both at root (`/bytes`) and versioned (`/v1/bytes`)

### P2P Networking

- **Transport**: `libp2p` (via `pkg/p2p/libp2p/`)
- **Protocols use protobuf** — each protocol package has a `pb/` subdirectory with `.proto` files
- **Key protocols**:
  - `pushsync` — push chunks to their neighborhood
  - `pullsync` — pull chunks from peers during syncing
  - `retrieval` — retrieve chunks by address
  - `pingpong` — liveness checking
  - `hive` — peer discovery and address broadcasting
  - `pricing` — price announcements between peers
  - `gsoc` — General Single Owner Chunks

### Storage

- **Chunk types**: Content-Addressed Chunks (`pkg/cac/`) and Single Owner Chunks (`pkg/soc/`)
- **Core abstractions**: `pkg/storage/` defines `Putter`, `Getter`, `Hasser`, `Deleter` interfaces
- **Storer**: `pkg/storer/` manages the local store (reserve, cache, upload store, pinning)
- **Sharky**: `pkg/sharky/` — blob storage engine for chunk data (fixed-size slots)
- **BMT**: `pkg/bmt/` — Binary Merkle Tree hasher for chunk integrity
- **State store**: `pkg/statestore/` — LevelDB-backed key-value store for node state
- **Shed**: `pkg/shed/` — LevelDB abstraction layer with typed indexes

### Postage Stamps

- `pkg/postage/` — core stamp types, batch store, batch service
- `pkg/postage/batchstore/` — persists stamp batches
- `pkg/postage/listener/` — listens to on-chain stamp events
- `pkg/postage/postagecontract/` — interacts with the postage stamp smart contract
- Stamps have a **batch ID**, **depth** (capacity), and **amount** (value per chunk)

### Storage Incentives

- `pkg/storageincentives/` — the redistribution game agent
- Nodes in the correct neighborhood participate in storage incentive rounds
- Involves sampling the reserve, submitting proofs, and claiming rewards

### Other Key Packages

| Package | Purpose |
|---------|---------|
| `pkg/swarm/` | Core types: `Address` (32-byte hash), `Chunk`, constants (`HashSize`, `ChunkSize`, `MaxPO`) |
| `pkg/topology/` | Kademlia routing table and peer management |
| `pkg/feeds/` | Mutable resources via Single Owner Chunks |
| `pkg/pss/` | Postal Service over Swarm (trojan chunk messaging) |
| `pkg/accesscontrol/` | Access Control Trie (ACT) for encrypted content sharing |
| `pkg/redundancy/` | Erasure coding (Reed-Solomon) |
| `pkg/replicas/` | Dispersed Replicas for chunk redundancy |
| `pkg/manifest/` | Trie-based directory structures (Mantaray) |
| `pkg/settlement/` | Payment channels (SWAP chequebook + pseudosettle) |
| `pkg/resolver/` | ENS and other name resolution |
| `pkg/crypto/` | Signing, key management |
| `pkg/keystore/` | Encrypted key storage (file-based and in-memory) |
| `pkg/accounting/` | Per-peer bandwidth accounting |
| `pkg/salud/` | Network health scoring |
| `pkg/log/` | Structured logging with V-levels |
| `pkg/metrics/` | Prometheus metrics |
| `pkg/tracing/` | OpenTracing / Jaeger |

## Coding Conventions

See `CODING.md` and `CODINGSTYLE.md` for full details. Key rules:

### General

- Follow [Effective Go](https://golang.org/doc/effective_go.html) and [Go Code Review Comments](https://github.com/golang/go/wiki/CodeReviewComments)
- **American English** spelling (marshaling, not marshalling; canceled, not cancelled)
- **Copyright header required** on every Go file:
  ```go
  // Copyright <year> The Swarm Authors. All rights reserved.
  // Use of this source code is governed by a BSD-style
  // license that can be found in the LICENSE file.
  ```
- **Conventional commits** — follow [conventionalcommits.org](https://www.conventionalcommits.org/en/v1.0.0/), imperative mood, max 72 chars

### Error Handling

- Propagate errors up; don't log and return the same error
- Annotate with `fmt.Errorf("context: %w", err)` — produces stack-trace-like messages
- Avoid "failed to" prefixes: use `"new store: %w"` not `"failed to create new store: %w"`
- Sentinel errors: `var ErrFoo = errors.New("package: description")`

### Concurrency

- Every goroutine must have a defined termination path
- Channels have an owning goroutine; prefer directional channel types
- Channel size: 0 (unbuffered) or 1 — anything else needs justification

### Code Style

- Avoid `init()` functions
- Start enums at `iota + 1` (zero value should mean "unset")
- Use `time.Duration` and `time.Time` (not raw ints for time)
- Verify interface compliance: `var _ Interface = (*Impl)(nil)`
- Avoid embedding types in public structs
- Exit only in `main()`, never from library code

### Logging

- Four levels: `Error`, `Warning`, `Info`, `Debug` (with V-levels)
- `Error`/`Warning` — for node operators, no internal implementation details
- `Debug` — for developers, include technical details
- Keys: `lower_snake_case`, human-readable, specific (`peer_address` not `address`)
- Loggers are runtime-configurable via `/loggers` API endpoint

## Testing

- **Separate test packages** preferred (e.g., `package api_test`) to test exported behavior
- Use `export_test.go` to expose internals only to tests
- **Run tests in parallel** where possible (`t.Parallel()`)
- **Compact test names** — use godoc for scenario description, not the test name
- Avoid "fail" in test names (conflicts with test runner output)
- Use `t.Fatal` / `t.FailNow` instead of `panic` in tests
- **Flaky tests**: name them with `FLAKY` suffix so CI can separate them
- **Integration tests**: use `-tags=integration` build tag
- **Goroutine leak detection**: `go.uber.org/goleak` is available
- **Mocking**: dependency injection via interfaces; no mocking frameworks

## Linting

**golangci-lint v2** with `.golangci.yml` config. Notable enabled linters:
- `goheader` — enforces copyright header
- `paralleltest` — detects missing `t.Parallel()`
- `misspell` — catches British spellings
- `errorlint` — ensures proper error wrapping
- `gochecknoinits` — flags `init()` functions
- `prealloc` — suggests pre-allocating slices
- `goconst` — detects repeated string literals (disabled in tests)
- `forbidigo` — restricts `fmt.Print` usage (allowed in `cmd/bee/cmd/`)

## Key Domain Concepts

- **Address** — 32-byte hash (`pkg/swarm/`). Used for both chunk addresses and node overlay addresses. Proximity is measured by XOR distance.
- **Chunk** — 4KB data unit (4096 bytes + 8-byte span). The fundamental storage unit.
- **CAC (Content Address Chunk)** — chunk whose address is determined by its content hash (BMT root)
- **SOC (Single Owner Chunk)** — chunk with a deterministic address derived from owner + ID, signed by the owner
- **Neighborhood** — set of addresses sharing a common prefix (proximity order). Determines which chunks a node is responsible for.
- **Proximity Order (PO)** — number of leading bits two addresses share (0-31, with extended PO up to 36)
- **Kademlia** — routing overlay based on XOR distance. Peers organized in bins by proximity order.
- **Postage Stamp** — proof of payment attached to chunks. Batch has depth (log2 capacity) and amount (per-chunk value).
- **PLUR** — smallest unit of BZZ token (1 BZZ = 10^16 PLUR)
- **Reserve** — chunks a node is responsible for storing based on its neighborhood
- **Push sync** — protocol for forwarding newly uploaded chunks to their neighborhood
- **Pull sync** — protocol for syncing chunks between peers in overlapping neighborhoods
- **Redistribution** — storage incentive game where nodes prove they store chunks and earn rewards

## Directory Structure

```
cmd/
  bee/
    cmd/          # Cobra CLI commands (start, init, deploy, db, version, split)
    main.go       # Entry point
openapi/
  Swarm.yaml      # OpenAPI 3.0 spec (v7.4.1)
  SwarmCommon.yaml # Shared schemas and parameters
packaging/        # System packaging (Homebrew, Debian, etc.)
pkg/
  api/            # HTTP API handlers and router (gorilla/mux)
  node/           # Node bootstrap and dependency wiring
  swarm/          # Core types (Address, Chunk, constants)
  p2p/            # P2P transport layer (libp2p)
  topology/       # Kademlia routing table
  storage/        # Storage interfaces (Putter, Getter, etc.)
  storer/         # Local store implementation (reserve, cache, upload, pinning)
  sharky/         # Blob storage engine
  postage/        # Postage stamp system (batches, contracts, events)
  storageincentives/ # Redistribution game agent
  pushsync/       # Push sync protocol
  pullsync/       # Pull sync protocol
  retrieval/      # Chunk retrieval protocol
  feeds/          # Mutable feed references
  pss/            # Postal Service over Swarm
  soc/            # Single Owner Chunks
  cac/            # Content Address Chunks
  bmt/            # Binary Merkle Tree hasher
  accesscontrol/  # ACT encryption
  redundancy/     # Erasure coding
  settlement/     # Payment (SWAP + pseudosettle)
  accounting/     # Per-peer bandwidth accounting
  crypto/         # Signing and key management
  keystore/       # Encrypted key storage
  log/            # Structured logging
  config/         # Chain and network configuration
  ...             # ~60 packages total
```

## Common Pitfalls

- **Chunk size is exactly 4096 bytes** (data) + 8-byte span prefix. Don't confuse `ChunkSize` (4096) with `ChunkWithSpanSize` (4104).
- **Addresses are XOR-distance based** — "closer" means more shared prefix bits, not numerically smaller.
- **`MaxPO` is 31** (5 bits of proximity order), but `ExtendedPO` goes to 36 for finer neighborhood granularity.
- **Don't log and return the same error** — pick one. Errors propagate up, logging happens at the handler/entry point.
- **Separate test packages** — tests go in `package foo_test`, not `package foo`. Use `export_test.go` for internal access.
- **Every goroutine needs a shutdown path** — typically via context cancellation or a quit channel.
- **Stamps can be unusable** — expired, depleted, or from an unsynced batch. Always check usability.
- **Full node vs light node** — some features (reserve, storage incentives) are only available on full nodes.
- **PLUR vs BZZ** — 1 BZZ = 10^16 PLUR. Smart contracts work in PLUR.

AGENTS.md

We recommend also adding an AGENTS.md file with identical content. This filename is recognized by other AI coding tools (OpenAI Codex, Cursor, etc.), ensuring broad coverage regardless of which assistant a contributor uses. Maintaining two files with the same content is low-overhead since the content rarely changes.

Reference

Part of the effort tracked in ethersphere/DevRel#825 — adding AI context files to core Swarm repositories (bee, bee-js, swarm-cli).


Generated with AI

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions