Skip to content

Commit 8494c15

Browse files
committed
docs: update README.md to reflect new project direction
1 parent a303a08 commit 8494c15

1 file changed

Lines changed: 195 additions & 44 deletions

File tree

README.md

Lines changed: 195 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -1,49 +1,200 @@
1-
# ⏳︎ tskv — Time-Series Key-Value Store
1+
# ⏳︎ tskv — Time-Window **T**ime-**S**eries **K**ey-**V**alue Cache
22

3-
**TL;DR:** Single-node, crash-safe time-series KV store with a non-blocking TCP server (Linux **epoll**) and LSM-style storage: **Write-Ahead Log (WAL) → memtable → immutable SSTables**, plus a background compaction worker. Built with **C++23 modules**; no third-party libraries.
3+
**TL;DR:** `tskv` is a single-node, in-memory **time-window cache** for time-series data, with:
4+
5+
- A bounded sliding retention window (e.g., "last N minutes")
6+
- A simple binary protocol over a non-blocking TCP server (Linux **epoll**)
7+
- Time-partitioned in-memory segments for efficient expiration
8+
- Modern **C++23** with modules; no third-party libraries
9+
10+
Stage-0 (`v1.0`) focuses on a small, understandable core: a hot time window with clear semantics and basic observability, leaving WAL, multi-threading, and advanced features for future versions.
411

512
---
613

714
## ◎ Goals
8-
- Demonstrate disciplined systems design in modern C++.
9-
- Show clear **durability** boundaries (WAL append / optional sync) and **read-after-write** visibility.
10-
- Keep **backpressure** and buffers **bounded** for predictable latency.
11-
- Favor **correctness + measurable performance** over feature breadth.
12-
13-
## ⛶ Architecture
14-
- **Write path:** append to **WAL** → (optional `fdatasync`) → apply to **memtable** → periodic **flush** to **SSTable** (immutable, sorted).
15-
- **Read path:** **memtable** first → then newest-to-oldest **SSTables**; per-file **Bloom filter** to skip negatives; index to jump to the right block.
16-
- **Compaction:** merge overlapping SSTables, keep newest versions, drop obsolete ones; install via **manifest** with durable rename.
17-
18-
## ⚑ Roadmap (high-level)
19-
- [x] v0.1 — Bootstrap: README, CLI, PR template
20-
- [x] v0.2 — Non-blocking TCP + epoll echo; clean shutdown
21-
- [ ] v0.3 — Framing: header + length; PING/PONG
22-
- [ ] v0.4 — Connection state: RX/TX rings; backpressure cap
23-
- [ ] v0.5 — Engine queues: SPSC/MPSC; dispatcher
24-
- [ ] v0.6 — WAL v1: append+CRC; sync policy flag
25-
- [ ] v0.7 — Recovery: replay WAL; torn-tail safe
26-
- [ ] v0.8 — Memtable v0: std::map; PUT/GET end-to-end
27-
- [ ] v0.9 — SSTable v1: writer/reader; mmap; footer
28-
- [ ] v0.10 — Manifest: live tables; durable rename
29-
- [ ] v0.11 — Wire-through: GET/PUT via SST path
30-
- [ ] v0.12 — Bloom filters: per-SST; bits/key tuning
31-
- [ ] v0.13 — Memtable v1: skiplist + iterator
32-
- [ ] v0.14 — SCAN RPC: streaming RESP; writev batches
33-
- [ ] v0.15 — Concurrency: N I/O, M engine; fairness
34-
- [ ] v0.16 — Metrics: counters + p50/p95/p99 endpoint
35-
- [ ] v0.17 — Compaction v1: merge + manifest install
36-
- [ ] v0.18 — Chaos tests: disk-full; kill-9 loops
37-
- [ ] v0.19 — Perf pass: micro/macro benches; notes
38-
- [ ] v1.0 — Polish: docs, demo.sh, ASan/UBSan; release
39-
40-
## ∷ C++ Module Layout
41-
- `tskv.common.*` — logging, metrics, ring buffers, fs helpers
42-
- `tskv.net.*` — socket (non-blocking), reactor (**epoll**, edge-triggered), connection, rpc
43-
- `tskv.kv.*` — engine, wal, memtable, sstable, manifest, compaction, filters
44-
45-
## ∑ Metrics (planned)
46-
- **net:** connections_open, rx_bytes_total, tx_bytes_total, backpressure_events_total
47-
- **rpc:** put_total, get_total, scan_total, errors_total
48-
- **wal/sstable:** appends_total, fsync_total, files_total, bloom_negative_total
49-
- **latency:** p50/p95/p99 for GET & PUT
15+
16+
- Provide a compact, readable example of a **time-window time-series cache**:
17+
- Bounded memory via a fixed retention window
18+
- Explicit "visible data" contract: recent data only
19+
- Demonstrate disciplined systems design in **modern C++**:
20+
- C++23 modules
21+
- Non-blocking I/O with **epoll**
22+
- Favor **correctness + clear invariants** over feature breadth:
23+
- Simple, explicit write and read paths
24+
- Straightforward retention / expiration logic
25+
- Keep latency and resource usage **predictable**:
26+
- No unbounded growth from infinite history
27+
- Easy-to-reason-about hot path
28+
29+
---
30+
31+
## ⛶ Architecture (Stage-0)
32+
33+
### Data model
34+
35+
- Keys are time-series identifiers (e.g. `cpu.user`, `service=api,host=foo`).
36+
- Each write is a tuple `(series_id, timestamp, value)`.
37+
- The store maintains only data within a **sliding time window**:
38+
- `timestamp >= now - WINDOW`
39+
- Older data is considered expired and is eventually dropped.
40+
41+
### In-memory layout
42+
43+
- Data is stored in **time-partitioned segments**:
44+
- Each segment covers a fixed time slice (e.g. 1–10 seconds).
45+
- Segments are organized in time order (oldest → newest).
46+
- New writes go to the current "tail" segment.
47+
- A periodic retention pass drops the oldest segments whose time range is fully outside the configured window.
48+
49+
This keeps memory and data size bounded by `WINDOW`, not by total insert volume.
50+
51+
### Network path
52+
53+
- Single-node server using **non-blocking TCP** and **epoll**.
54+
- Simple length-prefixed framing for requests and responses.
55+
- Initial RPCs:
56+
- `PING / PONG` for connectivity checks.
57+
- `PUT_TS(series_id, timestamp, value)` to append a point.
58+
- `GET_TS_LATEST(series_id)` to fetch the latest point in-window.
59+
- `RANGE(series_id, from_ts, to_ts)` to read points for a series over a time range (clipped to the window).
60+
61+
---
62+
63+
## ⚑ Roadmap
64+
65+
### Implemented
66+
67+
- [x] **v0.1 — Bootstrap**
68+
- README, minimal CLI stub, basic project layout
69+
- PR template and basic coding conventions
70+
71+
- [x] **v0.2 — Non-blocking TCP**
72+
- Non-blocking server with **epoll**
73+
- Basic echo handler for manual testing
74+
- Clean shutdown path
75+
76+
### Planned to v1.0 (Stage-0)
77+
78+
- [ ] **v0.3 — Framing + basic RPCs**
79+
- Length-prefixed request/response framing
80+
- `PING` / `PONG` and error responses
81+
- Skeleton handlers for time-series commands
82+
83+
- [ ] **v0.4 — In-memory window store v0**
84+
- Global `WINDOW` config (e.g. last N minutes)
85+
- Single in-memory container for `(series_id, timestamp, value)`
86+
- Simple, periodic cleanup of expired entries
87+
- `PUT_TS` + `GET_TS_LATEST` end-to-end
88+
89+
- [ ] **v0.5 — Time-partitioned segments**
90+
- Replace the single container with fixed-duration segments
91+
- Append writes to the current segment
92+
- Drop whole segments when they fall completely out of window
93+
- Basic `RANGE(series_id, from_ts, to_ts)` over segments
94+
95+
- [ ] **v0.6 — Window-aware introspection**
96+
- `WINDOW_INFO` RPC:
97+
- Window size, number of segments
98+
- Approximate memory usage
99+
- Debug dump of segments and series counts
100+
101+
- [ ] **v0.7 — Metrics**
102+
- Simple counters and gauges:
103+
- `ts_put_total`, `ts_get_latest_total`, `ts_range_total`, `ts_errors_total`
104+
- `window_segments`, `window_series_approx`, `window_points_approx`
105+
- Text or simple binary metrics endpoint/command
106+
107+
- [ ] **v0.8 — Indexing pass (per-segment)**
108+
- Optional per-segment index:
109+
- `series_id -> offsets`
110+
- Speed up `GET_TS_LATEST` and `RANGE` without scanning all entries
111+
- Microbenchmarks for lookup vs. scan
112+
113+
- [ ] **v0.9 — Reliability + perf pass**
114+
- Basic property tests for window semantics:
115+
- Writes with timestamps < `now - WINDOW` are never visible
116+
- Writes with timestamps in the window remain visible
117+
- Simple load generator for PUT/GET/RANGE
118+
- First round of notes on throughput/latency
119+
120+
- [ ] **v1.0 — Stage-0 release**
121+
- Minimal but complete time-window cache:
122+
- Protocol, window store, segments, indexing, basic metrics
123+
- Documentation:
124+
- Architecture overview
125+
- Wire protocol reference
126+
- Example usage script (`demo.sh`)
127+
- Sanitizers in CI (ASan/UBSan) and a small test suite
128+
129+
---
130+
131+
## ∷ C++ Module Layout (Stage-0)
132+
133+
- `tskv.common.*`
134+
- Logging, basic metrics types, time helpers
135+
- Small ring buffers / utility containers
136+
- `tskv.net.*`
137+
- Socket wrapper (non-blocking)
138+
- Reactor (**epoll**, edge-triggered)
139+
- Connection and RPC framing
140+
- `tskv.window.*`
141+
- In-memory segments
142+
- Window management (retention, expiration)
143+
- Query execution (latest / range)
144+
- Simple per-segment indexing
145+
146+
---
147+
148+
## ∑ Metrics (Stage-0, planned)
149+
150+
- **net:**
151+
- `net_connections_open`
152+
- `net_rx_bytes_total`
153+
- `net_tx_bytes_total`
154+
155+
- **rpc:**
156+
- `rpc_ping_total`
157+
- `ts_put_total`
158+
- `ts_get_latest_total`
159+
- `ts_range_total`
160+
- `rpc_errors_total`
161+
162+
- **window:**
163+
- `window_segments`
164+
- `window_points_approx`
165+
- `window_series_approx`
166+
167+
- **latency (sample-based, coarse):**
168+
- p50 / p95 / p99 for:
169+
- `PUT_TS`
170+
- `GET_TS_LATEST`
171+
- `RANGE`
172+
173+
---
174+
175+
## After v1.0 (ideas, not committed)
176+
177+
These are potential extensions beyond the stage-0 scope:
178+
179+
- **Durability for the hot window**
180+
- WAL segments per time-partitioned segment
181+
- Startup replay to reconstruct the last N minutes after a crash
182+
183+
- **Multi-threading**
184+
- Split I/O and engine into separate threads
185+
- Shard data across multiple engine threads by series id
186+
187+
- **Richer time-series semantics**
188+
- Per-series TTL overrides
189+
- Server-side aggregates:
190+
- Windowed `SUM` / `AVG` / `MIN` / `MAX` RPCs
191+
192+
- **Advanced observability**
193+
- More detailed metrics (per-series or per-connection)
194+
- Debug RPCs to inspect segment contents, hot keys, etc.
195+
196+
- **Replication / HA experiments**
197+
- Simple follower replication for the time window
198+
- Eventually-consistent read replicas
199+
200+
Stage-0 (`v1.0`) stays deliberately small: a single-node, in-memory time-window cache with a clear contract and straightforward implementation. Everything else can grow out of that foundation.

0 commit comments

Comments
 (0)