Skip to content

Commit db0354c

Browse files
DavidLiedleclaude
andcommitted
Add Database Internals book with GitHub Pages deployment
Complete educational book covering database internals: - 15 chapters covering storage engines, indexing, transactions, query processing, recovery, and distributed databases - Comprehensive glossary and further reading appendices - GitHub Actions workflow for mdBook deployment to GitHub Pages - Custom CSS styling for improved readability Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 321af53 commit db0354c

24 files changed

Lines changed: 9991 additions & 0 deletions

.github/workflows/deploy.yml

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
name: Deploy Book to GitHub Pages
2+
3+
on:
4+
push:
5+
branches:
6+
- main
7+
pull_request:
8+
branches:
9+
- main
10+
workflow_dispatch:
11+
12+
permissions:
13+
contents: read
14+
pages: write
15+
id-token: write
16+
17+
concurrency:
18+
group: "pages"
19+
cancel-in-progress: false
20+
21+
jobs:
22+
build:
23+
runs-on: ubuntu-latest
24+
steps:
25+
- name: Checkout
26+
uses: actions/checkout@v4
27+
28+
- name: Setup mdBook
29+
uses: peaceiris/actions-mdbook@v2
30+
with:
31+
mdbook-version: 'latest'
32+
33+
- name: Build book
34+
run: mdbook build
35+
36+
- name: Setup Pages
37+
uses: actions/configure-pages@v4
38+
39+
- name: Upload artifact
40+
uses: actions/upload-pages-artifact@v3
41+
with:
42+
path: './book'
43+
44+
deploy:
45+
if: github.ref == 'refs/heads/main'
46+
environment:
47+
name: github-pages
48+
url: ${{ steps.deployment.outputs.page_url }}
49+
runs-on: ubuntu-latest
50+
needs: build
51+
steps:
52+
- name: Deploy to GitHub Pages
53+
id: deployment
54+
uses: actions/deploy-pages@v4

.gitignore

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# mdBook build output
2+
book/
3+
4+
# OS files
5+
.DS_Store
6+
Thumbs.db
7+
8+
# Editor files
9+
*.swp
10+
*.swo
11+
*~
12+
.idea/
13+
.vscode/

README.md

Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
# Database Internals: Where Your Data Actually Lives
2+
3+
**A CloudStreet Educational Book**
4+
5+
*Written by Opus 4.5*
6+
7+
---
8+
9+
[![Deploy Book](https://github.com/cloudstreet-dev/Database-Internals/actions/workflows/deploy.yml/badge.svg)](https://github.com/cloudstreet-dev/Database-Internals/actions/workflows/deploy.yml)
10+
11+
## Read Online
12+
13+
**[Read the book online](https://cloudstreet-dev.github.io/Database-Internals/)** - Hosted on GitHub Pages
14+
15+
---
16+
17+
## About This Book
18+
19+
Ever wondered what happens when you hit COMMIT? Why does that one query take 30 seconds while another returns instantly? What's actually going on when your database "recovers" after a crash?
20+
21+
This book takes you on a journey into the heart of database systems—the storage engines, B-trees, write-ahead logs, and MVCC implementations that power everything from your local SQLite database to planet-scale distributed systems. We'll explore how databases transform your SQL queries into disk operations, manage concurrent access from thousands of users, and guarantee your data survives power failures and hardware crashes.
22+
23+
Whether you're a developer trying to understand why your queries are slow, an engineer designing data-intensive systems, or simply curious about one of the most sophisticated pieces of software ever created, this book will give you the mental models to understand what's really happening beneath the abstraction layers.
24+
25+
## Who This Book Is For
26+
27+
- **Backend developers** who want to write better queries and design better schemas
28+
- **Software engineers** building systems that interact heavily with databases
29+
- **System architects** making decisions about data storage and retrieval
30+
- **The curious** who want to understand the engineering marvels hiding behind `SELECT * FROM users`
31+
32+
## What You'll Learn
33+
34+
- How data is physically organized on disk and in memory
35+
- The data structures that make queries fast (and when they don't)
36+
- How databases handle multiple users reading and writing simultaneously
37+
- What guarantees ACID actually provides and how they're implemented
38+
- Why write-ahead logging is essential for crash recovery
39+
- How query optimizers decide the best way to execute your SQL
40+
- The trade-offs between different storage engine architectures
41+
- How distributed databases maintain consistency across machines
42+
43+
## Table of Contents
44+
45+
### Part I: Foundations
46+
1. [Introduction: The Journey of a Query](src/01-introduction.md)
47+
2. [Storage Engines and File Formats](src/02-storage-engines.md)
48+
3. [Disk I/O and Page Management](src/03-disk-io.md)
49+
50+
### Part II: Data Structures
51+
4. [Indexing Structures: B-Trees and Beyond](src/04-indexing-structures.md)
52+
5. [LSM Trees and Write-Optimized Structures](src/05-lsm-trees.md)
53+
6. [Hash Indexes and Specialized Structures](src/06-hash-indexes.md)
54+
55+
### Part III: Transactions and Concurrency
56+
7. [Write-Ahead Logging (WAL)](src/07-write-ahead-logging.md)
57+
8. [MVCC and Transaction Isolation](src/08-mvcc-isolation.md)
58+
9. [Locking and Concurrency Control](src/09-locking-concurrency.md)
59+
60+
### Part IV: Query Processing
61+
10. [Query Parsing and Planning](src/10-query-parsing.md)
62+
11. [Query Optimization](src/11-query-optimization.md)
63+
12. [Buffer Pools and Caching](src/12-buffer-pools.md)
64+
65+
### Part V: Reliability and Scale
66+
13. [Recovery and Crash Safety](src/13-recovery.md)
67+
14. [Column Stores vs Row Stores](src/14-column-vs-row.md)
68+
15. [Distributed Databases and Replication](src/15-distributed-databases.md)
69+
70+
### Appendices
71+
- [Appendix A: Glossary of Terms](src/appendix-a-glossary.md)
72+
- [Appendix B: Further Reading](src/appendix-b-reading.md)
73+
74+
## How to Read This Book
75+
76+
This book is designed to be read sequentially, as later chapters build on concepts introduced earlier. However, if you're already familiar with certain topics, feel free to skip ahead:
77+
78+
- **New to databases?** Start from Chapter 1 and work through sequentially.
79+
- **Know the basics?** Skip to Part II for the data structure deep-dives.
80+
- **Here for concurrency?** Part III covers transactions, locking, and MVCC.
81+
- **Query performance issues?** Part IV on query processing will be most relevant.
82+
- **Scaling up?** Part V covers distributed systems and different storage architectures.
83+
84+
## Building Locally
85+
86+
This book is built using [mdBook](https://rust-lang.github.io/mdBook/). To build locally:
87+
88+
```bash
89+
# Install mdBook
90+
cargo install mdbook
91+
92+
# Build the book
93+
mdbook build
94+
95+
# Serve locally with hot-reload
96+
mdbook serve --open
97+
```
98+
99+
## Conventions Used
100+
101+
Throughout this book, we use several conventions:
102+
103+
- `Code blocks` indicate SQL, pseudocode, or data structure representations
104+
- **Bold terms** indicate important concepts being introduced
105+
- *Italics* are used for emphasis and technical terms
106+
- ASCII diagrams illustrate data structures and system architectures
107+
- PostgreSQL is used as the primary reference implementation, with notes on how other databases differ
108+
109+
## About the Author
110+
111+
This book was written by **Opus 4.5**, Anthropic's AI assistant, as part of the CloudStreet educational series. The content synthesizes knowledge from database research papers, system documentation, and practical engineering experience into an accessible guide for working developers.
112+
113+
## License
114+
115+
This work is part of the CloudStreet Educational Series.
116+
117+
---
118+
119+
*"The database is the most important software component in most applications, yet it remains a black box to most developers. Let's open that box."*
120+
121+
— Opus 4.5

book.toml

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
[book]
2+
title = "Database Internals: Where Your Data Actually Lives"
3+
authors = ["Opus 4.5"]
4+
description = "A deep dive into how databases actually work under the hood"
5+
language = "en"
6+
src = "src"
7+
8+
[build]
9+
build-dir = "book"
10+
11+
[output.html]
12+
default-theme = "light"
13+
preferred-dark-theme = "navy"
14+
git-repository-url = "https://github.com/cloudstreet-dev/Database-Internals"
15+
edit-url-template = "https://github.com/cloudstreet-dev/Database-Internals/edit/main/{path}"
16+
site-url = "/Database-Internals/"
17+
additional-css = ["custom.css"]
18+
19+
[output.html.fold]
20+
enable = true
21+
level = 1
22+
23+
[output.html.search]
24+
enable = true
25+
limit-results = 30
26+
teaser-word-count = 30
27+
use-hierarchical = true

custom.css

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
/* Custom styles for Database Internals book */
2+
3+
/* Improve code block readability */
4+
pre {
5+
border-radius: 6px;
6+
padding: 1em;
7+
}
8+
9+
/* Style for ASCII diagrams */
10+
pre code {
11+
font-family: 'SF Mono', 'Monaco', 'Inconsolata', 'Fira Mono', 'Droid Sans Mono', 'Source Code Pro', monospace;
12+
font-size: 0.85em;
13+
line-height: 1.4;
14+
}
15+
16+
/* Blockquote styling for chapter epigraphs */
17+
blockquote {
18+
border-left: 4px solid #4a90d9;
19+
padding-left: 1em;
20+
margin-left: 0;
21+
font-style: italic;
22+
color: #555;
23+
}
24+
25+
/* Table styling */
26+
table {
27+
width: 100%;
28+
border-collapse: collapse;
29+
margin: 1em 0;
30+
}
31+
32+
th, td {
33+
border: 1px solid #ddd;
34+
padding: 0.5em 1em;
35+
text-align: left;
36+
}
37+
38+
th {
39+
background-color: #f5f5f5;
40+
font-weight: 600;
41+
}
42+
43+
tr:nth-child(even) {
44+
background-color: #fafafa;
45+
}
46+
47+
/* Dark theme adjustments */
48+
.navy blockquote,
49+
.coal blockquote,
50+
.ayu blockquote {
51+
color: #aaa;
52+
border-left-color: #6fa3d9;
53+
}
54+
55+
.navy th,
56+
.coal th,
57+
.ayu th {
58+
background-color: #2a2a2a;
59+
}
60+
61+
.navy tr:nth-child(even),
62+
.coal tr:nth-child(even),
63+
.ayu tr:nth-child(even) {
64+
background-color: #1a1a1a;
65+
}
66+
67+
/* Improve heading hierarchy */
68+
h1 {
69+
border-bottom: 2px solid #4a90d9;
70+
padding-bottom: 0.3em;
71+
}
72+
73+
h2 {
74+
border-bottom: 1px solid #ddd;
75+
padding-bottom: 0.2em;
76+
}
77+
78+
/* Definition list styling (for glossary) */
79+
dt {
80+
font-weight: 600;
81+
margin-top: 1em;
82+
}
83+
84+
dd {
85+
margin-left: 1.5em;
86+
margin-bottom: 0.5em;
87+
}
88+
89+
/* Make navigation clearer */
90+
.chapter li.chapter-item {
91+
margin-top: 0.5em;
92+
}
93+
94+
.chapter li.part-title {
95+
margin-top: 1.5em;
96+
font-weight: 700;
97+
color: #4a90d9;
98+
}

0 commit comments

Comments
 (0)