Skip to content
Open

Gen #34

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 45 additions & 34 deletions agents/bgl_migration_strategy.md

Large diffs are not rendered by default.

208 changes: 208 additions & 0 deletions docs/user-guide/generators.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,14 @@
- [Generators](#generators)
- [path_graph](#path_graph)
- [grid_graph](#grid_graph)
- [complete_graph](#complete_graph)
- [erdos_renyi_graph](#erdos_renyi_graph)
- [erdos_renyi_gnm](#erdos_renyi_gnm)
- [barabasi_albert_graph](#barabasi_albert_graph)
- [watts_strogatz](#watts_strogatz)
- [rmat](#rmat)
- [plod](#plod)
- [ssca](#ssca)
- [Example: Building and Querying a Generated Graph](#example)

---
Expand All @@ -42,8 +48,14 @@ All generators are header-only and require no external dependencies.
// Or include individually:
#include <graph/generators/path.hpp>
#include <graph/generators/grid.hpp>
#include <graph/generators/complete.hpp>
#include <graph/generators/erdos_renyi.hpp>
#include <graph/generators/gnm.hpp>
#include <graph/generators/barabasi_albert.hpp>
#include <graph/generators/watts_strogatz.hpp>
#include <graph/generators/rmat.hpp>
#include <graph/generators/plod.hpp>
#include <graph/generators/ssca.hpp>
```

---
Expand Down Expand Up @@ -95,6 +107,36 @@ auto edges = graph::generators::grid_graph(3u, 4u);

---

### `complete_graph`

Generates a complete graph K(n): every ordered pair `(u, v)` with `u ≠ v`.

```cpp
template <class VId = uint32_t>
auto complete_graph(VId n, uint64_t seed = 42,
weight_dist wdist = weight_dist::uniform)
-> std::vector<copyable_edge_t<VId, double>>;
```

| Parameter | Description |
|-----------|-------------|
| `n` | Number of vertices |
| `seed` | Random seed for reproducible edge weights |
| `wdist` | Edge-weight distribution: `weight_dist::uniform` (U[1,100], default), `weight_dist::exponential` (Exp(0.1)+1), or `weight_dist::constant_one` (1.0) |

**Returns:** `n * (n-1)` directed edges — the fully-connected graph — sorted by
source id, then target id.

> **Warning:** the edge count grows as O(n²); generating K(n) for large `n` is
> memory-intensive (e.g. `n = 10'000` yields ~100M edges).

```cpp
auto edges = graph::generators::complete_graph(100u);
// 100 * 99 = 9'900 directed edges
```

---

### `erdos_renyi_graph`

Generates a random graph using the Erdős–Rényi G(n, p) model.
Expand All @@ -121,6 +163,38 @@ auto edges = graph::generators::erdos_renyi_graph(100u, 0.05);

---

### `erdos_renyi_gnm`

Generates a random graph using the Erdős–Rényi G(n, m) model — the
fixed-edge-count companion to `erdos_renyi`. Exactly `m` distinct edges are
selected uniformly at random from the `n * (n-1)` ordered pairs.

```cpp
template <class VId = uint32_t>
auto erdos_renyi_gnm(VId n, size_t m, uint64_t seed = 42,
weight_dist wdist = weight_dist::uniform)
-> std::vector<copyable_edge_t<VId, double>>;
```

| Parameter | Description |
|-----------|-------------|
| `n` | Number of vertices |
| `m` | Number of edges to generate (clamped to `n * (n-1)` if larger) |
| `seed` | Random seed for reproducibility |
| `wdist` | Edge-weight distribution: `weight_dist::uniform` (U[1,100], default), `weight_dist::exponential` (Exp(0.1)+1), or `weight_dist::constant_one` (1.0) |

**Returns:** Exactly `m` distinct directed edges (`u ≠ v`), sorted by source id.
Use this model when a precise edge count is required (e.g. controlling graph
density for benchmarks); use `erdos_renyi` (G(n, p)) when each edge should exist
independently with a fixed probability.

```cpp
auto edges = graph::generators::erdos_renyi_gnm(100u, 500u);
// exactly 500 distinct directed edges
```

---

### `barabasi_albert_graph`

Generates a scale-free graph using the Barabási–Albert preferential attachment model.
Expand All @@ -147,6 +221,140 @@ auto edges = graph::generators::barabasi_albert_graph(1000u, 3u);

---

### `watts_strogatz`

Generates a small-world graph using the Watts–Strogatz model: a ring lattice
where each vertex connects to its `k` nearest neighbours, with each forward
lattice edge rewired to a random target with probability `beta`.

```cpp
template <class VId = uint32_t>
auto watts_strogatz(VId n, VId k, double beta, uint64_t seed = 42,
weight_dist wdist = weight_dist::uniform)
-> std::vector<copyable_edge_t<VId, double>>;
```

| Parameter | Description |
|-----------|-------------|
| `n` | Number of vertices (must be > `k`) |
| `k` | Each vertex connects to its `k` nearest ring neighbours (rounded down to even) |
| `beta` | Rewiring probability in [0, 1]: `0` = pure ring lattice, `1` ≈ random graph |
| `seed` | Random seed for reproducibility |
| `wdist` | Edge-weight distribution: `weight_dist::uniform` (default), `weight_dist::exponential`, or `weight_dist::constant_one` |

**Returns:** Bidirectional edges (each undirected pair emitted both ways), sorted
by source id. Intermediate `beta` (~0.01–0.1) produces the characteristic
small-world regime: high clustering with short average path length.

```cpp
auto edges = graph::generators::watts_strogatz(100u, 6u, 0.1);
// ring lattice of degree 6, 10% of edges rewired
```

---

### `rmat`

Generates a directed graph using the R-MAT (Recursive MATrix) model, which
produces the power-law / community structure used by the Graph500 benchmark.
Each edge is placed by recursively descending into one of four adjacency-matrix
quadrants with probabilities `(a, b, c, d)`.

```cpp
template <class VId = uint32_t>
auto rmat(uint32_t scale, size_t m,
double a = 0.57, double b = 0.19, double c = 0.19, double d = 0.05,
uint64_t seed = 42, weight_dist wdist = weight_dist::uniform)
-> std::vector<copyable_edge_t<VId, double>>;
```

| Parameter | Description |
|-----------|-------------|
| `scale` | Graph has `2^scale` vertices |
| `m` | Number of directed edges to attempt to place |
| `a, b, c, d` | Quadrant probabilities (should sum to ~1; normalised internally) |
| `seed` | Random seed for reproducibility |
| `wdist` | Edge-weight distribution (see above) |

**Returns:** Up to `m` distinct directed edges (self-loops and duplicates
removed), sorted by source id. The default `(0.57, 0.19, 0.19, 0.05)` are the
standard Graph500 parameters.

```cpp
auto edges = graph::generators::rmat<uint32_t>(16, 1u << 18);
// 65'536 vertices, ~256K edges, skewed degree distribution
```

---

### `plod`

Generates a directed graph with a power-law out-degree distribution
(Palmer–Steffan PLOD model). Each vertex is assigned a target out-degree drawn
from a power law, then edges are placed to random targets.

```cpp
template <class VId = uint32_t>
auto plod(VId n, double alpha = 2.5, double beta = 10.0,
uint64_t seed = 42, weight_dist wdist = weight_dist::uniform)
-> std::vector<copyable_edge_t<VId, double>>;
```

| Parameter | Description |
|-----------|-------------|
| `n` | Number of vertices |
| `alpha` | Power-law exponent (larger ⇒ steeper degree decay) |
| `beta` | Degree scaling factor (larger ⇒ denser graph) |
| `seed` | Random seed for reproducibility |
| `wdist` | Edge-weight distribution (see above) |

**Returns:** Directed edges (no self-loops or duplicates), sorted by source id.

> **Note:** For most scale-free use cases [`barabasi_albert_graph`](#barabasi_albert_graph)
> is a better choice; `plod` is provided for BGL parity.

```cpp
auto edges = graph::generators::plod(1000u, 2.5, 10.0);
// power-law out-degree distribution
```

---

### `ssca`

Generates an SSCA#2 (HPCS Scalable Synthetic Compact Applications #2) benchmark
graph: randomly-sized cliques connected by sparse inter-clique edges whose
probability decays with the inter-clique id distance.

```cpp
template <class VId = uint32_t>
auto ssca(VId n, VId max_clique_size = 8, double prob_inter_clique = 0.2,
int max_parallel_edges = 2, uint64_t seed = 42,
weight_dist wdist = weight_dist::uniform)
-> std::vector<copyable_edge_t<VId, double>>;
```

| Parameter | Description |
|-----------|-------------|
| `n` | Number of vertices |
| `max_clique_size` | Maximum clique size (sizes drawn uniformly from [1, this]) |
| `prob_inter_clique` | Probability a vertex emits an inter-clique edge |
| `max_parallel_edges` | Maximum parallel edges per intra-clique pair |
| `seed` | Random seed for reproducibility |
| `wdist` | Edge-weight distribution (see above) |

**Returns:** Directed edges sorted by source id. Vertices are partitioned into
consecutive cliques; every ordered pair within a clique is connected (with up to
`max_parallel_edges` parallel edges — a defining SSCA#2 trait), plus sparse
inter-clique links. Self-loops are skipped.

```cpp
auto edges = graph::generators::ssca(1000u, 8u, 0.2);
// clustered graph: dense cliques + sparse inter-clique edges
```

---

## Example

```cpp
Expand Down
Loading
Loading