Skip to content

Commit 9dd963a

Browse files
authored
Merge pull request #5 from SingleRust/dev-leiden-optimization
Dev leiden optimization
2 parents 7a30018 + 95bf891 commit 9dd963a

17 files changed

Lines changed: 528 additions & 772 deletions

File tree

Cargo.lock

Lines changed: 10 additions & 41 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[package]
22
name = "single-clustering"
3-
version = "0.6.0"
3+
version = "0.6.1"
44
edition = "2024"
55
authors = ["Ian F. Diks"]
66
homepage = "https://singlerust.com"
@@ -12,13 +12,13 @@ description = "A high-performance network clustering library implementing commun
1212

1313
[dependencies]
1414
anyhow = "1.0.98"
15-
kiddo = {version = "5.0.3" }
15+
kiddo = {version = "5.2.2" }
1616
nalgebra-sparse = "0.10.0"
1717
ndarray = {version = "0.16.1" , features = ["rayon"]}
1818
num-traits = "0.2.19"
19-
petgraph = { version = "0.8.1", features = ["rayon"] }
19+
petgraph = { version = "0.8.2", features = ["rayon"] }
2020
rayon = "1.10.0"
21-
single-utilities = "0.6.0"
21+
single-utilities = "0.8.5"
2222
rand = "0.9.0"
2323
rand_chacha = {version = "0.9.0"}
2424
hnsw_rs = {version = "0.3.2", features = ["simdeez_f"]}

README.md

Lines changed: 51 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,38 +1,50 @@
11
# single-clustering
22

3-
A Rust library for community detection and graph clustering algorithms.
3+
⚠️ **Development Status**: This library is currently under heavy development and should **not be considered production ready**. APIs may change significantly between versions.
4+
5+
A Rust library for community detection and graph clustering algorithms with a focus on performance and flexibility.
46

57
## Features
68

7-
- **Network Analysis**: Efficient graph representation and manipulation for clustering tasks
8-
- **Community Detection**: Implementation of state-of-the-art algorithms
9-
- Louvain method for community detection
10-
- Leiden algorithm (enhanced version of Louvain)
11-
- **Flexible Grouping**: Abstract trait system for creating and managing node clusters
12-
- **Performance**: Parallel computation support via Rayon
13-
- **K-NN Graph Creation**: Build networks from high-dimensional data points
9+
- **Efficient Network Representation**: CSR (Compressed Sparse Row) format for optimal memory usage and performance
10+
- **Community Detection Algorithms**:
11+
- **Leiden Algorithm**: State-of-the-art method with guaranteed well-connected communities
12+
- **Louvain Method**: Classic modularity optimization (work-in-progress)
13+
- **Quality Functions**: Multiple partition quality metrics
14+
- Modularity optimization
15+
- Reichardt-Bornholdt (RB) configuration model with tunable resolution
16+
- **Flexible Architecture**: Generic trait-based design supporting different network types and grouping strategies
17+
- **Performance Optimized**: Caching strategies and efficient data structures for large networks
1418

1519
## Usage
1620

1721
```rust
18-
use single_clustering::network::Network;
19-
use single_clustering::network::grouping::VectorGrouping;
20-
use single_clustering::community_search::leiden::Leiden;
21-
22-
// Create a network from your data
23-
let network = Network::new_from_graph(graph);
24-
25-
// Initialize clustering (each node in its own cluster)
26-
let mut clustering = VectorGrouping::create_isolated(network.nodes());
27-
28-
// Run Leiden algorithm (resolution parameter, randomness parameter, optional seed)
29-
let mut leiden = Leiden::new(1.0, 0.01, Some(42));
30-
leiden.iterate(&network, &mut clustering);
31-
32-
// Access clustering results
33-
for node in 0..network.nodes() {
34-
println!("Node {} belongs to cluster {}", node, clustering.get_group(node));
22+
use single_clustering::network::CSRNetwork;
23+
use single_clustering::community_search::leiden::{LeidenOptimizer, LeidenConfig};
24+
use single_clustering::community_search::leiden::partition::ModularityPartition;
25+
26+
// Create a CSR network from your data
27+
let network = CSRNetwork::new(edges, weights, node_count);
28+
29+
// Configure the Leiden algorithm
30+
let config = LeidenConfig {
31+
max_iterations: 100,
32+
tolerance: 1e-6,
33+
seed: Some(42),
34+
..Default::default()
35+
};
36+
37+
// Initialize the optimizer
38+
let mut optimizer = LeidenOptimizer::new(config);
39+
40+
// Find communities using modularity optimization
41+
let partition: ModularityPartition<f64, _> = optimizer.find_partition(network)?;
42+
43+
// Access results
44+
for node in 0..partition.node_count() {
45+
println!("Node {} is in community {}", node, partition.membership(node));
3546
}
47+
println!("Modularity: {:.4}", partition.quality());
3648
```
3749

3850
## Installation
@@ -41,16 +53,23 @@ Add this to your `Cargo.toml`:
4153

4254
```toml
4355
[dependencies]
44-
single-clustering = "0.1.0"
56+
single-clustering = "0.6.0"
4557
```
4658

47-
## Performance Considerations
59+
## Current Status
60+
61+
-**Leiden Algorithm**: Core implementation with modularity and RB quality functions
62+
-**CSR Network Representation**: Efficient storage for large graphs
63+
-**Quality Functions**: Modularity and Reichardt-Bornholdt implementations
64+
- 🚧 **Louvain Algorithm**: Basic implementation (work-in-progress)
65+
- 🚧 **Documentation**: API documentation and examples (ongoing)
66+
-**Benchmarks**: Performance testing suite (planned)
67+
-**Python Bindings**: PyO3 integration (planned)
68+
69+
## Contributing
4870

49-
The library offers multiple implementations optimized for different scenarios:
50-
- `StandardLocalMoving`: Basic implementation of the moving algorithm
51-
- `FastLocalMoving`: Optimized version with better memory usage
52-
- Parallel implementations of various operations for large networks
71+
This project is in active development. Contributions, bug reports, and feature requests are welcome!
5372

5473
## License
5574

56-
This crate is licensed under the MIT License.
75+
This crate is licensed under the BSD 3-Clause License.

src/community_search/leiden/mod.rs

Lines changed: 27 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,39 +1,60 @@
1-
use std::collections::{HashSet, VecDeque};
1+
//! Leiden algorithm implementation for community detection in networks.
2+
//!
3+
//! The Leiden algorithm is an improvement over the Louvain algorithm that guarantees
4+
//! well-connected communities. It uses a refinement phase to ensure high-quality
5+
//! partitions by preventing poorly connected communities. IMPORTANT: This code is currently work-in-progress and neither production ready nor optimize to the fullest!
26
3-
use rand::{Rng, SeedableRng, seq::SliceRandom};
4-
use rand_chacha::ChaCha8Rng;
5-
use single_utilities::traits::FloatOpsTS;
6-
7-
use crate::network::{Network, grouping::NetworkGrouping};
87
pub mod partition;
98
mod optimizer;
109
pub use optimizer::LeidenOptimizer;
1110

11+
/// Strategy for selecting communities to consider during optimization.
1212
#[derive(Debug, Clone, Copy, PartialEq)]
1313
pub enum ConsiderComms {
14+
/// Consider all non-empty communities.
1415
AllComms = 1,
16+
/// Consider communities of neighboring nodes only.
1517
AllNeighComms = 2,
18+
/// Randomly select one community.
1619
RandComm = 3,
20+
/// Randomly select a neighbor's community.
1721
RandNeighComm = 4,
1822
}
1923

24+
/// Optimization routine for node movement during community detection.
2025
#[derive(Debug, Clone, Copy, PartialEq)]
2126
pub enum OptimiseRoutine {
27+
/// Move individual nodes to different communities.
2228
MoveNodes = 10,
29+
/// Merge entire nodes/communities together.
2330
MergeNodes = 11,
2431
}
2532

33+
/// Configuration parameters for the Leiden algorithm.
34+
///
35+
/// Controls the behavior of community detection including iteration limits,
36+
/// convergence criteria, randomization, and optimization strategies.
2637
#[derive(Debug, Clone)]
2738
pub struct LeidenConfig {
39+
/// Maximum number of iterations before termination.
2840
pub max_iterations: usize,
41+
/// Convergence tolerance for quality improvement.
2942
pub tolerance: f64,
43+
/// Random seed for reproducible results. None for random seed.
3044
pub seed: Option<u64>,
45+
/// Maximum allowed community size. None for unlimited.
3146
pub max_community_size: Option<usize>,
47+
/// Whether to perform partition refinement for better connectivity.
3248
pub refine_partition: bool,
49+
/// Whether to consider moving nodes to empty communities.
3350
pub consider_empty_community: bool,
51+
/// Strategy for selecting communities during main optimization.
3452
pub consider_comms: ConsiderComms,
53+
/// Strategy for selecting communities during refinement.
3554
pub refine_consider_comms: ConsiderComms,
55+
/// Optimization routine for main phase.
3656
pub optimise_routine: OptimiseRoutine,
57+
/// Optimization routine for refinement phase.
3758
pub refine_routine: OptimiseRoutine,
3859
}
3960

0 commit comments

Comments
 (0)