Skip to content

Commit 7b37248

Browse files
CopilotDedeHai
andcommitted
Add fork statistics visualizer tool with graphs and detailed reports
Co-authored-by: DedeHai <6280424+DedeHai@users.noreply.github.com>
1 parent 325f408 commit 7b37248

4 files changed

Lines changed: 724 additions & 0 deletions

File tree

tools/README_fork_stats.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,20 @@ Detailed machine-readable output including:
101101
- Full statistical breakdown
102102
- Intermediate results are automatically saved to `tempresults.json` every 10 forks to prevent data loss on interruption
103103

104+
### Visualization
105+
106+
For advanced visualization and analysis of the JSON results, use the companion visualizer tool:
107+
108+
```bash
109+
# Generate visualizations from collected data
110+
python3 tools/fork_stats_visualizer.py results.json --save-plots
111+
112+
# Text-only statistics (no graphs)
113+
python3 tools/fork_stats_visualizer.py results.json --no-graphs
114+
```
115+
116+
See [README_fork_stats_visualizer.md](README_fork_stats_visualizer.md) for complete documentation.
117+
104118
## Performance Considerations
105119

106120
### Execution Speed
Lines changed: 240 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,240 @@
1+
# Fork Statistics Visualizer
2+
3+
A Python script that loads JSON data generated by `fork_stats.py` and displays detailed statistics both as formatted text lists and visual graphs.
4+
5+
## Features
6+
7+
- **Text-based Statistics**: Formatted tables with visual bars showing percentages
8+
- **Interactive Graphs**: Pie charts, bar charts, histograms, and combined dashboards
9+
- **Top Forks Lists**: Shows top forks by unique branches, owner commits, and activity
10+
- **Export Capabilities**: Save all visualizations as high-quality PNG images
11+
- **Works Without Graphics**: Can run in text-only mode without matplotlib
12+
13+
## Installation
14+
15+
### Basic Installation (Text-only)
16+
The visualizer works without additional dependencies for text-based output:
17+
```bash
18+
python3 tools/fork_stats_visualizer.py results.json --no-graphs
19+
```
20+
21+
### Full Installation (With Graphics)
22+
For graphical visualizations, install matplotlib:
23+
```bash
24+
pip install -r tools/fork_stats_visualizer_requirements.txt
25+
```
26+
27+
## Usage
28+
29+
### Basic Usage
30+
Display text statistics and interactive graphs:
31+
```bash
32+
python3 tools/fork_stats_visualizer.py results.json
33+
```
34+
35+
### Text-Only Mode
36+
Skip graphs and only show text statistics:
37+
```bash
38+
python3 tools/fork_stats_visualizer.py results.json --no-graphs
39+
```
40+
41+
### Save Plots to Files
42+
Save all plots as PNG files instead of displaying them:
43+
```bash
44+
python3 tools/fork_stats_visualizer.py results.json --save-plots
45+
```
46+
47+
### Custom Output Directory
48+
Specify where to save the plots:
49+
```bash
50+
python3 tools/fork_stats_visualizer.py results.json --save-plots --output-dir ./my_plots
51+
```
52+
53+
### Show Top N Forks
54+
Display top 30 forks instead of default 20:
55+
```bash
56+
python3 tools/fork_stats_visualizer.py results.json --top-n 30
57+
```
58+
59+
## Output
60+
61+
### Text Statistics
62+
63+
The script displays:
64+
65+
1. **Repository Information**
66+
- Repository name, total forks, stars, watchers
67+
- Number of forks analyzed
68+
- Analysis timestamp
69+
70+
2. **Fork Age Distribution**
71+
- Breakdown by age categories (≤1 month, ≤3 months, ≤6 months, ≤1 year, ≤2 years, >5 years)
72+
- Count and percentage for each category
73+
- Visual bars showing proportions
74+
75+
3. **Fork Activity Analysis**
76+
- Forks with unique branches
77+
- Forks with recent main branch
78+
- Forks that contributed PRs
79+
- Active forks without PR contributions
80+
- Visual bars for each metric
81+
82+
4. **Owner Commit Analysis**
83+
- Number of forks with owner commits
84+
- Total commits by all fork owners
85+
- Average commits per fork
86+
87+
5. **Top Forks Lists**
88+
- Top N forks by unique branches
89+
- Top N forks by owner commits
90+
- Active forks without PR contributions
91+
92+
### Visual Graphs
93+
94+
When matplotlib is available, the script generates:
95+
96+
1. **Age Distribution Pie Chart**
97+
- Shows the distribution of fork ages
98+
- Color-coded by recency (green for recent, red for old)
99+
100+
2. **Activity Metrics Bar Chart**
101+
- Compares different activity metrics side-by-side
102+
- Shows percentages for each metric
103+
104+
3. **Owner Commits Distribution Histogram**
105+
- Shows the distribution of owner commits across forks
106+
- Includes mean and max statistics
107+
108+
4. **Combined Dashboard**
109+
- All-in-one view with multiple charts
110+
- Summary statistics panel
111+
- Perfect for presentations and reports
112+
113+
## Example Output
114+
115+
### Text Output
116+
```
117+
================================================================================
118+
REPOSITORY INFORMATION
119+
================================================================================
120+
121+
Repository: wled/WLED
122+
Total Forks: 1,243
123+
Stars: 15,500
124+
Watchers: 326
125+
126+
Analyzed Forks: 100
127+
128+
================================================================================
129+
FORK AGE DISTRIBUTION
130+
================================================================================
131+
132+
Age Category Count Percentage
133+
------------------------------------------------------------
134+
Last updated ≤ 1 month 8 8.0% ████
135+
Last updated ≤ 3 months 12 12.0% ██████
136+
Last updated ≤ 6 months 15 15.0% ███████
137+
Last updated ≤ 1 year 23 23.0% ███████████
138+
Last updated ≤ 2 years 25 25.0% ████████████
139+
Last updated > 5 years 17 17.0% ████████
140+
```
141+
142+
### Saved Files
143+
When using `--save-plots`, the following files are created:
144+
- `age_distribution.png` - Pie chart of fork ages
145+
- `activity_metrics.png` - Bar chart of activity metrics
146+
- `owner_commits_distribution.png` - Histogram of owner commits
147+
- `dashboard.png` - Combined dashboard view
148+
149+
## Integration with fork_stats.py
150+
151+
### Complete Workflow
152+
153+
1. **Collect Statistics**
154+
```bash
155+
python3 tools/fork_stats.py --output results.json --max-forks 100
156+
```
157+
158+
2. **Visualize Results**
159+
```bash
160+
python3 tools/fork_stats_visualizer.py results.json --save-plots
161+
```
162+
163+
3. **View Dashboard**
164+
Open `fork_plots/dashboard.png` for a complete overview
165+
166+
### Automated Analysis
167+
168+
Create a bash script to run both tools:
169+
```bash
170+
#!/bin/bash
171+
# analyze_forks.sh
172+
173+
# Set your GitHub token
174+
export GITHUB_TOKEN="your_token_here"
175+
176+
# Collect statistics
177+
python3 tools/fork_stats.py \
178+
--repo wled/WLED \
179+
--max-forks 200 \
180+
--fast \
181+
--output fork_analysis.json
182+
183+
# Generate visualizations
184+
python3 tools/fork_stats_visualizer.py \
185+
fork_analysis.json \
186+
--save-plots \
187+
--output-dir ./fork_analysis_plots \
188+
--top-n 25
189+
190+
echo "Analysis complete! Check fork_analysis_plots/ for visualizations."
191+
```
192+
193+
## Troubleshooting
194+
195+
### "matplotlib not installed" Warning
196+
This is normal if you haven't installed matplotlib. The script will still work in text-only mode. To enable graphs:
197+
```bash
198+
pip install -r tools/fork_stats_visualizer_requirements.txt
199+
```
200+
201+
### "No detailed fork data available"
202+
The JSON file doesn't contain individual fork details. This happens when:
203+
- Using demo mode in fork_stats.py
204+
- The analysis was interrupted before collecting fork data
205+
- The JSON file is incomplete
206+
207+
The visualizer will still show aggregate statistics from the statistics section.
208+
209+
### Plots Not Saving
210+
Ensure the output directory is writable:
211+
```bash
212+
mkdir -p ./fork_plots
213+
chmod 755 ./fork_plots
214+
python3 tools/fork_stats_visualizer.py results.json --save-plots --output-dir ./fork_plots
215+
```
216+
217+
## Command-Line Options
218+
219+
| Option | Description | Default |
220+
|--------|-------------|---------|
221+
| `json_file` | Path to JSON file with fork statistics | Required |
222+
| `--save-plots` | Save plots to files instead of displaying | False |
223+
| `--output-dir DIR` | Directory to save plots | `./fork_plots` |
224+
| `--top-n N` | Number of top forks to display | 20 |
225+
| `--no-graphs` | Skip graph generation, text only | False |
226+
227+
## Dependencies
228+
229+
### Required
230+
- Python 3.7+
231+
- json (built-in)
232+
- argparse (built-in)
233+
234+
### Optional (for graphs)
235+
- matplotlib >= 3.5.0
236+
237+
Install optional dependencies:
238+
```bash
239+
pip install -r tools/fork_stats_visualizer_requirements.txt
240+
```

0 commit comments

Comments
 (0)