This file provides guidance to AI coding agents (Claude Code, Codex, Gemini CLI, and similar) when working with code in this repository. The content is intentionally agent-agnostic; only the filename follows the CLAUDE.md convention.
sgraph is a Python library for working with hierarchic graph structures, typically used for representing software architectures. It provides data formats, structures, and algorithms for analyzing and manipulating software dependency graphs.
The library is maintained by Softagram and is used for building information models about analyzed software. See also sgraph-mcp-server for AI agent integration.
The canonical repository is softagram/sgraph. Contributors are expected to work from personal forks using the standard fork-and-PR model.
A typical working clone has two remotes:
| Remote | Purpose | Example URL |
|---|---|---|
origin |
Your personal fork (push target for feature branches) | git@github.com:<you>/sgraph.git |
upstream |
The canonical Softagram repository (source of truth for main) |
git@github.com:softagram/sgraph.git |
Important for coding agents: when checking whether a branch's commits are already merged, always compare against upstream/main, not just origin/main. A contributor's fork may lag behind upstream, so commits that appear "unmerged" relative to origin/main can in fact already be live in the canonical repo.
# Always fetch both before reasoning about merge state
git fetch origin main
git fetch upstream main
# Are this branch's commits in the canonical main?
git rev-list --left-right --count upstream/main...HEAD
# → "N 0" means all HEAD commits are already in upstream/main (safe to switch/delete branch)New contributors: fork softagram/sgraph on GitHub, clone your fork as origin, and add softagram/sgraph as upstream. Open PRs from feature branches on your fork against softagram/sgraph:main.
# Install dependencies
pip install -r requirements.txt
# Install dev dependencies (includes yapf, flake8)
pip install -r requirements-dev.txt
# Install package in development mode
pip install -e .# Run all tests
pytest
# Run specific test file
pytest tests/sgraph_test.py
# Run specific test
pytest tests/sgraph_test.py::test_deepcopy# Run flake8 (max line length: 100)
flake8 src/
# Format code with yapf (config in .style.yapf)
yapf -r -i src/# Build distribution packages
python3 setup.py sdist bdist_wheelThe project uses an automated release script to streamline the release process:
# Auto-bump patch version (1.1.1 -> 1.1.2)
python scripts/release.py --bump patch
# Auto-bump minor version (1.1.1 -> 1.2.0)
python scripts/release.py --bump minor
# Auto-bump major version (1.1.1 -> 2.0.0)
python scripts/release.py --bump major
# Release specific version
python scripts/release.py --version 1.2.0
# Dry run to preview changes
python scripts/release.py --bump patch --dry-runThe script automates the full release workflow: version bumping, branch creation, PR creation (requires gh CLI), tagging, building, PyPI upload (requires twine), and GitHub release creation with auto-generated release notes from merged PRs. See scripts/README.md for setup instructions.
The sgraph data model consists of three primary classes:
-
SGraph (
sgraph.py): The top-level model container- Contains a root
SElement - Manages model-level attributes via
modelAttrsandmetaAttrs - Provides serialization to XML, deps, and other formats
- Entry point for parsing models from files
- Contains a root
-
SElement (
selement.py): Hierarchical graph nodes- Represents elements in a tree structure (like directories/files)
- Each element has a name, parent, children (stored in both list and dict)
- Contains
incomingandoutgoinglists of associations (edges) - Stores attributes as key-value pairs in
attrsdict - Uses
__slots__for memory efficiency
-
SElementAssociation (
selementassociation.py): Edges between elements- Represents directed relationships between elements
- Has
fromElement,toElement, anddeptype(dependency type) - Stores additional attributes in
attrsdict - Must call
initElems()to register with connected elements - Use
create_unique_element_association()to avoid duplicates
-
ModelApi (
modelapi.py): Query and traverse modelsgetElementByPath(): Find elements by pathgetElementsByName(): Find all elements by namegetCalledFunctions(): Get function call relationships- Various filtering and traversal utilities
-
MetricsApi (
metricsapi.py): Extract metrics from modelsget_total_loc_metrics(): Lines of code metricsget_total_tech_debt_metrics(): Tech debt analysis
- algorithms/: Graph algorithms (metrics, filtering, generalization, analysis)
- converters/: Format converters (XML, deps, GraphML, JSON, PlantUML, DOT, CytoscapeJS, SBOM)
- compare/: Model comparison and diff functionality (rename detection, similarity analysis)
- loader/: Model loading utilities
- cli/: Command-line interface utilities
- attributes/: Attribute query and management
- cypher.py: Cypher query language support via sPyCy (optional dependency:
spycy-aneeshdurg)
The library supports multiple serialization formats:
-
XML Format: High-performance format for large models (10M+ elements)
- Uses integer IDs for element references
- Minimalist syntax:
<e n="name">for elements,<r r="id" t="type">for relationships - Model version 2.1
-
Deps Format: Line-based text format for simple scripting
- Format:
/path/to/from:/path/to/to:dependency_type - Easy to read and script, but not recommended for very large models
- Format:
-
Other formats: GraphML, JSON, PlantUML, DOT, CytoscapeJS
- Path-based element access: Elements are identified by hierarchical paths (e.g.,
/nginx/src/core/nginx.c) - Lazy element creation:
createOrGetElementFromPath()creates elements on-demand - Element merging: Duplicate elements under the same parent are merged
- Association initialization: Associations must call
initElems()to register with their connected elements - Type-based filtering: Elements and associations support type attributes for categorization
from sgraph import SGraph, SElement, SElementAssociation
# Create empty model
model = SGraph(SElement(None, ''))
# Create elements from paths
e1 = model.createOrGetElementFromPath('/path/to/file.x')
e2 = model.createOrGetElementFromPath('/path/to/file.y')
# Create association
ea = SElementAssociation(e1, e2, 'use')
ea.initElems() # Must call to register association
# Serialize
model.to_xml('output.xml')
model.to_deps('output.txt')from sgraph import SGraph, ModelApi
# Load model
model = SGraph.parse_xml_or_zipped_xml('model.xml')
# Query with ModelApi
api = ModelApi(model=model)
element = api.getElementByPath('/some/path')
elements = api.getElementsByName('nginx.c')from sgraph import SGraph
from sgraph.cypher import cypher_query
model = SGraph.parse_xml_or_zipped_xml('model.xml')
results = cypher_query(model, '''
MATCH (a:file)-[:imports]->(b)
RETURN a.name, b.name
''')CLI: python -m sgraph.cypher model.xml.zip [query] — supports interactive REPL and 11 output formats (table, csv, tsv, json, jsonl, xml, deps, dot, plantuml, graphml, cytoscape). See docs/cypher.md for full documentation.
from sgraph.compare.modelcompare import ModelCompare
from sgraph.compare.compareutils import SLIDING_WINDOW_ATTRS
mc = ModelCompare()
# Basic comparison from files
compare_model = mc.compare('model_a.xml', 'model_b.xml')
# Exclude noisy time-windowed metrics (author counts, commit counts, etc.)
compare_model = mc.compare('model_a.xml', 'model_b.xml', exclude_attrs=SLIDING_WINDOW_ATTRS)
# Compare in-memory models
compare_model = mc.compareModels(model1, model2, exclude_attrs={'commit_count_30', 'author_list_7'})
# Inspect results
mc.printCompareInfos(compare_model)The exclude_attrs parameter accepts a set of attribute names to ignore during comparison. Use SLIDING_WINDOW_ATTRS as a preset to suppress time-windowed metric noise (author/commit/bug counts at various time windows).
- Source code:
src/sgraph/ - Tests:
tests/ - Automation scripts:
scripts/(includesrelease.py) - Package metadata:
setup.cfg,setup.py - Documentation:
README.md,releasing.md,CLAUDE.md - Graph conventions:
docs/graph-conventions.md - Cypher query docs:
docs/cypher.md