Summary
The `shortestPath()` function in AGE appears to use the same VLE (variable-length edge) expansion internally, inheriting the O(n^k) cycle-free expansion issue from #2349. A shortest path query should use BFS (breadth-first search) which terminates as soon as the target is found, but the current implementation expands all paths up to the maximum depth before selecting the shortest.
Reproduction
SELECT * FROM cypher('benchmark', $$
MATCH p = shortestPath((a:Node {bench_id: 1})-[*..10]->(b:Node {bench_id: 50}))
RETURN length(p)
$$) AS (len agtype);
Performance Data
Benchmark W07 (shortest path):
| Approach |
10K scale |
100K scale |
| Cypher shortestPath() |
~23ms |
~23ms |
| BFS CTE with UNION + early termination |
0.09ms |
0.33ms |
The SDK workaround implements BFS as a recursive CTE:
WITH RECURSIVE bfs AS (
SELECT e.end_id AS node_id, 1 AS depth
FROM benchmark."EDGE" e
WHERE e.start_id = (SELECT id FROM benchmark."Node" WHERE _bench_id = 1)
UNION
SELECT e.end_id, b.depth + 1
FROM bfs b
JOIN benchmark."EDGE" e ON e.start_id = b.node_id
WHERE b.depth < 10
AND NOT EXISTS (SELECT 1 FROM bfs WHERE node_id = (SELECT id FROM benchmark."Node" WHERE _bench_id = 50))
)
SELECT depth FROM bfs
WHERE node_id = (SELECT id FROM benchmark."Node" WHERE _bench_id = 50)
LIMIT 1;
Suggested Fix
Implement `shortestPath()` using a proper BFS algorithm in the executor rather than delegating to VLE. BFS guarantees finding the shortest path and terminates at the first match, providing O(V+E) complexity instead of O(n^k).
This is related to #2349 (VLE cycle detection) — fixing VLE would also improve shortestPath, but a dedicated BFS implementation would be even more efficient for the shortest-path use case.
Environment
- PostgreSQL 18.2
- Apache AGE 1.7.0 (built from source, branch `release/PG18/1.7.0`)
- 32-core, 64GB RAM, Linux 6.17
Summary
The `shortestPath()` function in AGE appears to use the same VLE (variable-length edge) expansion internally, inheriting the O(n^k) cycle-free expansion issue from #2349. A shortest path query should use BFS (breadth-first search) which terminates as soon as the target is found, but the current implementation expands all paths up to the maximum depth before selecting the shortest.
Reproduction
Performance Data
Benchmark W07 (shortest path):
The SDK workaround implements BFS as a recursive CTE:
Suggested Fix
Implement `shortestPath()` using a proper BFS algorithm in the executor rather than delegating to VLE. BFS guarantees finding the shortest path and terminates at the first match, providing O(V+E) complexity instead of O(n^k).
This is related to #2349 (VLE cycle detection) — fixing VLE would also improve shortestPath, but a dedicated BFS implementation would be even more efficient for the shortest-path use case.
Environment