Skip to content

shortestPath() uses VLE internally — inherits O(n^k) expansion, no BFS optimization #2350

@gregfelice

Description

@gregfelice

Summary

The `shortestPath()` function in AGE appears to use the same VLE (variable-length edge) expansion internally, inheriting the O(n^k) cycle-free expansion issue from #2349. A shortest path query should use BFS (breadth-first search) which terminates as soon as the target is found, but the current implementation expands all paths up to the maximum depth before selecting the shortest.

Reproduction

SELECT * FROM cypher('benchmark', $$
  MATCH p = shortestPath((a:Node {bench_id: 1})-[*..10]->(b:Node {bench_id: 50}))
  RETURN length(p)
$$) AS (len agtype);

Performance Data

Benchmark W07 (shortest path):

Approach 10K scale 100K scale
Cypher shortestPath() ~23ms ~23ms
BFS CTE with UNION + early termination 0.09ms 0.33ms

The SDK workaround implements BFS as a recursive CTE:

WITH RECURSIVE bfs AS (
  SELECT e.end_id AS node_id, 1 AS depth
  FROM benchmark."EDGE" e
  WHERE e.start_id = (SELECT id FROM benchmark."Node" WHERE _bench_id = 1)
  UNION
  SELECT e.end_id, b.depth + 1
  FROM bfs b
  JOIN benchmark."EDGE" e ON e.start_id = b.node_id
  WHERE b.depth < 10
    AND NOT EXISTS (SELECT 1 FROM bfs WHERE node_id = (SELECT id FROM benchmark."Node" WHERE _bench_id = 50))
)
SELECT depth FROM bfs
WHERE node_id = (SELECT id FROM benchmark."Node" WHERE _bench_id = 50)
LIMIT 1;

Suggested Fix

Implement `shortestPath()` using a proper BFS algorithm in the executor rather than delegating to VLE. BFS guarantees finding the shortest path and terminates at the first match, providing O(V+E) complexity instead of O(n^k).

This is related to #2349 (VLE cycle detection) — fixing VLE would also improve shortestPath, but a dedicated BFS implementation would be even more efficient for the shortest-path use case.

Environment

  • PostgreSQL 18.2
  • Apache AGE 1.7.0 (built from source, branch `release/PG18/1.7.0`)
  • 32-core, 64GB RAM, Linux 6.17

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions