Better Schema Registry by rbs333 · Pull Request #573 · redis/redis-vl-python

rbs333 · 2026-04-03T13:42:27Z

This PR updates SQLQuery execution to use a factory-based executor path and documents the new sql_redis_options behavior.

We chose the factory pattern so that sql-redis retains ownership of executor and schema-registry construction, rather than re-implementing that logic inside RedisVL. That keeps the boundary clean: RedisVL can pass configuration through, while sql-redis remains the source of truth for how schema registries are created, initialized, and evolved. This avoids duplicating registry logic in RedisVL and makes it easier for other applications to integrate with sql-redis without each client having to recreate that behavior themselves.

NOTE: this pr depends on a new release of sql-redis.

Note

Medium Risk
Changes SQL execution internals to use sql-redis factory executors with per-option caching and lifecycle invalidation, which could affect query behavior and resource usage. Also bumps sql-redis to >=0.4.0, so compatibility relies on the new upstream release.

Overview
Updates SQLQuery execution to use sql-redis’s create_executor/create_async_executor factory APIs and introduces per-index caching of executors keyed by sql_redis_options (defaulting to schema_cache_strategy="lazy"). Cached executors are now invalidated on index lifecycle events (connect/set_client, create/delete/clear, disconnect) to avoid stale schema state.

Extends SQLQuery to accept sql_redis_options passthrough and documents new TEXT operator semantics for sql-redis >= 0.4.0 (LIKE, fuzzy(), fulltext()), updating docs/notebook examples and integration tests accordingly. Dependency constraints are updated to require sql-redis>=0.4.0 (and lockfile refreshed).

^{Reviewed by Cursor Bugbot for commit 439d6de. Bugbot is set up for automated code reviews on this repo. Configure here.}

jit-ci · 2026-04-03T13:45:26Z

🛡️ Jit Security Scan Results

✅ No security findings were detected in this PR

^{Security scan by Jit}

Copilot

Pull request overview

This PR updates RedisVL’s SQLQuery execution path to rely on sql-redis factory functions for executor/schema-registry construction, adds per-query sql_redis_options passthrough (notably schema cache strategy), and introduces executor caching + cache invalidation hooks across index lifecycle operations.

Changes:

Add sql_redis_options to SQLQuery and forward options into sql-redis executor creation (defaulting to "lazy" schema caching).
Cache sql-redis executors inside SearchIndex / AsyncSearchIndex, keyed by normalized options, and invalidate the cache on connect/disconnect/create/delete/clear.
Add integration tests and documentation updates describing schema cache strategy behavior and the new option passthrough.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
`redisvl/query/sql.py`	Adds `sql_redis_options` to `SQLQuery` and switches `redis_query_string()` to a factory-created executor path.
`redisvl/index/index.py`	Adds executor caching for SQL execution, plus cache invalidation on lifecycle operations for sync/async indexes.
`tests/integration/test_sql_redis_hash.py`	Adds integration coverage for default vs eager schema caching and cache invalidation (sync).
`tests/integration/test_sql_redis_json.py`	Adds integration coverage for default vs eager schema caching and cache invalidation (async).
`docs/user_guide/12_sql_to_redis_queries.ipynb`	Documents `sql_redis_options` usage in the SQL user guide notebook.
`docs/concepts/queries.md`	Documents `sql_redis_options` and `schema_cache_strategy` semantics in the concepts guide.
`docs/api/query.rst`	Adds API docs note describing `sql_redis_options` and common cache strategies.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

redisvl/query/sql.py

redisvl/index/index.py

tests/integration/test_sql_redis_hash.py

tests/integration/test_sql_redis_json.py

docs/user_guide/12_sql_to_redis_queries.ipynb

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 0f4a5c6. Configure here.}

redisvl/index/index.py

redisvl/query/sql.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-06T17:29:06Z

redisvl/query/sql.py

+        sql_redis_options = {
+            "schema_cache_strategy": "lazy",
+            **self.sql_redis_options,
+        }
+        executor = create_executor(redis_client, **sql_redis_options)

        # Substitute non-bytes params in SQL before translation
        sql = self._substitute_params(self.sql, self.params)

-        translated = translator.translate(sql)
+        translated = executor._translator.translate(sql)
        return translated.to_command_string()


SQLQuery.redis_query_string() reaches into executor._translator (a private sql-redis attribute) to perform translation. This couples RedisVL to sql-redis internals and is likely to break on upstream refactors; prefer calling a public translate()/to_command_string() API on the executor (or add one upstream) instead of accessing a private field.

Copilot · 2026-04-06T17:29:06Z

redisvl/query/sql.py

+            Use ``=`` for exact phrase matching, ``LIKE`` for wildcard
+            matching, ``fuzzy()`` for typo-tolerant matching, and
+            ``fulltext()`` for tokenized search.


The docstring is slightly inconsistent about the = operator: above it says = does “exact phrase or exact-term matching”, but the later Note says to use = for “exact phrase matching” only. Consider updating the Note to include exact-term matching as well so users aren’t misled.

Suggested change

Use ``=`` for exact phrase matching, ``LIKE`` for wildcard

matching, ``fuzzy()`` for typo-tolerant matching, and

``fulltext()`` for tokenized search.

Use ``=`` for exact phrase or exact-term matching, ``LIKE``

for wildcard matching, ``fuzzy()`` for typo-tolerant

matching, and ``fulltext()`` for tokenized search.

Copilot · 2026-04-06T17:29:07Z

redisvl/index/index.py

+def _sql_executor_cache_key(sql_redis_options: Dict[str, Any]) -> str:
+    """Build a stable cache key for sql-redis executor reuse."""
+    return json.dumps(sql_redis_options, sort_keys=True, default=repr)


_sql_executor_cache_key() uses json.dumps(..., default=repr) to serialize options. repr() can include non-deterministic data (e.g., memory addresses) and can change across runs, which can cause unbounded cache growth or poor cache hit rates if non-JSON-serializable values are ever passed. Consider validating that sql_redis_options only contains JSON-serializable primitives (or explicitly normalizing known option types) and raising a clear error otherwise.

Suggested change

def _sql_executor_cache_key(sql_redis_options: Dict[str, Any]) -> str:

"""Build a stable cache key for sql-redis executor reuse."""

return json.dumps(sql_redis_options, sort_keys=True, default=repr)

def _normalize_sql_redis_option_value(value: Any) -> Any:

"""Normalize sql-redis options into deterministic JSON-serializable values."""

if value is None or isinstance(value, (str, int, float, bool)):

return value

if isinstance(value, dict):

normalized: Dict[str, Any] = {}

for key, item in value.items():

if not isinstance(key, str):

raise TypeError(

"sql_redis_options must use string keys for cache key generation; "

f"got key of type {type(key).__name__}"

)

normalized[key] = _normalize_sql_redis_option_value(item)

return normalized

if isinstance(value, (list, tuple)):

return [_normalize_sql_redis_option_value(item) for item in value]

raise TypeError(

"sql_redis_options must contain only JSON-serializable primitive values "

f"(nested dict/list/tuple structures are allowed); got value of type "

f"{type(value).__name__}"

)

def _sql_executor_cache_key(sql_redis_options: Dict[str, Any]) -> str:

"""Build a stable cache key for sql-redis executor reuse."""

normalized_options = _normalize_sql_redis_option_value(sql_redis_options)

return json.dumps(normalized_options, sort_keys=True)

Copilot · 2026-04-06T17:29:07Z

tests/integration/test_sql_redis_json.py

+            assert len(async_sql_index._sql_executors) == 1
+            executor = next(iter(async_sql_index._sql_executors.values()))
+            assert async_sql_index.name in executor._schema_registry._schemas
+            assert other_index.name not in executor._schema_registry._schemas


These async tests assert on private sql-redis internals (executor._schema_registry._schemas and _sql_executors). This is likely to break when sql-redis refactors its implementation even if behavior is unchanged. Prefer asserting through a public sql-redis API or another stable observable (e.g., whether additional schema lookups occur) if available.

default to better version

0836c68

rbs333 requested review from Copilot and nkanu17 April 3, 2026 13:42

Copilot started reviewing on behalf of rbs333 April 3, 2026 13:43 View session

Copilot AI reviewed Apr 3, 2026

View reviewed changes

update for sql-redis 0.4.0

0f4a5c6

rbs333 marked this pull request as ready for review April 6, 2026 17:24

Copilot AI review requested due to automatic review settings April 6, 2026 17:24

Copilot started reviewing on behalf of rbs333 April 6, 2026 17:24 View session

cursor bot reviewed Apr 6, 2026

View reviewed changes

redisvl/index/index.py Outdated Show resolved Hide resolved

redisvl/query/sql.py Show resolved Hide resolved

rbs333 and others added 2 commits April 6, 2026 13:25

Update redisvl/index/index.py

0361a63

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update redisvl/index/index.py

439d6de

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI reviewed Apr 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better Schema Registry#573

Better Schema Registry#573
rbs333 wants to merge 4 commits intomainfrom
feat/RAAE-1542/schema_lazy_load

rbs333 commented Apr 3, 2026 •

edited by cursor bot

Loading

Uh oh!

jit-ci bot commented Apr 3, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-def _sql_executor_cache_key(sql_redis_options: Dict[str, Any]) -> str:
-    """Build a stable cache key for sql-redis executor reuse."""
-    return json.dumps(sql_redis_options, sort_keys=True, default=repr)
+def _normalize_sql_redis_option_value(value: Any) -> Any:
+    """Normalize sql-redis options into deterministic JSON-serializable values."""
+    if value is None or isinstance(value, (str, int, float, bool)):
+        return value
+    if isinstance(value, dict):
+        normalized: Dict[str, Any] = {}
+        for key, item in value.items():
+            if not isinstance(key, str):
+                raise TypeError(
+                    "sql_redis_options must use string keys for cache key generation; "
+                    f"got key of type {type(key).__name__}"
+                )
+            normalized[key] = _normalize_sql_redis_option_value(item)
+        return normalized
+    if isinstance(value, (list, tuple)):
+        return [_normalize_sql_redis_option_value(item) for item in value]
+    raise TypeError(
+        "sql_redis_options must contain only JSON-serializable primitive values "
+        f"(nested dict/list/tuple structures are allowed); got value of type "
+        f"{type(value).__name__}"
+    )
+def _sql_executor_cache_key(sql_redis_options: Dict[str, Any]) -> str:
+    """Build a stable cache key for sql-redis executor reuse."""
+    normalized_options = _normalize_sql_redis_option_value(sql_redis_options)
+    return json.dumps(normalized_options, sort_keys=True)

Conversation

rbs333 commented Apr 3, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jit-ci bot commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🛡️ Jit Security Scan Results

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rbs333 commented Apr 3, 2026 •

edited by cursor bot

Loading

jit-ci bot commented Apr 3, 2026 •

edited

Loading