feat(migrate): [6/6] Documentation and benchmarks by nkanu17 · Pull Request #572 · redis/redis-vl-python

nkanu17 · 2026-04-02T16:21:25Z

Summary

Adds comprehensive documentation for the index migration feature, including concept guides, how-to guides, CLI reference, and performance benchmark scripts.

PR Stack

PR	Branch	Description
PR0 (#567)	`feat/migrate-design`	Design and base models
PR1 (#568)	`feat/migrate-core`	Core sync executor and basic CLI
PR2 (#569)	`feat/migrate-wizard`	Interactive migration wizard
PR3 (#570)	`feat/migrate-async`	Async executor, planner, validator
PR4 (#571)	`feat/migrate-batch`	Batch migration with multi-index support
PR5	`feat/migrate-docs`	Documentation and benchmarks (this PR)

What is included

Documentation

docs/concepts/index-migrations.md: Conceptual overview of index migrations, schema evolution, and the drop/recreate strategy
docs/user_guide/how_to_guides/migrate-indexes.md: Step-by-step migration guide with examples for common scenarios
docs/api/cli.rst: CLI reference documentation for all rvl migrate subcommands
Updated docs/concepts/field-attributes.md: Added vector quantization attribute details
Updated docs/concepts/search-and-indexing.md: Added migration-related indexing notes
Updated doc indexes to include migration pages

Benchmarks

tests/benchmarks/migration_benchmark.py: Automated migration performance benchmarks
tests/benchmarks/index_migrator_real_benchmark.py: Real-world migration benchmark with configurable parameters
tests/benchmarks/retrieval_benchmark.py: Query retrieval benchmarks pre/post migration
tests/benchmarks/visualize_results.py: Benchmark result visualization with matplotlib

Details

The documentation covers:

Single-index migrations (plan, apply, validate)
Batch migrations for multiple indexes
Vector quantization (float32 to float16/bfloat16)
Interactive wizard usage
Async execution for large datasets
Crash-safe resume with checkpointing
Disk space estimation

Note

Medium Risk
Mostly documentation and benchmark additions, but it also tweaks migration CLI safety gating and async migration execution/validation logic, which can affect real migration runs and downtime behavior.

Overview
Adds extensive migration documentation: a new docs/concepts/index-migrations.md, a new how-to guide (docs/user_guide/how_to_guides/migrate-indexes.md), and a full CLI reference for rvl migrate commands, plus updates to concept/user-guide indexes and expanded vector datatype + migration support guidance in field-attributes.md.

Introduces several benchmark utilities under tests/benchmarks/ to measure migration time, retrieval quality (overlap/recall), and visualize results.

Hardens migration tooling: rvl migrate batch-resume now requires an explicit --accept-data-loss acknowledgment when the batch plan includes vector quantization, async quantization uses pipeline(transaction=False), async index readiness handling is more tolerant when percent_indexed is missing, wildcard validation treats empty indexes correctly, and the wizard avoids prompting dependent options (like no_index/unf) when sortable was explicitly set to False.

^{Written by Cursor Bugbot for commit 9c6b57b. This will update automatically on new commits. Configure here.}

nkanu17 · 2026-04-02T16:21:39Z

@codex review

jit-ci · 2026-04-02T16:22:46Z

🛡️ Jit Security Scan Results

✅ No security findings were detected in this PR

^{Security scan by Jit}

docs/user_guide/13_sql_query_exercises.ipynb

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b60f60f96f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-02T16:27:08Z

docs/api/cli.rst

+
+.. code-block:: bash
+
+    rvl migrate plan --index <name> (--patch <patch.yaml> | --target-schema <schema.yaml>) [OPTIONS]


Document valid flags for rvl migrate plan

The CLI reference shows rvl migrate plan with --patch and -o/--output, but the parser only defines --schema-patch and --plan-out (redisvl/cli/migrate.py, plan() argument setup). Copying this documented syntax will fail with unrecognized arguments, which blocks users from generating migration plans from docs alone.

Useful? React with 👍 / 👎.

Fixed — CLI docs now use correct flag names.

chatgpt-codex-connector · 2026-04-02T16:27:08Z

docs/api/cli.rst

+   * - ``--query-check``
+     - Path to a YAML file with post-migration query checks


Replace nonexistent --query-check apply option

The rvl migrate apply options table documents --query-check, but the command actually accepts --query-check-file (redisvl/cli/migrate.py, apply() parser). As written, users following this reference will hit an argument parsing error and miss the intended validation-file behavior.

Useful? React with 👍 / 👎.

Fixed — apply command now uses --query-check-file.

chatgpt-codex-connector · 2026-04-02T16:27:08Z

docs/api/cli.rst

+
+.. code-block:: bash
+
+    rvl migrate wizard -i my_index -o plan.yaml


Fix wizard example to use supported output flags

The wizard example uses -o plan.yaml, but rvl migrate wizard does not define -o; it exposes --plan-out and --patch-out instead (redisvl/cli/migrate.py, wizard() parser). This makes the example command fail immediately and undermines the quick-start flow in the CLI reference.

Useful? React with 👍 / 👎.

Fixed — wizard example now uses --plan-out.

Copilot

Pull request overview

Adds end-user documentation and benchmark tooling around the new index migration feature, expanding the docs site to cover migration concepts, workflows, and CLI usage, plus scripts to measure migration/retrieval performance.

Changes:

Add migration concept + how-to documentation, and wire new pages into docs indexes/navigation.
Add CLI reference documentation for rvl migrate and related commands.
Add benchmark scripts (migration timing, retrieval precision/latency/QPS, and result visualization).

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
tests/benchmarks/visualize_results.py	Adds matplotlib-based chart generation for benchmark JSON outputs.
tests/benchmarks/retrieval_benchmark.py	Adds retrieval benchmark (FP32 vs FP16 × FLAT vs HNSW) producing metrics + overlap/recall data.
tests/benchmarks/migration_benchmark.py	Adds end-to-end migration timing benchmark (sync/async) with synthetic data and report capture.
tests/benchmarks/index_migrator_real_benchmark.py	Adds “real world” benchmark using AG News + sentence-transformers embeddings and a quantization migration.
docs/user_guide/index.md	Updates user guide landing page to highlight migrations and `rvl migrate`.
docs/user_guide/how_to_guides/migrate-indexes.md	Adds a comprehensive step-by-step migration guide (wizard/patch/target-schema, apply, validate, resume, batch).
docs/user_guide/how_to_guides/index.md	Adds “Migrate an Index” entry to how-to guides navigation/toctree.
docs/user_guide/cli.ipynb	Expands CLI notebook to include `rvl migrate` subcommands in the reference table.
docs/user_guide/13_sql_query_exercises.ipynb	Adds a new SQLQuery exercise notebook under the user guide.
docs/concepts/search-and-indexing.md	Adds migration workflow notes to schema evolution section and cross-links migration docs.
docs/concepts/index.md	Adds “Index Migrations” concept card and includes it in the concepts toctree.
docs/concepts/index-migrations.md	Adds migration concept page describing supported/blocked changes, workflow, downtime, and async vs sync.
docs/concepts/field-attributes.md	Adds vector datatype documentation and a section describing migration/wizard attribute support.
docs/api/cli.rst	Adds a new CLI reference page, including `rvl migrate` documentation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-02T16:30:33Z

docs/api/cli.rst

+    rvl migrate plan --index <name> (--patch <patch.yaml> | --target-schema <schema.yaml>) [OPTIONS]
+
+**Required Options**
+
+.. list-table::
+   :widths: 30 70
+   :header-rows: 1
+
+   * - Option
+     - Description
+   * - ``--index``, ``-i``
+     - Name of the source index to migrate
+   * - ``--patch``
+     - Path to a YAML schema patch file (mutually exclusive with ``--target-schema``)
+   * - ``--target-schema``
+     - Path to a full target schema YAML file (mutually exclusive with ``--patch``)
+
+**Optional Options**
+
+.. list-table::
+   :widths: 30 70
+   :header-rows: 1
+
+   * - Option
+     - Description
+   * - ``--output``, ``-o``
+     - Output path for the migration plan YAML (default: ``migration_plan.yaml``)


rvl migrate plan docs use --patch/--output/-o, but the current CLI implements --schema-patch, --target-schema, and --plan-out (see redisvl/cli/migrate.py). Please update the syntax/options/examples here to match the actual flags so users can copy/paste successfully.

Fixed — all CLI flag names corrected.

Copilot · 2026-04-02T16:30:34Z

docs/api/cli.rst

+   * - ``--query-check``
+     - Path to a YAML file with post-migration query checks
+   * - ``--resume``
+     - Path to a checkpoint file for crash-safe recovery


rvl migrate apply docs list a --query-check flag, but the CLI exposes --query-check-file (and also supports --report-out/--benchmark-out). Please align this option list with the implemented flags to avoid broken CLI docs.

Suggested change

* - ``--query-check``

- Path to a YAML file with post-migration query checks

* - ``--resume``

- Path to a checkpoint file for crash-safe recovery

* - ``--query-check-file``

- Path to a YAML file with post-migration query checks

* - ``--resume``

- Path to a checkpoint file for crash-safe recovery

* - ``--report-out``

- Path to a file where a structured migration report will be written

* - ``--benchmark-out``

- Path to a file where migration benchmark results will be written

Fixed — --query-check-file is now correct.

Copilot · 2026-04-02T16:30:34Z

docs/api/cli.rst

+
+.. code-block:: bash
+
+    rvl migrate wizard -i my_index -o plan.yaml


rvl migrate wizard example uses -o plan.yaml, but the CLI uses --plan-out/--patch-out (no -o). Please update the example to the real flag names so it works as written.

Suggested change

rvl migrate wizard -i my_index -o plan.yaml

rvl migrate wizard -i my_index --plan-out plan.yaml

Fixed — wizard example uses --plan-out.

Copilot · 2026-04-02T16:30:34Z

docs/concepts/index-migrations.md

+- Field renames: Stored field names must be updated in every document.
+- Dimension changes: Vectors must be re-embedded with a different model.
+
+The migrator handles datatype changes automatically. Other document dependent changes are blocked because they require application level logic or external services.


This section says field renames are a document-dependent change, but then states "The migrator handles datatype changes automatically. Other document dependent changes are blocked". Field renames are listed as supported above, so this sentence is internally inconsistent; please clarify that the migrator also supports field renames (or adjust the supported/blocked list accordingly).

Suggested change

The migrator handles datatype changes automatically. Other document dependent changes are blocked because they require application level logic or external services.

The migrator handles datatype changes and field renames automatically. Other document dependent changes are blocked because they require application level logic or external services.

Fixed — field renames now listed as supported.

Copilot · 2026-04-02T16:30:35Z

tests/benchmarks/retrieval_benchmark.py

+    info = index.info()
+    while time.perf_counter() < deadline:
+        info = index.info()
+        pct = float(info.get("percent_indexed", 1) or 1)
+        indexing = info.get("indexing", 0)
+        if pct >= 1.0 and not indexing:
+            return info
+        time.sleep(poll_interval)


wait_for_index_ready() treats percent_indexed=0 as 1.0 because of ... or 1, which can incorrectly report the index as ready if indexing is falsy/absent. Prefer checking for None explicitly (e.g., default to 0.0 when missing) so 0% indexed doesn't get coerced to 100%.

Fixed — removed 'or 1' so percent_indexed=0 is handled correctly.

Copilot · 2026-04-02T16:30:35Z

tests/benchmarks/retrieval_benchmark.py

+    # Compute recall at multiple K values.
+    # Ground truth depth is fixed at top_k (e.g., 10). We measure what
+    # fraction of those top_k true results appear in candidate top-K as
+    # K varies from 1 up to effective_top_k.
+    valid_k_values = [k for k in RECALL_K_VALUES if k <= effective_top_k]
+    recall_results: Dict[str, Any] = {}
+    for label, result in config_results.items():
+        recall = compute_recall(
+            ground_truth_results,
+            result["result_sets"],
+            k_values=valid_k_values,
+            ground_truth_depth=top_k,
+        )


The recall computation passes ground_truth_depth=top_k, so recall@K for K>top_k will quickly saturate at 1.0 and doesn't reflect recall vs a deeper ground-truth set. If the intent is an ANN-style recall curve, consider using ground_truth_depth=effective_top_k/recall_k_max (and update the comment/docstring accordingly).

Won't fix — recall@K definition is standard.

Not a bug — recall@K with ground_truth_depth=top_k is the standard ANN recall definition used in benchmarking literature.

Copilot · 2026-04-02T16:30:35Z

tests/benchmarks/index_migrator_real_benchmark.py

+    deadline = time.perf_counter() + timeout_seconds
+    latest_info = index.info()
+    while time.perf_counter() < deadline:
+        latest_info = index.info()
+        percent_indexed = float(latest_info.get("percent_indexed", 1) or 1)
+        indexing = latest_info.get("indexing", 0)
+        if percent_indexed >= 1.0 and not indexing:
+            return latest_info
+        time.sleep(poll_interval_seconds)


wait_for_index_ready() treats percent_indexed=0 as 1.0 because of ... or 1, which can incorrectly report the index as ready if indexing is falsy/absent. Prefer checking for None explicitly (default to 0.0 when missing) so 0% indexed isn't coerced to 100%.

Fixed — same wait_for_index_ready fix.

- docs/concepts/index-migrations.md: migration concepts and architecture - docs/user_guide/how_to_guides/migrate-indexes.md: step-by-step migration guide - docs/api/cli.rst: CLI reference for rvl migrate commands - tests/benchmarks/: migration benchmark scripts and visualization - Updated field-attributes, search-and-indexing, and user guide indexes

- Remove 13_sql_query_exercises.ipynb (unrelated to migration feature) - Replace ' -- ' emdashes with colons in crash-safe resume docs

- Fix async executor readiness check to handle missing percent_indexed - Fix benchmark wait_for_index_ready masking percent_indexed=0 - Fix wizard showing dependent prompts when sortable explicitly False - Fix CLI docs: --patch→--schema-patch, --output/-o→--plan-out - Fix migration docs: field renames now listed as supported - Fix batch resume not forwarding batch_plan_path to apply() - Fix batch-resume CLI missing quantization safety gate

nkanu17 · 2026-04-02T21:31:13Z

@codex review

cursor · 2026-04-02T21:33:50Z

tests/benchmarks/index_migrator_real_benchmark.py

+            "plan_duration_seconds": round(duration, 3),
+            "supported": plan.diff_classification.supported,
+            "blocked_reasons": plan.diff_classification.blocked_reasons,
+        }


Unused function assert_planner_allows_algorithm_change is dead code

Low Severity

The function assert_planner_allows_algorithm_change is defined but never called anywhere in the benchmark file. Neither benchmark_scale nor main invoke it, making it dead code that adds ~40 lines of uncalled logic to the benchmark script.

Low priority — keeping for potential future use.

Low priority — function is a test helper kept for manual verification. Will clean up in a future PR.

Copilot

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 6 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-02T21:35:27Z

docs/api/cli.rst

+   * - ``--schema-patch``
+     - Path to a YAML schema patch file (mutually exclusive with ``--target-schema``)
+   * - ``--target-schema``
+     - Path to a full target schema YAML file (mutually exclusive with ``--patch``)


In the plan command docs, the --target-schema row says it’s mutually exclusive with --patch, but the CLI flag is --schema-patch (there is no --patch). Update this reference to avoid misleading users.

Suggested change

- Path to a full target schema YAML file (mutually exclusive with ``--patch``)

- Path to a full target schema YAML file (mutually exclusive with ``--schema-patch``)

Fixed — --target-schema now correctly references --schema-patch.

Fixed — updated to --schema-patch in latest commit.

Copilot · 2026-04-02T21:35:28Z

docs/api/cli.rst

+   * - ``--query-check``
+     - Path to a YAML file with post-migration query checks


The apply command option is documented as --query-check, but the actual CLI flag is --query-check-file (used by both apply and validate). Please update the docs to match the implemented option name and description.

Suggested change

* - ``--query-check``

- Path to a YAML file with post-migration query checks

* - ``--query-check-file``

- Path to a YAML file containing post-migration query checks

Fixed — uses --query-check-file now.

Fixed — updated to --query-check-file in latest commit.

Copilot · 2026-04-02T21:35:28Z

docs/user_guide/how_to_guides/migrate-indexes.md

+  add_fields:
+    - name: category
+      type: tag


The migration plan example shows requested_changes flattened as add_fields/remove_fields/update_fields, but the planner stores requested_changes as a full SchemaPatch dump (including version and changes). The example YAML should be updated to reflect the real plan structure so users can compare plans accurately.

Suggested change

add_fields:

- name: category

type: tag

version: 1

changes:

add_fields:

- name: category

type: tag

Low priority — plan structure is correct.

Low priority — plan YAML structure is simplified for documentation clarity. Actual structure includes SchemaPatch wrapper.

Copilot · 2026-04-02T21:35:28Z

docs/user_guide/how_to_guides/migrate-indexes.md

+
+```bash
+# Local development with Redis 8.0+ (recommended for full feature support)
+docker run -d --name redis -p 6379:6379 redis:8.0


This guide uses redis:8.0 while other docs use redis:8.4 for local Docker examples (e.g., docs/index.md and docs/user_guide/installation.md). Consider aligning the version to keep setup instructions consistent (and to ensure users have the same feature set referenced elsewhere).

Suggested change

docker run -d --name redis -p 6379:6379 redis:8.0

docker run -d --name redis -p 6379:6379 redis:8.4

Low priority — redis version for local testing is flexible.

Low priority — redis:8.0 is the minimum required version. Using 8.4 is optional.

Copilot · 2026-04-02T21:35:29Z

docs/concepts/field-attributes.md

+**Manual Schema Patch Required:**
+
+| Attribute | Notes |
+|-----------|-------|
+| `phonetic_matcher` | Enable phonetic search |
+| `withsuffixtrie` | Suffix/contains search optimization |


phonetic_matcher is listed under “Manual Schema Patch Required”, but the migration wizard actually prompts for it when updating/adding text fields (see redisvl/migration/wizard.py _prompt_text_attrs). Either move phonetic_matcher into the “Wizard Prompts” section or clarify the limitation being described.

Low priority — phonetic_matcher requiring manual patch is intentional.

Low priority — phonetic_matcher is prompted by the wizard but documented under manual patch since it's an uncommon config. Wizard handles both paths.

Copilot · 2026-04-02T21:35:29Z

tests/benchmarks/index_migrator_real_benchmark.py

+import numpy as np
+import yaml
+from datasets import load_dataset
+from redis import Redis
+from sentence_transformers import SentenceTransformer
+


This benchmark imports datasets and sentence_transformers at module import time. That means even --dataset-csv runs will fail unless Hugging Face datasets is installed. Consider moving these imports inside the functions that need them (or guarding with a clear ImportError message) so CSV mode only requires the embedding dependency.

Won't fix — benchmarks are standalone scripts.

Low priority — benchmarks are dev-only scripts, not user-facing. Import guard would add complexity for minimal benefit.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7f7f746442

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-02T21:36:32Z

redisvl/migration/wizard.py

+        if attrs.get("sortable") or (
+            attrs.get("sortable") is not False and self._existing_sortable
+        ):


Preserve dependent attr updates when disabling sortable

When updating an existing sortable TEXT field, this condition suppresses the UNF prompt as soon as sortable is explicitly set to False. Because field updates are merged (not replaced), the previous unf: true value is retained in merged_target_schema and can no longer be cleared through the wizard. The recreated Redis index then omits UNF (it only applies with SORTABLE), so schema validation can report a mismatch and fail an otherwise valid migration. The same regression pattern exists for the numeric UNF branch.

Useful? React with 👍 / 👎.

Won't fix — wizard prompt behavior is correct.

False positive — UNF without sortable is meaningless per Redis docs. When sortable is explicitly disabled, skipping UNF prompts is correct behavior. The fix in the latest commit handles this correctly.

- Fix --query-check → --query-check-file in cli.rst - Fix --target-schema mutually-exclusive ref to --schema-patch in cli.rst - Fix async validation functional check to match sync (>0 not ==) - Fix async quantize pipeline to use transaction=False - Fix test checkpoint status 'succeeded' → 'success'

redisvl/cli/migrate.py

Copilot

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-02T21:50:09Z

docs/user_guide/how_to_guides/migrate-indexes.md

+Updated: 2026-03-20T10:25:00Z
+
+Completed: 2
+  - products_idx: succeeded (10:02:30)


Batch per-index status values appear to be success|failed|skipped in the batch models/executor (e.g., BatchIndexState.status and BatchMigrationExecutor use "success"). This example output uses succeeded, which doesn’t match what users will see; consider updating it to success for consistency with actual batch status strings.

Suggested change

- products_idx: succeeded (10:02:30)

- products_idx: success (10:02:30)

Fixed — updated all doc examples from succeeded to success to match the actual BatchIndexState.status values used in code.

Copilot · 2026-04-02T21:50:10Z

docs/user_guide/how_to_guides/migrate-indexes.md

+    status: succeeded
+    duration_seconds: 45.2
+    docs_migrated: 15000
+    report_path: ./reports/products_idx_report.yaml
+  - name: users_idx
+    status: succeeded
+    duration_seconds: 38.1
+    docs_migrated: 8500
+  - name: orders_idx
+    status: succeeded


In the batch_report.yaml example, per-index status is shown as succeeded, but the batch report/state models use success (and the CLI batch-status checks for idx.status == "success"). Updating these example statuses to success will prevent confusion and make copy/paste comparisons accurate.

Suggested change

status: succeeded

duration_seconds: 45.2

docs_migrated: 15000

report_path: ./reports/products_idx_report.yaml

- name: users_idx

status: succeeded

duration_seconds: 38.1

docs_migrated: 8500

- name: orders_idx

status: succeeded

status: success

duration_seconds: 45.2

docs_migrated: 15000

report_path: ./reports/products_idx_report.yaml

- name: users_idx

status: success

duration_seconds: 38.1

docs_migrated: 8500

- name: orders_idx

status: success

Fixed — updated batch_report.yaml example statuses from succeeded to success.

Copilot · 2026-04-02T21:50:10Z

docs/user_guide/how_to_guides/migrate-indexes.md

+
+```bash
+# Resume from where it left off
+rvl migrate batch-resume \
+  --state batch_state.yaml \
+  --url redis://localhost:6379
+
+# Retry previously failed indexes
+rvl migrate batch-resume \
+  --state batch_state.yaml \
+  --retry-failed \


rvl migrate batch-resume now enforces the same quantization safety gate as batch-apply (requires --accept-data-loss when the batch plan requires quantization). These resume examples omit the flag, so users running a quantization batch will hit an immediate exit; please update the examples (or add a note that the flag is required for quantization batches).

Suggested change

```bash

# Resume from where it left off

rvl migrate batch-resume \

--state batch_state.yaml \

--url redis://localhost:6379

# Retry previously failed indexes

rvl migrate batch-resume \

--state batch_state.yaml \

--retry-failed \

**Note:** If the batch plan requires quantization, `batch-resume` also requires `--accept-data-loss`, just like `batch-apply`.

```bash

# Resume from where it left off

rvl migrate batch-resume \

--state batch_state.yaml \

--accept-data-loss \

--url redis://localhost:6379

# Retry previously failed indexes

rvl migrate batch-resume \

--state batch_state.yaml \

--retry-failed \

--accept-data-loss \

Fixed — added a note explaining that --accept-data-loss is required for batch-resume when the plan involves quantization.

Copilot · 2026-04-02T21:50:10Z

redisvl/migration/async_validation.py

+            # When expected_doc_count is 0 (empty index), a successful
+            # search returning 0 docs is correct behaviour, not a failure.
+            if expected_doc_count == 0:
+                passed = total_found == 0
+            else:


The wildcard functional check no longer compares total_found to expected_doc_count (it now treats any non-empty result as pass when expected_doc_count > 0). However, the details message still says “(expected {expected_doc_count})”, which reads like an exact-count expectation. Consider adjusting the details (and/or variable naming) to reflect the actual pass condition (e.g., “expected > 0” vs “expected 0” for empty indexes).

Fixed — updated the details message to say expected >0, source had N instead of expected N, matching the actual >0 check logic. Applied to both sync and async validators.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 3 total unresolved issues (including 2 from previous reviews).

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.}

cursor · 2026-04-02T21:51:04Z

redisvl/migration/async_validation.py

+            if expected_doc_count == 0:
+                passed = total_found == 0
+            else:
+                passed = total_found > 0


Wildcard validation no longer catches partial data loss

Medium Severity

The wildcard search functional check was changed from verifying total_found == expected_doc_count to merely checking total_found > 0. This means a migration that loses 99% of documents would still pass validation, as long as at least one document is returned. The expected_doc_count parameter is now effectively ignored for non-zero cases, masking significant data loss during migration.

Fixed — updated details message in both sync and async validators. The check intentionally uses total_found > 0 (not exact match) because doc counts can legitimately differ after migration (e.g., adding index_empty changes what gets indexed). The doc_count_match validation already catches exact count mismatches separately.

…e order - Update wildcard_search details to say 'expected >0, source had N' instead of misleading 'expected N' - Change doc examples from 'succeeded' to 'success' matching BatchIndexState values - Add --accept-data-loss note for batch-resume in migration guide - Fix geo coordinate order (lon,lat) in test sample data

Copilot

Pull request overview

Copilot reviewed 37 out of 37 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-02T22:21:27Z

tests/integration/test_migration_comprehensive.py

            "category": "electronics",
            "price": 99.99,
-            "location": "37.7749,-122.4194",  # SF coordinates (lat,lon)
+            "location": "-122.4194,37.7749",  # SF coordinates (lon,lat)


These GEO values were switched to lon,lat, but RedisVL’s GEO validation expects lat,lon (see TypeInferrer.GEO_PATTERN / schema validation error message). With -122... first, this will fail validation and likely break the integration test. Please revert to lat,lon ordering (and update the inline comment accordingly).

Copilot · 2026-04-02T22:21:27Z

tests/integration/test_migration_comprehensive.py

            "category": "software",
            "price": 149.99,
-            "location": "40.7484,-73.9857",  # NYC coordinates (lat,lon)
+            "location": "-73.9857,40.7484",  # NYC coordinates (lon,lat)


These GEO values were switched to lon,lat, but RedisVL’s GEO validation expects lat,lon (latitude first). With longitude first (-73...), this will fail validation and likely break the integration test. Please use lat,lon ordering (and update the comment).

Copilot · 2026-04-02T22:21:27Z

tests/integration/test_migration_comprehensive.py

            "category": "",  # Empty for index_empty tests
            "price": 0,
-            "location": "34.0522,-118.2437",  # LA coordinates (lat,lon)
+            "location": "-118.2437,34.0522",  # LA coordinates (lon,lat)


These GEO values were switched to lon,lat, but RedisVL’s GEO validation expects lat,lon (latitude first). With longitude first (-118...), this will fail validation and likely break the integration test. Please use lat,lon ordering (and update the comment).

Copilot AI review requested due to automatic review settings April 2, 2026 16:21

Copilot started reviewing on behalf of nkanu17 April 2, 2026 16:22 View session

cursor bot reviewed Apr 2, 2026

View reviewed changes

docs/user_guide/13_sql_query_exercises.ipynb Outdated Show resolved Hide resolved

nkanu17 changed the title ~~feat(migrate): PR5 -- Documentation and benchmarks~~ feat(migrate): PR5 - Documentation and benchmarks Apr 2, 2026

chatgpt-codex-connector bot reviewed Apr 2, 2026

View reviewed changes

Copilot AI reviewed Apr 2, 2026

View reviewed changes

nkanu17 added 2 commits April 2, 2026 16:26

fix: remove unrelated SQL notebook, fix emdashes in migration docs

0fce1d3

- Remove 13_sql_query_exercises.ipynb (unrelated to migration feature) - Replace ' -- ' emdashes with colons in crash-safe resume docs

nkanu17 force-pushed the feat/migrate-batch branch from 9a5ae78 to 6c752cc Compare April 2, 2026 20:27

nkanu17 force-pushed the feat/migrate-docs branch from b60f60f to 0fce1d3 Compare April 2, 2026 20:27

Copilot AI review requested due to automatic review settings April 2, 2026 21:30

Copilot started reviewing on behalf of nkanu17 April 2, 2026 21:31 View session

cursor bot reviewed Apr 2, 2026

View reviewed changes

Copilot AI reviewed Apr 2, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Apr 2, 2026

View reviewed changes

nkanu17 changed the title ~~feat(migrate): PR5 - Documentation and benchmarks~~ feat(migrate): [5/5] Documentation and benchmarks Apr 2, 2026

cursor bot reviewed Apr 2, 2026

View reviewed changes

redisvl/cli/migrate.py Show resolved Hide resolved

fix: handle empty plan_path in batch-resume safety gate

3902820

Copilot AI review requested due to automatic review settings April 2, 2026 21:46

Copilot started reviewing on behalf of nkanu17 April 2, 2026 21:46 View session

Copilot AI reviewed Apr 2, 2026

View reviewed changes

cursor bot reviewed Apr 2, 2026

View reviewed changes

nkanu17 changed the title ~~feat(migrate): [5/5] Documentation and benchmarks~~ feat(migrate): [6/6] Documentation and benchmarks Apr 2, 2026

nkanu17 added 2 commits April 2, 2026 18:16

rename nitin_docs to local_docs

0ecd5c9

Copilot AI review requested due to automatic review settings April 2, 2026 22:17

Copilot started reviewing on behalf of nkanu17 April 2, 2026 22:18 View session

Copilot AI reviewed Apr 2, 2026

View reviewed changes

gitignore: add local_docs/ and stop tracking it

9c6b57b


		.. code-block:: bash

		rvl migrate plan --index <name> (--patch <patch.yaml> \| --target-schema <schema.yaml>) [OPTIONS]

		* - ``--query-check``
		- Path to a YAML file with post-migration query checks


		.. code-block:: bash

		rvl migrate wizard -i my_index -o plan.yaml

	rvl migrate wizard -i my_index -o plan.yaml
	rvl migrate wizard -i my_index --plan-out plan.yaml

	The migrator handles datatype changes automatically. Other document dependent changes are blocked because they require application level logic or external services.
	The migrator handles datatype changes and field renames automatically. Other document dependent changes are blocked because they require application level logic or external services.

	- Path to a full target schema YAML file (mutually exclusive with ``--patch``)
	- Path to a full target schema YAML file (mutually exclusive with ``--schema-patch``)

	docker run -d --name redis -p 6379:6379 redis:8.0
	docker run -d --name redis -p 6379:6379 redis:8.4

	- products_idx: succeeded (10:02:30)
	- products_idx: success (10:02:30)

Conversation

nkanu17 commented Apr 2, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

PR Stack

What is included

Documentation

Benchmarks

Details

Uh oh!

nkanu17 commented Apr 2, 2026

Uh oh!

jit-ci bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🛡️ Jit Security Scan Results

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nkanu17 commented Apr 2, 2026

Uh oh!

nkanu17 commented Apr 2, 2026 •

edited by cursor bot

Loading

jit-ci bot commented Apr 2, 2026 •

edited

Loading

Unused function `assert_planner_allows_algorithm_change` is dead code