feat(migrate): [6/6] Documentation and benchmarks#572
feat(migrate): [6/6] Documentation and benchmarks#572nkanu17 wants to merge 8 commits intofeat/migrate-batchfrom
Conversation
|
@codex review |
🛡️ Jit Security Scan Results✅ No security findings were detected in this PR
Security scan by Jit
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b60f60f96f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
docs/api/cli.rst
Outdated
|
|
||
| .. code-block:: bash | ||
|
|
||
| rvl migrate plan --index <name> (--patch <patch.yaml> | --target-schema <schema.yaml>) [OPTIONS] |
There was a problem hiding this comment.
Document valid flags for
rvl migrate plan
The CLI reference shows rvl migrate plan with --patch and -o/--output, but the parser only defines --schema-patch and --plan-out (redisvl/cli/migrate.py, plan() argument setup). Copying this documented syntax will fail with unrecognized arguments, which blocks users from generating migration plans from docs alone.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed — CLI docs now use correct flag names.
docs/api/cli.rst
Outdated
| * - ``--query-check`` | ||
| - Path to a YAML file with post-migration query checks |
There was a problem hiding this comment.
Replace nonexistent
--query-check apply option
The rvl migrate apply options table documents --query-check, but the command actually accepts --query-check-file (redisvl/cli/migrate.py, apply() parser). As written, users following this reference will hit an argument parsing error and miss the intended validation-file behavior.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed — apply command now uses --query-check-file.
docs/api/cli.rst
Outdated
|
|
||
| .. code-block:: bash | ||
|
|
||
| rvl migrate wizard -i my_index -o plan.yaml |
There was a problem hiding this comment.
Fix wizard example to use supported output flags
The wizard example uses -o plan.yaml, but rvl migrate wizard does not define -o; it exposes --plan-out and --patch-out instead (redisvl/cli/migrate.py, wizard() parser). This makes the example command fail immediately and undermines the quick-start flow in the CLI reference.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed — wizard example now uses --plan-out.
There was a problem hiding this comment.
Pull request overview
Adds end-user documentation and benchmark tooling around the new index migration feature, expanding the docs site to cover migration concepts, workflows, and CLI usage, plus scripts to measure migration/retrieval performance.
Changes:
- Add migration concept + how-to documentation, and wire new pages into docs indexes/navigation.
- Add CLI reference documentation for
rvl migrateand related commands. - Add benchmark scripts (migration timing, retrieval precision/latency/QPS, and result visualization).
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/benchmarks/visualize_results.py | Adds matplotlib-based chart generation for benchmark JSON outputs. |
| tests/benchmarks/retrieval_benchmark.py | Adds retrieval benchmark (FP32 vs FP16 × FLAT vs HNSW) producing metrics + overlap/recall data. |
| tests/benchmarks/migration_benchmark.py | Adds end-to-end migration timing benchmark (sync/async) with synthetic data and report capture. |
| tests/benchmarks/index_migrator_real_benchmark.py | Adds “real world” benchmark using AG News + sentence-transformers embeddings and a quantization migration. |
| docs/user_guide/index.md | Updates user guide landing page to highlight migrations and rvl migrate. |
| docs/user_guide/how_to_guides/migrate-indexes.md | Adds a comprehensive step-by-step migration guide (wizard/patch/target-schema, apply, validate, resume, batch). |
| docs/user_guide/how_to_guides/index.md | Adds “Migrate an Index” entry to how-to guides navigation/toctree. |
| docs/user_guide/cli.ipynb | Expands CLI notebook to include rvl migrate subcommands in the reference table. |
| docs/user_guide/13_sql_query_exercises.ipynb | Adds a new SQLQuery exercise notebook under the user guide. |
| docs/concepts/search-and-indexing.md | Adds migration workflow notes to schema evolution section and cross-links migration docs. |
| docs/concepts/index.md | Adds “Index Migrations” concept card and includes it in the concepts toctree. |
| docs/concepts/index-migrations.md | Adds migration concept page describing supported/blocked changes, workflow, downtime, and async vs sync. |
| docs/concepts/field-attributes.md | Adds vector datatype documentation and a section describing migration/wizard attribute support. |
| docs/api/cli.rst | Adds a new CLI reference page, including rvl migrate documentation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
docs/api/cli.rst
Outdated
| rvl migrate plan --index <name> (--patch <patch.yaml> | --target-schema <schema.yaml>) [OPTIONS] | ||
|
|
||
| **Required Options** | ||
|
|
||
| .. list-table:: | ||
| :widths: 30 70 | ||
| :header-rows: 1 | ||
|
|
||
| * - Option | ||
| - Description | ||
| * - ``--index``, ``-i`` | ||
| - Name of the source index to migrate | ||
| * - ``--patch`` | ||
| - Path to a YAML schema patch file (mutually exclusive with ``--target-schema``) | ||
| * - ``--target-schema`` | ||
| - Path to a full target schema YAML file (mutually exclusive with ``--patch``) | ||
|
|
||
| **Optional Options** | ||
|
|
||
| .. list-table:: | ||
| :widths: 30 70 | ||
| :header-rows: 1 | ||
|
|
||
| * - Option | ||
| - Description | ||
| * - ``--output``, ``-o`` | ||
| - Output path for the migration plan YAML (default: ``migration_plan.yaml``) |
There was a problem hiding this comment.
rvl migrate plan docs use --patch/--output/-o, but the current CLI implements --schema-patch, --target-schema, and --plan-out (see redisvl/cli/migrate.py). Please update the syntax/options/examples here to match the actual flags so users can copy/paste successfully.
There was a problem hiding this comment.
Fixed — all CLI flag names corrected.
docs/api/cli.rst
Outdated
| * - ``--query-check`` | ||
| - Path to a YAML file with post-migration query checks | ||
| * - ``--resume`` | ||
| - Path to a checkpoint file for crash-safe recovery |
There was a problem hiding this comment.
rvl migrate apply docs list a --query-check flag, but the CLI exposes --query-check-file (and also supports --report-out/--benchmark-out). Please align this option list with the implemented flags to avoid broken CLI docs.
| * - ``--query-check`` | |
| - Path to a YAML file with post-migration query checks | |
| * - ``--resume`` | |
| - Path to a checkpoint file for crash-safe recovery | |
| * - ``--query-check-file`` | |
| - Path to a YAML file with post-migration query checks | |
| * - ``--resume`` | |
| - Path to a checkpoint file for crash-safe recovery | |
| * - ``--report-out`` | |
| - Path to a file where a structured migration report will be written | |
| * - ``--benchmark-out`` | |
| - Path to a file where migration benchmark results will be written |
There was a problem hiding this comment.
Fixed — --query-check-file is now correct.
docs/api/cli.rst
Outdated
|
|
||
| .. code-block:: bash | ||
|
|
||
| rvl migrate wizard -i my_index -o plan.yaml |
There was a problem hiding this comment.
rvl migrate wizard example uses -o plan.yaml, but the CLI uses --plan-out/--patch-out (no -o). Please update the example to the real flag names so it works as written.
| rvl migrate wizard -i my_index -o plan.yaml | |
| rvl migrate wizard -i my_index --plan-out plan.yaml |
There was a problem hiding this comment.
Fixed — wizard example uses --plan-out.
docs/concepts/index-migrations.md
Outdated
| - Field renames: Stored field names must be updated in every document. | ||
| - Dimension changes: Vectors must be re-embedded with a different model. | ||
|
|
||
| The migrator handles datatype changes automatically. Other document dependent changes are blocked because they require application level logic or external services. |
There was a problem hiding this comment.
This section says field renames are a document-dependent change, but then states "The migrator handles datatype changes automatically. Other document dependent changes are blocked". Field renames are listed as supported above, so this sentence is internally inconsistent; please clarify that the migrator also supports field renames (or adjust the supported/blocked list accordingly).
| The migrator handles datatype changes automatically. Other document dependent changes are blocked because they require application level logic or external services. | |
| The migrator handles datatype changes and field renames automatically. Other document dependent changes are blocked because they require application level logic or external services. |
There was a problem hiding this comment.
Fixed — field renames now listed as supported.
| info = index.info() | ||
| while time.perf_counter() < deadline: | ||
| info = index.info() | ||
| pct = float(info.get("percent_indexed", 1) or 1) | ||
| indexing = info.get("indexing", 0) | ||
| if pct >= 1.0 and not indexing: | ||
| return info | ||
| time.sleep(poll_interval) |
There was a problem hiding this comment.
wait_for_index_ready() treats percent_indexed=0 as 1.0 because of ... or 1, which can incorrectly report the index as ready if indexing is falsy/absent. Prefer checking for None explicitly (e.g., default to 0.0 when missing) so 0% indexed doesn't get coerced to 100%.
There was a problem hiding this comment.
Fixed — removed 'or 1' so percent_indexed=0 is handled correctly.
| # Compute recall at multiple K values. | ||
| # Ground truth depth is fixed at top_k (e.g., 10). We measure what | ||
| # fraction of those top_k true results appear in candidate top-K as | ||
| # K varies from 1 up to effective_top_k. | ||
| valid_k_values = [k for k in RECALL_K_VALUES if k <= effective_top_k] | ||
| recall_results: Dict[str, Any] = {} | ||
| for label, result in config_results.items(): | ||
| recall = compute_recall( | ||
| ground_truth_results, | ||
| result["result_sets"], | ||
| k_values=valid_k_values, | ||
| ground_truth_depth=top_k, | ||
| ) |
There was a problem hiding this comment.
The recall computation passes ground_truth_depth=top_k, so recall@K for K>top_k will quickly saturate at 1.0 and doesn't reflect recall vs a deeper ground-truth set. If the intent is an ANN-style recall curve, consider using ground_truth_depth=effective_top_k/recall_k_max (and update the comment/docstring accordingly).
There was a problem hiding this comment.
Won't fix — recall@K definition is standard.
There was a problem hiding this comment.
Not a bug — recall@K with ground_truth_depth=top_k is the standard ANN recall definition used in benchmarking literature.
| deadline = time.perf_counter() + timeout_seconds | ||
| latest_info = index.info() | ||
| while time.perf_counter() < deadline: | ||
| latest_info = index.info() | ||
| percent_indexed = float(latest_info.get("percent_indexed", 1) or 1) | ||
| indexing = latest_info.get("indexing", 0) | ||
| if percent_indexed >= 1.0 and not indexing: | ||
| return latest_info | ||
| time.sleep(poll_interval_seconds) |
There was a problem hiding this comment.
wait_for_index_ready() treats percent_indexed=0 as 1.0 because of ... or 1, which can incorrectly report the index as ready if indexing is falsy/absent. Prefer checking for None explicitly (default to 0.0 when missing) so 0% indexed isn't coerced to 100%.
There was a problem hiding this comment.
Fixed — same wait_for_index_ready fix.
- docs/concepts/index-migrations.md: migration concepts and architecture - docs/user_guide/how_to_guides/migrate-indexes.md: step-by-step migration guide - docs/api/cli.rst: CLI reference for rvl migrate commands - tests/benchmarks/: migration benchmark scripts and visualization - Updated field-attributes, search-and-indexing, and user guide indexes
- Remove 13_sql_query_exercises.ipynb (unrelated to migration feature) - Replace ' -- ' emdashes with colons in crash-safe resume docs
9a5ae78 to
6c752cc
Compare
b60f60f to
0fce1d3
Compare
- Fix async executor readiness check to handle missing percent_indexed - Fix benchmark wait_for_index_ready masking percent_indexed=0 - Fix wizard showing dependent prompts when sortable explicitly False - Fix CLI docs: --patch→--schema-patch, --output/-o→--plan-out - Fix migration docs: field renames now listed as supported - Fix batch resume not forwarding batch_plan_path to apply() - Fix batch-resume CLI missing quantization safety gate
|
@codex review |
| "plan_duration_seconds": round(duration, 3), | ||
| "supported": plan.diff_classification.supported, | ||
| "blocked_reasons": plan.diff_classification.blocked_reasons, | ||
| } |
There was a problem hiding this comment.
Unused function assert_planner_allows_algorithm_change is dead code
Low Severity
The function assert_planner_allows_algorithm_change is defined but never called anywhere in the benchmark file. Neither benchmark_scale nor main invoke it, making it dead code that adds ~40 lines of uncalled logic to the benchmark script.
There was a problem hiding this comment.
Low priority — keeping for potential future use.
There was a problem hiding this comment.
Low priority — function is a test helper kept for manual verification. Will clean up in a future PR.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 17 out of 17 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
docs/api/cli.rst
Outdated
| * - ``--schema-patch`` | ||
| - Path to a YAML schema patch file (mutually exclusive with ``--target-schema``) | ||
| * - ``--target-schema`` | ||
| - Path to a full target schema YAML file (mutually exclusive with ``--patch``) |
There was a problem hiding this comment.
In the plan command docs, the --target-schema row says it’s mutually exclusive with --patch, but the CLI flag is --schema-patch (there is no --patch). Update this reference to avoid misleading users.
| - Path to a full target schema YAML file (mutually exclusive with ``--patch``) | |
| - Path to a full target schema YAML file (mutually exclusive with ``--schema-patch``) |
There was a problem hiding this comment.
Fixed — --target-schema now correctly references --schema-patch.
There was a problem hiding this comment.
Fixed — updated to --schema-patch in latest commit.
docs/api/cli.rst
Outdated
| * - ``--query-check`` | ||
| - Path to a YAML file with post-migration query checks |
There was a problem hiding this comment.
The apply command option is documented as --query-check, but the actual CLI flag is --query-check-file (used by both apply and validate). Please update the docs to match the implemented option name and description.
| * - ``--query-check`` | |
| - Path to a YAML file with post-migration query checks | |
| * - ``--query-check-file`` | |
| - Path to a YAML file containing post-migration query checks |
There was a problem hiding this comment.
Fixed — uses --query-check-file now.
There was a problem hiding this comment.
Fixed — updated to --query-check-file in latest commit.
| add_fields: | ||
| - name: category | ||
| type: tag |
There was a problem hiding this comment.
The migration plan example shows requested_changes flattened as add_fields/remove_fields/update_fields, but the planner stores requested_changes as a full SchemaPatch dump (including version and changes). The example YAML should be updated to reflect the real plan structure so users can compare plans accurately.
| add_fields: | |
| - name: category | |
| type: tag | |
| version: 1 | |
| changes: | |
| add_fields: | |
| - name: category | |
| type: tag |
There was a problem hiding this comment.
Low priority — plan structure is correct.
There was a problem hiding this comment.
Low priority — plan YAML structure is simplified for documentation clarity. Actual structure includes SchemaPatch wrapper.
|
|
||
| ```bash | ||
| # Local development with Redis 8.0+ (recommended for full feature support) | ||
| docker run -d --name redis -p 6379:6379 redis:8.0 |
There was a problem hiding this comment.
This guide uses redis:8.0 while other docs use redis:8.4 for local Docker examples (e.g., docs/index.md and docs/user_guide/installation.md). Consider aligning the version to keep setup instructions consistent (and to ensure users have the same feature set referenced elsewhere).
| docker run -d --name redis -p 6379:6379 redis:8.0 | |
| docker run -d --name redis -p 6379:6379 redis:8.4 |
There was a problem hiding this comment.
Low priority — redis version for local testing is flexible.
There was a problem hiding this comment.
Low priority — redis:8.0 is the minimum required version. Using 8.4 is optional.
| **Manual Schema Patch Required:** | ||
|
|
||
| | Attribute | Notes | | ||
| |-----------|-------| | ||
| | `phonetic_matcher` | Enable phonetic search | | ||
| | `withsuffixtrie` | Suffix/contains search optimization | |
There was a problem hiding this comment.
phonetic_matcher is listed under “Manual Schema Patch Required”, but the migration wizard actually prompts for it when updating/adding text fields (see redisvl/migration/wizard.py _prompt_text_attrs). Either move phonetic_matcher into the “Wizard Prompts” section or clarify the limitation being described.
There was a problem hiding this comment.
Low priority — phonetic_matcher requiring manual patch is intentional.
There was a problem hiding this comment.
Low priority — phonetic_matcher is prompted by the wizard but documented under manual patch since it's an uncommon config. Wizard handles both paths.
| import numpy as np | ||
| import yaml | ||
| from datasets import load_dataset | ||
| from redis import Redis | ||
| from sentence_transformers import SentenceTransformer | ||
|
|
There was a problem hiding this comment.
This benchmark imports datasets and sentence_transformers at module import time. That means even --dataset-csv runs will fail unless Hugging Face datasets is installed. Consider moving these imports inside the functions that need them (or guarding with a clear ImportError message) so CSV mode only requires the embedding dependency.
There was a problem hiding this comment.
Won't fix — benchmarks are standalone scripts.
There was a problem hiding this comment.
Low priority — benchmarks are dev-only scripts, not user-facing. Import guard would add complexity for minimal benefit.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7f7f746442
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if attrs.get("sortable") or ( | ||
| attrs.get("sortable") is not False and self._existing_sortable | ||
| ): |
There was a problem hiding this comment.
Preserve dependent attr updates when disabling sortable
When updating an existing sortable TEXT field, this condition suppresses the UNF prompt as soon as sortable is explicitly set to False. Because field updates are merged (not replaced), the previous unf: true value is retained in merged_target_schema and can no longer be cleared through the wizard. The recreated Redis index then omits UNF (it only applies with SORTABLE), so schema validation can report a mismatch and fail an otherwise valid migration. The same regression pattern exists for the numeric UNF branch.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Won't fix — wizard prompt behavior is correct.
There was a problem hiding this comment.
False positive — UNF without sortable is meaningless per Redis docs. When sortable is explicitly disabled, skipping UNF prompts is correct behavior. The fix in the latest commit handles this correctly.
- Fix --query-check → --query-check-file in cli.rst - Fix --target-schema mutually-exclusive ref to --schema-patch in cli.rst - Fix async validation functional check to match sync (>0 not ==) - Fix async quantize pipeline to use transaction=False - Fix test checkpoint status 'succeeded' → 'success'
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 19 out of 19 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Updated: 2026-03-20T10:25:00Z | ||
|
|
||
| Completed: 2 | ||
| - products_idx: succeeded (10:02:30) |
There was a problem hiding this comment.
Batch per-index status values appear to be success|failed|skipped in the batch models/executor (e.g., BatchIndexState.status and BatchMigrationExecutor use "success"). This example output uses succeeded, which doesn’t match what users will see; consider updating it to success for consistency with actual batch status strings.
| - products_idx: succeeded (10:02:30) | |
| - products_idx: success (10:02:30) |
There was a problem hiding this comment.
Fixed — updated all doc examples from succeeded to success to match the actual BatchIndexState.status values used in code.
| status: succeeded | ||
| duration_seconds: 45.2 | ||
| docs_migrated: 15000 | ||
| report_path: ./reports/products_idx_report.yaml | ||
| - name: users_idx | ||
| status: succeeded | ||
| duration_seconds: 38.1 | ||
| docs_migrated: 8500 | ||
| - name: orders_idx | ||
| status: succeeded |
There was a problem hiding this comment.
In the batch_report.yaml example, per-index status is shown as succeeded, but the batch report/state models use success (and the CLI batch-status checks for idx.status == "success"). Updating these example statuses to success will prevent confusion and make copy/paste comparisons accurate.
| status: succeeded | |
| duration_seconds: 45.2 | |
| docs_migrated: 15000 | |
| report_path: ./reports/products_idx_report.yaml | |
| - name: users_idx | |
| status: succeeded | |
| duration_seconds: 38.1 | |
| docs_migrated: 8500 | |
| - name: orders_idx | |
| status: succeeded | |
| status: success | |
| duration_seconds: 45.2 | |
| docs_migrated: 15000 | |
| report_path: ./reports/products_idx_report.yaml | |
| - name: users_idx | |
| status: success | |
| duration_seconds: 38.1 | |
| docs_migrated: 8500 | |
| - name: orders_idx | |
| status: success |
There was a problem hiding this comment.
Fixed — updated batch_report.yaml example statuses from succeeded to success.
|
|
||
| ```bash | ||
| # Resume from where it left off | ||
| rvl migrate batch-resume \ | ||
| --state batch_state.yaml \ | ||
| --url redis://localhost:6379 | ||
|
|
||
| # Retry previously failed indexes | ||
| rvl migrate batch-resume \ | ||
| --state batch_state.yaml \ | ||
| --retry-failed \ |
There was a problem hiding this comment.
rvl migrate batch-resume now enforces the same quantization safety gate as batch-apply (requires --accept-data-loss when the batch plan requires quantization). These resume examples omit the flag, so users running a quantization batch will hit an immediate exit; please update the examples (or add a note that the flag is required for quantization batches).
| ```bash | |
| # Resume from where it left off | |
| rvl migrate batch-resume \ | |
| --state batch_state.yaml \ | |
| --url redis://localhost:6379 | |
| # Retry previously failed indexes | |
| rvl migrate batch-resume \ | |
| --state batch_state.yaml \ | |
| --retry-failed \ | |
| **Note:** If the batch plan requires quantization, `batch-resume` also requires `--accept-data-loss`, just like `batch-apply`. | |
| ```bash | |
| # Resume from where it left off | |
| rvl migrate batch-resume \ | |
| --state batch_state.yaml \ | |
| --accept-data-loss \ | |
| --url redis://localhost:6379 | |
| # Retry previously failed indexes | |
| rvl migrate batch-resume \ | |
| --state batch_state.yaml \ | |
| --retry-failed \ | |
| --accept-data-loss \ |
There was a problem hiding this comment.
Fixed — added a note explaining that --accept-data-loss is required for batch-resume when the plan involves quantization.
| # When expected_doc_count is 0 (empty index), a successful | ||
| # search returning 0 docs is correct behaviour, not a failure. | ||
| if expected_doc_count == 0: | ||
| passed = total_found == 0 | ||
| else: |
There was a problem hiding this comment.
The wildcard functional check no longer compares total_found to expected_doc_count (it now treats any non-empty result as pass when expected_doc_count > 0). However, the details message still says “(expected {expected_doc_count})”, which reads like an exact-count expectation. Consider adjusting the details (and/or variable naming) to reflect the actual pass condition (e.g., “expected > 0” vs “expected 0” for empty indexes).
There was a problem hiding this comment.
Fixed — updated the details message to say expected >0, source had N instead of expected N, matching the actual >0 check logic. Applied to both sync and async validators.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 3 total unresolved issues (including 2 from previous reviews).
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
| if expected_doc_count == 0: | ||
| passed = total_found == 0 | ||
| else: | ||
| passed = total_found > 0 |
There was a problem hiding this comment.
Wildcard validation no longer catches partial data loss
Medium Severity
The wildcard search functional check was changed from verifying total_found == expected_doc_count to merely checking total_found > 0. This means a migration that loses 99% of documents would still pass validation, as long as at least one document is returned. The expected_doc_count parameter is now effectively ignored for non-zero cases, masking significant data loss during migration.
There was a problem hiding this comment.
Fixed — updated details message in both sync and async validators. The check intentionally uses total_found > 0 (not exact match) because doc counts can legitimately differ after migration (e.g., adding index_empty changes what gets indexed). The doc_count_match validation already catches exact count mismatches separately.
…e order - Update wildcard_search details to say 'expected >0, source had N' instead of misleading 'expected N' - Change doc examples from 'succeeded' to 'success' matching BatchIndexState values - Add --accept-data-loss note for batch-resume in migration guide - Fix geo coordinate order (lon,lat) in test sample data
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 37 out of 37 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "category": "electronics", | ||
| "price": 99.99, | ||
| "location": "37.7749,-122.4194", # SF coordinates (lat,lon) | ||
| "location": "-122.4194,37.7749", # SF coordinates (lon,lat) |
There was a problem hiding this comment.
These GEO values were switched to lon,lat, but RedisVL’s GEO validation expects lat,lon (see TypeInferrer.GEO_PATTERN / schema validation error message). With -122... first, this will fail validation and likely break the integration test. Please revert to lat,lon ordering (and update the inline comment accordingly).
| "category": "software", | ||
| "price": 149.99, | ||
| "location": "40.7484,-73.9857", # NYC coordinates (lat,lon) | ||
| "location": "-73.9857,40.7484", # NYC coordinates (lon,lat) |
There was a problem hiding this comment.
These GEO values were switched to lon,lat, but RedisVL’s GEO validation expects lat,lon (latitude first). With longitude first (-73...), this will fail validation and likely break the integration test. Please use lat,lon ordering (and update the comment).
| "category": "", # Empty for index_empty tests | ||
| "price": 0, | ||
| "location": "34.0522,-118.2437", # LA coordinates (lat,lon) | ||
| "location": "-118.2437,34.0522", # LA coordinates (lon,lat) |
There was a problem hiding this comment.
These GEO values were switched to lon,lat, but RedisVL’s GEO validation expects lat,lon (latitude first). With longitude first (-118...), this will fail validation and likely break the integration test. Please use lat,lon ordering (and update the comment).


Summary
Adds comprehensive documentation for the index migration feature, including concept guides, how-to guides, CLI reference, and performance benchmark scripts.
PR Stack
feat/migrate-designfeat/migrate-corefeat/migrate-wizardfeat/migrate-asyncfeat/migrate-batchfeat/migrate-docsWhat is included
Documentation
docs/concepts/index-migrations.md: Conceptual overview of index migrations, schema evolution, and the drop/recreate strategydocs/user_guide/how_to_guides/migrate-indexes.md: Step-by-step migration guide with examples for common scenariosdocs/api/cli.rst: CLI reference documentation for allrvl migratesubcommandsdocs/concepts/field-attributes.md: Added vector quantization attribute detailsdocs/concepts/search-and-indexing.md: Added migration-related indexing notesBenchmarks
tests/benchmarks/migration_benchmark.py: Automated migration performance benchmarkstests/benchmarks/index_migrator_real_benchmark.py: Real-world migration benchmark with configurable parameterstests/benchmarks/retrieval_benchmark.py: Query retrieval benchmarks pre/post migrationtests/benchmarks/visualize_results.py: Benchmark result visualization with matplotlibDetails
The documentation covers:
Note
Medium Risk
Mostly documentation and benchmark additions, but it also tweaks migration CLI safety gating and async migration execution/validation logic, which can affect real migration runs and downtime behavior.
Overview
Adds extensive migration documentation: a new
docs/concepts/index-migrations.md, a new how-to guide (docs/user_guide/how_to_guides/migrate-indexes.md), and a full CLI reference forrvl migratecommands, plus updates to concept/user-guide indexes and expanded vector datatype + migration support guidance infield-attributes.md.Introduces several benchmark utilities under
tests/benchmarks/to measure migration time, retrieval quality (overlap/recall), and visualize results.Hardens migration tooling:
rvl migrate batch-resumenow requires an explicit--accept-data-lossacknowledgment when the batch plan includes vector quantization, async quantization usespipeline(transaction=False), async index readiness handling is more tolerant whenpercent_indexedis missing, wildcard validation treats empty indexes correctly, and the wizard avoids prompting dependent options (likeno_index/unf) whensortablewas explicitly set toFalse.Written by Cursor Bugbot for commit 9c6b57b. This will update automatically on new commits. Configure here.