feat(migrate): [5/7] Interactive migration wizard#564
feat(migrate): [5/7] Interactive migration wizard#564nkanu17 wants to merge 7 commits intofeat/migrate-batchfrom
Conversation
🛡️ Jit Security Scan Results✅ No security findings were detected in this PR
Security scan by Jit
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b94e57ae6f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
Pull request overview
Adds an interactive CLI “migration wizard” for building SchemaPatch changes step-by-step (add/update/remove/rename fields, index name, prefix) and generating a MigrationPlan via the existing migration planner, plus extensive unit tests for vector algorithm/datatype/compression input handling.
Changes:
- Introduces
redisvl.migration.wizard.MigrationWizardto interactively stage schema changes and output a plan/patch/merged target schema. - Adds a large unit test suite validating wizard behavior for vector algorithm changes, datatype constraints, distance metrics, and adversarial inputs.
- Exports
MigrationWizardfromredisvl.migration.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
redisvl/migration/wizard.py |
New interactive wizard for constructing schema patches and plans with vector-aware prompts. |
tests/unit/test_migration_wizard.py |
New unit tests covering interactive input flows, especially vector configuration edge cases. |
redisvl/migration/__init__.py |
Re-exports MigrationWizard from the migration package. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b94e57ae6f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b94e57ae6f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- Filter staged additions from update/rename candidate lists - Cancel staged-add on remove instead of appending to remove_fields - Add _filter_staged_adds helper method - Reflect update_fields in working schema (_apply_staged_changes) - Prevent no-op vector algorithm update when user enters current value
e052930 to
634cfa1
Compare
b94e57a to
541db34
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 541db34af2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- Filter staged additions from update/rename candidate lists - Cancel staged-add on remove instead of appending to remove_fields - Add _filter_staged_adds helper method - Reflect update_fields in working schema (_apply_staged_changes) - Prevent no-op vector algorithm update when user enters current value
634cfa1 to
cecfd9d
Compare
- Remove queued updates/renames when a field is deleted - Apply index-level changes (name, prefix) in staged preview schema - Merge duplicate updates for the same field instead of appending
541db34 to
7eea293
Compare
|
@codex review |
1 similar comment
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7eea293e59
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
9d0329b to
0f42737
Compare
- Prevent rename target from colliding with staged field additions
- Show 'y or n' when allow_blank=False, 'y, n, or skip' when allow_blank=True
d01542e to
f8679ab
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f8679abf14
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| changes.update_fields = [ | ||
| u for u in changes.update_fields if u.name != field_name | ||
| ] | ||
| changes.rename_fields = [ | ||
| r for r in changes.rename_fields if r.old_name != field_name |
There was a problem hiding this comment.
Drop stale rename/update entries when removing renamed fields
When a user removes a field after it has been renamed, this branch only removes queued renames by old_name and queued updates by exact current name. If the removed field is the rename target (for example title -> headline, then remove headline), the rename operation is left in the patch, so the executor will still rewrite every document field even though the field is ultimately removed from the target schema; in some rename+update orderings this also leaves an update pointing to a now-missing field and can make plan creation fail. The remove path should clear dependent renames/updates for both sides of staged rename chains.
Useful? React with 👍 / 👎.
| if compression: | ||
| attrs["compression"] = compression | ||
|
|
||
| # Prompt for REDUCE if LeanVec compression is selected | ||
| if compression.startswith("LeanVec"): |
There was a problem hiding this comment.
Clear stale
reduce when changing SVS compression type
Switching an existing SVS field from LeanVec compression to a non-LeanVec option only sets compression and never clears an existing reduce value. Because merge_patch overlays attrs on top of the source field, the old reduce persists, and schema validation then rejects combinations like compression=LVQ8 with reduce set. This causes an interactive wizard flow to generate a patch that fails during plan creation for valid user intent (changing compression type).
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Apply removes | ||
| removed_names = set(changes.remove_fields) | ||
| working["fields"] = [ | ||
| f for f in working["fields"] if f["name"] not in removed_names | ||
| ] | ||
|
|
||
| # Apply renames | ||
| rename_map = {r.old_name: r.new_name for r in changes.rename_fields} | ||
| for field in working["fields"]: | ||
| if field["name"] in rename_map: | ||
| field["name"] = rename_map[field["name"]] | ||
|
|
There was a problem hiding this comment.
In the working-schema preview, staged removes are applied before staged renames. This can make the wizard’s subsequent prompts/preview diverge from how MigrationPlanner.merge_patch applies changes (renames are applied first there), especially when a user renames a field and then removes it (or vice versa). Consider applying renames before removes here, or normalizing remove_fields through the rename map so the interactive state matches the eventual merged schema.
| # Apply removes | |
| removed_names = set(changes.remove_fields) | |
| working["fields"] = [ | |
| f for f in working["fields"] if f["name"] not in removed_names | |
| ] | |
| # Apply renames | |
| rename_map = {r.old_name: r.new_name for r in changes.rename_fields} | |
| for field in working["fields"]: | |
| if field["name"] in rename_map: | |
| field["name"] = rename_map[field["name"]] | |
| # Apply renames first so working schema matches MigrationPlanner.merge_patch | |
| rename_map = {r.old_name: r.new_name for r in changes.rename_fields} | |
| for field in working["fields"]: | |
| if field["name"] in rename_map: | |
| field["name"] = rename_map[field["name"]] | |
| # Apply removes after renames | |
| removed_names = set(changes.remove_fields) | |
| working["fields"] = [ | |
| f for f in working["fields"] if f["name"] not in removed_names | |
| ] |
| # Apply updates (reflect attribute changes in working schema) | ||
| update_map = {u.name: u for u in changes.update_fields} | ||
| for field in working["fields"]: | ||
| if field["name"] in update_map: | ||
| upd = update_map[field["name"]] | ||
| if upd.attrs: | ||
| field.setdefault("attrs", {}).update(upd.attrs) | ||
| if upd.type: | ||
| field["type"] = upd.type |
There was a problem hiding this comment.
Staged updates are applied to the working schema by matching update_map keys to the current field name. If a user stages an update and later stages a rename (or loads a patch with both), the rename will change the field’s name but the update entry will still be keyed by the old name, so the working schema stops reflecting the staged update. This can mislead users and affect later selections. Suggest resolving updates through the rename_map (or rewriting update_fields entries when a rename is added) so updates continue to apply after renames.
| # Also remove any queued updates or renames for this field | ||
| changes.update_fields = [ | ||
| u for u in changes.update_fields if u.name != field_name | ||
| ] | ||
| changes.rename_fields = [ | ||
| r for r in changes.rename_fields if r.old_name != field_name |
There was a problem hiding this comment.
When removing a field, queued updates and renames are only cleared by matching update.name == field_name and rename.old_name == field_name. If the field was previously renamed, users will select the new name, leaving behind updates targeting the old name and/or renames that still reference the removed field; this can later cause create_plan_from_patch/merge_patch to raise (update refers to a non-existent field). Consider clearing updates/renames by resolving both old/new names via the current rename map (and also handling rename.new_name matches).
| # Also remove any queued updates or renames for this field | |
| changes.update_fields = [ | |
| u for u in changes.update_fields if u.name != field_name | |
| ] | |
| changes.rename_fields = [ | |
| r for r in changes.rename_fields if r.old_name != field_name | |
| # Also remove any queued updates or renames for this field, | |
| # including any aliases introduced via staged renames. | |
| related_names = {field_name} | |
| updated = True | |
| # Resolve all names connected by the current rename map | |
| while updated: | |
| updated = False | |
| for r in changes.rename_fields: | |
| if r.old_name in related_names or r.new_name in related_names: | |
| before_size = len(related_names) | |
| related_names.update({r.old_name, r.new_name}) | |
| if len(related_names) != before_size: | |
| updated = True | |
| # Drop any updates targeting any alias of this field | |
| changes.update_fields = [ | |
| u for u in changes.update_fields if u.name not in related_names | |
| ] | |
| # Drop any renames that reference any alias of this field | |
| changes.rename_fields = [ | |
| r | |
| for r in changes.rename_fields | |
| if r.old_name not in related_names | |
| and r.new_name not in related_names |
| "a field with that name is already staged for addition." | ||
| ) | ||
| else: | ||
| changes.rename_fields.append(field_rename) |
There was a problem hiding this comment.
Field renames are appended without checking for an existing rename for the same old_name (or for chained/conflicting renames). Staging multiple renames for the same source field will cause merge_patch to fail (second rename can’t find old_name anymore). Consider de-duplicating/replacing existing entries (or preventing multiple renames per original field) when adding a new FieldRename.
| changes.rename_fields.append(field_rename) | |
| # Replace any existing rename for the same source field | |
| replaced = False | |
| for i, existing_rename in enumerate(changes.rename_fields): | |
| if existing_rename.old_name == field_rename.old_name: | |
| changes.rename_fields[i] = field_rename | |
| replaced = True | |
| break | |
| if not replaced: | |
| changes.rename_fields.append(field_rename) |
| elif action == "3": | ||
| field_name = self._prompt_remove_field(working_schema) | ||
| if field_name: | ||
| # If removing a staged-add, cancel the add instead of | ||
| # appending to remove_fields | ||
| staged_add_names = {f["name"] for f in changes.add_fields} | ||
| if field_name in staged_add_names: | ||
| changes.add_fields = [ | ||
| f for f in changes.add_fields if f["name"] != field_name | ||
| ] | ||
| print(f"Cancelled staged addition of '{field_name}'.") | ||
| else: | ||
| changes.remove_fields.append(field_name) | ||
| # Also remove any queued updates or renames for this field | ||
| changes.update_fields = [ | ||
| u for u in changes.update_fields if u.name != field_name | ||
| ] | ||
| changes.rename_fields = [ | ||
| r for r in changes.rename_fields if r.old_name != field_name | ||
| ] | ||
| elif action == "4": | ||
| # Filter out staged additions from rename candidates | ||
| staged_add_names = {f["name"] for f in changes.add_fields} | ||
| rename_schema = self._filter_staged_adds( | ||
| working_schema, staged_add_names | ||
| ) | ||
| field_rename = self._prompt_rename_field(rename_schema) | ||
| if field_rename: | ||
| # Check rename target doesn't collide with staged additions | ||
| if field_rename.new_name in staged_add_names: | ||
| print( | ||
| f"Cannot rename to '{field_rename.new_name}': " | ||
| "a field with that name is already staged for addition." | ||
| ) | ||
| else: | ||
| changes.rename_fields.append(field_rename) |
There was a problem hiding this comment.
The wizard has complex interactions between rename/update/remove ordering (e.g., update a field, then rename it; or rename then remove). The current logic is sensitive to those sequences and can leave stale update/rename entries. Adding unit tests that cover these multi-step flows would help prevent regressions and catch merge_patch failures early.
…ep field operations and algorithm-aware validation Interactive CLI wizard for building migration plans step by step. Supports adding, removing, renaming, and updating fields with algorithm-aware validation including SVS-VAMANA datatype enforcement, HNSW parameter tuning, and compression/LeanVec configuration. Includes wizard unit tests.
- Filter staged additions from update/rename candidate lists - Cancel staged-add on remove instead of appending to remove_fields - Add _filter_staged_adds helper method - Reflect update_fields in working schema (_apply_staged_changes) - Prevent no-op vector algorithm update when user enters current value
- Remove queued updates/renames when a field is deleted - Apply index-level changes (name, prefix) in staged preview schema - Merge duplicate updates for the same field instead of appending
- Prevent rename target from colliding with staged field additions
- Show 'y or n' when allow_blank=False, 'y, n, or skip' when allow_blank=True
0f42737 to
7a9f38f
Compare
f8679ab to
4086c0c
Compare
|
@codex review |
|
Closing in favor of restructured PR stack (Option A ordering). |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4086c0c0e6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| changes.rename_fields = [ | ||
| r for r in changes.rename_fields if r.old_name != field_name | ||
| ] |
There was a problem hiding this comment.
Drop stale renames when removing a renamed field
When a field is removed, the cleanup only drops pending renames where old_name matches the removed name. If a user renames a -> b and then removes b, that rename remains queued, so execution can still perform document-wide field renames even though the target field is removed from the final schema. For hash indexes this means unnecessary full-keyspace writes and leaves unindexed orphaned data under b; the remove path should also clear renames whose new_name is being removed.
Useful? React with 👍 / 👎.
| # Apply removes | ||
| removed_names = set(changes.remove_fields) | ||
| working["fields"] = [ | ||
| f for f in working["fields"] if f["name"] not in removed_names | ||
| ] |
There was a problem hiding this comment.
Apply staged renames before staged removes
The staged-schema preview applies removals before renames, so a queued remove_fields entry that references a renamed target (e.g. remove b after a -> b) is not reflected in the working schema. This lets the wizard continue offering that field for updates/renames, and users can build patches that later fail in planning with Cannot update field ... does not exist once planner merge applies rename+remove in its own order. The staged transformation order should match planner semantics.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
| ] | ||
| changes.rename_fields = [ | ||
| r for r in changes.rename_fields if r.old_name != field_name | ||
| ] |
There was a problem hiding this comment.
Removal cleanup misses pre-rename operations, causing planner crash
High Severity
When removing a field that was previously renamed (e.g., "a" renamed to "b", then "b" removed), the cleanup filters changes.update_fields by u.name != field_name and changes.rename_fields by r.old_name != field_name. Since field_name is the post-rename name ("b"), but existing updates reference the pre-rename name ("a") and the rename has old_name="a", neither is cleaned up. The stale FieldUpdate(name="a") and FieldRename(old_name="a", new_name="b") remain in the patch. When the planner's merge_patch processes this, it renames "a"→"b", removes "b", then tries to resolve the update for "a" through the rename map to "b" — which no longer exists — raising a ValueError.
| if upd.attrs: | ||
| field.setdefault("attrs", {}).update(upd.attrs) | ||
| if upd.type: | ||
| field["type"] = upd.type |
There was a problem hiding this comment.
Working schema ignores updates referencing pre-rename field names
Medium Severity
In _apply_staged_changes, after renames are applied (changing field names in the working copy), the update resolution builds update_map keyed by u.name — which may still reference the pre-rename name. Since the field in working["fields"] now has its post-rename name, the lookup field["name"] in update_map fails to match. This means staged attribute updates are silently dropped from the working schema whenever the same field was also renamed, causing the wizard to display an inaccurate preview of the schema state.


Summary
Adds an interactive, terminal-based migration wizard that guides users through building a migration plan step by step. The wizard introspects the live index, presents the current schema, and walks through field operations (add, remove, rename, modify) with algorithm-aware validation.
Usage
What is included
redisvl/migration/wizard.py):MigrationWizardclass with:redisvl/migration/__init__.py): Exports forMigrationWizard.tests/unit/test_migration_wizard.py): Tests for wizard field operations, validation, and edge cases.Details
The wizard uses a menu-driven interface with numbered choices:
PR Stack
feat/migrate-corefeat/migrate-executorfeat/migrate-asyncfeat/migrate-batchfeat/migrate-wizardfeat/migrate-cli-docsfeat/migrate-benchmarks