feat(migrate): [4/7] Batch migration with multi-index support#563
feat(migrate): [4/7] Batch migration with multi-index support#563nkanu17 wants to merge 8 commits intofeat/migrate-asyncfrom
Conversation
🛡️ Jit Security Scan Results✅ No security findings were detected in this PR
Security scan by Jit
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e05293097f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
Pull request overview
Adds batch migration capabilities to redisvl.migration, enabling users to plan and execute migrations across multiple RediSearch indexes using a shared schema patch, with ordered execution, checkpoint/resume, and batch-level reporting.
Changes:
- Introduces
BatchMigrationPlannerfor multi-index selection (explicit list, file, glob pattern) and per-index applicability detection. - Introduces
BatchMigrationExecutorfor sequential batch execution with checkpoint state tracking, resume, failure policies, per-index reports, and batch summary reporting. - Adds unit + integration test coverage for planning, execution, checkpointing/resume, failure policies, and progress callbacks.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
redisvl/migration/batch_planner.py |
New batch planner to select indexes and generate a BatchPlan from a shared schema patch. |
redisvl/migration/batch_executor.py |
New batch executor to apply batch plans with checkpointing, resume, and reporting. |
redisvl/migration/__init__.py |
Exports batch planner/executor and batch models from the migration package. |
tests/unit/test_batch_migration.py |
Unit tests for batch planning and execution behavior using mocks. |
tests/integration/test_batch_migration_integration.py |
End-to-end batch migration integration tests against real Redis. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e05293097f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e05293097f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- Fix status mismatch: executor writes 'success' to match BatchState.success_count - Pass rename_operations to get_vector_datatype_changes - Validate failure_policy early (reject unknown values) - Make update_fields applicability rename-aware - Fix progress position during resume (correct offset) - Fix fail-fast: leave remaining in state for checkpoint resume - Atomic checkpoint writes (write to .tmp then rename) - Sanitize index_name in report filenames (path traversal) - Add assert guard for fnmatch pattern type
Remove unused Path, MagicMock, and patch imports.
0087dcf to
8642ec7
Compare
e052930 to
634cfa1
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 634cfa110f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
8642ec7 to
7a1ef9a
Compare
- Fix status mismatch: executor writes 'success' to match BatchState.success_count - Pass rename_operations to get_vector_datatype_changes - Validate failure_policy early (reject unknown values) - Make update_fields applicability rename-aware - Fix progress position during resume (correct offset) - Fix fail-fast: leave remaining in state for checkpoint resume - Atomic checkpoint writes (write to .tmp then rename) - Sanitize index_name in report filenames (path traversal) - Add assert guard for fnmatch pattern type
Remove unused Path, MagicMock, and patch imports.
634cfa1 to
cecfd9d
Compare
- Add rename target collision validation in batch applicability check - Propagate infrastructure errors (ConnectionError, TimeoutError) instead of silently marking as not applicable
|
@codex review |
1 similar comment
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cecfd9d979
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
b96771b to
ef8fca8
Compare
- Fix status mismatch: executor writes 'success' to match BatchState.success_count - Pass rename_operations to get_vector_datatype_changes - Validate failure_policy early (reject unknown values) - Make update_fields applicability rename-aware - Fix progress position during resume (correct offset) - Fix fail-fast: leave remaining in state for checkpoint resume - Atomic checkpoint writes (write to .tmp then rename) - Sanitize index_name in report filenames (path traversal) - Add assert guard for fnmatch pattern type
Remove unused Path, MagicMock, and patch imports.
- Add rename target collision validation in batch applicability check - Propagate infrastructure errors (ConnectionError, TimeoutError) instead of silently marking as not applicable
- Account for renames before rejecting field additions in batch applicability check - Fields being renamed away free their name for new additions
9d0329b to
0f42737
Compare
- Update plan_path when loading existing state if caller provides a new path - Handles cases where original path was empty or pointed to deleted temp dir
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0f42737c9f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| except ( | ||
| ConnectionError, | ||
| OSError, | ||
| TimeoutError, | ||
| redis.exceptions.ConnectionError, |
There was a problem hiding this comment.
Propagate Redis timeouts instead of marking indexes skipped
The infrastructure-error escape hatch in _check_index_applicability omits redis.exceptions.TimeoutError, so Redis timeouts fall into the broad except Exception path and are reported as applicable=False with a skip reason. In a transient outage, this silently converts operational failures into “not applicable” indexes, producing an incomplete batch plan instead of failing fast and retrying later.
Useful? React with 👍 / 👎.
| return self.apply( | ||
| batch_plan, | ||
| state_path=state_path, | ||
| report_dir=report_dir, |
There was a problem hiding this comment.
Persist overridden batch plan path during resume
resume() loads plan_path = batch_plan_path or state.plan_path, but the subsequent apply() call does not pass that resolved plan_path. When callers provide batch_plan_path to recover from a stale checkpoint path, the state file is never updated with the override; if execution stops again before finishing, the next resume without an explicit path can fail by reusing the old invalid plan_path.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # --- NEW: validate failure_policy early --- | ||
| """Create a batch migration plan for multiple indexes. | ||
|
|
There was a problem hiding this comment.
The function docstring is not the first statement in create_batch_plan() because it is preceded by a comment, so Python won’t treat it as a real docstring (and the string literal becomes a no-op expression). Move the docstring to be the first statement in the function body (place the comment below it) to restore introspection/help() behavior and avoid lint warnings.
| # Pattern matching -- pattern is guaranteed non-None at this point | ||
| assert pattern is not None, "pattern must be set when reaching fnmatch" | ||
| all_indexes = list_indexes(redis_client=redis_client) | ||
| matched = [idx for idx in all_indexes if fnmatch.fnmatch(idx, pattern)] |
There was a problem hiding this comment.
Pattern matching uses fnmatch.fnmatch(), which is OS-dependent (case-insensitive on Windows). Redis index names are typically case-sensitive, so this can produce inconsistent selection across platforms. Prefer fnmatch.fnmatchcase() (or explicitly normalize) to make matching deterministic.
| matched = [idx for idx in all_indexes if fnmatch.fnmatch(idx, pattern)] | |
| matched = [idx for idx in all_indexes if fnmatch.fnmatchcase(idx, pattern)] |
| state = self._load_state(state_path) | ||
| plan_path = batch_plan_path or state.plan_path | ||
| if not plan_path or not plan_path.strip(): | ||
| raise ValueError( | ||
| "No batch plan path available. Provide batch_plan_path explicitly, " | ||
| "or ensure the checkpoint state contains a valid plan_path." | ||
| ) | ||
| batch_plan = self._load_batch_plan(plan_path) | ||
|
|
||
| # Optionally retry failed indexes | ||
| if retry_failed: | ||
| failed_names = [ | ||
| idx.name for idx in state.completed if idx.status == "failed" | ||
| ] | ||
| state.remaining = failed_names + state.remaining | ||
| state.completed = [idx for idx in state.completed if idx.status != "failed"] | ||
| # Write updated state back to file so apply() picks up the changes | ||
| self._write_state(state, state_path) | ||
|
|
||
| # Re-run apply with the updated state | ||
| return self.apply( | ||
| batch_plan, | ||
| state_path=state_path, | ||
| report_dir=report_dir, | ||
| redis_url=redis_url, | ||
| redis_client=redis_client, | ||
| progress_callback=progress_callback, | ||
| ) |
There was a problem hiding this comment.
resume() resolves plan_path but doesn’t pass it through to apply() as batch_plan_path. If the checkpoint state has an empty/stale plan_path, resume will work once but the state file will remain unchanged, causing subsequent resume calls without an explicit batch_plan_path to fail again. Pass batch_plan_path=plan_path into apply() (and/or persist it to state) so the checkpoint becomes self-contained.
| # Check failure policy | ||
| if ( | ||
| index_state.status == "failed" | ||
| and batch_plan.failure_policy == "fail_fast" | ||
| ): | ||
| # Leave remaining indexes in state.remaining so that |
There was a problem hiding this comment.
BatchMigrationExecutor.apply() relies on string equality checks for failure_policy (only 'fail_fast' has special handling). If a BatchPlan is loaded/edited and contains a typo/unknown policy, behavior silently degrades to continue-on-error. Validate batch_plan.failure_policy against the supported set at the start of apply()/resume() and raise a ValueError on invalid values (mirroring BatchMigrationPlanner).
| failed_names = [ | ||
| idx.name for idx in state.completed if idx.status == "failed" | ||
| ] | ||
| state.remaining = failed_names + state.remaining |
There was a problem hiding this comment.
When retry_failed=True, state.remaining is rebuilt via failed_names + state.remaining, which can introduce duplicates if an index is already present in remaining (or if the checkpoint contains duplicates). That would re-run migrations unexpectedly and can make progress reporting confusing. Consider de-duplicating while preserving order (e.g., stable unique) when rebuilding remaining.
| state.remaining = failed_names + state.remaining | |
| # Prepend failed names to remaining, de-duplicating while preserving order | |
| combined = failed_names + state.remaining | |
| state.remaining = list(dict.fromkeys(combined)) |
| if idx_state.status in ("succeeded", "success"): | ||
| succeeded += 1 | ||
| elif idx_state.status == "failed": | ||
| failed += 1 | ||
| else: | ||
| skipped += 1 | ||
|
|
There was a problem hiding this comment.
Status handling is inconsistent: BatchIndexState/BatchIndexReport document 'success' but _build_batch_report also treats 'succeeded' as success. This makes counting/normalization error-prone (and unknown statuses fall into the 'skipped' bucket). Consider normalizing status values when loading/writing state (or use a constrained enum/Literal) so reports and state files use a single canonical vocabulary.
| "completed": [ | ||
| { | ||
| "name": index_names[0], | ||
| "status": "succeeded", |
There was a problem hiding this comment.
The checkpoint state written in this integration test uses status='succeeded', but BatchMigrationExecutor writes 'success' (and BatchIndexState docs list 'success'). Using a different status value makes the test state less representative and can mask issues in code that relies on canonical statuses (e.g., BatchState.success_count). Update the test checkpoint to use 'success' unless you are explicitly testing backward-compat normalization.
| "status": "succeeded", | |
| "status": "success", |
- Pass existing snapshot to create_plan_from_patch to avoid double Redis round-trip - Use _get_client() instead of _redis_client for lazy async client initialization - Remap datatype_changes keys to post-rename field names before quantization - Only resume from completed checkpoint when source index is actually gone
…ion, ordered execution, and state tracking Batch planner and executor for migrating multiple indexes in a single operation. Supports glob-based index selection, ordered execution with per-index state tracking, checkpoint/resume semantics, and batch-level reporting with fail-fast or continue-on-error policies. Includes batch unit and integration tests.
- Fix status mismatch: executor writes 'success' to match BatchState.success_count - Pass rename_operations to get_vector_datatype_changes - Validate failure_policy early (reject unknown values) - Make update_fields applicability rename-aware - Fix progress position during resume (correct offset) - Fix fail-fast: leave remaining in state for checkpoint resume - Atomic checkpoint writes (write to .tmp then rename) - Sanitize index_name in report filenames (path traversal) - Add assert guard for fnmatch pattern type
Remove unused Path, MagicMock, and patch imports.
- Add rename target collision validation in batch applicability check - Propagate infrastructure errors (ConnectionError, TimeoutError) instead of silently marking as not applicable
- Account for renames before rejecting field additions in batch applicability check - Fields being renamed away free their name for new additions
- Update plan_path when loading existing state if caller provides a new path - Handles cases where original path was empty or pointed to deleted temp dir
ef8fca8 to
28fb826
Compare
0f42737 to
7a9f38f
Compare
|
@codex review |
|
Closing in favor of restructured PR stack (Option A ordering). |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7a9f38fde1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| except ( | ||
| ConnectionError, | ||
| OSError, | ||
| TimeoutError, | ||
| redis.exceptions.ConnectionError, | ||
| ) as e: |
There was a problem hiding this comment.
Propagate Redis timeout errors during applicability checks
_check_index_applicability intends to re-raise infrastructure failures, but this handler misses redis.exceptions.TimeoutError; in redis-py, that timeout type is distinct from Python's built-in TimeoutError, so it falls through to the generic except Exception and gets converted into applicable=False. A transient Redis timeout will therefore silently mark an index as skipped instead of failing planning, producing incomplete batch plans without surfacing the real connectivity failure.
Useful? React with 👍 / 👎.
| return self.apply( | ||
| batch_plan, | ||
| state_path=state_path, | ||
| report_dir=report_dir, |
There was a problem hiding this comment.
Persist provided plan path when resuming
When resume() is called with an explicit batch_plan_path, the method loads that plan but then calls apply() without forwarding the same path. As a result, _init_or_load_state() keeps the old/empty state.plan_path, so if execution is interrupted again, a later resume() without another explicit path can fail with "No batch plan path available," undermining crash-safe multi-step resume behavior.
Useful? React with 👍 / 👎.
Summary
Adds batch migration support for migrating multiple indexes in a single operation. Includes glob-based index selection, ordered execution, failure policies, and checkpoint-based state tracking for crash-safe resume.
Usage
What is included
redisvl/migration/batch_planner.py):BatchMigrationPlannerthat generates aBatchPlanfrom glob patterns, comma-separated lists, or index files. Validates each index against the schema patch and marks non-applicable ones as skipped.redisvl/migration/batch_executor.py):BatchMigrationExecutorthat runs individual migrations sequentially with per-index error isolation, checkpoint state persistence, and configurable failure policies (fail_fastorcontinue_on_error).BatchPlan,BatchState,BatchReport, and related Pydantic models for serialization.redisvl/migration/__init__.py): Exports for batch classes.tests/unit/test_batch_migration.py): Tests covering batch planning, execution, resume, and failure handling.Details
fail_fastaborts on the first failure;continue_on_errorskips failed indexes and continues.batch_state.yamlafter each index completes, enabling resume from the exact point of interruption.batch-resume --retry-failedretries previously failed indexes.--report-dirdirectory.PR Stack
feat/migrate-corefeat/migrate-executorfeat/migrate-asyncfeat/migrate-batchfeat/migrate-wizardfeat/migrate-cli-docsfeat/migrate-benchmarks