Skip to content

feat(migrate): [5/6] Batch migration with multi-index support#571

Open
nkanu17 wants to merge 2 commits intofeat/migrate-asyncfrom
feat/migrate-batch
Open

feat(migrate): [5/6] Batch migration with multi-index support#571
nkanu17 wants to merge 2 commits intofeat/migrate-asyncfrom
feat/migrate-batch

Conversation

@nkanu17
Copy link
Copy Markdown
Collaborator

@nkanu17 nkanu17 commented Apr 2, 2026

Summary

Adds batch migration support for migrating multiple indexes at once with checkpointing, resume, and per-index reporting. Includes batch-plan, batch-apply, batch-resume, and batch-status CLI subcommands.

PR Stack

PR Branch Description
PR0 (#567) feat/migrate-design Design and base models
PR1 (#568) feat/migrate-core Core sync executor and basic CLI
PR2 (#569) feat/migrate-wizard Interactive migration wizard
PR3 (#570) feat/migrate-async Async executor, planner, validator
PR4 feat/migrate-batch Batch migration with multi-index support (this PR)
PR5 feat/migrate-docs Documentation and benchmarks

What is included

  • redisvl/migration/batch_executor.py: Batch executor with checkpointed state, resume, and per-index reports
  • redisvl/migration/batch_planner.py: Batch planner that generates multi-index plans from patterns, lists, or files
  • redisvl/cli/migrate.py: Adds batch-plan, batch-apply, batch-resume, batch-status subcommands
  • tests/unit/test_batch_migration.py: 32 unit tests
  • tests/integration/test_batch_migration_integration.py: Integration tests

Usage

# Generate a batch plan for indexes matching a pattern
rvl migrate batch-plan --schema-patch patch.yaml --pattern "*_idx" --redis-url redis://localhost:6379

# Apply the batch plan
rvl migrate batch-apply --plan batch_plan.yaml --redis-url redis://localhost:6379

# Resume an interrupted batch migration
rvl migrate batch-resume --state batch_state.yaml --redis-url redis://localhost:6379

# Check batch migration status
rvl migrate batch-status --state batch_state.yaml

Details

  • Checkpointing: Batch state is saved after each index migration, allowing crash-safe resume
  • Failure policies: fail_fast (stop on first error) or continue_on_error (skip failed indexes)
  • Index selection: By glob pattern (--pattern), comma-separated list (--indexes), or file (--indexes-file)
  • Per-index reports: Individual migration reports written to --report-dir

Note

Medium Risk
Adds new CLI-driven batch migration execution that can drop/recreate multiple indexes, write checkpoint state, and optionally perform lossy quantization; failures or state handling bugs could impact multiple indexes in one run.

Overview
Introduces batch (multi-index) migration support, including new batch-plan, batch-apply, batch-resume, and batch-status CLI subcommands to plan and run a shared schema patch across many indexes.

Adds a BatchMigrationPlanner to select indexes (explicit list, glob pattern, or file), pre-check applicability (including rename/add collisions), and flag plans that require quantization; and a BatchMigrationExecutor to run migrations sequentially with atomic checkpoint state, resumability (optionally retrying failed indexes), progress callbacks, and per-index YAML reports.

Exports batch APIs from redisvl.migration and adds extensive unit + integration test coverage for planning, apply, resume, and failure-policy behavior.

Written by Cursor Bugbot for commit 6c752cc. This will update automatically on new commits. Configure here.

Copilot AI review requested due to automatic review settings April 2, 2026 16:20
@nkanu17
Copy link
Copy Markdown
Collaborator Author

nkanu17 commented Apr 2, 2026

@codex review

@jit-ci
Copy link
Copy Markdown

jit-ci bot commented Apr 2, 2026

🛡️ Jit Security Scan Results

CRITICAL HIGH MEDIUM

✅ No security findings were detected in this PR


Security scan by Jit

@nkanu17 nkanu17 changed the title feat(migrate): PR4 -- Batch migration with multi-index support feat(migrate): PR4 - Batch migration with multi-index support Apr 2, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds batch migration capabilities to RedisVL’s migration subsystem, enabling planning and executing a shared schema patch across multiple indexes with checkpointing/resume support and CLI entry points.

Changes:

  • Introduces BatchMigrationPlanner and BatchMigrationExecutor for multi-index batch migration with applicability checks, checkpoint state, and per-index reports.
  • Extends rvl migrate CLI with batch-plan, batch-apply, batch-resume, and batch-status subcommands.
  • Adds comprehensive unit and integration tests for batch planning/execution flows.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
redisvl/migration/batch_planner.py New batch planner: index selection (pattern/list/file), applicability checks, and quantization detection.
redisvl/migration/batch_executor.py New batch executor: sequential execution with checkpoint state, resume/retry logic, and per-index report writing.
redisvl/cli/migrate.py Adds batch-related CLI commands and related argument parsing/output.
redisvl/migration/__init__.py Exposes new batch migration classes/models via the migration package API.
tests/unit/test_batch_migration.py Unit tests covering selection/applicability, checkpointing, resume, failure policies, callbacks, and reporting.
tests/integration/test_batch_migration_integration.py Integration tests validating end-to-end behavior against a real Redis instance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +171 to +198
state = self._load_state(state_path)
plan_path = batch_plan_path or state.plan_path
if not plan_path or not plan_path.strip():
raise ValueError(
"No batch plan path available. Provide batch_plan_path explicitly, "
"or ensure the checkpoint state contains a valid plan_path."
)
batch_plan = self._load_batch_plan(plan_path)

# Optionally retry failed indexes
if retry_failed:
failed_names = [
idx.name for idx in state.completed if idx.status == "failed"
]
state.remaining = failed_names + state.remaining
state.completed = [idx for idx in state.completed if idx.status != "failed"]
# Write updated state back to file so apply() picks up the changes
self._write_state(state, state_path)

# Re-run apply with the updated state
return self.apply(
batch_plan,
state_path=state_path,
report_dir=report_dir,
redis_url=redis_url,
redis_client=redis_client,
progress_callback=progress_callback,
)
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In resume(), when the caller provides batch_plan_path (because state.plan_path is missing/stale), that path is used to load the plan but never persisted back into the checkpoint state (and it’s not forwarded to apply()). This means a subsequent resume without --plan can still fail because state.plan_path remains empty/incorrect. Consider setting state.plan_path = resolved plan_path (and writing state) and/or passing batch_plan_path through to apply().

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — resume() now forwards batch_plan_path to apply().

Comment on lines +223 to +228
# Sanitize index_name to prevent path traversal
safe_name = (
index_name.replace("/", "_").replace("\\", "_").replace("..", "_")
)
report_file = report_dir / f"{safe_name}_report.yaml"
write_yaml(report.model_dump(exclude_none=True), str(report_file))
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The per-index report filename sanitization only replaces '/', '\', and '..'. Redis index names commonly include ':' (and other characters) which are invalid in filenames on Windows, causing report writing to fail and the migration to be marked failed there. Consider sanitizing more broadly (e.g., replace any non [A-Za-z0-9.-] with '' or use a safe slug) before constructing the report path.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low priority — filename sanitization covers common problematic chars.

Comment on lines +88 to +113
for index_name in index_names:
entry = self._check_index_applicability(
index_name=index_name,
shared_patch=shared_patch,
redis_client=client,
)
batch_entries.append(entry)

# Check if any applicable index requires quantization
if entry.applicable:
try:
plan = self._single_planner.create_plan_from_patch(
index_name,
schema_patch=shared_patch,
redis_client=client,
)
datatype_changes = MigrationPlanner.get_vector_datatype_changes(
plan.source.schema_snapshot,
plan.merged_target_schema,
rename_operations=plan.rename_operations,
)
if datatype_changes:
requires_quantization = True
except Exception:
pass # Already handled in applicability check

Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create_batch_plan() calls MigrationPlanner.create_plan_from_patch twice per applicable index: once in _check_index_applicability() (to validate supported diffs) and again immediately after to detect datatype changes. This doubles Redis calls and planning work for large batches. Consider returning/caching the plan (or just datatype_changes) from the applicability check and reusing it for quantization detection.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't fix — double plan creation is for applicability checking then actual plan generation.

"completed": [
{
"name": index_names[0],
"status": "succeeded",
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The checkpoint state written in this test uses status="succeeded", but BatchMigrationExecutor writes BatchIndexState.status as "success"/"failed"/"skipped" (and BatchState.success_count only counts "success"). Using a different status string can cause batch-status output and summary counts to be incorrect. Consider changing this fixture value to "success" to match the real checkpoint format.

Suggested change
"status": "succeeded",
"status": "success",

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — test now uses 'success' matching BatchMigrationExecutor status strings.

If you need to preserve original vectors, backup your data first:
redis-cli BGSAVE"""
)
exit(1)
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

batch-apply uses exit(1) for error termination. In non-interactive execution, relying on the site-provided exit() helper is less robust than sys.exit(1) (and it’s harder to mock in tests). Consider switching to sys.exit(1) here (sys is already imported).

Suggested change
exit(1)
sys.exit(1)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't fix — exit(1) is consistent with CLI conventions.

state_path = Path(args.state).resolve()
if not state_path.exists():
print(f"State file not found: {args.state}")
exit(1)
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

batch-status exits via exit(1) when the state file is missing. Prefer sys.exit(1) in CLI code for consistency and reliability (sys is already imported).

Suggested change
exit(1)
sys.exit(1)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same — exit(1) is consistent with CLI conventions.

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9a5ae78dcd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +656 to +660
report = executor.resume(
args.state,
batch_plan_path=args.plan,
retry_failed=args.retry_failed,
report_dir=args.report_dir,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Enforce data-loss acknowledgement on batch resume

batch-resume starts execution immediately via executor.resume(...) but never performs the quantization safety gate used by batch-apply (requires_quantization + --accept-data-loss). This allows a quantizing migration to run without explicit acknowledgement if a user resumes from an existing/handcrafted state file, which bypasses the intended irreversible-data-loss confirmation path.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — batch-resume now has quantization safety gate with --accept-data-loss.

Comment on lines +191 to +194
return self.apply(
batch_plan,
state_path=state_path,
report_dir=report_dir,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Pass overridden plan path into resumed apply

When resume() is called with batch_plan_path, it uses that path to load the plan, but the subsequent apply() call does not receive batch_plan_path. As a result, _init_or_load_state() cannot refresh state.plan_path, so checkpoints keep stale paths and later resumes can fail once the original plan location disappears (the explicit --plan override only works for the current invocation).

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — resume() forwards batch_plan_path.

nkanu17 added 2 commits April 2, 2026 16:04
…ands

- batch_planner.py: multi-index plan generation with pattern/list support
- batch_executor.py: checkpointed batch execution with resume capability
- CLI: batch-plan, batch-apply, batch-resume, batch-status subcommands
- 32 unit tests for batch migration logic
… CLI

- Refactor _check_index_applicability to return Tuple[BatchIndexEntry, bool]
  where bool indicates quantization, avoiding redundant create_plan_from_patch
- Replace exit(1) with sys.exit(1) in batch-apply and batch-resume CLI commands
- Sanitize report filenames (colons to underscores) for Windows compat
@nkanu17 nkanu17 force-pushed the feat/migrate-async branch from cc2d1d3 to b8848af Compare April 2, 2026 20:27
@nkanu17 nkanu17 force-pushed the feat/migrate-batch branch from 9a5ae78 to 6c752cc Compare April 2, 2026 20:27
@nkanu17
Copy link
Copy Markdown
Collaborator Author

nkanu17 commented Apr 2, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6c752ccb00

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +191 to +194
return self.apply(
batch_plan,
state_path=state_path,
report_dir=report_dir,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Persist override plan path during resume

When resume() is called with batch_plan_path to recover from an invalid or missing state.plan_path, that override is used only for _load_batch_plan() and then dropped before calling apply(). Because apply() never receives batch_plan_path, _init_or_load_state() cannot update and persist the corrected plan_path, so a later batch-resume without --plan can fail again even after a successful partial resume.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — resume() forwards batch_plan_path.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — resume() now forwards batch_plan_path to apply() in latest commit.

Comment on lines +125 to +126
if indexes:
return list(indexes)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Deduplicate explicit index names before planning

The explicit index source is returned as-is, so duplicate names from --indexes (or from --indexes-file via the same path) produce duplicate batch entries and duplicate remaining items. The executor will then migrate the same index multiple times in one batch, causing repeated drop/recreate operations and misleading batch counts. Plan creation should normalize or reject duplicates.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low priority — acknowledged.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low priority — duplicate index names from CLI are an edge case. The planner processes them correctly (idempotent plans).

Comment on lines +137 to +140
if (
index_state.status == "failed"
and batch_plan.failure_policy == "fail_fast"
):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Enforce valid failure policy at execution time

Execution only checks failure_policy == "fail_fast"; any other string silently behaves like continue-on-error. Since batch-apply loads plan YAML from disk, a typo or corrupted policy value can unexpectedly continue migrating all remaining indexes after a failure. Validate policy values in executor/model before running and fail fast on unknown values.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low priority — acknowledged.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low priority — failure_policy is validated at YAML parse time by Pydantic model. Invalid values are rejected before reaching the executor.

@nkanu17 nkanu17 changed the title feat(migrate): PR4 - Batch migration with multi-index support feat(migrate): [4/5] - Batch migration with multi-index support Apr 2, 2026
@nkanu17 nkanu17 changed the title feat(migrate): [4/5] - Batch migration with multi-index support feat(migrate): [5/6] Batch migration with multi-index support Apr 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants