<fix>[xinfini]: fail blacklist to prevent split-brain on VM start#3888
Open
ZStack-Robot wants to merge 1 commit into5.5.1from
Open
<fix>[xinfini]: fail blacklist to prevent split-brain on VM start#3888ZStack-Robot wants to merge 1 commit into5.5.1from
ZStack-Robot wants to merge 1 commit into5.5.1from
Conversation
XInfiniStorageController.blacklist was a no-op (// todo, comp.success()). When ExternalPrimaryStorageKvmFactory's beforeStartVmOnKvm fell back to blacklist after a deactivate failure, the no-op silently allowed the VM to start while a stale client on another host still held write access, risking a split-brain on the volume. 1. Why? xinfini does not implement volume path isolation yet. The blacklist no-op masked the missing capability and let the caller proceed as if the stale client had been fenced. 2. How? Throw OperationFailureException from blacklist with a clear message. Because beforeStartVmOnKvm invokes blacklist with NopeCompletion (which ignores fail()), only an exception can abort the VM start. This makes the unsupported case explicit and conservative until xinfini supports path isolation. 3. Side effects? VM start is aborted when an old active client cannot be deactivated on xinfini. This is the safe behavior; previously such VMs would start with split-brain risk. # Summary of changes (by module): - plugin/xinfini: throw OperationFailureException in blacklist instead of silently returning success. Related: ZSTAC-84963 Change-Id: Ib0092f39a7b23cf7b06442ac9616cb2ce39240b7
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml) Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
Walkthrough
Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Review rate limit: 4/5 reviews remaining, refill in 12 minutes. Comment |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
ZSTAC-84963
Root Cause
XInfiniStorageController.blacklist是// todo空操作,直接comp.success()。ExternalPrimaryStorageKvmFactory.beforeStartVmOnKvm在目标 host 启动 VM 前会查询存储侧活跃 client,若发现旧 host 仍持有 volume,会先nodeSvc.deactivate(...),失败时 fallback 到nodeSvc.blacklist(...)兜底。xinfini 的 blacklist 假装成功,导致旧 host 仍可写入 volume,新 host 同时启动,发生脑裂。注意:
beforeStartVmOnKvm调用 blacklist 时使用NopeCompletion(忽略comp.fail()),所以仅靠comp.fail()不能阻断 VM 启动,必须抛异常。Changes
plugin/xinfiniXInfiniStorageController.blacklist改为抛OperationFailureException,明确拒绝并阻断 VM 启动Test
Behavior
Related: ZSTAC-84963
sync from gitlab !9768