-
Notifications
You must be signed in to change notification settings - Fork 0
<fix>[allocator]: reject designated cross-cluster migration when strategy is Soft #3190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…tegy is Soft When user explicitly designates a target host for VM migration, ResourceBindingAllocatorFlow should reject cross-cluster migration regardless of whether strategy is Hard or Soft. Before: Soft strategy would relax the constraint and allow cross-cluster After: Designated cross-cluster migration fails even with Soft strategy The Soft strategy now only relaxes constraints during system auto-allocation when no hosts are available in the bound cluster. Resolves: ZSTAC-75428 Change-Id: I6b76656d7069736a6c707262657465696c717668
概览本次更改在资源分配流程中添加了针对用户指定主机的显式验证逻辑,当指定主机不在绑定资源中时立即失败返回,同时新增一套全面的VM迁移集成测试用例来验证资源绑定场景下的迁移行为。 变更清单
代码评审工作量评估🎯 3 (中等) | ⏱️ ~20 分钟 庆祝诗
Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 1 warning)
✅ Passed checks (1 passed)
✨ Finishing touches
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
compute/src/main/java/org/zstack/compute/allocator/ResourceBindingAllocatorFlow.javatest/src/test/groovy/org/zstack/test/integration/kvm/vm/migrate/ResourceBindingMigrateVmCase.groovy
🧰 Additional context used
📓 Path-based instructions (2)
**/*.*
⚙️ CodeRabbit configuration file
**/*.*: - 代码里不应当有有中文,包括报错、注释等都应当使用正确的、无拼写错误的英文来写
Files:
compute/src/main/java/org/zstack/compute/allocator/ResourceBindingAllocatorFlow.javatest/src/test/groovy/org/zstack/test/integration/kvm/vm/migrate/ResourceBindingMigrateVmCase.groovy
**/*.java
⚙️ CodeRabbit configuration file
**/*.java: ## 1. API 设计要求
- API 命名:
- API 名称必须唯一,不能重复。
- API 消息类需要继承
APIMessage;其返回类必须继承APIReply或APIEvent,并在注释中用@RestResponse进行标注。- API 消息上必须添加注解
@RestRequest,并满足如下规范:
path:
- 针对资源使用复数形式。
- 当 path 中引用消息类变量时,使用
{variableName}格式。- HTTP 方法对应:
- 查询操作 →
HttpMethod.GET- 更新操作 →
HttpMethod.PUT- 创建操作 →
HttpMethod.POST- 删除操作 →
HttpMethod.DELETE- API 类需要实现
__example__方法以便生成 API 文档,并确保生成对应的 Groovy API Template 与 API Markdown 文件。
2. 命名与格式规范
类名:
- 使用 UpperCamelCase 风格。
- 特殊情况:
- VO/AO/EO 类型类除外。
- 抽象类采用
Abstract或Base前缀/后缀。- 异常类应以
Exception结尾。- 测试类需要以
Test或Case结尾。方法名、参数名、成员变量和局部变量:
- 使用 lowerCamelCase 风格。
常量命名:
- 全部大写,使用下划线分隔单词。
- 要求表达清楚,避免使用含糊或不准确的名称。
包名:
- 统一使用小写,使用点分隔符,每个部分应是一个具有自然语义的英文单词(参考 Spring 框架的结构)。
命名细节:
- 避免在父子类或同一代码块中出现相同名字的成员或局部变量,防止混淆。
- 命名缩写:
- 不允许使用不必要的缩写,如:
AbsSchedulerJob、condi、Fu等。应使用完整单词提升可读性。
3. 编写自解释代码
意图表达:
- 避免使用布尔型参数造成含义不明确。例如:
- 对于
stopAgent(boolean ignoreError),建议拆分为不同函数(如stopAgentIgnoreError()),或使用枚举表达操作类型。- 命名应尽量用完整的单词组合表达意图,并在名称中体现数据类型或用途(例如在常量与变量名称中,将类型词放在末尾)。
- 避免使用魔法值(Magic Value):
直接使用未经定义的数值或字符串(如 if (status == 5))应替换为枚举或常量。
示例:
// 错误示例:魔法值
if (user.getStatus() == 5) { ... }
// 正确示例:常量或枚举
public static final int STATUS_ACTIVE = 5;
if (user.getStatus() == STATUS_ACTIVE) { ... }
// 或使用枚举
enum UserStatus { ACTIVE, INACTIVE }
注释:
- 代码应尽量做到自解释,对少于两行的说明可以直接写在代码中。
- 对于较长的注释,需要仔细校对并随代码更新,确保内容正确。
- 接口方法不应有多余的修饰符(例如
public),且必须配有有效的 Javadoc 注释。
4. 流程控制和结构优化
if...else 的使用:
- 应尽量减少 if...else 结构的使用,建议:
- 限制嵌套层级最多为两层,且内层不应再出现
else分支。- 尽早返回(Early Return),将条件判断中的处理逻辑提前结束或抽成独立方法。
- 使用 Java Stream 或 Lambda 表达式代替冗长的循环与条件判断。
条件判断:
- if 条件表达不宜过长或过于复杂,必要时可以将条件抽成 boolean 变量描述。
代码块长度:
...
Files:
compute/src/main/java/org/zstack/compute/allocator/ResourceBindingAllocatorFlow.java
🔍 Remote MCP AtlassianMCP
Perfect! Now I have comprehensive information. Let me compile the findings:
Summary of Relevant Context Found
Jira Issue Background
Issue: ZSTAC-75428 (P1 - Critical for 5.5.6 release)
- Title: Cross-cluster migration allowed with Soft strategy when it shouldn't be (in Chinese: "【5.5.6必须解】【集群绑定】scene 为 all strategy 为 soft,集群资源充足的情况下,zstack-cli可跨集群迁移,与预期不符")
- Status: Open
- Assignee: 马林 (Lin Ma) - who also authored this PR
Problem Description (from issue):
- Configuration:
vm.ha.across.clusters = false,resourceBinding.strategy = Soft,resourceBinding.Scene = All - Bug: UI properly rejects cross-cluster migration, but CLI (via MigrateVm) incorrectly allows it
- Expected behavior: Both UI and CLI should reject cross-cluster migration regardless of Soft strategy when user explicitly designates a target host
Key Issue Comments:
- Lin Ma's analysis (Jan 12, 2026): The fix targets row 4 of a truth table showing that when user specifies a cross-cluster target host with
haAcross=falseandstrategy=Soft, the system should fail immediately instead of allowing Soft strategy to relax the constraint. - The original GitLab MR #9009 is linked as the upstream source of this fix.
Resource Binding Concepts
From Confluence page "集群绑定" (Cluster Binding):
Configuration Parameters:
-
vm.ha.across.clusters: Controls cross-cluster HA at VM or cluster levelfalse: Enable cluster binding (prevent cross-cluster operations)true: Allow cross-cluster operations
-
resourceBinding.Scene: Defines binding scenarios (cluster-level granularity)- Default:
All
- Default:
-
resourceBinding.strategy: Controls allocation behavior when resources are constrained- Hard strategy: If no available hosts in the bound cluster, reject the operation (fail immediately)
- Soft strategy: If no available hosts in the bound cluster, allow relaxation to access hosts outside the bound cluster
Interaction Logic:
- Cluster binding is enabled by setting
resourceBinding.Scene = All+vm.ha.across.clusters = false - The Soft strategy's relaxation (cross-cluster access) should only apply during system auto-allocation when no hosts are available in the bound cluster
- Critical fix: Soft strategy should NOT relax constraints when a user explicitly designates a target host across cluster boundaries
Relationship to PR Changes
The PR implements exactly what the Jira issue identifies as the fix:
- Main file change (
ResourceBindingAllocatorFlow.java): Adds explicit validation that rejects designated cross-cluster targets immediately, even with Soft strategy - Test file (
ResourceBindingMigrateVmCase.groovy): Test case 4 specifically validates this fix - confirming that cross-cluster migration withhaAcross=falseandstrategy=Softfails when user designates a target host
Key distinction the fix enforces:
- User-designated target host: Must obey cluster binding regardless of strategy
- System auto-allocation: Soft strategy can relax to find resources outside bound cluster
🔇 Additional comments (8)
compute/src/main/java/org/zstack/compute/allocator/ResourceBindingAllocatorFlow.java (3)
14-14: LGTM!新增的
HostAllocatorConstant导入是新逻辑所必需的,用于访问LocationSelector.host常量。
148-155: 逻辑清晰,实现正确注释清楚地区分了两种场景:
- 用户显式指定目标主机时 → 无论策略如何都严格执行绑定约束
- 系统自动分配时 → Soft 策略可以放宽约束
这与 PR 目标和 ZSTAC-75428 的预期行为完全一致:Soft 策略的放宽机制仅在系统自动分配且绑定集群内无可用主机时生效。
132-146: 代码安全,无需修改。
spec.getExtraData()在HostAllocatorSpec中被初始化为new HashMap<>()(第 24 行),getExtraData()方法直接返回该字段,永远不会返回null。整个代码库中,包括DesignatedHostAllocatorFlow.java和AbstractHostAllocatorStrategyFactory.java等文件,都直接调用spec.getExtraData().get()和.put()而无需空值检查。这与 ZStack 项目的既有约定一致(返回空集合而非null)。test/src/test/groovy/org/zstack/test/integration/kvm/vm/migrate/ResourceBindingMigrateVmCase.groovy (5)
22-47: 文档完整性良好!测试类的 Javadoc 清晰地记录了 12 种测试场景的真值表,涵盖了用户指定目标主机和系统自动分配两种情况。这对于理解测试覆盖范围和 ZSTAC-75428 修复的预期行为非常有帮助。
65-175: 环境配置合理!测试环境设置得当:
- 两个集群共享 Ceph 存储,满足跨集群迁移的存储要求
- VM 初始位于 cluster1 的 host1,便于测试各种迁移场景
229-255: 测试辅助方法设计合理
resetVmToHost1Cluster1方法正确地临时启用跨集群迁移来重置 VM 位置,确保每个测试用例从一致的初始状态开始。这是良好的测试隔离实践。
592-631: Case 12 正确验证了 Soft 策略的放宽行为此测试用例准确验证了核心修复的边界条件:当用户未指定目标主机且集群资源不足时,Soft 策略允许跨集群迁移。这与 Case 4(用户指定目标主机时即使 Soft 策略也应拒绝跨集群迁移)形成对比,完整覆盖了 ZSTAC-75428 的修复逻辑。
177-202: 测试结构清晰,执行顺序合理测试方法按照真值表的顺序执行所有 12 个测试用例,先执行用户指定目标主机的场景(Cases 1-8),再执行系统自动分配的场景(Cases 9-12)。每个测试前的
resetVmToHost1Cluster1()和enableAllHosts()调用确保了测试隔离。
| assert result.error.details.contains("designated host") || | ||
| result.error.details.contains("bound resource") || | ||
| result.error.details.contains("binded resource") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# 搜索生产代码中的 "binded" 拼写
echo "=== 搜索生产代码中的 'binded' ==="
rg -i "binded" --type java --type groovy -g '!**/test/**' | head -20
echo ""
echo "=== 搜索生产代码中 'bound resource' 相关的错误消息 ==="
rg "bound resource|binding" --type java --type groovy -g '!**/test/**' -A 2 -B 2 | head -40
echo ""
echo "=== 查看测试文件的完整上下文 (lines 370-385) ==="
fd "ResourceBindingMigrateVmCase.groovy" --exec sed -n '370,385p' {}Repository: MatheMatrix/zstack
Length of output: 5142
🏁 Script executed:
#!/bin/bash
# 确认需要修复的所有文件
echo "=== ResourceBindingAllocatorFlow.java 中的错误消息 ==="
rg -n "binded resource" compute/src/main/java/org/zstack/compute/allocator/ResourceBindingAllocatorFlow.java
echo ""
echo "=== 所有包含 'binded' 的生产代码文件 ==="
rg -l "binded" --type java --type groovy -g '!**/test/**' | grep -v expon | head -20Repository: MatheMatrix/zstack
Length of output: 398
修正生产代码中的英语拼写错误:"binded" 应为 "bound"
根据编码规范,代码应使用正确的英文。"bind" 的过去分词是 "bound",不是 "binded"。此拼写错误存在于生产代码中:
compute/src/main/java/org/zstack/compute/allocator/ResourceBindingAllocatorFlow.java第 154 行的错误消息conf/i18n/messages_en_US.properties和conf/i18n/messages_zh_CN.properties中的国际化配置
测试代码在第 378-380 行检查的是生产代码生成的这个错误消息。建议同时修正生产代码和测试代码,使其与修正后的错误消息保持一致。
When user explicitly designates a target host for VM migration,
ResourceBindingAllocatorFlow should reject cross-cluster migration
regardless of whether strategy is Hard or Soft.
Before: Soft strategy would relax the constraint and allow cross-cluster
After: Designated cross-cluster migration fails even with Soft strategy
The Soft strategy now only relaxes constraints during system auto-allocation
when no hosts are available in the bound cluster.
Resolves: ZSTAC-75428
Change-Id: I6b76656d7069736a6c707262657465696c717668
sync from gitlab !9009