[ZSTAC-86445] Backport ZSTAC-85281 to 5.4.10#4380
Conversation
Change-Id: If07dce918c4d7d9f62445184ff5984e30b83ce64 (cherry picked from commit 8939994)
Trigger TF port sync from TF controller ping success instead of MN startup. Keep the original one day interval and port cleanup algorithm, while scoping sync to the current controller. Resolves: ZSTAC-85281 Change-Id: I84a88a34135cd47fc79b3019d2f0fc2953f06ac9 (cherry picked from commit e44d41a)
|
Warning Review limit reachedYou’ve reached a temporary PR review limit under our Fair Usage Limits Policy. Next review available in: 42 minutes Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available. How can I continue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews. How do review limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please refer docs for additional details. Review details⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
Warning
|
| Layer / File(s) | Summary |
|---|---|
TfZstackPortSync 重构为按需触发 plugin/.../network/TfZstackPortSync.java |
移除 ManagementNodeReadyExtensionPoint 实现;新增 triggerSyncIfDue(sdnControllerUuid),通过 lastSyncTime/runningSyncs 实现 1 天节流与并发去重;SyncPort 从 PeriodicTask 改为 Task<Void>,getPortToDelete 改为基于 sdnControllerUuid 查询且返回空集合替代 null。 |
TfPortClient 与 TfPortService 按 UUID 重载 plugin/.../neutronClient/TfPortClient.java, plugin/.../network/TfPortService.java |
TfPortClient 新增接受 sdnControllerUuid 的构造器,校验 vendorType 后调用 init;TfPortService 新增 getTfPortsDetail(uuid) 与 deleteTfPort(uuid, portUuid) 重载。 |
SugonSdnController Ping 消息处理 plugin/.../controller/SugonSdnController.java |
handleMessage 新增 SdnControllerPingMsg 分支,handMessage 通过 client.getDomain() 验证控制器默认域,成功时调用 triggerSyncIfDue,失败或异常时返回错误回复;注入 TfZstackPortSync 字段。 |
集成测试:Ping 触发同步验证 test/.../sdnController/SugonSdnControllerCase.groovy |
新增 testTfPortSyncTriggeredByPingSuccess,注册 TF_CREATE_VMI 模拟回调统计触发次数,通过 sendTfPing 发送 Ping 消息,断言同步仅触发一次且不重复触发。 |
Sequence Diagram(s)
sequenceDiagram
participant Client
participant SugonSdnController
participant TfZstackPortSync
participant TfPortService
Client->>SugonSdnController: SdnControllerPingMsg
SugonSdnController->>SugonSdnController: client.getDomain() 验证默认域
SugonSdnController->>TfZstackPortSync: triggerSyncIfDue(sdnControllerUuid)
TfZstackPortSync->>TfZstackPortSync: 检查 lastSyncTime & runningSyncs
TfZstackPortSync->>TfPortService: getTfPortsDetail(sdnControllerUuid)
TfPortService-->>TfZstackPortSync: VMI 列表
TfZstackPortSync->>TfPortService: deleteTfPort(sdnControllerUuid, portUuid)
TfZstackPortSync->>TfZstackPortSync: finally 清理 runningSyncs
SugonSdnController-->>Client: SdnControllerPingReply(success)
Estimated code review effort
🎯 3 (Moderate) | ⏱️ ~25 minutes
Poem
🐇 Ping 一声,兔子醒来跑!
不再守钟按时同步跳,
按 UUID 节流又去重,
域存在才触发,安全稳又好。
测试断言次数一,绝不多扰。
🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
| Check name | Status | Explanation | Resolution |
|---|---|---|---|
| Docstring Coverage | Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. | Write docstrings for the functions missing them to satisfy the coverage threshold. |
✅ Passed checks (4 passed)
| Check name | Status | Explanation |
|---|---|---|
| Title check | ✅ Passed | 标题准确描述了将 ZSTAC-85281 回迁到 5.4.10 的主要变更。 |
| Description check | ✅ Passed | 描述与回迁、cherry-pick 和目标分支信息一致,和实际变更相关。 |
| Linked Issues check | ✅ Passed | Check skipped because no linked issues were found for this pull request. |
| Out of Scope Changes check | ✅ Passed | Check skipped because no linked issues were found for this pull request. |
✨ Finishing Touches
🧪 Generate unit tests (beta)
- Create PR with unit tests
- Commit unit tests in branch
sync/shixin.ruan/shixin-ZSTAC-86445
Comment @coderabbitai help to get the list of available commands.
Remove DHCP message handlers and generated error code reference unavailable on 5.4.10. Change-Id: I1a15fb1eeeab60ca9c6fa239904bda89839b64e5
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
plugin/sugonSdnController/src/main/java/org/zstack/sugonSdnController/network/TfPortService.java (1)
123-126: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win把新增重载里的 UUID 变量名改成 lowerCamelCase。
portUUid和tfPortUUid的大小写不符合新增 Java 代码的命名要求,建议改为portUuid/tfPortUuid。建议调整
- public TfPortResponse deleteTfPort(String sdnControllerUuid, String portUUid) { - String tfPortUUid = StringDSL.transToTfUuid(portUUid); + public TfPortResponse deleteTfPort(String sdnControllerUuid, String portUuid) { + String tfPortUuid = StringDSL.transToTfUuid(portUuid); TfPortClient tfPortClient = new TfPortClient(sdnControllerUuid); - return tfPortClient.deletePort(tfPortUUid); + return tfPortClient.deletePort(tfPortUuid); }As per path instructions, “方法名、参数名、成员变量和局部变量:使用 lowerCamelCase 风格。”
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugin/sugonSdnController/src/main/java/org/zstack/sugonSdnController/network/TfPortService.java` around lines 123 - 126, In TfPortService.deleteTfPort, rename the UUID parameters and locals to lowerCamelCase to match the Java naming convention: change the method argument and the translated UUID variable from the current mixed-case forms to portUuid and tfPortUuid, and update the deletePort call accordingly. Keep the change confined to deleteTfPort and any directly related references so the overload remains consistent with the rest of the class.Source: Path instructions
plugin/sugonSdnController/src/main/java/org/zstack/sugonSdnController/controller/SugonSdnController.java (1)
65-71: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win将
handMessage重命名为表达明确的处理方法。
handMessage像是handleMessage的拼写遗漏,建议改为handlePingMessage,避免后续维护时混淆。建议重命名
- handMessage((SdnControllerPingMsg) msg); + handlePingMessage((SdnControllerPingMsg) msg); @@ - void handMessage(SdnControllerPingMsg msg) { + void handlePingMessage(SdnControllerPingMsg msg) {As per path instructions,命名应尽量用完整的单词组合表达意图。
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugin/sugonSdnController/src/main/java/org/zstack/sugonSdnController/controller/SugonSdnController.java` around lines 65 - 71, `handMessage` in `SugonSdnController` looks like a typo and should be renamed to a clearer, intent-revealing method name. Update the call site in the message dispatch path and rename the method itself to `handlePingMessage` so the `SdnControllerPingMsg` handling is explicit and consistent with the rest of the controller’s message-handling flow.Source: Path instructions
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@plugin/sugonSdnController/src/main/java/org/zstack/sugonSdnController/controller/SugonSdnController.java`:
- Around line 63-69: The new SdnController message entrypoint in
SugonSdnController.handleMessage needs the same MessageSafe protection as other
ZStack handlers so unexpected exceptions are converted into standard error
replies. Add `@MessageSafe` to the handleMessage(SdnControllerMessage) method,
keeping the existing SdnControllerPingMsg dispatch and
bus.dealWithUnknownMessage fallback unchanged.
In
`@plugin/sugonSdnController/src/main/java/org/zstack/sugonSdnController/network/TfZstackPortSync.java`:
- Around line 42-64: `TfZstackPortSync.triggerSyncIfDue` is recording
`lastSyncTime` before `SyncPort` actually succeeds, and the current
read-check-add sequence around `lastSyncTime`/`runningSyncs` is not atomic.
Update the throttling so the sync timestamp is written only after a successful
`SyncPort.call` path, and make failures in `getPortToDelete`/`call` leave the
entry eligible for retry instead of being treated as a completed sync. Also
guard the `lastSyncTime.get(...)` plus `runningSyncs.add(...)` decision with a
single atomic section (or equivalent concurrent primitive) so concurrent
triggers cannot bypass `runningSyncs` and submit duplicate syncs.
In
`@test/src/test/groovy/org/zstack/test/integration/network/sdnController/SugonSdnControllerCase.groovy`:
- Around line 86-93: 第二次 Ping 仍可能落在第一次异步同步未结束的窗口里,导致测试只验证了 runningSyncs
去重而不是一天节流逻辑。请在 SugonSdnControllerCase 的 sendTfPing / retryInSecs
这段测试里,把等待第一次同步完成的逻辑放到第二次 ping 之前,或增加一个能确认 SyncPort 已结束的确定性等待点,再发起第二次
sendTfPing;这样才能确保后续断言真正覆盖节流行为。
---
Nitpick comments:
In
`@plugin/sugonSdnController/src/main/java/org/zstack/sugonSdnController/controller/SugonSdnController.java`:
- Around line 65-71: `handMessage` in `SugonSdnController` looks like a typo and
should be renamed to a clearer, intent-revealing method name. Update the call
site in the message dispatch path and rename the method itself to
`handlePingMessage` so the `SdnControllerPingMsg` handling is explicit and
consistent with the rest of the controller’s message-handling flow.
In
`@plugin/sugonSdnController/src/main/java/org/zstack/sugonSdnController/network/TfPortService.java`:
- Around line 123-126: In TfPortService.deleteTfPort, rename the UUID parameters
and locals to lowerCamelCase to match the Java naming convention: change the
method argument and the translated UUID variable from the current mixed-case
forms to portUuid and tfPortUuid, and update the deletePort call accordingly.
Keep the change confined to deleteTfPort and any directly related references so
the overload remains consistent with the rest of the class.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 9140d990-ee78-427e-856b-6faca6c106b6
⛔ Files ignored due to path filters (1)
conf/springConfigXml/sugonSdnController.xmlis excluded by!**/*.xml
📒 Files selected for processing (5)
plugin/sugonSdnController/src/main/java/org/zstack/sugonSdnController/controller/SugonSdnController.javaplugin/sugonSdnController/src/main/java/org/zstack/sugonSdnController/controller/neutronClient/TfPortClient.javaplugin/sugonSdnController/src/main/java/org/zstack/sugonSdnController/network/TfPortService.javaplugin/sugonSdnController/src/main/java/org/zstack/sugonSdnController/network/TfZstackPortSync.javatest/src/test/groovy/org/zstack/test/integration/network/sdnController/SugonSdnControllerCase.groovy
| public void triggerSyncIfDue(String sdnControllerUuid) { | ||
| if (sdnControllerUuid == null) { | ||
| logger.warn("Port_Sync_Task: skip sync because sdn controller uuid is null."); | ||
| return; | ||
| } | ||
| trackerThread = thdf.submitPeriodicTask(new SyncPort()); | ||
| } | ||
|
|
||
| private class SyncPort implements PeriodicTask { | ||
| long now = System.currentTimeMillis(); | ||
| Long lastSync = lastSyncTime.get(sdnControllerUuid); | ||
| if (lastSync != null && now - lastSync < SYNC_INTERVAL_MILLIS) { | ||
| return; | ||
| } | ||
| if (!runningSyncs.add(sdnControllerUuid)) { | ||
| return; | ||
| } | ||
|
|
||
| @Override | ||
| public TimeUnit getTimeUnit() { | ||
| return TimeUnit.DAYS; | ||
| lastSyncTime.put(sdnControllerUuid, now); | ||
| try { | ||
| thdf.submit(new SyncPort(sdnControllerUuid)); | ||
| } catch (RuntimeException e) { | ||
| lastSyncTime.remove(sdnControllerUuid); | ||
| runningSyncs.remove(sdnControllerUuid); | ||
| throw e; | ||
| } |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
修正同步节流状态的竞态和失败标记。
当前 lastSyncTime 在任务成功前写入;而 getPortToDelete/call 会吞掉异常,导致一次失败同步也会被 Line 50 节流一天。另外,Line 49 的读取和 Line 57 的写入不是原子操作,并发 Ping 下可能绕过 runningSyncs 重复提交。
建议调整
- public void triggerSyncIfDue(String sdnControllerUuid) {
+ public synchronized void triggerSyncIfDue(String sdnControllerUuid) {
if (sdnControllerUuid == null) {
logger.warn("Port_Sync_Task: skip sync because sdn controller uuid is null.");
return;
}
@@
if (!runningSyncs.add(sdnControllerUuid)) {
return;
}
- lastSyncTime.put(sdnControllerUuid, now);
try {
thdf.submit(new SyncPort(sdnControllerUuid));
} catch (RuntimeException e) {
- lastSyncTime.remove(sdnControllerUuid);
runningSyncs.remove(sdnControllerUuid);
throw e;
}
@@
} catch (Exception e) {
logger.error(String.format("Port_Sync_Task: Fetch tf VirtualMachineInterface failed: %s.", e));
- return new HashSet<>();
+ throw new RuntimeException("Fetch tf VirtualMachineInterface failed", e);
}
@@
public Void call() {
logger.info("Port_Sync_Task: begin.");
+ boolean syncSucceeded = false;
try {
HashSet<String> portsToDelete = getPortToDelete();
int maxDeleteCount = MAX_DELETE_COUNT;
for (String portUuid: portsToDelete) {
@@
break;
}
}
+ syncSucceeded = true;
} catch (Exception e) {
logger.error(String.format("Port_Sync_Task failed: %s.", e));
} finally {
+ if (syncSucceeded) {
+ lastSyncTime.put(sdnControllerUuid, System.currentTimeMillis());
+ }
runningSyncs.remove(sdnControllerUuid);
}
return null;
}Also applies to: 104-139
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@plugin/sugonSdnController/src/main/java/org/zstack/sugonSdnController/network/TfZstackPortSync.java`
around lines 42 - 64, `TfZstackPortSync.triggerSyncIfDue` is recording
`lastSyncTime` before `SyncPort` actually succeeds, and the current
read-check-add sequence around `lastSyncTime`/`runningSyncs` is not atomic.
Update the throttling so the sync timestamp is written only after a successful
`SyncPort.call` path, and make failures in `getPortToDelete`/`call` leave the
entry eligible for retry instead of being treated as a completed sync. Also
guard the `lastSyncTime.get(...)` plus `runningSyncs.add(...)` decision with a
single atomic section (or equivalent concurrent primitive) so concurrent
triggers cannot bypass `runningSyncs` and submit duplicate syncs.
Add MessageSafe to the SDN message entrypoint. Move the second ping after the first async sync window in the focused test. Resolves: ZSTAC-86445 Change-Id: I4d7748a0de3b5d6a052a47966d2265249b2779e2
|
Comment on Comment from shixin.ruan: 已处理。 Defect class: MessageSafe / message error contract. Root cause: 这次 backport 在 Fix: commit Verification:
Remaining risk: 完整 case 在本地会因测试框架初始化阶段 JMX |
Clone Jira: ZSTAC-86445
Original Jira: ZSTAC-85281
Source fix MR: zstack!9929
Cherry-picked commit(s): 8939994, e44d41a
Target branch: 5.4.10
Validation: cherry-pick completed on 5.4.10 and git diff --check passed. Full regression was not run locally.
sync from gitlab !10340