Skip to content

<fix>[core]: select callback ip by target#4423

Closed
MatheMatrix wants to merge 2 commits into
5.5.28from
sync/shixin.ruan/shixin-ZSTAC-86567
Closed

<fix>[core]: select callback ip by target#4423
MatheMatrix wants to merge 2 commits into
5.5.28from
sync/shixin.ruan/shixin-ZSTAC-86567

Conversation

@MatheMatrix

Copy link
Copy Markdown
Owner

Root cause:
Dual-stack MN keeps management.server.ip as the primary address and stores the secondary family in management.server.ip4/ip6. Some callback paths always used management.server.ip, so IPv6 resources could receive an IPv4 callback address, or IPv4 resources could receive an IPv6 callback address.

Fix:

  • Select REST callback URL by request target IP version.
  • Select callback checker IP by remote resource address family for KVM, Ceph, SFTP BS, and ZBS MDS.
  • Keep explicit callback headers unchanged.

Verification:

  • mvn -pl core,plugin/kvm,plugin/ceph,plugin/sftpBackupStorage,plugin/zbs -am -DskipTests compile
  • cd test && mvn test -Djacoco.skip=true -Dtest=ManagementNetworkIpv6Case

sync from gitlab !10385

Select the management callback address by remote target IP version in
dual-stack MN deployments. Keep explicit REST callback headers
unchanged and keep non-IP targets on the existing default callback URL.

Resolves: ZSTAC-86567

Change-Id: Id474bdc9002f25dfb1cf6334f0b18ffad3d2a526
@coderabbitai

coderabbitai Bot commented Jul 2, 2026

Copy link
Copy Markdown

Warning

Review limit reached

You’ve reached a temporary PR review limit under our Fair Usage Limits Policy.

Your recent review volume is higher than typical usage, so adaptive limits are currently applied.

Next review available in: 37 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.
You're only billed for reviews past your plan's rate limits ($0.25/file).

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 24ebf46e-d095-4c28-9984-a5e72881e2e4

📥 Commits

Reviewing files that changed from the base of the PR and between d22b42d and c58b24e.

📒 Files selected for processing (8)
  • core/src/main/java/org/zstack/core/Platform.java
  • core/src/main/java/org/zstack/core/rest/RESTFacadeImpl.java
  • plugin/ceph/src/main/java/org/zstack/storage/ceph/backup/CephBackupStorageMonBase.java
  • plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageMonBase.java
  • plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java
  • plugin/sftpBackupStorage/src/main/java/org/zstack/storage/backup/sftp/SftpBackupStorage.java
  • plugin/zbs/src/main/java/org/zstack/storage/zbs/ZbsPrimaryStorageMdsBase.java
  • test/src/test/groovy/org/zstack/test/integration/core/ManagementNetworkIpv6Case.groovy

Warning

.coderabbit.yaml has a parsing error

The CodeRabbit configuration file in this repository has a parsing error and default settings were used instead. Please fix the error(s) in the configuration file. You can initialize chat with CodeRabbit to get help with the configuration file.

💥 Parsing errors (1)
Could not fetch remote config from http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml: TimeoutError: The operation was aborted due to timeout
⚙️ Configuration instructions
  • Please see the configuration documentation for more information.
  • You can also validate your configuration using the online YAML validator.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch sync/shixin.ruan/shixin-ZSTAC-86567

Comment @coderabbitai help to get the list of available commands.

@ZStack-Robot

Copy link
Copy Markdown
Collaborator

Comment from shixin.ruan:

Code Review

# 严重程度 分类 文件 问题描述 修复建议
1 🟡 Major 性能 / callback 正确性 core/src/main/java/org/zstack/core/rest/RESTFacadeImpl.java:445, core/src/main/java/org/zstack/core/Platform.java:1396 asyncJson() 是所有 agent async REST 调用的通用热路径。这个 MR 让每个 IP literal URL 都执行 selectCallbackUrl() -> Platform.getManagementServerIpForRemote() -> getRouteSourceIp() -> /sbin/ip route get ...,也就是每次 KVM/PS/BS/ZBS/agent 调用都 fork 一个外部 shell。当前仓库里有约 60 个 asyncJson* 调用点,KVM ping、host 操作、存储 ping/任务都会走这里。高并发下这会把 worker thread 卡在同步 shell 上,带来明显性能和抖动风险。更糟的是,ip route get 返回的是 OS 选出的 source address,不一定是 management.server.ip/ip4/ip6 中用户配置、ZStack 对外声明的 MN 地址;多 IPv6 地址、临时地址、HA/VIP/额外持久地址场景下,callback URL 可能变成一个未被 ZStack 配置语义管理的地址。 通用 RESTFacadeImpl.asyncJson() 不要跑 /sbin/ip route get。这里应该按目标 IP version 从已配置的 MN 地址集合选择:目标 IPv6 选 management.server.ip 中的 IPv6 或 management.server.ip6,目标 IPv4 选 management.server.ip 中的 IPv4 或 management.server.ip4。如果确实需要 route source,只能放在明确低频、显式语义的路径,或者做 bounded cache,并且返回值必须限制在 ZStack 已配置/认可的 MN 地址集合内。新增测试需要覆盖:selectCallbackUrl() 不触发 shell;route source 为非配置地址时不会被作为 callback 地址。

结论: CONDITIONAL ⚠️

回归风险: 中

修复建议: 先修上面 Major 问题后再过。当前测试只验证了纯选择函数和 callback URL 格式,没有覆盖通用 async REST 热路径的外部 shell 调用成本,也没有覆盖 route source 返回非配置 MN 地址的场景。

Resolve review comment on MR !10385. Generic REST callback selection
now uses configured MN addresses by target IP version and does not
invoke ip route get on async REST hot path. Non-configured route source
candidates are ignored.

Resolves: ZSTAC-86567

Change-Id: Ia9a34bbe87c21067643e3bbfb410e8b501d25eb3
@MatheMatrix

Copy link
Copy Markdown
Owner Author

Comment from shixin.ruan:

Review comment fixed.

Defect type: performance and callback address correctness.

Root cause:
Generic async REST callback selection called Platform.getManagementServerIpForRemote(), which previously executed ip route get for IP-literal remote agent addresses. That put a shell command on the async REST hot path and could select an OS route source address that is not configured as management.server.ip/ip4/ip6.

Fix:

  • Platform.getManagementServerIpForRemote() now selects only from configured MN addresses by target IP version.
  • Platform.selectManagementServerIpForRemote() ignores route source candidates unless they are already configured MN addresses.
  • Existing explicit route-source usage remains limited to the VR-specific path.

Verification:

  • mvn -pl core -am -DskipTests compile: PASS
  • cd test && mvn test -Djacoco.skip=true -Dtest=ManagementNetworkIpv6Case: PASS, Tests run: 2, Failures: 0, Errors: 0, Skipped: 0

Commit: c58b24e [core]: avoid route shell in callback

@MatheMatrix MatheMatrix closed this Jul 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants