-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
Search before asking
- I had searched in the issues and found no similar issues.
Version
Doris 4.0.3, 4.0.4-slim
What's Wrong?
In Doris cloud mode, when deploying FE cluster using Docker host network where each FE node uses a
different http_port, Observer nodes fail to join the cluster.
Error log:
WARN [Env.getFeNodeTypeAndNameFromHelpers():1520] failed to get fe node type from helper node:
HostInfo{host='127.0.0.1', port=9010}.
java.net.ConnectException: Connection refused
WARN [Env.getClusterIdAndRole():1342] current node HostInfo{host='127.0.0.1', port=9011} is not added to
the group. please add it first.
Root cause:
In Env.getFeNodeTypeAndNameFromHelpers() method:
String url = "http://" + NetUtils.getHostPortInAccessibleFormat(
helperNode.getHost(), // "127.0.0.1" (correct)
Config.http_port // Uses current node's http_port, NOT helper's!
) + "/role?host=...";
The code uses Config.http_port (current node's port) instead of the helper node's http_port. This assumes
all FE nodes use the same http_port, which fails when:
- Using Docker host network mode (containers share host network, must use different ports)
What You Expected
FE-2 (Observer) should connect to FE-1's HTTP endpoint at port 8030 and join the cluster successfully.
The code should use the helper node's http_port when constructing the HTTP URL, not the current node's
Config.http_port.
What You Expected?
- FE-2 should successfully connect to FE-1's HTTP endpoint and join the cluster as an Observer node.
- The code should use the helper node's http_port when constructing the HTTP URL, not the current node's
Config.http_port.
Current behavior (wrong):
FE-2 tries: http://127.0.0.1:8031/role (FE-2's own http_port, no service)
Should be: http://127.0.0.1:8030/role (FE-1's http_port, correct)
Suggested fix:
- Store http_port in Meta Service when registering FE nodes
- Or provide a way to specify helper node's http_port in the --helper parameter (e.g., --helper
host:http_port:edit_log_port)
How to Reproduce?
- Deploy Doris cloud mode with Meta Service + FoundationDB
- Configure FE-1 (Master) with http_port=8030, edit_log_port=9010
- Configure FE-2 (Observer) with http_port=8031, edit_log_port=9011
- Start FE-1 successfully
- Start FE-2 with --helper 127.0.0.1:9010
- FE-2 fails to join with "Connection refused" error
Configuration example:
┌──────┬──────────┬───────────────┬───────────┐
│ Node │ Role │ edit_log_port │ http_port │
├──────┼──────────┼───────────────┼───────────┤
│ FE-1 │ Master │ 9010 │ 8030 │
├──────┼──────────┼───────────────┼───────────┤
│ FE-2 │ Observer │ 9011 │ 8031 │
└──────┴──────────┴───────────────┴───────────┘
Anything Else?
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct