Search before reporting
Read release policy
User environment
- Observed version: Pulsar 2.10.1 (unsupported)
- Likely affected: All Pulsar versions, including 3.x, 4.x
Issue Description
When a topic is fenced, some partition consumers may silently stop receiving messages, while others continue working normally.
It only occurs under specific timing and failure conditions, which makes it hard to detect and debug:
- The broker sometimes fails to deliver the close command to certain consumers (e.g. due to transient network issues, broker overload, or connection state races).
- Consumers rely entirely on broker-initiated notifications and do not actively verify fencing state after subscription is established.
Once this happens, the affected partition consumer enters a permanent silent failure state.
Symptoms
- ✅ TCP connection remains alive
- ✅ No exceptions or error logs
- ❌ Some partitions stop receiving messages indefinitely
Meanwhile, other partitions of the same consumer continue to work normally, which further hides the issue.
Error messages
Reproducing the issue
- Create a partitioned topic and subscribe a consumer.
- Trigger topic fencing via topic unload.
- Publish messages to the topic.
- Observe that some partitions stop receiving messages while others continue normally, with no errors or logs.
This issue is not 100% reproducible.
Additional information
This topic is partitioned, which has one producer and one consumer.
- partition-2 stats post unload
- partition-2 internal-stats post unload
Around 13:00, an observation at the partition level of a specific topic revealed a discrepancy in data flow. While partition-2 had active producers with a non-zero production rate, the consumption rate for that specific partition was 0.
However, when observing at the topic level, the overall consumption rate remained non-zero. This indicates that while the topic appears "healthy" in aggregate, the partition-2 is in a silent or stalled state.
Are you willing to submit a PR?
Search before reporting
Read release policy
User environment
Issue Description
When a topic is fenced, some partition consumers may silently stop receiving messages, while others continue working normally.
It only occurs under specific timing and failure conditions, which makes it hard to detect and debug:
Once this happens, the affected partition consumer enters a permanent silent failure state.
Symptoms
Meanwhile, other partitions of the same consumer continue to work normally, which further hides the issue.
Error messages
Reproducing the issue
This issue is not 100% reproducible.
Additional information
This topic is partitioned, which has one producer and one consumer.
Around 13:00, an observation at the partition level of a specific topic revealed a discrepancy in data flow. While partition-2 had active producers with a non-zero production rate, the consumption rate for that specific partition was 0.
However, when observing at the topic level, the overall consumption rate remained non-zero. This indicates that while the topic appears "healthy" in aggregate, the partition-2 is in a silent or stalled state.
Are you willing to submit a PR?