Skip to content

Commit 6eda1eb

Browse files
committed
improve doc
1 parent d72dc04 commit 6eda1eb

1 file changed

Lines changed: 33 additions & 109 deletions

File tree

pip/pip-447.md

Lines changed: 33 additions & 109 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
# Abstract
44

5-
This PIP proposes a mechanism to add customizable Prometheus labels to Apache Pulsar topic-level metrics. Administrators will define a set of allowed custom metric label keys at the cluster or broker level. Users can then assign values to these predefined keys for specific topics. These key-value pairs will be exposed as Prometheus labels, enabling more granular metric filtering and precise alerting, while managing metric cardinality through centralized key governance.
5+
This PIP proposes a mechanism to selectively expose existing Pulsar Topic Properties as Prometheus metrics labels. Instead of introducing a new mechanism for custom labels, Administrators will define a set of allowed property keys at the broker level. If a topic has a property matching one of these allowed keys, its value will be automatically sanitized and injected as a Prometheus label. This enables granular metric filtering and alerting using the metadata users are already maintaining, while managing cardinality through centralized configuration.
66

77
# Motivation
88

@@ -18,73 +18,43 @@ Operational Overhead: Increased effort in managing and maintaining Prometheus al
1818

1919
Users require a native Pulsar mechanism to inject queryable, custom metadata directly into topic metrics to improve alerting precision, simplify dashboarding, and enhance overall observability.
2020

21-
# Goals
22-
23-
The proposed solution allows administrators to define a list of permissible custom metric label keys (e.g., sla_tier, app_owner) in the broker configuration.
24-
25-
Users can then use pulsar-admin commands or the REST API to set string values for these allowed keys on specific topics (e.g., sla_tier=gold).
21+
Users often already attach this metadata to topics using Topic Properties (e.g., sla_tier=gold, owner=team_a). However, these properties are currently invisible to the monitoring layer.
2622

27-
These key-value pairs will be stored as part of the topic's metadata, leveraging Pulsar's topic-level policy system.
28-
29-
When topic metrics are generated for Prometheus, these custom key-value pairs will be added as labels to the existing set of standard labels.
23+
# Goals
3024

3125
## In Scope
3226

3327
The primary goals of this proposal are to:
3428

35-
Allow administrators to define a configurable set of allowed custom metric label keys at the broker or cluster level.
36-
37-
Enable users (or automated systems) to assign string values to these predefined keys for individual topics.
29+
* Allow administrators to define a configurable list of allowed topic property keys in the broker configuration.
3830

39-
Expose these user-defined key-value pairs as additional Prometheus labels on all topic-level metrics for the respective topics, provided exposeTopicLevelMetricsInPrometheus is true.
31+
* Update the Prometheus metrics generation logic to retrieve values from the existing Topic Properties map and inject them as labels if they match the allowed list.
4032

41-
Provide robust control over Prometheus metric cardinality by restricting the set of custom metric label keys and providing limits on their usage.
42-
43-
Integrate this feature with Pulsar's existing topic-level policy framework, utilizing system topics for policy propagation.
44-
45-
## Out of Scope
46-
47-
This PIP does not propose changes to topic-level properties. Topic-level properties are distinct from topic-level custom metrics labels.
48-
49-
It does not aim to replace existing standard metric labels (cluster, namespace, topic).
50-
51-
Support for complex data types as label values; only string values will be supported.
33+
* Provide robust control over Prometheus metric cardinality by strictly enforcing the allowed-list.
5234

5335
# High Level Design
5436

55-
API and Data Structure Definition: Finalize the internal data structures for storing custom metric labels within topic policies and the public contracts for pulsar-admin commands and REST APIs.
56-
57-
Broker Configuration Implementation: Add the new configuration parameters to broker.conf and implement the logic for brokers to read and use these settings.
37+
Configuration Implementation: Add parameters to broker.conf to define the allow-list (e.g., allowedTopicPropertiesForMetrics).
5838

59-
Admin Client and REST API Implementation: Develop the new pulsar-admin topics subcommands and their corresponding REST API endpoints in the broker, including validation logic against allowedCustomMetricLabelKeys.
39+
Metrics Generation Modification: Update the PrometheusMetricsServlet (or equivalent exporter) to:
6040

61-
Broker Policy Handling Logic:
41+
* Read the topic's existing properties.
6242

63-
Extend the topic-level policy framework to handle customMetricLabels.
43+
* Filter keys against the configured allow-list.
6444

65-
Ensure changes are published to and consumed from the `__change_events` system topic.
66-
67-
Update broker policy caches accordingly.
68-
69-
Metrics Generation Modification: Update the Prometheus metrics servlet (or equivalent OTel exporter logic in the future) to retrieve custom metric labels from the policy cache and add them to the outgoing topic metrics.
45+
* Append them to the outgoing Prometheus metric lines.
7046

7147
# Detailed Design
7248

7349
## Design & Implementation Details
7450

75-
Topic Policy Storage: Custom metric labels will be stored as a Map<String, String> within the topic's policy data structure.
76-
77-
Policy Propagation and Caching: Brokers will use the existing system topic mechanism (__change_events) for propagating custom metric label policy changes and updating their in-memory policy caches.
51+
### Topic Property Storage:
52+
We utilize the existing Map<String, String> properties in TopicPolicies. No new storage is needed.
7853

7954
### Metrics Generation Logic:
8055
The Prometheus metrics generation component in the broker (e.g., PrometheusMetricsServlet) will be modified.
8156

82-
If `exposeCustomTopicMetricLabelsEnabled` is true, for each topic, it will retrieve the customMetricLabels map from its policy cache.
83-
84-
Each key-value pair from this map will be added as an additional label to all Prometheus metrics emitted for that topic. The standard labels (cluster, namespace, topic) will remain.
85-
86-
Validation: Brokers will enforce maxCustomMetricLabelValueLength during the set-custom-metric-labels operation. Keys will be validated against allowedCustomMetricLabelKeys.
87-
57+
If `exposeCustomTopicMetricLabelsEnabled` is true, for each topic. The broker iterates over the allowedTopicPropertiesForMetrics list. For each allowed key, it checks if the topic has a corresponding property value. If found, the sanitized key and the original value are added as a label to the metric family.
8858

8959
## Public-facing Changes
9060

@@ -98,54 +68,14 @@ Description: Enables or disables the custom topic metric labels feature.
9868

9969
Default: false
10070

101-
allowedCustomMetricLabelKeys=<key1>,<key2>,...
71+
allowedTopicPropertiesForMetrics=<key1>,<key2>,...
10272

103-
Description: A comma-separated list of strings defining the custom metric label keys that administrators allow to be set on topics. Example: sla_tier,data_sensitivity,cost_center,app_owner.
73+
Description: A comma-separated list of Topic Property keys that are allowed to be exposed as metrics. Only keys explicitly listed here will be exposed.
10474

105-
Default: Empty string (if the feature is enabled but no keys are defined, no custom metric labels can be set).
106-
107-
maxCustomMetricLabelValueLength=<integer>
108-
109-
Description: The maximum character length for a custom metric label value.
110-
111-
Default: 128
75+
Default: Empty string (if the feature is enabled but no keys are defined, no custom metric labels can be export).
11276

11377
### Public API
11478

115-
New pulsar-admin topics subcommands and corresponding REST API endpoints will be introduced:
116-
117-
#### Set Custom Metric Labels:
118-
119-
CLI: pulsar-admin topics set-custom-metric-labels <topic-name> --labels "key1=value1,key2=value2"
120-
121-
REST API: POST /admin/v2/topics/{tenant}/{namespace}/{topic}/custom-metric-labels with a JSON payload {"labels": {"key1":"value1", "key2":"value2"}}
122-
123-
Action: Sets or updates custom metric labels for the specified topic.
124-
125-
The broker (or admin client before sending) will validate that all provided keys (e.g., key1, key2) are present in the allowedCustomMetricLabelKeys list defined in broker.conf.
126-
127-
User cannot rewrite a label which already defined from Pulsar.
128-
129-
Invalid keys will result in an error. This operation will update the topic's policy and publish a change event to the system topic (__change_events) for that namespace.
130-
131-
#### Get Custom Metric Labels:
132-
133-
CLI: pulsar-admin topics get-custom-metric-labels <topic-name>
134-
135-
REST API: GET /admin/v2/topics/{tenant}/{namespace}/{topic}/custom-metric-labels
136-
137-
Action: Retrieves the currently set custom metric labels for the topic.
138-
139-
#### Remove Custom Metric Labels:
140-
141-
CLI:
142-
pulsar-admin topics remove-custom-metric-labels <topic-name> --labels "key1,key2" (to remove specific labels)
143-
pulsar-admin topics remove-custom-metric-labels <topic-name> --all (to remove all custom metric labels from the topic)
144-
145-
REST API: DELETE /admin/v2/topics/{tenant}/{namespace}/{topic}/custom-metric-labels with a query params keys=k1&keys=k2 or all=true.
146-
147-
Action: Removes the specified custom metric labels or all custom metric labels from the topic. This also updates the topic policy.
148-
14979
# Backward & Forward Compatibility
15080

15181
## Backward Compatibility
@@ -154,57 +84,45 @@ Disabled by Default: The feature will be disabled by default (exposeCustomTopicM
15484

15585
Existing Pulsar deployments will see no change in behavior or metric format.
15686

157-
No Impact if Unused: If the feature is enabled but allowedCustomMetricLabelKeys is not configured or no labels are set on topics, metrics will remain unchanged.
87+
No Impact if Unused: If the feature is enabled but allowedTopicPropertiesForMetrics is not configured or no labels are set on topics, metrics will remain unchanged.
15888

159-
Existing APIs: Existing pulsar-admin commands and REST APIs are unaffected. The new commands and endpoints are additive.
89+
Existing APIs: Existing pulsar-admin commands and REST APIs are unaffected.
16090

16191
## Forward Compatibility
16292

16393
Prometheus Systems: If a Pulsar broker with this feature enabled sends metrics with custom metric labels to an older Prometheus server or a monitoring system not expecting these additional labels, those systems will typically ignore the extra labels without issue.
16494

165-
Future Enhancements: Future Pulsar versions could extend this feature, for example, by allowing more dynamic management of allowedCustomMetricLabelKeys if deemed safe and necessary.
95+
Future Enhancements: Future Pulsar versions could extend this feature, for example, by allowing more dynamic management of allowedTopicPropertiesForMetrics if deemed safe and necessary.
16696

16797
OpenTelemetry Alignment: The key-value structure of custom metric labels aligns well with OpenTelemetry attributes, ensuring that this feature remains relevant and compatible with Pulsar's evolving metrics infrastructure.
16898

16999
# Testing Plan
170100

171101
A comprehensive testing strategy will be required:
172102

173-
Unit Tests: For new logic in pulsar-admin, REST API handlers, policy management, and metrics generation.
103+
Test the filtering logic against allowedTopicPropertiesForMetrics.
174104

175105
Integration Tests:
176106

177107
Verify correct setting, getting, and removing of custom metric labels via admin tools and REST APIs.
178108

179-
Test validation logic for allowed keys, max labels per topic, and value length.
180-
181-
Ensure correct propagation of custom metric label policy changes via system topics and updates to broker policy caches.
109+
Test validation logic for allowed keys.
182110

183-
Verify that Prometheus metrics output accurately includes the custom metric labels for relevant topics and does not include them when the feature is disabled or labels are not set.
111+
End-to-End Flow: Set a property via pulsar-admin topics update-properties, scrape the metrics endpoint, and verify the label appears.
184112

185-
End-to-End Tests: Simulate a Pulsar cluster environment, set custom metric labels on topics, and scrape metrics using a Prometheus instance to confirm labels appear correctly and can be queried.
113+
Dynamic Updates: Verify that updating a property value changes the metric label value in subsequent scrapes.
186114

187-
Performance Tests: Assess any potential performance impact on brokers, particularly concerning policy updates and metrics generation, especially in environments with a large number of topics.
115+
Removal: Verify that removing a property removes the label.
188116

189117
# Documentation Plan
190118

191119
The official Apache Pulsar documentation will be updated to include:
192120

193121
Concepts Section: Explanation of the custom topic metric labels feature, its purpose, and how it helps with monitoring and alerting.
194122

195-
Administrator Guide:
196-
197-
Instructions on how to enable and configure the feature in broker.conf (e.g., exposeCustomTopicMetricLabelsEnabled, allowedCustomMetricLabelKeys, and other limits).
198-
199-
Best practices for defining allowedCustomMetricLabelKeys with cardinality management in mind.
200-
201-
User Guide / pulsar-admin Reference:
123+
Instructions on how to enable and configure the feature in broker.conf (e.g., exposeCustomTopicMetricLabelsEnabled, allowedTopicPropertiesForMetrics, and other limits).
202124

203-
Detailed syntax and examples for the new pulsar-admin topics set/get/remove-custom-metric-labels commands.
204-
205-
Guidance on choosing appropriate values for predefined keys to manage cardinality.
206-
207-
REST API Reference: Documentation for the new REST API endpoints.
125+
Best practices for defining allowedTopicPropertiesForMetrics with cardinality management in mind.
208126

209127
Monitoring Section: Notes on how these custom metric labels appear in Prometheus and how they can be used in PromQL queries, along with reminders about cardinality considerations for the Prometheus system.
210128

@@ -224,6 +142,12 @@ Description: Keeping Pulsar metrics unchanged and using Prometheus's relabel_con
224142

225143
Reason for Rejection: This shifts the implementation complexity and maintenance burden to the Prometheus configuration and external systems. It introduces risks of stale metadata and potential performance overhead on Prometheus. Crucially, it is not a Pulsar-native solution, which is the aim of this proposal.
226144

145+
C. Store label in topic policy
146+
147+
Description: Instead of using topic properties, we can store the custom metric labels in topic policies.
148+
149+
Reason for Rejection: Storing labels in topic policies adds complexity to the topic management and requires additional code to handle the policy updates.
150+
227151
# Links
228152

229153
* Mailing List discussion thread: https://lists.apache.org/thread/66l8cdhx5f7sv05mqfnlwc7s570frtzq

0 commit comments

Comments
 (0)