Draft
Conversation
Initial implementation exposing transceiver metrics, error rates, temperatures, LLDP neighbors, static metadata.
Replace 20 individual counter metrics for packet size buckets with two native Prometheus histograms (RX/TX). This maps SAI port stat fields to cumulative histogram buckets via a new `histogram` transform in the metrics config.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Proposed Changes
Implement the switch metrics on a Prometheus endpoint
:9100/metrics, as described in EP-0003.Notes to Reviewers
There are a few metrics that are borderline useful. Opinions are welcome:
Ethernet124as interface. Each active SFP module will create such a set of metrics (32 or 52 ports respectively)!Fortunately, all of these are "simple" config entries that we can test drive and remove later if we see them as not useful and using too much metrics storage.
Example Prometheus Metrics Output
Example metrics, retrieved on
swi1-wdf4g-3, limited to interfaceEthernet124and non-interface metrics. Long list, folded by default. Unfold to see.admin@swi1-wdf4g-3:~$ curl -s localhost:9100/metrics | awk '!/Ethernet/ || /Ethernet124/' # HELP sonic_scrape_duration_seconds Duration of the last metrics scrape in seconds # TYPE sonic_scrape_duration_seconds gauge sonic_scrape_duration_seconds 0.043446274 # HELP sonic_switch_info Device metadata as labels, always 1 # TYPE sonic_switch_info gauge sonic_switch_info{asic="broadcom",firmware="11",hwsku="Accton-AS7726-32X",mac="94:ef:97:94:5e:52",platform="x86_64-accton_as7726_32x-r0"} 1 # HELP sonic_switch_interface_admin_state Admin state of the interface (1=up, 0=down) # TYPE sonic_switch_interface_admin_state gauge sonic_switch_interface_admin_state{interface="Ethernet124"} 1 # HELP sonic_switch_interface_anomaly_packets_total Total anomalous packets # TYPE sonic_switch_interface_anomaly_packets_total counter sonic_switch_interface_anomaly_packets_total{interface="Ethernet124",type="fragments"} 0 sonic_switch_interface_anomaly_packets_total{interface="Ethernet124",type="jabbers"} 0 sonic_switch_interface_anomaly_packets_total{interface="Ethernet124",type="rx_oversize"} 0 sonic_switch_interface_anomaly_packets_total{interface="Ethernet124",type="tx_oversize"} 0 sonic_switch_interface_anomaly_packets_total{interface="Ethernet124",type="undersize"} 0 sonic_switch_interface_anomaly_packets_total{interface="Ethernet124",type="unknown_protos"} 0 # HELP sonic_switch_interface_bytes_total Total bytes transferred # TYPE sonic_switch_interface_bytes_total counter sonic_switch_interface_bytes_total{direction="rx",interface="Ethernet124"} 6.57158677e+08 sonic_switch_interface_bytes_total{direction="tx",interface="Ethernet124"} 1.2311851e+08 # HELP sonic_switch_interface_discards_total Total interface discards # TYPE sonic_switch_interface_discards_total counter sonic_switch_interface_discards_total{direction="rx",interface="Ethernet124"} 46588 sonic_switch_interface_discards_total{direction="tx",interface="Ethernet124"} 0 # HELP sonic_switch_interface_dropped_packets_total Total SAI-level dropped packets # TYPE sonic_switch_interface_dropped_packets_total counter sonic_switch_interface_dropped_packets_total{direction="rx",interface="Ethernet124"} 0 sonic_switch_interface_dropped_packets_total{direction="tx",interface="Ethernet124"} 0 # HELP sonic_switch_interface_errors_total Total interface errors # TYPE sonic_switch_interface_errors_total counter sonic_switch_interface_errors_total{direction="rx",interface="Ethernet124"} 0 sonic_switch_interface_errors_total{direction="tx",interface="Ethernet124"} 0 # HELP sonic_switch_interface_fec_frames_total Total FEC frames # TYPE sonic_switch_interface_fec_frames_total counter sonic_switch_interface_fec_frames_total{interface="Ethernet124",type="correctable"} 0 sonic_switch_interface_fec_frames_total{interface="Ethernet124",type="symbol_errors"} 0 sonic_switch_interface_fec_frames_total{interface="Ethernet124",type="uncorrectable"} 0 # HELP sonic_switch_interface_neighbor_info LLDP neighbor metadata as labels, always 1 # TYPE sonic_switch_interface_neighbor_info gauge sonic_switch_interface_neighbor_info{interface="Ethernet124",neighbor_mac="94:ef:97:94:21:42",neighbor_name="swi2-wdf4g-2",neighbor_port="Ethernet8"} 1 # HELP sonic_switch_interface_oper_state Operational state of the interface (1=up, 0=down) # TYPE sonic_switch_interface_oper_state gauge sonic_switch_interface_oper_state{interface="Ethernet124"} 1 # HELP sonic_switch_interface_packets_total Total packets transferred # TYPE sonic_switch_interface_packets_total counter sonic_switch_interface_packets_total{direction="rx",interface="Ethernet124",type="broadcast"} 0 sonic_switch_interface_packets_total{direction="rx",interface="Ethernet124",type="multicast"} 162981 sonic_switch_interface_packets_total{direction="rx",interface="Ethernet124",type="non_unicast"} 162981 sonic_switch_interface_packets_total{direction="rx",interface="Ethernet124",type="unicast"} 1.199271e+06 sonic_switch_interface_packets_total{direction="tx",interface="Ethernet124",type="broadcast"} 0 sonic_switch_interface_packets_total{direction="tx",interface="Ethernet124",type="multicast"} 162982 sonic_switch_interface_packets_total{direction="tx",interface="Ethernet124",type="non_unicast"} 162982 sonic_switch_interface_packets_total{direction="tx",interface="Ethernet124",type="unicast"} 953331 # HELP sonic_switch_interface_pfc_packets_total Total PFC packets # TYPE sonic_switch_interface_pfc_packets_total counter sonic_switch_interface_pfc_packets_total{direction="rx",interface="Ethernet124",priority="0"} 0 sonic_switch_interface_pfc_packets_total{direction="rx",interface="Ethernet124",priority="1"} 0 sonic_switch_interface_pfc_packets_total{direction="rx",interface="Ethernet124",priority="2"} 0 sonic_switch_interface_pfc_packets_total{direction="rx",interface="Ethernet124",priority="3"} 0 sonic_switch_interface_pfc_packets_total{direction="rx",interface="Ethernet124",priority="4"} 0 sonic_switch_interface_pfc_packets_total{direction="rx",interface="Ethernet124",priority="5"} 0 sonic_switch_interface_pfc_packets_total{direction="rx",interface="Ethernet124",priority="6"} 0 sonic_switch_interface_pfc_packets_total{direction="rx",interface="Ethernet124",priority="7"} 0 sonic_switch_interface_pfc_packets_total{direction="tx",interface="Ethernet124",priority="0"} 0 sonic_switch_interface_pfc_packets_total{direction="tx",interface="Ethernet124",priority="1"} 0 sonic_switch_interface_pfc_packets_total{direction="tx",interface="Ethernet124",priority="2"} 0 sonic_switch_interface_pfc_packets_total{direction="tx",interface="Ethernet124",priority="3"} 0 sonic_switch_interface_pfc_packets_total{direction="tx",interface="Ethernet124",priority="4"} 0 sonic_switch_interface_pfc_packets_total{direction="tx",interface="Ethernet124",priority="5"} 0 sonic_switch_interface_pfc_packets_total{direction="tx",interface="Ethernet124",priority="6"} 0 sonic_switch_interface_pfc_packets_total{direction="tx",interface="Ethernet124",priority="7"} 0 # HELP sonic_switch_interface_queue_length Current output queue length # TYPE sonic_switch_interface_queue_length gauge sonic_switch_interface_queue_length{interface="Ethernet124"} 0 # HELP sonic_switch_interface_rx_packet_size_bytes RX packet size distribution # TYPE sonic_switch_interface_rx_packet_size_bytes histogram sonic_switch_interface_rx_packet_size_bytes_bucket{interface="Ethernet124",le="64"} 0 sonic_switch_interface_rx_packet_size_bytes_bucket{interface="Ethernet124",le="127"} 925589 sonic_switch_interface_rx_packet_size_bytes_bucket{interface="Ethernet124",le="255"} 926310 sonic_switch_interface_rx_packet_size_bytes_bucket{interface="Ethernet124",le="511"} 967493 sonic_switch_interface_rx_packet_size_bytes_bucket{interface="Ethernet124",le="1023"} 967705 sonic_switch_interface_rx_packet_size_bytes_bucket{interface="Ethernet124",le="1518"} 1.362252e+06 sonic_switch_interface_rx_packet_size_bytes_bucket{interface="Ethernet124",le="2047"} 1.362252e+06 sonic_switch_interface_rx_packet_size_bytes_bucket{interface="Ethernet124",le="4095"} 1.362252e+06 sonic_switch_interface_rx_packet_size_bytes_bucket{interface="Ethernet124",le="9216"} 1.362252e+06 sonic_switch_interface_rx_packet_size_bytes_bucket{interface="Ethernet124",le="16383"} 1.362252e+06 sonic_switch_interface_rx_packet_size_bytes_bucket{interface="Ethernet124",le="+Inf"} 1.362252e+06 sonic_switch_interface_rx_packet_size_bytes_sum{interface="Ethernet124"} 0 sonic_switch_interface_rx_packet_size_bytes_count{interface="Ethernet124"} 1.362252e+06 # HELP sonic_switch_interface_tx_packet_size_bytes TX packet size distribution # TYPE sonic_switch_interface_tx_packet_size_bytes histogram sonic_switch_interface_tx_packet_size_bytes_bucket{interface="Ethernet124",le="64"} 1 sonic_switch_interface_tx_packet_size_bytes_bucket{interface="Ethernet124",le="127"} 1.063977e+06 sonic_switch_interface_tx_packet_size_bytes_bucket{interface="Ethernet124",le="255"} 1.075338e+06 sonic_switch_interface_tx_packet_size_bytes_bucket{interface="Ethernet124",le="511"} 1.116235e+06 sonic_switch_interface_tx_packet_size_bytes_bucket{interface="Ethernet124",le="1023"} 1.116245e+06 sonic_switch_interface_tx_packet_size_bytes_bucket{interface="Ethernet124",le="1518"} 1.116313e+06 sonic_switch_interface_tx_packet_size_bytes_bucket{interface="Ethernet124",le="2047"} 1.116313e+06 sonic_switch_interface_tx_packet_size_bytes_bucket{interface="Ethernet124",le="4095"} 1.116313e+06 sonic_switch_interface_tx_packet_size_bytes_bucket{interface="Ethernet124",le="9216"} 1.116313e+06 sonic_switch_interface_tx_packet_size_bytes_bucket{interface="Ethernet124",le="16383"} 1.116313e+06 sonic_switch_interface_tx_packet_size_bytes_bucket{interface="Ethernet124",le="+Inf"} 1.116313e+06 sonic_switch_interface_tx_packet_size_bytes_sum{interface="Ethernet124"} 0 sonic_switch_interface_tx_packet_size_bytes_count{interface="Ethernet124"} 1.116313e+06 # HELP sonic_switch_interfaces_total Number of interfaces by operational status # TYPE sonic_switch_interfaces_total gauge sonic_switch_interfaces_total{operational_status="down"} 30 sonic_switch_interfaces_total{operational_status="up"} 2 # HELP sonic_switch_ports_total Total number of physical ports # TYPE sonic_switch_ports_total gauge sonic_switch_ports_total 32 # HELP sonic_switch_ready Whether the switch is ready (1) or not (0) # TYPE sonic_switch_ready gauge sonic_switch_ready 1 # HELP sonic_switch_temperature_celsius Chassis temperature sensor reading in Celsius # TYPE sonic_switch_temperature_celsius gauge sonic_switch_temperature_celsius{sensor="CB_temp(0x4B)"} 27 sonic_switch_temperature_celsius{sensor="CPU_Core_0_temp"} 37 sonic_switch_temperature_celsius{sensor="CPU_Core_1_temp"} 37 sonic_switch_temperature_celsius{sensor="CPU_Core_2_temp"} 37 sonic_switch_temperature_celsius{sensor="CPU_Core_3_temp"} 37 sonic_switch_temperature_celsius{sensor="CPU_Package_temp"} 37 sonic_switch_temperature_celsius{sensor="FB_temp(0x4C)"} 32.5 sonic_switch_temperature_celsius{sensor="MB_FrontMAC_temp(0x49)"} 33.5 sonic_switch_temperature_celsius{sensor="MB_LeftCenter_temp(0x4A)"} 28 sonic_switch_temperature_celsius{sensor="MB_RearMAC_temp(0x48)"} 34.5 sonic_switch_temperature_celsius{sensor="PSU-1 temp sensor 1"} 46 sonic_switch_temperature_celsius{sensor="PSU-2 temp sensor 1"} 44 # HELP sonic_switch_temperature_high_threshold_celsius Chassis temperature sensor high threshold in Celsius # TYPE sonic_switch_temperature_high_threshold_celsius gauge sonic_switch_temperature_high_threshold_celsius{sensor="CB_temp(0x4B)"} 80 sonic_switch_temperature_high_threshold_celsius{sensor="CPU_Core_0_temp"} 82 sonic_switch_temperature_high_threshold_celsius{sensor="CPU_Core_1_temp"} 82 sonic_switch_temperature_high_threshold_celsius{sensor="CPU_Core_2_temp"} 82 sonic_switch_temperature_high_threshold_celsius{sensor="CPU_Core_3_temp"} 82 sonic_switch_temperature_high_threshold_celsius{sensor="CPU_Package_temp"} 82 sonic_switch_temperature_high_threshold_celsius{sensor="FB_temp(0x4C)"} 80 sonic_switch_temperature_high_threshold_celsius{sensor="MB_FrontMAC_temp(0x49)"} 80 sonic_switch_temperature_high_threshold_celsius{sensor="MB_LeftCenter_temp(0x4A)"} 80 sonic_switch_temperature_high_threshold_celsius{sensor="MB_RearMAC_temp(0x48)"} 80 sonic_switch_temperature_high_threshold_celsius{sensor="PSU-1 temp sensor 1"} 80 sonic_switch_temperature_high_threshold_celsius{sensor="PSU-2 temp sensor 1"} 80 # HELP sonic_switch_temperature_warning Chassis temperature sensor warning status (1=warning, 0=ok) # TYPE sonic_switch_temperature_warning gauge sonic_switch_temperature_warning{sensor="CB_temp(0x4B)"} 0 sonic_switch_temperature_warning{sensor="CPU_Core_0_temp"} 0 sonic_switch_temperature_warning{sensor="CPU_Core_1_temp"} 0 sonic_switch_temperature_warning{sensor="CPU_Core_2_temp"} 0 sonic_switch_temperature_warning{sensor="CPU_Core_3_temp"} 0 sonic_switch_temperature_warning{sensor="CPU_Package_temp"} 0 sonic_switch_temperature_warning{sensor="FB_temp(0x4C)"} 0 sonic_switch_temperature_warning{sensor="MB_FrontMAC_temp(0x49)"} 0 sonic_switch_temperature_warning{sensor="MB_LeftCenter_temp(0x4A)"} 0 sonic_switch_temperature_warning{sensor="MB_RearMAC_temp(0x48)"} 0 sonic_switch_temperature_warning{sensor="PSU-1 temp sensor 1"} 0 sonic_switch_temperature_warning{sensor="PSU-2 temp sensor 1"} 0 # HELP sonic_switch_transceiver_dom_rx_power_dbm Transceiver RX power in dBm # TYPE sonic_switch_transceiver_dom_rx_power_dbm gauge sonic_switch_transceiver_dom_rx_power_dbm{interface="Ethernet124",lane="1"} -0.6248210798265337 sonic_switch_transceiver_dom_rx_power_dbm{interface="Ethernet124",lane="2"} 0.19116290447072778 sonic_switch_transceiver_dom_rx_power_dbm{interface="Ethernet124",lane="3"} 0.29789470831855613 sonic_switch_transceiver_dom_rx_power_dbm{interface="Ethernet124",lane="4"} -0.24108863598207259 # HELP sonic_switch_transceiver_dom_temperature_celsius Transceiver temperature in Celsius # TYPE sonic_switch_transceiver_dom_temperature_celsius gauge sonic_switch_transceiver_dom_temperature_celsius{interface="Ethernet124"} 37.645 # HELP sonic_switch_transceiver_dom_threshold Transceiver DOM threshold value # TYPE sonic_switch_transceiver_dom_threshold gauge sonic_switch_transceiver_dom_threshold{direction="high",interface="Ethernet124",level="alarm",sensor="rx_power"} 3.5 sonic_switch_transceiver_dom_threshold{direction="high",interface="Ethernet124",level="alarm",sensor="temperature"} 75 sonic_switch_transceiver_dom_threshold{direction="high",interface="Ethernet124",level="alarm",sensor="tx_bias"} 90 sonic_switch_transceiver_dom_threshold{direction="high",interface="Ethernet124",level="alarm",sensor="tx_power"} 3.5 sonic_switch_transceiver_dom_threshold{direction="high",interface="Ethernet124",level="alarm",sensor="voltage"} 3.63 sonic_switch_transceiver_dom_threshold{direction="high",interface="Ethernet124",level="warning",sensor="rx_power"} 2.5 sonic_switch_transceiver_dom_threshold{direction="high",interface="Ethernet124",level="warning",sensor="temperature"} 70 sonic_switch_transceiver_dom_threshold{direction="high",interface="Ethernet124",level="warning",sensor="tx_bias"} 80 sonic_switch_transceiver_dom_threshold{direction="high",interface="Ethernet124",level="warning",sensor="tx_power"} 2.5 sonic_switch_transceiver_dom_threshold{direction="high",interface="Ethernet124",level="warning",sensor="voltage"} 3.465 sonic_switch_transceiver_dom_threshold{direction="low",interface="Ethernet124",level="alarm",sensor="rx_power"} -12.503 sonic_switch_transceiver_dom_threshold{direction="low",interface="Ethernet124",level="alarm",sensor="temperature"} -5 sonic_switch_transceiver_dom_threshold{direction="low",interface="Ethernet124",level="alarm",sensor="tx_bias"} 10 sonic_switch_transceiver_dom_threshold{direction="low",interface="Ethernet124",level="alarm",sensor="tx_power"} -7.501 sonic_switch_transceiver_dom_threshold{direction="low",interface="Ethernet124",level="alarm",sensor="voltage"} 3.05 sonic_switch_transceiver_dom_threshold{direction="low",interface="Ethernet124",level="warning",sensor="rx_power"} -11.5 sonic_switch_transceiver_dom_threshold{direction="low",interface="Ethernet124",level="warning",sensor="temperature"} 0 sonic_switch_transceiver_dom_threshold{direction="low",interface="Ethernet124",level="warning",sensor="tx_bias"} 20 sonic_switch_transceiver_dom_threshold{direction="low",interface="Ethernet124",level="warning",sensor="tx_power"} -6.499 sonic_switch_transceiver_dom_threshold{direction="low",interface="Ethernet124",level="warning",sensor="voltage"} 3.135 # HELP sonic_switch_transceiver_dom_tx_bias_milliamps Transceiver TX bias current in milliamps # TYPE sonic_switch_transceiver_dom_tx_bias_milliamps gauge sonic_switch_transceiver_dom_tx_bias_milliamps{interface="Ethernet124",lane="1"} 50.564 sonic_switch_transceiver_dom_tx_bias_milliamps{interface="Ethernet124",lane="2"} 51.556 sonic_switch_transceiver_dom_tx_bias_milliamps{interface="Ethernet124",lane="3"} 51.828 sonic_switch_transceiver_dom_tx_bias_milliamps{interface="Ethernet124",lane="4"} 53.384 # HELP sonic_switch_transceiver_dom_voltage_volts Transceiver supply voltage in Volts # TYPE sonic_switch_transceiver_dom_voltage_volts gauge sonic_switch_transceiver_dom_voltage_volts{interface="Ethernet124"} 3.264 # HELP sonic_switch_transceiver_info Transceiver static metadata as labels, always 1 # TYPE sonic_switch_transceiver_info gauge sonic_switch_transceiver_info{interface="Ethernet124",model="S-QSFP-100G-CWDM",serial="F7Z2G4L ",type="QSFP28 or later",vendor="SWITCH2OPEN "} 1 # HELP sonic_switch_transceiver_rxlos Transceiver RX loss of signal (1=loss, 0=ok) # TYPE sonic_switch_transceiver_rxlos gauge sonic_switch_transceiver_rxlos{interface="Ethernet124",lane="1"} 0 sonic_switch_transceiver_rxlos{interface="Ethernet124",lane="2"} 0 sonic_switch_transceiver_rxlos{interface="Ethernet124",lane="3"} 0 sonic_switch_transceiver_rxlos{interface="Ethernet124",lane="4"} 0 # HELP sonic_switch_transceiver_txfault Transceiver TX fault (1=fault, 0=ok) # TYPE sonic_switch_transceiver_txfault gauge sonic_switch_transceiver_txfault{interface="Ethernet124",lane="1"} 0 sonic_switch_transceiver_txfault{interface="Ethernet124",lane="2"} 0 sonic_switch_transceiver_txfault{interface="Ethernet124",lane="3"} 0 sonic_switch_transceiver_txfault{interface="Ethernet124",lane="4"} 0