[Discussion] Adding energy consumption metrics to MLPerf Inference Benchmark

## Discussion: Energy Metrics for MLPerf Inference

### Context

MLPerf Inference currently reports throughput and latency metrics. As AI sustainability becomes a key concern, standardized energy efficiency metrics would complement existing benchmarks.

### Observation

Through systematic benchmarking of quantized LLM inference (NF4, INT8, FP16) across NVIDIA Ada Lovelace and Blackwell architectures, we found that:

1. Quantization's energy impact is non-trivial and model-size dependent
2. For models <3B parameters, NF4 quantization increases energy by 25-56%
3. INT8 mixed-precision adds 17-33% energy overhead vs FP16
4. These trade-offs are not captured by throughput/latency alone

### Suggestion

Consider adding optional energy reporting to the MLPerf Inference benchmark:

- Energy per query/token (J)
- Average power draw (W)
- Energy efficiency (tokens/J)

This would enable apples-to-apples energy comparison across hardware and quantization configurations.

### Data

- Full benchmark dataset (200+ measurements): [Zenodo](https://zenodo.org/records/18900289)
- Profiling toolkit: [EcoCompute-AI](https://github.com/hongping-zh/ecocompute-ai)
- Interactive results: [https://hongping-zh.github.io/ecocompute-dynamic-eval/](https://hongping-zh.github.io/ecocompute-dynamic-eval/)

### Related

- [huggingface/transformers#44407](https://github.com/huggingface/transformers/pull/44407) — Energy efficiency docs (approved)
- [huggingface/optimum#2410](https://github.com/huggingface/optimum/pull/2410) — Quantization energy data
- [vllm-project/vllm#36440](https://github.com/vllm-project/vllm/issues/36440) — Energy metrics feature request

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Discussion] Adding energy consumption metrics to MLPerf Inference Benchmark #2558

Discussion: Energy Metrics for MLPerf Inference

Context

Observation

Suggestion

Data

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Discussion] Adding energy consumption metrics to MLPerf Inference Benchmark #2558

Description

Discussion: Energy Metrics for MLPerf Inference

Context

Observation

Suggestion

Data

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions