Skip to content

Report per-column idle time in end-of-run metrics #695

@eric-tramel

Description

@eric-tramel

Feature Request

Add per-column idle-time metrics to the end-of-run metrics emitted after a create run completes.

Motivation

For model-backed generation columns, it is hard to tell whether total runtime is dominated by model response latency or by time where a column is not actually waiting on an in-flight request. When tuning generation plans, concurrency, scheduling, and model configuration, we should be able to see how much of the overall run was spent with each model-backed column truly idle.

Proposed Behavior

For each generation column that can issue model requests, likely columns using the model mixin / ModelConfig, report metrics such as:

  • request_wait_wall_time_s: wall-clock time during the run where the column had at least one in-flight model request and was waiting on a response.
  • idle_time_s: wall-clock time during the run where the column was not waiting on a model response.
  • idle_pct_of_run: idle_time_s / total_run_wall_time_s.
  • Existing or related request metrics, if available, such as request count and summed request latency, should remain separate from wall-clock wait time.

The important distinction is that request_wait_wall_time_s should avoid double-counting overlapping requests for the same column. If a column has multiple concurrent requests in flight, the wall-clock wait metric should represent the union of those waiting intervals, not the sum of every individual request latency. That makes the idle calculation answer: "during how much of the total run was this column not waiting on a model response?"

A table-like end-of-run summary could look like:

column        model_config    request_wait_wall_s    idle_s    idle_pct_of_run    requests
prompt        gpt-4.1-mini    83.2                   16.8      16.8%              240
label         nemotron        42.5                   57.5      57.5%              120

Acceptance Criteria

  • End-of-run metrics after create include per-column idle metrics for columns that issue model requests.
  • The metrics are available wherever end-of-run metrics are currently surfaced, not only in logs, if a structured metrics object exists.
  • Overlapping in-flight requests for a column are handled as wall-clock intervals rather than double-counted summed latencies.
  • Non-model-backed columns are either omitted from this section or reported with an explicit not-applicable state.
  • Tests cover at least one mocked model-backed column with controlled request timing and one case with overlapping requests.

Open Questions

  • Should idle_time_s use the full run wall-clock duration as its denominator, or only the column's active window between first scheduled work and final completed work?
  • Should a later enhancement split idle time into dependency wait, scheduler/admission wait, local processing time, and no-work-complete time?

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions