Report per-column idle time in end-of-run metrics

## Feature Request

Add per-column idle-time metrics to the end-of-run metrics emitted after a `create` run completes.

## Motivation

For model-backed generation columns, it is hard to tell whether total runtime is dominated by model response latency or by time where a column is not actually waiting on an in-flight request. When tuning generation plans, concurrency, scheduling, and model configuration, we should be able to see how much of the overall run was spent with each model-backed column truly idle.

## Proposed Behavior

For each generation column that can issue model requests, likely columns using the model mixin / `ModelConfig`, report metrics such as:

- `request_wait_wall_time_s`: wall-clock time during the run where the column had at least one in-flight model request and was waiting on a response.
- `idle_time_s`: wall-clock time during the run where the column was not waiting on a model response.
- `idle_pct_of_run`: `idle_time_s / total_run_wall_time_s`.
- Existing or related request metrics, if available, such as request count and summed request latency, should remain separate from wall-clock wait time.

The important distinction is that `request_wait_wall_time_s` should avoid double-counting overlapping requests for the same column. If a column has multiple concurrent requests in flight, the wall-clock wait metric should represent the union of those waiting intervals, not the sum of every individual request latency. That makes the idle calculation answer: "during how much of the total run was this column not waiting on a model response?"

A table-like end-of-run summary could look like:

```text
column        model_config    request_wait_wall_s    idle_s    idle_pct_of_run    requests
prompt        gpt-4.1-mini    83.2                   16.8      16.8%              240
label         nemotron        42.5                   57.5      57.5%              120
```

## Acceptance Criteria

- End-of-run metrics after `create` include per-column idle metrics for columns that issue model requests.
- The metrics are available wherever end-of-run metrics are currently surfaced, not only in logs, if a structured metrics object exists.
- Overlapping in-flight requests for a column are handled as wall-clock intervals rather than double-counted summed latencies.
- Non-model-backed columns are either omitted from this section or reported with an explicit not-applicable state.
- Tests cover at least one mocked model-backed column with controlled request timing and one case with overlapping requests.

## Open Questions

- Should `idle_time_s` use the full run wall-clock duration as its denominator, or only the column's active window between first scheduled work and final completed work?
- Should a later enhancement split idle time into dependency wait, scheduler/admission wait, local processing time, and no-work-complete time?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Report per-column idle time in end-of-run metrics #695

Feature Request

Motivation

Proposed Behavior

Acceptance Criteria

Open Questions

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Report per-column idle time in end-of-run metrics #695

Description

Feature Request

Motivation

Proposed Behavior

Acceptance Criteria

Open Questions

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions