Skip to content

eval crashes when scores are loaded but recompute is still needed #1880

@s6sebusc

Description

@s6sebusc

What happened?

calc_scores_per_stream expects an argument metrics_dict which maps regions to metrics. In run_evaluation, it is used as follows

metrics_to_compute = recomputable_metrics if recomputable_metrics else metrics
stream_computed_scores = calc_scores_per_stream(
          reader, stream, regions_to_compute, metrics_to_compute, plot_score_maps
)

if scores were loaded from .json, recomputable_metrics is an empty dict and we call calc_scores_per_stream with metrics_dict=metrics instead, but metrics is not a dict mapping regions to metrics, so it crashes, saying

Traceback (most recent call last):
  File "/p/home/jusers/buschow1/juwels/work/WeatherGenerator/.venv/bin/evaluate", line 10, in <module>
    sys.exit(evaluate())
             ^^^^^^^^^^
  File "/p/home/jusers/buschow1/juwels/work/WeatherGenerator/packages/evaluate/src/weathergen/evaluate/run_evaluation.py", line 105, in evaluate
    evaluate_from_args(sys.argv[1:], log_queue)
  File "/p/home/jusers/buschow1/juwels/work/WeatherGenerator/packages/evaluate/src/weathergen/evaluate/run_evaluation.py", line 157, in evaluate_from_args
    evaluate_from_config(cf, mlflow_client, log_queue)
  File "/p/home/jusers/buschow1/juwels/work/WeatherGenerator/packages/evaluate/src/weathergen/evaluate/run_evaluation.py", line 340, in evaluate_from_config
    results = [_process_stream(**task) for task in tasks]
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/p/home/jusers/buschow1/juwels/work/WeatherGenerator/packages/evaluate/src/weathergen/evaluate/run_evaluation.py", line 255, in _process_stream
    stream_computed_scores = calc_scores_per_stream(
                             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/p/home/jusers/buschow1/juwels/work/WeatherGenerator/packages/evaluate/src/weathergen/evaluate/utils/utils.py", line 128, in calc_scores_per_stream
    metrics = metrics_dict[region]
              ~~~~~~~~~~~~^^^^^^^^
  File "/p/home/jusers/buschow1/juwels/work/WeatherGenerator/.venv/lib/python3.12/site-packages/omegaconf/listconfig.py", line 218, in __getitem__
    self._format_and_raise(key=index, value=None, cause=e)
  File "/p/home/jusers/buschow1/juwels/work/WeatherGenerator/.venv/lib/python3.12/site-packages/omegaconf/base.py", line 231, in _format_and_raise
    format_and_raise(
  File "/p/home/jusers/buschow1/juwels/work/WeatherGenerator/.venv/lib/python3.12/site-packages/omegaconf/_utils.py", line 899, in format_and_raise
    _raise(ex, cause)
  File "/p/home/jusers/buschow1/juwels/work/WeatherGenerator/.venv/lib/python3.12/site-packages/omegaconf/_utils.py", line 797, in _raise
    raise ex.with_traceback(sys.exc_info()[2])  # set env var OC_CAUSE=1 for full trace
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/p/home/jusers/buschow1/juwels/work/WeatherGenerator/.venv/lib/python3.12/site-packages/omegaconf/listconfig.py", line 193, in __getitem__
    self._validate_get(index, None)
  File "/p/home/jusers/buschow1/juwels/work/WeatherGenerator/.venv/lib/python3.12/site-packages/omegaconf/listconfig.py", line 86, in _validate_get
    raise KeyValidationError(
omegaconf.errors.KeyValidationError: ListConfig indices must be integers or slices, not str
    full_key: evaluation.metrics[global]
    object_type=list

What are the steps to reproduce the bug?

This happens for example when I run evaluation twice with the same config and plot_score_maps: true. The first run creates the json, the second run loads it but because of plot_score_maps it still has to recompute.

Hedgedoc link to logs and more information. This ticket is public, do not attach files directly.

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingevalanything related to the model evaluation pipeline

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions