Skip to content

Comments

changed model path to shared dir path#1799

Open
sbAsma wants to merge 14 commits intoecmwf:developfrom
sbAsma:sbAsma/develop/1790-remove-wrong-path
Open

changed model path to shared dir path#1799
sbAsma wants to merge 14 commits intoecmwf:developfrom
sbAsma:sbAsma/develop/1790-remove-wrong-path

Conversation

@sbAsma
Copy link
Contributor

@sbAsma sbAsma commented Feb 4, 2026

Description

Issue Number

Closes #1790

Is this PR a draft? Mark it as draft.

Checklist before asking for review

  • I have performed a self-review of my code
  • My changes comply with basic sanity checks:
    • I have fixed formatting issues with ./scripts/actions.sh lint
    • I have run unit tests with ./scripts/actions.sh unit-test
    • I have documented my code and I have updated the docstrings.
    • I have added unit tests, if relevant
  • I have tried my changes with data and code:
    • I have run the integration tests with ./scripts/actions.sh integration-test
    • (bigger changes) I have run a full training and I have written in the comment the run_id(s): launch-slurm.py --time 60
    • (bigger changes and experiments) I have shared a hegdedoc in the github issue with all the configurations and runs for this experiments
  • I have informed and aligned with people impacted by my change:
    • for config changes: the MatterMost channels and/or a design doc
    • for changes of dependencies: the MatterMost software development channel

@sbAsma sbAsma marked this pull request as ready for review February 10, 2026 11:46
path = get_path_model(run_id=run_id)
else:
path = Path(model_path) / run_id
path = _get_shared_wg_path() / "models" / run_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already a method for this.
I think we shouldn't change this. It is used when an explicit model_path was given. Otherwise the if-else block is redundant and we can just use the part in the if-block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed the else is redundant, I will revert and add a logging instead

"""

path_run = Path(cf.model_path) / run_id
path_run = _get_shared_wg_path() / "models" / run_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same applies here. We could use this method directly. Maybe we can add the same if-else block as in the previous case. This enables the user to not add a model_path to the config and just use the regular path. But I'm not entirely sure if this is possible here. Maybe someone with more experience on the model and training process knows better. @sophie-xhonneux ?

@github-project-automation github-project-automation bot moved this to In Progress in WeatherGen-dev Feb 11, 2026
@sbAsma sbAsma requested a review from TillHae February 12, 2026 02:48
@sbAsma
Copy link
Contributor Author

sbAsma commented Feb 12, 2026

Thank you @TillHae for taking the time to review this PR, I added the requested changes ^^

mini_epoch : The mini_epoch to load. Default (-1) is the latest mini_epoch
"""

path_run = Path(cf.model_path) / run_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the same if-else block from config.py. If the user enters a model_path we should use it instead of creating it from the run_id.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's shared code that needs to be identical in two places, can we please write a function that ensures it's indeed identical.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the same if-else block from config.py. If the user enters a model_path we should use it instead of creating it from the run_id.

Hi @TillHae, I changed it to https://github.com/ecmwf/WeatherGenerator/pull/1799/changes#diff-2e85cd26b59453d8cbf3b54ed76e4cf092f911b8063dde53addf8ea81f0ca28dR183

Please let me know if this is what you meant or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sbAsma this is exactly what I meant. I think this should also fix the error from #1790. But I still have three small suggestions.

  • define a method for this in config.py, which we can then use in both places as @clessig requested. In that case we also don't need the get_model_path import anymore.
  • remove the additional variable user_defined_path and just use path_run as in config.py
  • Remove the space before .exists() in l. 185

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the routine to config.py and re-used the method get_model_path(). Lemme know if this works.

@github-actions github-actions bot added bug Something isn't working infra Issues related to infrastructure model Related to model training or definition (not generic infra) labels Feb 19, 2026
mini_epoch : The mini_epoch to load. Default (-1) is the latest mini_epoch
"""

path_run = Path(cf.model_path) / run_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sbAsma this is exactly what I meant. I think this should also fix the error from #1790. But I still have three small suggestions.

  • define a method for this in config.py, which we can then use in both places as @clessig requested. In that case we also don't need the get_model_path import anymore.
  • remove the additional variable user_defined_path and just use path_run as in config.py
  • Remove the space before .exists() in l. 185

def get_path_model(config: Config | None = None, run_id: str | None = None) -> Path:
"""Get the current runs model_path for storing model checkpoints."""
if config or run_id:
run_id = run_id if run_id else get_run_id_from_config(config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you take a look on this. Maybe this is already everything we need.

path_run = Path(cf.model_path) / run_id
# If `run_id` is a full filesystem path to a model run directory, use it
# directly. This matches the behaviour of `config.load_run_config`, which
# accepts full-path run_id values.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is no longer true. get_path_model doesn't handle the case of run_id being a full filesystem path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as far as I know, and correct me if I am wrong, run_id isn't supposed to be a full system path. @grassesi What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, for some time in the beginning there was a wish that one can load a model directly by specifying a path. Since our infrastructure has matured quite a bit since then, I feel like it is not required anymore, but I dont know if this option is still in use. @clessig do you know if this is still actively used? can we go ahead and remove this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still good to have, especially with git worktrees or when setting up people during a hackathon or locally. Please don't remove.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it should be possible that run_id is a model path. The better solution is that we allow for model_path, result_path (and potential slurm_path) to be overwritten in the CLI.

# If a user provided `model_path` in the config, prefer it when the
# directory for the requested run exists. Otherwise fall back to the
# shared models directory.
if config is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think adding this to get_path_model is the wrong idea. We should create an entirely new method, which uses get_path_model. Then we can apply it to config.py ll. 232-237 and model_interface.py l. 184, because that is the actual same code that we have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working infra Issues related to infrastructure model Related to model training or definition (not generic infra)

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

Integration test fails to run evaluation

5 participants