Skip to content

Load weights from a fit#2443

Open
ecole41 wants to merge 5 commits intomasterfrom
load_weights_from_fit
Open

Load weights from a fit#2443
ecole41 wants to merge 5 commits intomasterfrom
load_weights_from_fit

Conversation

@ecole41
Copy link
Copy Markdown
Collaborator

@ecole41 ecole41 commented Mar 25, 2026

This PR adds the load_weights_from_fit option which allows for the weights from a previous fit to be used as the initial weights of another fit, if the weight files are available for this previous fit.

@ecole41 ecole41 requested a review from scarlehoff March 25, 2026 12:33
@ecole41 ecole41 added the enhancement New feature or request label Mar 25, 2026
@@ -165,6 +165,13 @@ def parse_pdf(self, name, unpolarized_bc=None):

return pdf

def parse_load_weights_from_fit(self, name: str):
"""A fit in the results folder, containing at least a valid filter result."""
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of returning here a fit with a pase, I think it is better to read here the weight paths as a list in a produce rule (that takes load_weights_from_fit as input) and return that.

And if load_weights_from_fit is empty then return None.

This allows other tools (e.g. simunet or colibri) to override the behaviour more easily

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed this to produce a dictionary of paths, each assigned the corresponding replica index to make the extraction of the correct path easier in model_trainer

f"Fit to load weights from: {load_weights_from_fit} does not contain any weights file in the expected location (replica_*/weights.weights.h5)"
)


Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these checks can be in the validphys produce, here you should only need to check whether the load and the load_from_fit are both defined.

For instance, right now this line

fit_folder.glob("nnfit/replica_*/weights.weights.h5"))

is effectively done twice once here once the model trainer. If you put it in the produce instead then it is only once (and with the extras of being able to override it easily and knowing for sure you have a folder, otherwise the check_fit would've failed)

basis,
fitbasis,
positivity_bound,
load_weights_dict,
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scarlehoff I want to allow the option to not define load_weights_from_fit and have load_weights_dict as None but I am not sure how to do this here

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just set =None as default (in the produce, and if you want also here) and then when none do nothing

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When setting the default to None here, it never ran the produce function even when load_weights_from_fit was defined. But I can still just edit the produce function, this will work. I was wondering if there was maybe a way where the produce function doesn't have to be called at all when load_weights_from_fit is None

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see

produce function doesn't have to be called at all when load_weights_from_fit is None

No, I think if you ask for an action that requires something it runs over all production rules. But it's not a problem.

Copy link
Copy Markdown
Member

@scarlehoff scarlehoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Seems fine to me. I want to try to do the version for multireplica before merging, I think it should work fine.

from validphys.loader import Loader
from validphys.core import FitSpec

l = Loader()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the end you are not using these lines right?

if self.load_weights_dict:
weights_path= self.load_weights_dict[self.replicas[0]]
log.info("Loading weights from path: " + str(weights_path))
pdf_model.load_weights(weights_path)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle this should be able to load the weights for all replicas at once, so that it can also be used when running several replicas in parallel. I'll have a go when I test this PR.

@@ -165,6 +165,20 @@ def parse_pdf(self, name, unpolarized_bc=None):

return pdf

def produce_load_weights_dict(self, fit, load_weights_from_fit=None):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need fit here as part of the input?

@ecole41 ecole41 force-pushed the load_weights_from_fit branch from 7aabb07 to 82b7975 Compare March 30, 2026 14:30
@ecole41
Copy link
Copy Markdown
Collaborator Author

ecole41 commented Mar 30, 2026

@scarlehoff Do you understand why the commondata test is failing here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants