Skip to content

Ensemble filename and stoch restart fixes#409

Merged
alperaltuntas merged 3 commits intoNCAR:dev/ncarfrom
alperaltuntas:restart_file_fixes
Mar 12, 2026
Merged

Ensemble filename and stoch restart fixes#409
alperaltuntas merged 3 commits intoNCAR:dev/ncarfrom
alperaltuntas:restart_file_fixes

Conversation

@alperaltuntas
Copy link
Copy Markdown
Member

@alperaltuntas alperaltuntas commented Feb 27, 2026

This PR brings in the following changes:

  • Fix stoch restart file handling:
    • append ensemble instance suffix in multi-instance runs
    • add r_stoch file(s) to CESM rpointer files.
  • Introduce append_ensemble_appendix: To eliminate code duplication, this commit introduces append_ensemble_appendix, a routine to append the ensemble appendix to i/o filenames. If provided,
    the appendix is appended after the last occurence of ENSEMBLE_APPENDIX_PREFIX,
    a new runtime parameter (initially called RESTARTFILE_APPENDIX_PREFIX).

This PR is part of a series of PRs to fix various interrelated restart/multi-instance/hybrid run issues alongside: ESCOMP/FMS#8
ESCOMP/MOM_interface#311
ESMCI/cime#4944

To eliminate code duplication, this commit introduces append_ensemble_appendix,
a routine to append the ensemble appendix to i/o filenames. If provided,
the appendix is appended after the last occurence of ENSEMBLE_APPENDIX_PREFIX,
a new runtime parameter (initially called RESTARTFILE_APPENDIX_PREFIX).
- append ensemble instance suffix in multi-instance runs
- add it to CESM rpointer files.
@iangrooms
Copy link
Copy Markdown
Collaborator

Thanks @alperaltuntas for working on this. @kateboden ran a quick test using this PR code; results can be found at /glade/derecho/scratch/kboden/g.e30_a07g.G_JRA.TL319_t232_zstar_N75.2025.SKEB_DA.restart_test/.
Ensemble member 0040 (for example) writes two restarts with names

g.e30_a07g.G_JRA.TL319_t232_zstar_N75.2025.SKEB_DA.restart_test.mom6.r.0054-01-06-00000_0040.nc
g.e30_a07g.G_JRA.TL319_t232_zstar_N75.2025.SKEB_DA.restart_test.mom6_0040.r_stoch.0054-01-06-00000.nc

This is an improvement because previously the stochastic-physics restart file didn't have an instance number at all, but the location of the instance number in the stochastic-physics restart filename is inconsistent with the mom6 restart filename.

The rpointer file is named rpointer.ocn_0040.0054-01-06-00000 and its contents are

g.e30_a07g.G_JRA.TL319_t232_zstar_N75.2025.SKEB_DA.restart_test.mom6_0040.r.0054-01-06-00000.nc
g.e30_a07g.G_JRA.TL319_t232_zstar_N75.2025.SKEB_DA.restart_test.mom6_0040.r_stoch.0054-01-06-00000.nc

So the rpointer file contents are not consistent with the actual restart names. We have not tried to restart from these files yet.

This may be out of scope, but the instance number is placed inconsistently in some other filenames too. For example:

g.e30_a07g.G_JRA.TL319_t232_zstar_N75.2025.SKEB_DA.restart_test.mom6.h.sfc._0040.0054-01.nc
g.e30_a07g.G_JRA.TL319_t232_zstar_N75.2025.SKEB_DA.restart_test.mom6.h.static._0040.nc
g.e30_a07g.G_JRA.TL319_t232_zstar_N75.2025.SKEB_DA.restart_test.mom6.h.z._0040.0054-01.nc
g.e30_a07g.G_JRA.TL319_t232_zstar_N75.2025.SKEB_DA.restart_test.mom6.ic.0054-01-01_0040.nc
g.e30_a07g.G_JRA.TL319_t232_zstar_N75.2025.SKEB_DA.restart_test.mom6.h.Fram_Strait._0040.0054-01.nc.0856
ocean.stats._0040.nc
ocean.stats._0040

The history files all have it before the date while the ic file has it after the date. This is just aesthetic, so not a high priority.

@alperaltuntas
Copy link
Copy Markdown
Member Author

Thanks @iangrooms. If you haven't already, could you re-run these tests after merging all of the other PRs too (mentioned in the PR description):

ESCOMP/FMS#8
ESCOMP/MOM_interface#311
ESMCI/cime#4944

For convenience, I have a sandbox on derecho, where all these PRs are applied: /glade/work/altuntas/cesm.sandboxes/cesm3_0_alpha08d_feb27

@iangrooms
Copy link
Copy Markdown
Collaborator

@kateboden ran a second test using your sandbox; results can be found at /glade/derecho/scratch/kboden/g.e30_a07g.G_JRA.TL319_t232_zstar_N75.2025.SKEB_DA.restart_test_2/. The instance number is in the right place in all the filenames.

Copy link
Copy Markdown
Collaborator

@mnlevy1981 mnlevy1981 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments on the names of variables / functions; if you were just following how similar objects are named elsewhere in the code, I'm happy to approve as-is. (I didn't do any testing, but saw other comments about file names being correct in other peoples' tests)

Comment thread src/framework/MOM_io.F90 Outdated
Comment on lines +3001 to +3003
!> Append the ensemble appendix to a filename. If provided, the appendix is appended after
!! the last occurrence of the append_after substring in the filename.
subroutine append_ensemble_appendix(filename, append_after)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we rename this function insert_ensemble_id() or something in that vein? I mostly take issue with the append_ portion, because we are not always appending this string; I see that MOM and FMS use appendix and suffix to refer to the ensemble member identifier.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up renaming it as insert_ensemble_appendix, assuming that id refers to i while appendix refers to _000i for the i'th component (where 0<=i<=9)

Comment thread src/framework/MOM_io.F90 Outdated
Comment on lines +3015 to +3016
filename_appendix_t = trim(filename_appendix_t)
if (len(filename_appendix_t) == 0) return
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of trimming the string and then finding the length, you could just check (len_trim(filename_appendix_t) == 0)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, I actually switched from filename_appendix_t -> filename_appendix and used len_trim since filename_appendix_t wasn't a deferred-length string, unlike the other two.

Comment thread src/framework/MOM_io.F90 Outdated
Comment on lines +3018 to +3019
filename_t = trim(filename)
pos = len(filename_t)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, you can just use pos = len_trim(filename_t); you are going to change filename_t below anyway, so the assignment line isn't necessary

Copy link
Copy Markdown
Member Author

@alperaltuntas alperaltuntas Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added _t versions of these strings to prevent multiple calls and accidental omissions of trim() in the rest of the function. In fact, I think I should have:

filename_t = trim(adjustl(filename))

If you think _t versions of these strings cause unnecessary confusion, I am fine with removing them.

(Just noticed that trim call in 3039 is unnecessary, so it needs to be removed at the least.)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I managed to miss that 3018 was filename_t on the lefthand side and filename on the right; I think your comment about using adjustl() makes more sense than dropping filename_t -- I noticed the trim(adjustl()) construct elsewhere in MOM6 when looking for where GRID_FILE is read in.

Comment thread src/framework/MOM_io.F90 Outdated
call MOM_error(FATAL, "append_ensemble_appendix: The string "//trim(append_after)// &
" was not found in the filename "//trim(filename))
endif
pos = pos + len_trim(append_after_t) - 1
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opposite comment time :) Since append_after_t is deferred-length and is trim(append_after), don't you want len(append_after_t) instead of len_trim()?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this should be just len(append_after_t). I think I initially had:

len_trim(append_after)

then changed append_after to append_after_t, but didn't update len_trim to len.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

public ocean_public_type_chksum
public get_ocean_grid, query_ocean_state
public get_eps_omesh
public stoch_restart_needed
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like having this as a stand-alone function!

Copy link
Copy Markdown
Collaborator

@iangrooms iangrooms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any comments about the implementation, but in our ensemble data assimilation test this works.

@alperaltuntas
Copy link
Copy Markdown
Member Author

@mnlevy1981 I believe I addressed all your comments, but let me know otherwise. Thanks!

Copy link
Copy Markdown
Collaborator

@mnlevy1981 mnlevy1981 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making those changes, this looks great!

@alperaltuntas alperaltuntas merged commit b15bc97 into NCAR:dev/ncar Mar 12, 2026
52 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants