Skip to content
Draft
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
d3178e1
Add for_testing options to namelist handling to bypass init and run
ekluzek Aug 21, 2025
d65ecfb
Turn the bypass init and run logicals for testing on
ekluzek Aug 21, 2025
96e0c94
Add a namelist read and some logical settings to bypass init and run …
ekluzek Aug 21, 2025
f6272ef
Merge remote-tracking branch 'escomp/b4b-dev' into for_testing_bypass…
ekluzek Aug 21, 2025
7006740
Add use of abortutils so can make endrun calls
ekluzek Aug 21, 2025
93628b2
Add bypassing the run phase in the for_testing tests, and remove it f…
ekluzek Aug 22, 2025
28834c9
Update bld/namelist_files/namelist_definition_ctsm.xml
ekluzek Aug 22, 2025
256e692
Merge branch 'b4b-dev' into for_testing_bypass_framework
ekluzek Sep 23, 2025
0137dc2
Merge branch 'for_testing_bypass_framework' of github.com:ekluzek/CTS…
ekluzek Sep 23, 2025
69792af
Merge branch 'b4b-dev' into for_testing_bypass_framework
ekluzek Sep 29, 2025
de1ca06
Add namelist controls for self testing
ekluzek Jun 27, 2025
c1c7ca3
Add unit_test_shr directory to the main model build
ekluzek Aug 29, 2025
db4551c
Merge remote-tracking branch 'escomp/b4b-dev' into decomp_init_for_te…
ekluzek Aug 23, 2025
cb8e7ec
Merge remote-tracking branch 'escomp/b4b-dev' into decomp_init_for_te…
ekluzek Aug 22, 2025
09aa5ac
Balance check doesn't take time, so adjust the timers again for part3
ekluzek Jul 31, 2025
bf498ab
Add another timer within part3, and also turn off some of the history…
ekluzek Jul 31, 2025
3c54060
Add timers for clm_initialize2 that cover the whole subroutine
ekluzek Jul 31, 2025
ce2d68b
Change the test grid total size to 384 so can be divisible by either …
ekluzek Sep 2, 2025
2fc723f
Don't do the abort testing if not serial as different tasks won't be …
ekluzek Sep 6, 2025
d8d656b
Change a test to make it valid for clump_pproc or not
ekluzek Sep 8, 2025
4ce6b5f
Just do the checking over the local processor clumps and not all the …
ekluzek Sep 14, 2025
7dc7dcd
Resolve the conflicts
ekluzek Oct 1, 2025
2a46724
Remove some of the previous bypassing changes that aren't needed here
ekluzek Aug 22, 2025
c95b886
Move bypass code around a bit so that most timers aren't half in/half…
ekluzek Aug 27, 2025
86382c6
Also bypass the import fields for_testing option, and move the decomp…
ekluzek Aug 27, 2025
dac0ae0
Move the get_proc_bounds to inside the bypass
ekluzek Aug 29, 2025
e867afe
Changes to exit early when self test namelist option used for_testing…
ekluzek Jul 2, 2025
d19b894
Add asserts for scalars and also text scalars
ekluzek Aug 11, 2025
a5d5b5c
Revert most of 2fd081b544 so removing the changes regarding the addit…
ekluzek Oct 1, 2025
b3185c0
Move some for_testing namelist items into the selftests driver namelist
ekluzek Oct 1, 2025
1acf630
Remove the uneeded timers and get back to the 3 part timers as they s…
ekluzek Oct 1, 2025
2636975
Remove some changes from the baseline code that aren't needed especia…
ekluzek Oct 1, 2025
def2c97
Remove TestDecompInit for now, bring it in, in another PR
ekluzek Oct 2, 2025
bf947fb
Remove the update to Assertions and bring it in, in another PR
ekluzek Oct 2, 2025
80192dc
Remove update in DecompInitMod for now
ekluzek Oct 2, 2025
df19381
Merge branch 'b4b-dev' into for_testing_bypass_framework
ekluzek Oct 3, 2025
417922d
Merge branch 'b4b-dev' into for_testing_bypass_framework
ekluzek Oct 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions bld/CLMBuildNamelist.pm
Original file line number Diff line number Diff line change
Expand Up @@ -5290,6 +5290,7 @@ sub write_output_files {
push @groups, "clm_canopy_inparm";
push @groups, "prigentroughness";
push @groups, "zendersoilerod";
push @groups, "for_testing_options";
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I should add more error checking to the build-namelist for these bypass options. I also think that these testing options should maybe do things like ensure that history and restart files are off the like. There also might be some error checking that some of the advanced options are NOT turned on with the bypass options, and that sort of thing. And it will be important to make sure it's clear that these bypass and testing options are turned on -- and they don't get turned on accidentally. It will take some thinking to figure that out.

if (remove_leading_and_trailing_quotes($nl->get_value('snow_cover_fraction_method')) eq 'SwensonLawrence2012') {
push @groups, "scf_swenson_lawrence_2012_inparm";
}
Expand Down
16 changes: 16 additions & 0 deletions bld/namelist_files/namelist_definition_ctsm.xml
Original file line number Diff line number Diff line change
Expand Up @@ -1242,6 +1242,22 @@ Whether to use subgrid fluxes for snow
Whether snow on the vegetation canopy affects the radiation/albedo calculations
</entry>

<!-- ======================================================================================== -->
<!-- Namelist items for doing specific things for testing -->
<!-- for_testing section: -->
<!-- ======================================================================================== -->

<entry id="for_testing_bypass_init" type="logical" category="default_settings"
group="for_testing_options" >
For testing whether to bypass the rest of the initiatlization after the self test driver is run
Comment thread
ekluzek marked this conversation as resolved.
Outdated
</entry>

<entry id="for_testing_bypass_run" type="logical" category="default_settings"
group="for_testing_options" >
For testing whether to bypass most of the run phase other than the clock advance
</entry>


<entry id="for_testing_run_ncdiopio_tests" type="logical" category="default_settings"
group="clm_inparm" >
Whether to run some tests of ncdio_pio as part of the model run. This is
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
for_testing_bypass_init = .true.
for_testing_bypass_run = .true.
Comment thread
ekluzek marked this conversation as resolved.
Outdated
for_testing_run_ncdiopio_tests = .true.

! Turn off history, restarts, and output
Expand Down
7 changes: 7 additions & 0 deletions src/main/clm_driver.F90
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@ module clm_driver
use clm_instMod
use SoilMoistureStreamMod , only : PrescribedSoilMoistureInterp, PrescribedSoilMoistureAdvance
use SoilBiogeochemDecompCascadeConType , only : no_soil_decomp, decomp_method
use SelfTestDriver , only : for_testing_bypass_run_except_clock_advance
!
! !PUBLIC TYPES:
implicit none
Expand Down Expand Up @@ -165,6 +166,7 @@ subroutine clm_drv(doalb, nextsw_cday, declinp1, declin, rstwr, nlend, rdate, ro
! CalcIrrigationNeeded. Simply declaring this variable makes the ICE go away.
real(r8), allocatable :: dummy1_to_make_pgi_happy(:)
!-----------------------------------------------------------------------
if ( for_testing_bypass_run_except_clock_advance() ) return
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this short circuits the code -- it's worth thinking if this should be a simple return statement or a more explicit if structure.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samsrabin this is one thing I'd like to hear feedback on. Let me know what you think about this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well this one isn't just an early return; it actually short-circuits the entire subroutine. So why not just wrap its call in an if statement?

Copy link
Copy Markdown
Member

@samsrabin samsrabin Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing now that the other returns are the same way, so same question there.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like your idea @samsrabin. For performance it's of course better to do at the higher level. For readability it might depend on where you want these things to be seen.

But, in this case it would remove this strange testing option from code that's a mix of science/software infrastructure to the higher level in the NUOPC cap that is pretty much just for SE's. So it helps with readability as well as separation of concerns.


! Determine processor bounds and clumps for this processor

Expand Down Expand Up @@ -1576,6 +1578,8 @@ subroutine clm_drv_init(bounds, &
integer :: fp, fc ! filter indices
!-----------------------------------------------------------------------

if ( for_testing_bypass_run_except_clock_advance() ) return

associate( &
snl => col%snl , & ! Input: [integer (:) ] number of snow layers

Expand Down Expand Up @@ -1657,6 +1661,7 @@ subroutine clm_drv_patch2col (bounds, &
! !LOCAL VARIABLES:
integer :: c,fc ! indices
! -----------------------------------------------------------------
if ( for_testing_bypass_run_except_clock_advance() ) return

! Note: lake points are excluded from many of the following
! averages. For some fields, this is because the field doesn't
Expand Down Expand Up @@ -1752,6 +1757,8 @@ subroutine write_diagnostic (bounds, nstep, lnd2atm_inst)
integer :: status(MPI_STATUS_SIZE) ! mpi status
!------------------------------------------------------------------------

if ( for_testing_bypass_run_except_clock_advance() ) return

call get_proc_global(ng=numg)

if (masterproc) then
Expand Down
10 changes: 9 additions & 1 deletion src/main/clm_initializeMod.F90
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ module clm_initializeMod
use CLMFatesInterfaceMod , only : CLMFatesGlobals1,CLMFatesGlobals2
use CLMFatesInterfaceMod , only : CLMFatesTimesteps
use dynSubgridControlMod , only : dynSubgridControl_init, get_reset_dynbal_baselines
use SelfTestDriver , only : self_test_driver
use SelfTestDriver , only : self_test_driver, for_testing_bypass_init_after_self_tests
use SoilMoistureStreamMod , only : PrescribedSoilMoistureInit
use clm_instMod
!
Expand Down Expand Up @@ -67,6 +67,7 @@ subroutine initialize1(dtime)
use SoilBiogeochemDecompCascadeConType , only : decomp_cascade_par_init
use CropReprPoolsMod , only: crop_repr_pools_init
use HillslopeHydrologyMod, only: hillslope_properties_init
use SelfTestDriver , only: self_test_readnml
!
! !ARGUMENTS
integer, intent(in) :: dtime ! model time step (seconds)
Expand Down Expand Up @@ -104,6 +105,8 @@ subroutine initialize1(dtime)
call surfrd_get_num_patches(fsurdat, actual_maxsoil_patches, actual_numpft, actual_numcft)
call surfrd_get_nlevurb(fsurdat, actual_nlevurb)

call self_test_readnml( NLFilename )

! If fates is on, we override actual_maxsoil_patches. FATES dictates the
! number of patches per column. We still use numcft from the surface
! file though...
Expand Down Expand Up @@ -185,6 +188,7 @@ subroutine initialize2(ni,nj, currtime)
use FATESFireFactoryMod , only : scalar_lightning
use dynFATESLandUseChangeMod , only : dynFatesLandUseInit
use HillslopeHydrologyMod , only : InitHillslope
use SelfTestDriver , only : for_testing_bypass_init_after_self_tests
!
! !ARGUMENTS
integer, intent(in) :: ni, nj ! global grid sizes
Expand Down Expand Up @@ -467,6 +471,7 @@ subroutine initialize2(ni,nj, currtime)
call bgc_vegetation_inst%Init2(bounds_proc, NLFilename)
end if

if ( .not. for_testing_bypass_init_after_self_tests() )then
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to indent the lines below. I left it unindents until I saw if it could run, and that this is the spot where it should be.

if (use_cn) then

! NOTE(wjs, 2016-02-23) Maybe the rest of the body of this conditional should also
Expand Down Expand Up @@ -510,6 +515,7 @@ subroutine initialize2(ni,nj, currtime)
if (nsrest == nsrContinue ) then
call htapes_fieldlist()
end if
end if
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indent lines above.


! Read restart/initial info
is_cold_start = .false.
Expand Down Expand Up @@ -684,6 +690,7 @@ subroutine initialize2(ni,nj, currtime)
call hist_htapes_build()
end if

if ( .not. for_testing_bypass_init_after_self_tests() )then
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing about the indents.

It's also not completelly clear how many of these type of things there should be.

One reason that I need to do this here -- rather than having return statements, is that I need the timers to work. So if a timer has been started in the code above, I can't return in the middle without that timer being messed up. So it needs to exucute the part of the code where the timer is stopped.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samsrabin this and the one above about the return statements are things I'd like to hear from you on. In here they need to be if statements as I say above, but in the above section they could be either.

Although, maybe because this is a weird pathway in the code -- the return statements should be preferred so it doesn't disrupt the code flow as much as if statements. It's easy to miss returns in the code, flow but that's only important when it's something that might happen commonly enough. There was a code standard in my past that recommended to not have return statements in the midst of subroutines, because they are easy to miss. And I do see that point...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer early returns as much as possible, especially since as you said these are rare cases. Avoiding returns by using if statements creates its own readability issues: How far indented am I right now? Under what conditions?

The bigger readability problem with this module is how enormous the subroutines are. I'd prefer to see things like initialize2 getting refactored before we start worrying about ifs vs. returns. Maybe that could be something you do before this PR? I.e., refactor that subroutine to create at least two new ones:

  1. Everything inside the first new if
  2. Everything inside the second new if

Maybe also (3) everything between the new ifs.

I wouldn't want to see that done in this PR, though, in the interest of keeping things easy to review and test.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactoring could also avoid return statements too: Break subroutines into two smaller subroutines, with the call of the first being wrapped in an if statement.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good points, thanks for the comments. I'll be thinking about all this. I can't do a lot of work in refactoring clm_initialize and won't do it here. But maybe it would help to do some simple things in a different small PR. Hmmm...

! Initialize variables that are associated with accumulated fields.
! The following is called for both initial and restart runs and must
! must be called after the restart file is read
Expand Down Expand Up @@ -767,6 +774,7 @@ subroutine initialize2(ni,nj, currtime)
water_inst%waterdiagnosticbulk_inst, canopystate_inst, &
soilstate_inst, soilbiogeochem_carbonflux_inst)
end if
end if

! topo_glc_mec was allocated in initialize1, but needed to be kept around through
! initialize2 because it is used to initialize other variables; now it can be deallocated
Expand Down
5 changes: 4 additions & 1 deletion src/main/clm_instMod.F90
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,7 @@ subroutine clm_instInit(bounds)
use HillslopeHydrologyMod , only : SetHillslopeSoilThickness
use initVerticalMod , only : setSoilLayerClass
use DustEmisFactory , only : create_dust_emissions
use SelfTestDriver , only : for_testing_bypass_init_after_self_tests
!
! !ARGUMENTS
type(bounds_type), intent(in) :: bounds ! processor bounds
Expand Down Expand Up @@ -269,7 +270,9 @@ subroutine clm_instInit(bounds)
call humanindex_inst%Init(bounds)

! Initialize urban time varying data
call urbantv_inst%Init(bounds, NLFilename)
if ( .not. for_testing_bypass_init_after_self_tests() )then
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some of my decompInit testing work, I found where the urbantv calll specifically was problematic, so I'm explicitly just avoiding this one call here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this isn't where the problem was.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I should probably remove this one.

call urbantv_inst%Init(bounds, NLFilename)
end if

! Initialize vertical data components

Expand Down
91 changes: 90 additions & 1 deletion src/self_tests/SelfTestDriver.F90
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,22 @@ module SelfTestDriver
use clm_varctl, only : for_testing_run_ncdiopio_tests
use decompMod, only : bounds_type
use TestNcdioPio, only : test_ncdio_pio
use abortutils, only : endrun

implicit none
private
save

! Public routines

public :: self_test_driver
public :: self_test_driver ! Run the self-tests asked for
public :: self_test_readnml ! Read in the general self testing options for overall code flow
public :: for_testing_bypass_init_after_self_tests ! For testing bypass the rest of the initialization after the self test driver was run
public :: for_testing_bypass_run_except_clock_advance ! For testing bypass most of the run phase other than the clock advance

! Private module data
logical :: for_testing_bypass_init ! For testing bypass the initialization phase after the self-test driver
logical :: for_testing_bypass_run ! For testing bypass most of the run phase except the time advance

character(len=*), parameter, private :: sourcefile = &
__FILE__
Expand Down Expand Up @@ -46,4 +54,85 @@ subroutine self_test_driver(bounds)

end subroutine self_test_driver

!-----------------------------------------------------------------------
subroutine self_test_readnml(NLFileName)
!
! !DESCRIPTION:
! Namelist read for the self-test driver. This includes bypass options
! that will be used in other parts of the code to bypass bits of the code
! for testing purposes.
!
! !USES:
use shr_nl_mod , only : shr_nl_find_group_name
use spmdMod, only : masterproc, mpicom
use shr_mpi_mod, only : shr_mpi_bcast
use clm_varctl, only : iulog
!
! !ARGUMENTS:
character(len=*), intent(in) :: NLFilename ! Namelist filename
!
! !LOCAL VARIABLES:
integer :: ierr ! error code
integer :: unitn ! unit for namelist file

! Namelist name: this has to be matched with the name in the read stqatement
character(len=*), parameter :: nmlname = 'for_testing_options'
!-----------------------------------------------------------------------

namelist /for_testing_options/ for_testing_bypass_init, for_testing_bypass_run
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the other for_testing namelist options should move to here as well.


! Initialize options to default values, in case they are not specified in
! the namelist

if (masterproc) then
write(iulog,*) 'Read in '//nmlname//' namelist'
open(newunit=unitn, status='old', file=NLFilename)
call shr_nl_find_group_name(unitn, nmlname, status=ierr)
if (ierr == 0) then
read(unit=unitn, nml=for_testing_options, iostat=ierr)
if (ierr /= 0) then
call endrun(msg="ERROR reading "//nmlname//"namelist", file=sourcefile, line=__LINE__)
end if
else
call endrun(msg="ERROR finding "//nmlname//"namelist", file=sourcefile, line=__LINE__)
end if
close(unitn)
end if

call shr_mpi_bcast (for_testing_bypass_init, mpicom)
call shr_mpi_bcast (for_testing_bypass_run, mpicom)

if (masterproc) then
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There also should be some error checking done here in the Fortran code.

write(iulog,*) ' '
write(iulog,*) nmlname//' settings:'
write(iulog,nml=for_testing_options)
write(iulog,*) ' '
end if

end subroutine self_test_readnml

!-----------------------------------------------------------------------

logical function for_testing_bypass_init_after_self_tests()
! Determine if should exit initialization early after having run the self tests
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both of these should get more sosphisticated and do things like NOT return true until after the self-tests are run. And ensure that the self_test namelist was read in, or else abort. The run phase bypass should maybe NOT return true until after the init phase has passed and things like that.

if ( for_testing_bypass_init ) then
for_testing_bypass_init_after_self_tests = .true.
else
for_testing_bypass_init_after_self_tests = .false.
end if
end function for_testing_bypass_init_after_self_tests

!-----------------------------------------------------------------------

logical function for_testing_bypass_run_except_clock_advance()
! Determine if should skip most of the run phase other than the clock advance
if ( for_testing_bypass_init ) then
for_testing_bypass_run_except_clock_advance = .true.
else
for_testing_bypass_run_except_clock_advance = .false.
end if
end function for_testing_bypass_run_except_clock_advance

!-----------------------------------------------------------------------

end module SelfTestDriver