In mcpl_merge_outfiles_mpi the order of merging, merged-file compression, and temporary file removal could be problematic on systems with limited disk space.
|
mcpl_outfile_t outfh = mcpl_merge_files( targetfn.c_str, nproc, |
|
(const char**)fns); |
|
if ( !mcpl_closeandgzip_outfile(outfh) ) |
|
mcpl_error("mcpl_merge_outfiles_mpi: problems gzipping final output"); |
|
//Remove worker files: |
|
for ( unsigned long iproc = 0; iproc < nproc; ++iproc ) { |
|
char * bn = mcpl_basename(fns[iproc]); |
|
size_t n = 128 + strlen(bn); |
|
char * buf = mcpl_internal_malloc(n); |
|
snprintf(buf,n,"MCPL: Removing file %s\n",bn); |
|
mcpl_internal_delete_file( fns[iproc] ); |
|
mcpl_print(buf); |
|
free(bn); |
|
free(buf); |
|
} |
|
//Cleanup memory: |
The files present on disk at points through the execution of mcpl_merge_outfiles_mpi are:
| point |
worker files |
.mcpl file |
.mcpl.gz file |
disk utilization |
before mcpl_merge_files |
nproc |
0 |
0 |
1 |
after mcpl_merge_files |
nproc |
1 |
0 |
2 |
during mcpl_closeandgzip_outfile |
nproc |
1 |
1 |
3 |
after mcpl_closeandgzip_outfile |
nproc |
0 |
1 |
2 |
| after removing worker files |
0 |
0 |
1 |
1 |
On systems with limited disk space, the compression operation could fail after exhausting the available storage due to the unnecessary presence of the nproc worker files.
Possible solution
Instead, by moving the compression operation after worker-file removal, the maximum disk utilization can be reduced to only twice the final file size:
mcpl_outfile_t outfh = mcpl_merge_files( targetfn.c_str, nproc,
(const char**)fns);
//Remove worker files:
for ( unsigned long iproc = 0; iproc < nproc; ++iproc ) {
char * bn = mcpl_basename(fns[iproc]);
size_t n = 128 + strlen(bn);
char * buf = mcpl_internal_malloc(n);
snprintf(buf,n,"MCPL: Removing file %s\n",bn);
mcpl_internal_delete_file( fns[iproc] );
mcpl_print(buf);
free(bn);
free(buf);
}
if ( !mcpl_closeandgzip_outfile(outfh) )
mcpl_error("mcpl_merge_outfiles_mpi: problems gzipping final output");
The files present on disk at points through the execution of this modified code would be:
| point |
worker files |
.mcpl file |
.mcpl.gz file |
disk utilization |
before mcpl_merge_files |
nproc |
0 |
0 |
1 |
after mcpl_merge_files |
nproc |
1 |
0 |
2 |
| after removing worker files |
0 |
1 |
0 |
1 |
during mcpl_closeandgzip_outfile |
0 |
1 |
1 |
2 |
after mcpl_closeandgzip_outfile |
0 |
0 |
1 |
1 |
In
mcpl_merge_outfiles_mpithe order of merging, merged-file compression, and temporary file removal could be problematic on systems with limited disk space.mcpl/mcpl_core/src/mcpl.c
Lines 4530 to 4545 in 838417c
The files present on disk at points through the execution of
mcpl_merge_outfiles_mpiare:.mcplfile.mcpl.gzfilemcpl_merge_filesnprocmcpl_merge_filesnprocmcpl_closeandgzip_outfilenprocmcpl_closeandgzip_outfilenprocOn systems with limited disk space, the compression operation could fail after exhausting the available storage due to the unnecessary presence of the
nprocworker files.Possible solution
Instead, by moving the compression operation after worker-file removal, the maximum disk utilization can be reduced to only twice the final file size:
The files present on disk at points through the execution of this modified code would be:
.mcplfile.mcpl.gzfilemcpl_merge_filesnprocmcpl_merge_filesnprocmcpl_closeandgzip_outfilemcpl_closeandgzip_outfile