Skip to content

Mark top-level Makefile .NOTPARALLEL to fix make -j build race#79

Open
darrylabbate wants to merge 1 commit into
intel:mainfrom
darrylabbate:fix-parallel-make-shared-objects
Open

Mark top-level Makefile .NOTPARALLEL to fix make -j build race#79
darrylabbate wants to merge 1 commit into
intel:mainfrom
darrylabbate:fix-parallel-make-shared-objects

Conversation

@darrylabbate

Copy link
Copy Markdown

The IMB-* targets each recurse into src_cpp/ and compile a shared set of intermediate objects (imb.o, benchmark_suites_collection.o, args_parser.o, scope.o, libyaml-cpp.a) to identical paths. Building two or more suites concurrently under 'make -j' lets multiple sub-makes write those same object files simultaneously, clobbering them and intermittently breaking the link, e.g.:

benchmark_suites_collection.h: undefined reference to
  `BenchmarkSuitesCollection::pnames'
imb.o: file not recognized: File truncated

The suite recipes invoke a plain 'make' (not $(MAKE) and not prefixed with '+'), so they are not recognized as recursive. With GNU Make < 4.4's pipe-based jobserver such sub-makes were cut off from the job pool and ran serially, which masked the race. GNU Make >= 4.4's FIFO jobserver connects non-recursive sub-makes to the parallel pool, so the suite builds now run concurrently and the race surfaces routinely (observed on Ubuntu 26.04 / RHEL 10, which ship make 4.4).

Add .NOTPARALLEL so the top-level suite builds are serialized regardless of -j. Parallelism within each suite sub-make is unaffected, and the suites build in seconds, so the wall-clock impact is negligible.

The IMB-* targets each recurse into src_cpp/ and compile a shared set of
intermediate objects (imb.o, benchmark_suites_collection.o, args_parser.o,
scope.o, libyaml-cpp.a) to identical paths. Building two or more suites
concurrently under 'make -j' lets multiple sub-makes write those same
object files simultaneously, clobbering them and intermittently breaking
the link, e.g.:

  benchmark_suites_collection.h: undefined reference to
    `BenchmarkSuitesCollection::pnames'
  imb.o: file not recognized: File truncated

The suite recipes invoke a plain 'make' (not $(MAKE) and not prefixed
with '+'), so they are not recognized as recursive. With GNU Make < 4.4's
pipe-based jobserver such sub-makes were cut off from the job pool and ran
serially, which masked the race. GNU Make >= 4.4's FIFO jobserver connects
non-recursive sub-makes to the parallel pool, so the suite builds now run
concurrently and the race surfaces routinely (observed on Ubuntu 26.04 /
RHEL 10, which ship make 4.4).

Add .NOTPARALLEL so the top-level suite builds are serialized regardless of
-j. Parallelism within each suite sub-make is unaffected, and the suites
build in seconds, so the wall-clock impact is negligible.

Signed-off-by: Darryl Abbate <drl@amazon.com>
@darrylabbate

Copy link
Copy Markdown
Author

Refactoring s.t. the Makefile emits per-suite objects is probably the better long-term solution (allows parallelizing), but declaring .NOTPARALLEL effectively reverts it to its previous behavior with Make 4.4+.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants