Fix logging segfault by leaking logger#470
Conversation
There is a race between activity and logger teardown, so we switch from a unique_ptr to a raw pointer and leak the logger so it persists across all activity teardowns. Signed-off-by: Lisanna Dettwyler <lisanna.dettwyler@gmail.com> (cherry picked from commit be84169)
📝 WalkthroughWalkthroughThe PR changes the global logger from an owning std::unique_ptr to a raw Logger* and updates initialization, factory call sites, wrapper allocations, subprocess contexts, and tests to release or assign raw pointers accordingly. ChangesLogger ownership model migration to raw pointers
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/libutil/unix/processes.cc (1)
213-222: 💤 Low valueUnnecessary allocation when
allowVforkis true.
newLoggeris allocated unconditionally on line 213, but it's only assigned tologgerwhen!options.allowVfork(line 215). WhenallowVforkis true, the allocation is made but never used—leaking memory in the parent even when not needed.Consider moving the allocation inside the conditional to avoid the unnecessary leak in the vfork path:
Proposed refactor
- auto newLogger = makeSimpleLogger().release(); ChildWrapperFunction wrapper = [&] { if (!options.allowVfork) { /* Set a simple logger, while leaking (not destroying) the parent logger. We don't want to run the parent logger's destructor since that will crash (e.g. when ~ProgressBar() tries to join a thread that doesn't exist. */ - logger = newLogger; + logger = makeSimpleLogger().release(); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/libutil/unix/processes.cc` around lines 213 - 222, The code unconditionally calls makeSimpleLogger().release() into newLogger but only uses it when !options.allowVfork, leaking memory when allowVfork is true; change the ChildWrapperFunction wrapper so the allocation is done inside the conditional (i.e., only allocate newLogger when !options.allowVfork) and assign logger = newLogger there, leaving no unused allocation. Update references to makeSimpleLogger(), newLogger, ChildWrapperFunction wrapper, options.allowVfork, and logger to reflect the moved allocation and ensure no unused pointer is left in the parent path.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/libutil/unix/processes.cc`:
- Around line 213-222: The code unconditionally calls
makeSimpleLogger().release() into newLogger but only uses it when
!options.allowVfork, leaking memory when allowVfork is true; change the
ChildWrapperFunction wrapper so the allocation is done inside the conditional
(i.e., only allocate newLogger when !options.allowVfork) and assign logger =
newLogger there, leaving no unused allocation. Update references to
makeSimpleLogger(), newLogger, ChildWrapperFunction wrapper, options.allowVfork,
and logger to reflect the moved allocation and ensure no unused pointer is left
in the parent path.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c6d9abde-a101-4638-acac-67f296681ae2
📒 Files selected for processing (10)
src/libexpr-tests/primops.ccsrc/libmain/loggers.ccsrc/libstore/daemon.ccsrc/libstore/unix/build/child.ccsrc/libstore/unix/build/derivation-builder.ccsrc/libutil-c/nix_api_util.ccsrc/libutil/include/nix/util/logging.hhsrc/libutil/logging.ccsrc/libutil/unix/processes.ccsrc/nix/build-remote/build-remote.cc
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/libutil-c/nix_api_util.cc`:
- Around line 259-261: When replacing the global logger in nix_set_logger, the
existing nix::logger must be cleaned up so its destroy callback runs; before
assigning the new CallbackLogger, check if nix::logger is non-null and delete it
(so CallbackLogger::~CallbackLogger invokes vtable.destroy), then assign
nix::logger = new CallbackLogger(*vtable, userdata). Make this change inside the
same block where CallbackLogger is created (referencing CallbackLogger,
nix::logger and nix_set_logger) and use delete (not free) to trigger proper
destructor semantics; keep the intended shutdown leak behavior unchanged by only
performing this delete on explicit replacements.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 24c016e9-4d71-4815-b544-313430127f5d
📒 Files selected for processing (2)
src/libutil-c/nix_api_util.ccsrc/libutil-tests/nix_api_util.cc
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/libutil-tests/nix_api_util.cc`:
- Around line 205-207: The inline comment is contradictory: the code calls
nix_set_logger(ctx, &captureLoggerVtable, &capture2) and then asserts
EXPECT_FALSE(capture.destroyed), so update the comment to match the test
behavior (replacement should NOT fire destroy). Change the comment referencing
the logger replacement to state that replacing the logger must NOT fire destroy
on the old userdata (or clarify the specific expected lifecycle), and ensure it
refers to captureLoggerVtable, capture2 and capture.destroyed so the intent
matches the ASSERT_EQ(nix_set_logger(...), NIX_OK) and
EXPECT_FALSE(capture.destroyed) lines.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4f163b84-0e91-4341-999d-58533a809401
📒 Files selected for processing (1)
src/libutil-tests/nix_api_util.cc
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/libutil-tests/nix_api_util.cc (1)
146-148:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winComment may be outdated regarding destroy callback behavior.
The comment states that "the destroy callback runs while
captureis still alive", but with the leaked-logger model (raw pointer instead of unique_ptr), the destroy callback may never be invoked. The test at line 194-209 explicitly verifies that replacing a logger does NOT invoke destroy on the old userdata. This comment may be outdated from before the logger ownership model changed.📝 Suggested clarification
- // Restore the default logger before `capture` goes out of scope so - // the destroy callback runs while `capture` is still alive. + // Restore the default logger before `capture` goes out of scope. + // Note: destroy is not called on the old logger when replacing. Finally restoreLogger([] { nix::logger = nix::makeSimpleLogger().release(); });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/libutil-tests/nix_api_util.cc` around lines 146 - 148, The existing comment near Finally restoreLogger([] { nix::logger = nix::makeSimpleLogger().release(); }); incorrectly asserts that "the destroy callback runs while `capture` is still alive"; update the comment to reflect the current leaked-logger model (nix::logger is a raw pointer returned by makeSimpleLogger().release()) which means the old logger's destroy callback may not be invoked when replacing the logger, and reference the test behavior that verifies replacing a logger does NOT invoke destroy on the old userdata; make the comment precise about ownership semantics and why the destroy callback may never run instead of claiming it will.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/libutil-tests/nix_api_util.cc`:
- Around line 196-197: The comment near the Finally in nix_api_util.cc
incorrectly implies that nix::logger's teardown will call destroy on a live
capture; update the comment to match test behavior by either removing the
reference to destroy being called at teardown or explicitly stating that destroy
is not expected to be called (e.g., because the leaked-logger model with raw
pointers intentionally avoids destruction and line 207 verifies destroy is NOT
called when replacing the logger). Mention the relevant symbols Finally and
nix::logger/destroy so future readers see the intended lifecycle.
---
Outside diff comments:
In `@src/libutil-tests/nix_api_util.cc`:
- Around line 146-148: The existing comment near Finally restoreLogger([] {
nix::logger = nix::makeSimpleLogger().release(); }); incorrectly asserts that
"the destroy callback runs while `capture` is still alive"; update the comment
to reflect the current leaked-logger model (nix::logger is a raw pointer
returned by makeSimpleLogger().release()) which means the old logger's destroy
callback may not be invoked when replacing the logger, and reference the test
behavior that verifies replacing a logger does NOT invoke destroy on the old
userdata; make the comment precise about ownership semantics and why the destroy
callback may never run instead of claiming it will.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c32a9237-4967-4b41-bc04-6177bbff60b7
📒 Files selected for processing (1)
src/libutil-tests/nix_api_util.cc
| // Both captures must outlive the Finally so that the logger held | ||
| // by `nix::logger` at teardown can call destroy on a live capture. |
There was a problem hiding this comment.
Comment contradicts test behavior regarding destroy lifecycle.
The comment states that "the logger held by nix::logger at teardown can call destroy on a live capture", suggesting destroy will eventually be called. However:
- Line 207 verifies that destroy is NOT called when replacing the logger
- The leaked-logger model (using raw pointers) means loggers are intentionally not destroyed
- This contradicts the comment's implication that destroy will fire at teardown
Consider clarifying whether destroy is expected to be called at any point in the test lifecycle, or remove the reference to destroy being called.
📝 Suggested clarification
- // Both captures must outlive the Finally so that the logger held
- // by `nix::logger` at teardown can call destroy on a live capture.
+ // Both captures must outlive the Finally block.
+ // Note: destroy is not called when replacing loggers.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Both captures must outlive the Finally so that the logger held | |
| // by `nix::logger` at teardown can call destroy on a live capture. | |
| // Both captures must outlive the Finally block. | |
| // Note: destroy is not called when replacing loggers. |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/libutil-tests/nix_api_util.cc` around lines 196 - 197, The comment near
the Finally in nix_api_util.cc incorrectly implies that nix::logger's teardown
will call destroy on a live capture; update the comment to match test behavior
by either removing the reference to destroy being called at teardown or
explicitly stating that destroy is not expected to be called (e.g., because the
leaked-logger model with raw pointers intentionally avoids destruction and line
207 verifies destroy is NOT called when replacing the logger). Mention the
relevant symbols Finally and nix::logger/destroy so future readers see the
intended lifecycle.
Motivation
Cherry-picks upstream NixOS#15884.
Context
Add 👍 to pull requests you find important.
The Nix maintainer team uses a GitHub project board to schedule and track reviews.
Summary by CodeRabbit
Refactor
Tests