Skip to content

Fix logging segfault by leaking logger#470

Merged
edolstra merged 3 commits into
mainfrom
leak-logger
May 22, 2026
Merged

Fix logging segfault by leaking logger#470
edolstra merged 3 commits into
mainfrom
leak-logger

Conversation

@edolstra
Copy link
Copy Markdown
Collaborator

@edolstra edolstra commented May 22, 2026

Motivation

Cherry-picks upstream NixOS#15884.

Context


Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

Summary by CodeRabbit

  • Refactor

    • Logging internals reworked across startup, child processes, remote builds, and connection handling to improve stability and teardown behavior. No public APIs or intended behavior changes.
  • Tests

    • Updated logging-related tests to match the revised logger lifecycle and teardown semantics.

Review Change Stack

There is a race between activity and logger teardown, so we
switch from a unique_ptr to a raw pointer and leak the logger so it
persists across all activity teardowns.

Signed-off-by: Lisanna Dettwyler <lisanna.dettwyler@gmail.com>
(cherry picked from commit be84169)
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 22, 2026

📝 Walkthrough

Walkthrough

The PR changes the global logger from an owning std::unique_ptr to a raw Logger* and updates initialization, factory call sites, wrapper allocations, subprocess contexts, and tests to release or assign raw pointers accordingly.

Changes

Logger ownership model migration to raw pointers

Layer / File(s) Summary
Core logger ownership API change
src/libutil/include/nix/util/logging.hh, src/libutil/logging.cc
Global logger declaration changes from std::unique_ptr<Logger> to Logger*; core initialization calls .release() on makeSimpleLogger(true) and the JSON tee setup wraps the raw pointer in a temporary std::unique_ptr for makeTeeLogger() and then releases the result back to a raw pointer.
Entry-point logger factory call adaptations
src/libmain/loggers.cc, src/libutil-c/nix_api_util.cc, src/nix/build-remote/build-remote.cc
Top-level sites now assign factory results via .release() or use raw new (C API) so the global logger holds a raw pointer.
Daemon connection and subprocess logger context switching
src/libstore/daemon.cc, src/libstore/unix/build/child.cc, src/libstore/unix/build/derivation-builder.cc, src/libutil/unix/processes.cc
Daemon processConnection allocates TunnelLogger with new and assigns it to the raw logger; child init, derivation builder, and startProcess replace unique_ptr assignments with .release()/raw pointer assignments.
Test infrastructure adaptation for raw pointer logger
src/libexpr-tests/primops.cc, src/libutil-tests/nix_api_util.cc
CaptureLogging saves the previous logger as a raw pointer and installs a heap CaptureLogger; tests restore the logger with .release() calls and access the active logger by casting the raw logger pointer.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • DeterminateSystems/nix-src#448: Both PRs touch the C logger integration in src/libutil-c/nix_api_util.cc, including nix_set_logger/CallbackLogger construction.

Poem

🐰 I swapped the unique for a pointer so neat,
Release and restore with a gentle heartbeat,
Tests now cast where smart-ptrs once stood,
Leak on purpose to keep shutdowns good,
A tiny hop for logs, a rabbit’s cheer!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: converting the global logger from a unique_ptr (with ownership) to a raw pointer by releasing/leaking it to fix a segfault caused by race conditions during teardown.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch leak-logger

Comment @coderabbitai help to get the list of available commands and usage tips.

@edolstra edolstra enabled auto-merge May 22, 2026 15:17
cole-h
cole-h previously approved these changes May 22, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/libutil/unix/processes.cc (1)

213-222: 💤 Low value

Unnecessary allocation when allowVfork is true.

newLogger is allocated unconditionally on line 213, but it's only assigned to logger when !options.allowVfork (line 215). When allowVfork is true, the allocation is made but never used—leaking memory in the parent even when not needed.

Consider moving the allocation inside the conditional to avoid the unnecessary leak in the vfork path:

Proposed refactor
-    auto newLogger = makeSimpleLogger().release();
     ChildWrapperFunction wrapper = [&] {
         if (!options.allowVfork) {
             /* Set a simple logger, while leaking (not destroying)
                the parent logger. We don't want to run the parent
                logger's destructor since that will crash (e.g. when
                ~ProgressBar() tries to join a thread that doesn't
                exist. */
-            logger = newLogger;
+            logger = makeSimpleLogger().release();
         }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/libutil/unix/processes.cc` around lines 213 - 222, The code
unconditionally calls makeSimpleLogger().release() into newLogger but only uses
it when !options.allowVfork, leaking memory when allowVfork is true; change the
ChildWrapperFunction wrapper so the allocation is done inside the conditional
(i.e., only allocate newLogger when !options.allowVfork) and assign logger =
newLogger there, leaving no unused allocation. Update references to
makeSimpleLogger(), newLogger, ChildWrapperFunction wrapper, options.allowVfork,
and logger to reflect the moved allocation and ensure no unused pointer is left
in the parent path.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/libutil/unix/processes.cc`:
- Around line 213-222: The code unconditionally calls
makeSimpleLogger().release() into newLogger but only uses it when
!options.allowVfork, leaking memory when allowVfork is true; change the
ChildWrapperFunction wrapper so the allocation is done inside the conditional
(i.e., only allocate newLogger when !options.allowVfork) and assign logger =
newLogger there, leaving no unused allocation. Update references to
makeSimpleLogger(), newLogger, ChildWrapperFunction wrapper, options.allowVfork,
and logger to reflect the moved allocation and ensure no unused pointer is left
in the parent path.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c6d9abde-a101-4638-acac-67f296681ae2

📥 Commits

Reviewing files that changed from the base of the PR and between 3b7d197 and a8b192c.

📒 Files selected for processing (10)
  • src/libexpr-tests/primops.cc
  • src/libmain/loggers.cc
  • src/libstore/daemon.cc
  • src/libstore/unix/build/child.cc
  • src/libstore/unix/build/derivation-builder.cc
  • src/libutil-c/nix_api_util.cc
  • src/libutil/include/nix/util/logging.hh
  • src/libutil/logging.cc
  • src/libutil/unix/processes.cc
  • src/nix/build-remote/build-remote.cc

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/libutil-c/nix_api_util.cc`:
- Around line 259-261: When replacing the global logger in nix_set_logger, the
existing nix::logger must be cleaned up so its destroy callback runs; before
assigning the new CallbackLogger, check if nix::logger is non-null and delete it
(so CallbackLogger::~CallbackLogger invokes vtable.destroy), then assign
nix::logger = new CallbackLogger(*vtable, userdata). Make this change inside the
same block where CallbackLogger is created (referencing CallbackLogger,
nix::logger and nix_set_logger) and use delete (not free) to trigger proper
destructor semantics; keep the intended shutdown leak behavior unchanged by only
performing this delete on explicit replacements.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 24c016e9-4d71-4815-b544-313430127f5d

📥 Commits

Reviewing files that changed from the base of the PR and between a8b192c and 73c7c58.

📒 Files selected for processing (2)
  • src/libutil-c/nix_api_util.cc
  • src/libutil-tests/nix_api_util.cc

Comment thread src/libutil-c/nix_api_util.cc
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/libutil-tests/nix_api_util.cc`:
- Around line 205-207: The inline comment is contradictory: the code calls
nix_set_logger(ctx, &captureLoggerVtable, &capture2) and then asserts
EXPECT_FALSE(capture.destroyed), so update the comment to match the test
behavior (replacement should NOT fire destroy). Change the comment referencing
the logger replacement to state that replacing the logger must NOT fire destroy
on the old userdata (or clarify the specific expected lifecycle), and ensure it
refers to captureLoggerVtable, capture2 and capture.destroyed so the intent
matches the ASSERT_EQ(nix_set_logger(...), NIX_OK) and
EXPECT_FALSE(capture.destroyed) lines.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4f163b84-0e91-4341-999d-58533a809401

📥 Commits

Reviewing files that changed from the base of the PR and between 73c7c58 and 867546b.

📒 Files selected for processing (1)
  • src/libutil-tests/nix_api_util.cc

Comment thread src/libutil-tests/nix_api_util.cc Outdated
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 22, 2026

@github-actions github-actions Bot temporarily deployed to pull request May 22, 2026 15:35 Inactive
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/libutil-tests/nix_api_util.cc (1)

146-148: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Comment may be outdated regarding destroy callback behavior.

The comment states that "the destroy callback runs while capture is still alive", but with the leaked-logger model (raw pointer instead of unique_ptr), the destroy callback may never be invoked. The test at line 194-209 explicitly verifies that replacing a logger does NOT invoke destroy on the old userdata. This comment may be outdated from before the logger ownership model changed.

📝 Suggested clarification
-    // Restore the default logger before `capture` goes out of scope so
-    // the destroy callback runs while `capture` is still alive.
+    // Restore the default logger before `capture` goes out of scope.
+    // Note: destroy is not called on the old logger when replacing.
     Finally restoreLogger([] { nix::logger = nix::makeSimpleLogger().release(); });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/libutil-tests/nix_api_util.cc` around lines 146 - 148, The existing
comment near Finally restoreLogger([] { nix::logger =
nix::makeSimpleLogger().release(); }); incorrectly asserts that "the destroy
callback runs while `capture` is still alive"; update the comment to reflect the
current leaked-logger model (nix::logger is a raw pointer returned by
makeSimpleLogger().release()) which means the old logger's destroy callback may
not be invoked when replacing the logger, and reference the test behavior that
verifies replacing a logger does NOT invoke destroy on the old userdata; make
the comment precise about ownership semantics and why the destroy callback may
never run instead of claiming it will.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/libutil-tests/nix_api_util.cc`:
- Around line 196-197: The comment near the Finally in nix_api_util.cc
incorrectly implies that nix::logger's teardown will call destroy on a live
capture; update the comment to match test behavior by either removing the
reference to destroy being called at teardown or explicitly stating that destroy
is not expected to be called (e.g., because the leaked-logger model with raw
pointers intentionally avoids destruction and line 207 verifies destroy is NOT
called when replacing the logger). Mention the relevant symbols Finally and
nix::logger/destroy so future readers see the intended lifecycle.

---

Outside diff comments:
In `@src/libutil-tests/nix_api_util.cc`:
- Around line 146-148: The existing comment near Finally restoreLogger([] {
nix::logger = nix::makeSimpleLogger().release(); }); incorrectly asserts that
"the destroy callback runs while `capture` is still alive"; update the comment
to reflect the current leaked-logger model (nix::logger is a raw pointer
returned by makeSimpleLogger().release()) which means the old logger's destroy
callback may not be invoked when replacing the logger, and reference the test
behavior that verifies replacing a logger does NOT invoke destroy on the old
userdata; make the comment precise about ownership semantics and why the destroy
callback may never run instead of claiming it will.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c32a9237-4967-4b41-bc04-6177bbff60b7

📥 Commits

Reviewing files that changed from the base of the PR and between 867546b and 360bfcb.

📒 Files selected for processing (1)
  • src/libutil-tests/nix_api_util.cc

Comment on lines 196 to 197
// Both captures must outlive the Finally so that the logger held
// by `nix::logger` at teardown can call destroy on a live capture.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Comment contradicts test behavior regarding destroy lifecycle.

The comment states that "the logger held by nix::logger at teardown can call destroy on a live capture", suggesting destroy will eventually be called. However:

  • Line 207 verifies that destroy is NOT called when replacing the logger
  • The leaked-logger model (using raw pointers) means loggers are intentionally not destroyed
  • This contradicts the comment's implication that destroy will fire at teardown

Consider clarifying whether destroy is expected to be called at any point in the test lifecycle, or remove the reference to destroy being called.

📝 Suggested clarification
-    // Both captures must outlive the Finally so that the logger held
-    // by `nix::logger` at teardown can call destroy on a live capture.
+    // Both captures must outlive the Finally block.
+    // Note: destroy is not called when replacing loggers.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Both captures must outlive the Finally so that the logger held
// by `nix::logger` at teardown can call destroy on a live capture.
// Both captures must outlive the Finally block.
// Note: destroy is not called when replacing loggers.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/libutil-tests/nix_api_util.cc` around lines 196 - 197, The comment near
the Finally in nix_api_util.cc incorrectly implies that nix::logger's teardown
will call destroy on a live capture; update the comment to match test behavior
by either removing the reference to destroy being called at teardown or
explicitly stating that destroy is not expected to be called (e.g., because the
leaked-logger model with raw pointers intentionally avoids destruction and line
207 verifies destroy is NOT called when replacing the logger). Mention the
relevant symbols Finally and nix::logger/destroy so future readers see the
intended lifecycle.

@edolstra edolstra added this pull request to the merge queue May 22, 2026
Merged via the queue into main with commit 178e43a May 22, 2026
29 checks passed
@edolstra edolstra deleted the leak-logger branch May 22, 2026 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants