Fix macOS crash in GetAvailableServers (CGo write barrier)#8541
Fix macOS crash in GetAvailableServers (CGo write barrier)#8541myleshorton wants to merge 11 commits intomainfrom
Conversation
GetAvailableServers was crashing with a runtime.throw in bulkBarrierPreWrite when called from Swift via gomobile. The Servers() return value contains pointer-rich types (Options with any interfaces) whose copy triggers GC write barriers on the CGo callback stack, which the GC heap bitmap doesn't cover. Move the Servers() + json.Marshal work to a dedicated goroutine so it runs on a proper Go stack where write barriers function correctly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Apply the same goroutine-based fix to getAutoLocation() in both ffi/ffi.go and mobile/mobile.go. These functions marshal Server types containing pointer-rich Options interfaces, which can trigger GC write barrier panics on the CGo callback stack. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Fixes a macOS crash when GetAvailableServers is invoked from Swift via gomobile by ensuring pointer-rich server data and JSON marshaling run on a proper Go stack (instead of a cgo callback stack that can trip GC write barriers).
Changes:
- Move
ServerManager().Servers()andjson.Marshalinto a dedicated goroutine and return the result via a channel. - Add inline documentation explaining the cgo/write-barrier crash and why the goroutine workaround is needed.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
Pull request overview
Fixes an intermittent macOS crash when server data is marshalled from a gomobile/CGo callback context by moving pointer-rich server retrieval + json.Marshal work onto a dedicated goroutine (i.e., a normal Go stack).
Changes:
- Update
LanternCore.GetAvailableServers()to fetchServers()and marshal JSON on a goroutine, returning the result via a channel. - Apply the same goroutine-hop approach to auto-location server marshalling in the mobile binding and the FFI entrypoint.
- Add explanatory comments documenting the GC write-barrier/CGo-stack failure mode being avoided.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
lantern-core/core.go |
Moves Servers() + json.Marshal off the CGo callback stack in GetAvailableServers() to avoid write barrier panics. |
lantern-core/mobile/mobile.go |
Runs GetServerByTag + json.Marshal for auto-location on a goroutine to avoid the same CGo stack issue in gomobile calls. |
lantern-core/ffi/ffi.go |
Runs GetServerByTag + json.Marshal for auto-location on a goroutine to avoid CGo stack write-barrier panics in the FFI path. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Address PR review feedback: - Extract a generic RunOnGoStack[T] helper in utils/gostack.go that runs a closure on a dedicated goroutine and returns the result via channel. Includes defer/recover so a panic in the closure returns an error instead of blocking the caller forever. - Refactor all three call sites (GetAvailableServers, getAutoLocation in FFI, GetAutoLocation in mobile) to use the shared helper. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move the CGo safety concern into radiance where the pointer-rich types originate. Callers now use ServersJSON() and GetServerByTagJSON() which marshal internally and return plain []byte. - Remove utils/gostack.go (moved to radiance common/gostack.go) - GetAvailableServers uses Manager.ServersJSON() - getAutoLocation (FFI) and GetAutoLocation (mobile) use Core.GetServerByTagJSON() which delegates to Manager.GetServerByTagJSON() - Upgrade radiance to 477c01d which adds the new methods Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Fixes a macOS crash when GetAvailableServers (and related server lookups) are invoked from Swift via gomobile/CGo by avoiding returning/copying pointer-rich server structs on the CGo callback stack.
Changes:
- Switch
GetAvailableServers()to useServersJSON()instead ofServers()+json.Marshal. - Add
GetServerByTagJSON()to the core interface and route mobile/FFI auto-location lookup through it. - Bump
github.com/getlantern/radianceto a newer pseudo-version.
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
lantern-core/mobile/mobile.go |
Uses GetServerByTagJSON() for auto-location JSON to avoid copying server structs. |
lantern-core/ffi/ffi.go |
Same shift to GetServerByTagJSON() for the C FFI path. |
lantern-core/core.go |
Adds GetServerByTagJSON() API and switches available-servers retrieval to ServersJSON(). |
go.mod |
Updates the radiance dependency version. |
go.sum |
Adds new radiance checksums for the bumped versions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
- Remove slog.Debug that converted potentially large []byte to string on every GetAvailableServers call regardless of log level - Update radiance to 405169d which extracts shared lookup logic and uses MarshalContext consistently Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All callers are internal to lantern-core and access serverManager directly. Removing the unnecessary interface method and delegation wrapper simplifies the API surface — external callers should use GetServerByTagJSON which is CGo-safe. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The previous fix (using ServersJSON/GetServerByTagJSON) moved the marshal inside radiance but didn't address the root cause: the crash is in the gomobile-generated CGo wrapper (_cgoexp_proxymobile__*), which copies Go pointer-containing return values to the C thread stack. The GC heap bitmap doesn't cover that memory, so bulkBarrierPreWrite panics. The only reliable fix is to run the entire function body on a real Go goroutine stack via runOffCgoStack. The goroutine result is sent back through a channel (which only involves copying the final []byte/string, not pointer-rich intermediate types). The FFI path (ffi.go) doesn't need this because its //export functions return *C.char (C-allocated memory, no GC pointers). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move the goroutine hop helper to radiance so any project consuming radiance through gomobile can use it. Replace the local runOffCgoStack in mobile.go with common.RunOffCgoStack. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ServersJSON and GetServerByTagJSON in radiance now handle the goroutine hop internally, so callers don't need to wrap anything. mobile.go goes back to straightforward delegation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Problem
Lantern's macOS app crashes intermittently with a Go runtime panic:
The crash is in the gomobile-generated CGo wrapper, not in application code. Gomobile-exported functions run on a CGo callback stack. The generated
_cgoexp_wrapper copies Go pointer-containing return values to the C thread stack, whose memory the GC heap bitmap doesn't cover. When the write barrier fires,bulkBarrierPreWritepanics.This doesn't affect the FFI path (
ffi.go) because//exportfunctions return*C.char(C-allocated memory, no GC pointers).Fix
1. Goroutine hop for gomobile exports (
mobile/mobile.go): WrapGetAvailableServersandGetAutoLocationwithcommon.RunOffCgoStack(from radiance), which runs the entire function body on a real Go goroutine via channel. The pointer-rich intermediate types (sing-boxoption.OutboundwithOptions anyinterfaces) never exist on the CGo callback stack -- the gomobile wrapper only sees the final[]byte/stringresult.2. CGo-safe JSON methods in radiance (companion PR getlantern/radiance#361):
ServersJSON()andGetServerByTagJSON()marshal pointer-rich sing-box types inside radiance under the read lock, returning plain[]byte. This is a cleaner API regardless -- callers don't need to handleServerstructs with pointer-rich fields.Also:
GetServerByTagfrom theAppinterface (all callers are internal)slog.Debugthat allocated a string copy of the servers JSON on every callgo mod tidyTest plan