Skip to content

Conversation

@GregMefford
Copy link
Contributor

@GregMefford GregMefford commented Nov 21, 2025

This PR is a step in between the initial plumbing that was recently merged in #958 and the complete implementation that was spiked out in #289.

It adds a BEAM unwinder that supports x86 and ARM, using Frame Pointers when available and scanning the stack when they're not. For now, the frames that are passed back to the agent only identify which CodeHeader the PC belongs to, along with the PC itself. A final PR will actually do the additional work of reading the CodeHeader to resolve the symbol names.

image

Initial issues when the PR was opened

The reason I'm opening this initially as a draft is that it seems to no longer be possible to use ReadSymbols to read the r symbol like I was before. I get an error like the following:

The issue has been resolved now
ef.ReadSymbols undefined (type *pfelf.File has no field or method ReadSymbols)

In my previous code, I was doing this to read it:

	// TODO: We want to avoid reading static symbols to find the address of r here,
	// because it would be removed if the binary is stripped, but it seems that's the only
	// way to get it currently. Look into programmatically calculating it by disassembly.
	// If possible, get it exported in erl_etp.c
	symbols, err := ef.ReadSymbols()
	if err != nil {
		return nil, fmt.Errorf("failed to read symbols: %v", err)
	}

	// "r" symbol is from:
	// https://github.com/erlang/otp/blob/OTP-27.2.4/erts/emulator/beam/beam_ranges.c#L62
	r, err := symbols.LookupSymbolAddress(libpf.SymbolName("r"))

	if err != nil {
		return nil, fmt.Errorf("symbol 'r' not found: %v", err)
	}

So I guess the time has come to figure out how to handle that more correctly. What I've included in the draft PR won't work because I get the following error when trying to load to the BEAM instance:

Failed to load /home/greg/.asdf/installs/erlang/27.2.4/erts-15.2.2/bin/beam.smp (0xde6a87c971fbf0b9): symbol 'r' not found: symbol not found

@fabled
Copy link
Contributor

fabled commented Nov 21, 2025

The reason I'm opening this initially as a draft is that it seems to no longer be possible to use ReadSymbols to read the r symbol like I was before

Yes, the ReadSymbols function was removed as it reads and converts the whole symbol table using lot of unneeded memory. You can still extract the symbol via the VisitSymbols and giving a lambda that gathers the information needed (and abort the parsing when needed information is found. For an example, see https://github.com/open-telemetry/opentelemetry-ebpf-profiler/blob/main/nativeunwind/elfunwindinfo/elfgopclntab.go#L313-L320

But ideally the location would be extracted with disassembly if needed to allow working with stripped binaries.

@GregMefford GregMefford force-pushed the beam_minimal_unwinder branch from c07951b to a6e7994 Compare November 22, 2025 16:31
@GregMefford GregMefford marked this pull request as ready for review November 22, 2025 17:43
@GregMefford GregMefford requested review from a team as code owners November 22, 2025 17:43
{
state->SP_REGISTER += 8;
bpf_probe_read_user(&state->pc, sizeof(u64), (void *)state->SP_REGISTER);
if ((state->pc & 0x03) == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 0x03 a special value that should be documented or we can link to documentation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment for now to document it. I will probably go ahead and just read in the etp_tag_primary_size and etp_tag_primary_header symbols in a future iteration just to make it more clear and less likely to break in the future, but I'm guessing that's very unlikely to change anytime soon since that would mean shuffling the whole tagging system around to make room for more bits in the encoding scheme.

@GregMefford
Copy link
Contributor Author

GregMefford commented Dec 5, 2025

I timed out for today, but I think I incorporated most of the feedback, in case you get a chance to review and confirm. I will come back and do some more clean-up though. For example I'm actually not even using any of the VM offsets anymore since they're all just assumed / mirrored in the structs I'm defining in order to read in the data consecutively at this point.

@GregMefford GregMefford force-pushed the beam_minimal_unwinder branch from 1095f50 to ab2533f Compare December 5, 2025 13:02
@GregMefford
Copy link
Contributor Author

I finished the rest of the clean-up and rebased onto main so that the conflicts on the ebpf binaries should also be resolved now.

Copy link
Contributor

@fabled fabled left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great module few nits. Should be ready for merge after the non leaf marking move. Thanks!

@GregMefford GregMefford force-pushed the beam_minimal_unwinder branch from ab2533f to a1b2968 Compare December 5, 2025 23:00
@GregMefford
Copy link
Contributor Author

I think I got all the nits picked, but let me know if I misunderstood the comment about moving the nonleaf marking thing!

@fabled fabled merged commit aa1cbed into open-telemetry:main Dec 7, 2025
29 checks passed
@GregMefford GregMefford deleted the beam_minimal_unwinder branch December 7, 2025 23:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants