Implement trace-log correlation by mmat11 · Pull Request #942 · open-telemetry/opentelemetry-ebpf-instrumentation

mmat11 · 2025-11-27T12:36:22Z

Checklist

If this enhances / fixes / changes a core feature, I have updated the features documentation

codecov · 2025-11-27T12:42:21Z

Codecov Report

❌ Patch coverage is 18.84422% with 323 lines in your changes missing coverage. Please review.
✅ Project coverage is 46.07%. Comparing base (6a37792) to head (e9bcde0).
⚠️ Report is 2 commits behind head on main.

Files with missing lines	Patch %	Lines
pkg/internal/ebpf/logenricher/logenricher.go	0.00%	241 Missing ⚠️
pkg/appolly/discover/matcher.go	45.00%	20 Missing and 2 partials ⚠️
pkg/ebpf/common/common.go	0.00%	18 Missing ⚠️
pkg/appolly/discover/finder.go	18.75%	12 Missing and 1 partial ⚠️
pkg/ebpf/common/common_linux.go	52.38%	7 Missing and 3 partials ⚠️
pkg/internal/shardedqueue/queue.go	70.58%	9 Missing and 1 partial ⚠️
pkg/ebpf/tracer_linux.go	0.00%	4 Missing ⚠️
pkg/ebpf/tracer.go	25.00%	2 Missing and 1 partial ⚠️
pkg/appolly/discover/typer.go	86.66%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #942      +/-   ##
==========================================
- Coverage   46.49%   46.07%   -0.42%     
==========================================
  Files         267      270       +3     
  Lines       28044    28410     +366     
==========================================
+ Hits        13040    13091      +51     
- Misses      14127    14433     +306     
- Partials      877      886       +9

Flag	Coverage Δ
integration-test	`22.19% <12.50%> (-0.22%)`	⬇️
integration-test-arm	`0.00% <0.00%> (ø)`
integration-test-vm-${ARCH}-${KERNEL_VERSION}	`0.00% <0.00%> (ø)`
k8s-integration-test	`2.56% <0.00%> (-0.04%)`	⬇️
oats-test	`0.00% <0.00%> (ø)`
unittests	`47.35% <19.37%> (-0.40%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

rafaelroquetto

Another pass. We should run some scalability tests here before being able to merge this (e.g. deploy it in a big cluster and gather performance data). I am worried about a few points:

pressure on the ringbuffer: the shared ringbuffer will now be flooded by lines of logs of every tracked process - that alone is concerning, and it gets worse if we take into account the same ring buffer is used for other kinds of events
overhead of context-switching for each _write kprobe / CPU usage and ksoftirq handlers - you can gather performance data using perf for that

The handler for the logs should probably grab the event, and put it into a processing queue to be picked up by another thread or thread pool - doing heavy work like deserialising json or opening and closing files (i.e. I/O) is very slow and will block the ring buffer. That handler should be doing the minimum work possible and return (e.g. copy the event, enqueue it and return).

We should also track the target process stdout (or log sink), open it once for writing, and keep the file descriptor open until either the process or OBI itself terminates, as @grcevski suggested.

rafaelroquetto · 2025-12-05T17:24:33Z

+    // Loop 1, follow the dentry chain (up to a maximum of
+    // k_path_resolver_max_components) and store pointers to each dentry in
+    // dentry_arr
+    for (int i = 0; i < k_path_resolver_max_components; i++) {


I wonder if we could simplify this into a single loop by simply walking backwards and write directly to buf. Something like this (pseudo-code):

static __always_inline size_t resolve_path(char *buf, size_t buflen, struct path *path, const struct task_struct *task) { /* ... setup task_struct, etc ... */ /* these are the main pointers, we iterate backwards into dentry, and then * once we hit the current mnt_root, we swap the pointers, rinse and * repeat */ struct vfsmount *mnt; struct dentry *dentry; char *end = buf + buflen; *--end = '\0'; /* so here we start from the leaf path part */ mnt = path->mnt; dentry = path->dentry; for (u32 i = 0; i < k_path_resolver_max_components; ++i) { struct dentry *parent; /* while we are in the samen mnt, we keep prepending the path parts * for each dentry */ while (dentry != mnt->mnt_root) { int len = dentry->d_name.len; /* account for '/' + name */ if (end - buf < len + 1) { /* fail returning 0 bytes written */ return 0u; } /* copy name backwards - this is pseudo / regular C, need to read * the len, qstr, etc... with bpf_probe_read_kernel() or * BPF_CORE_READ() */ end -= len; /* read the path part and preprent '/' */ bpf_probe_read_kernel(end, len, dentry->d_name.name); *--end = '/'; /* move to parent dentry */ parent = dentry->d_parent; if (parent == dentry) break; dentry = parent; } /* if we getting here, we got to the current's dentry chain mnt_root - * so all we do is rinse and repeat and switch the dentry and mnt * pointers to the next level */ if (mnt == mnt->mnt_parent) break; dentry = mnt->mnt_mountpoint; mnt = mnt->mnt_parent; } // If we didn't write anything but the terminator, path is "/" if (*end == '\0') { if (end == buf) return 0; *--end = '/'; } /* return the number of bytes read, you can use 0 to check if anything was * written */ return buflen - end + buf; }

or something like that - then in userspace you can simply iterate bytes_written event from the end of the buffer to extract the path. This makes it simpler and does ithout the need for the map to collect dentry pointers

I'd like to avoid changing this code too much, as written on top of the file it's been taken from elastic/ebpf and ideally I'd use it as-is as I have the confidence it just works. This function has a lot of edge-cases handling and I don't wanna risk breaking it.
If the code is too much/too complex I can use the fd-based path construction the way I do for pipe_write, I preferred this as there are less moving parts (and less probes)

I don't know... the original code loops over the N components twice, and requires additional loading/storing from an array per part (plus a map lookup every time this is invoked) - whereas the approach I am proposing does without the array altogether and loops once (and no map lookups) in a single pass. This is something that we can test for.

The pseudo-code is lacking a few corner cases (truncation, root path / when no other parts exist, for instance) but these could be addressed.

Having said that, I understand this takes (human) time and see where you are coming from. I'd personally give it a shot (and potentially fail miserably) but ultimately it's up to you - so feel free to leave it as is.

An aside point, since this code comes from elastic/ebpf and is dual-licensed (GPL/BSD2) you may want to confirm whether we need to keep the copyright and license notice or somenthing like that.

For the first part, I'll give it a shot after load testing. For the second part, I think we should keep copyright, but I don't know how to do it. I added a comment at the top of the file with the source, should I copy something else? /cc @MrAlias

@mmat11 yeah that sounds good - come to think of it, you are right, let's do the testing first and see if that even matters.

mmat11 · 2025-12-05T21:24:59Z

Another pass. We should run some scalability tests here before being able to merge this (e.g. deploy it in a big cluster and gather performance data). I am worried about a few points:

pressure on the ringbuffer: the shared ringbuffer will now be flooded by lines of logs of every tracked process - that alone is concerning, and it gets worse if we take into account the same ring buffer is used for other kinds of events

I am using a dedicated ring buffer (see log_events), we can discuss whether we want its size configurable (or auto-adjusting depending on the number of tracked services)

overhead of context-switching for each _write kprobe / CPU usage and ksoftirq handlers - you can gather performance data using perf for that

Will do that 👍🏻

The handler for the logs should probably grab the event, and put it into a processing queue to be picked up by another thread or thread pool - doing heavy work like deserialising json or opening and closing files (i.e. I/O) is very slow and will block the ring buffer. That handler should be doing the minimum work possible and return (e.g. copy the event, enqueue it and return).

I didn't want to add additional complexity without having measured some metrics or observed some bottlenecks, maybe we can do this after that step?

We should also track the target process stdout (or log sink), open it once for writing, and keep the file descriptor open until either the process or OBI itself terminates, as @grcevski suggested.

👍🏻

rafaelroquetto · 2025-12-09T19:17:31Z

+}
+
+static __always_inline bool tty_driver_is_pty(struct tty_struct *tty) {
+    if (bpf_core_enum_value_exists(enum tty_driver_type__new, TTY_DRIVER_TYPE_PTY)) {


This is actualy not enough, you need

if (!bpf_core_field_exists(struct tty_driver, type)) { return false; }

otherwise it will fail in kernels that don't have that.

the type field is always present

rafaelroquetto · 2025-12-09T19:18:11Z

+}
+
+static __always_inline bool tty_driver_is_master(struct tty_struct *tty) {
+    if (bpf_core_enum_value_exists(enum tty_driver_subtype__new, PTY_TYPE_MASTER)) {


same here for subtype (if that also can vary amongst kernel versions)

mmat11 · 2026-01-08T11:38:25Z

@rafaelroquetto @grcevski I did add a couple of optimizations, did some tests on a real cluster and applied all the other suggestions (I just left out the pathresolver.h changes). Let me know if I'm missing something

grcevski · 2026-01-08T15:26:38Z

@rafaelroquetto @grcevski I did add a couple of optimizations, did some tests on a real cluster and applied all the other suggestions (I just left out the pathresolver.h changes). Let me know if I'm missing something

Thanks @mmat11 ! Reviewing this one more time is on my todo for today. I just noticed there are some conflicts with main.

mmat11 · 2026-01-08T15:29:40Z

@rafaelroquetto @grcevski I did add a couple of optimizations, did some tests on a real cluster and applied all the other suggestions (I just left out the pathresolver.h changes). Let me know if I'm missing something

Thanks @mmat11 ! Reviewing this one more time is on my todo for today. I just noticed there are some conflicts with main.

rebased!

grcevski

This looks great @mmat11!

I had one question remaining and it's probably me not being able to see how this is done. I see we have another criteria definition for further filtering of the services we want to include for logging, how is this wired in?

I would've thought we needed maybe an extension to the pipeline to have Matcher filter out pids we don't want.

grcevski · 2026-01-09T13:29:00Z

+	ebpfcommon.ForwardRingbuf(
+		&p.cfg.EBPF,
+		p.bpfObjects.LogEvents,
+		eventCtx.CommonPIDsFilter,


Do we need to pass the log pids filter here to the forward ring buff? The common pids filter is for the other bpf programs. Maybe that makes sense actually, you want to filter all logs for non instrumented services. In a sense, tracking logs doesn't make sense if the pid is not also instrumented for tracing.

IIRC I passed the common pids filter there because that filtering is happening at span level after data gets to userspace -- all I needed was to keep the ebpf map with pids aligned with the tracked processes (see AllowPID/BlockPID). Maybe I can add another check in ebpf which just calls valid_pid() on the pids map?

Yeah I think it's fine, I think it will work, since we require that the logged pids are also traced. I think the main issue is that I still don't see how the logs filtering works, how the additional rules affect certain apps. If you can point me to the code that does the logs criteria matching, I must be missing it. Perhaps I need to download your branch.

It works similarly to how generictracer's filter works, I add the pids key (ns + pid) to a map and check via ebpf the current pid is in that map (see pid_tracked() in logenricher.c)

the userspace tracer is responsible to reconcile this map (log_enricher_pids) with the discovered processes based on config

Sorry I wasn't clear, so imagine we have instrumented two ports for tracing, OTEL_EBPF_OPEN_PORT="8080,8081". I see you have LogEnricherServicesConfig, so let's say I have a config:

log_enricher: open_port: 8080

I'm wondering where is that extra discovery config used to avoid enrichment of logs for 8081?

I think I'm using the main discovery pids for filtering 🤦🏻‍♂️ , I'll fix it

Heh no problem, I just thought I was missing something. I think it looks great to be honest, with adding the correct filtering we are good to go IMO.

I had to change it at "matcher" level and pass the info all along the pipeline, the changes are in the last commit

grcevski

LGTM! Thanks for making this happen @mmat11 ! I just have one minor comment about adding a log statement if a user picks logging criteria broader than the instrumentation criteria. We can document as much as we want, but there will always be user errors and it would be good to expose them.

grcevski · 2026-01-15T16:24:43Z


-		return &ProcessMatch{Criteria: criteria, Process: proc}
+		return &ProcessMatch{Criteria: criteria, LogEnricherCriteria: logEnricherCriteria, Process: proc}
 	}


I would just ask one last thing :), if you don't mind. Before returning nil here can you add an if statement to check if the logEnricherCriteria is not empty and with info log tell us about it.

Something like:

if len(logEnricherCriteria) > 0 { m.Log.Info("avoiding log instrumentation for process, since it wasn't matched for instrumentation. Log enrichment criteria must be a subset of instrumentation matching criteria", "pid", proc.Pid... }

yes, good suggestion!

Signed-off-by: Mattia Meleleo <mattia.meleleo@coralogix.com>

mmat11 force-pushed the matt/trace-log branch 6 times, most recently from 38b68a9 to 2da8e85 Compare December 4, 2025 16:03

mmat11 force-pushed the matt/trace-log branch 2 times, most recently from 1cf9a17 to 46ecdcc Compare December 5, 2025 14:40

mmat11 changed the title ~~RFC: wip: trace-log correlation~~ Implement trace-log correlation Dec 5, 2025

mmat11 marked this pull request as ready for review December 5, 2025 14:40

mmat11 requested a review from a team as a code owner December 5, 2025 14:40

mmat11 force-pushed the matt/trace-log branch 2 times, most recently from 9ca4e7a to 080bacc Compare December 5, 2025 15:25

grcevski reviewed Dec 5, 2025

View reviewed changes

Comment thread pkg/internal/ebpf/logenricher/logenricher.go

Comment thread pkg/internal/ebpf/logenricher/logenricher.go Outdated

rafaelroquetto reviewed Dec 5, 2025

View reviewed changes

mmat11 force-pushed the matt/trace-log branch 2 times, most recently from 54af8d8 to a89aba2 Compare December 9, 2025 17:52

rafaelroquetto reviewed Dec 9, 2025

View reviewed changes

mmat11 force-pushed the matt/trace-log branch 7 times, most recently from 4f7b4c3 to 5e8e0f1 Compare December 18, 2025 13:31

mmat11 force-pushed the matt/trace-log branch from 5e8e0f1 to 6311be5 Compare December 30, 2025 16:06

mmat11 force-pushed the matt/trace-log branch from 6311be5 to 4119e70 Compare January 7, 2026 23:25

mmat11 force-pushed the matt/trace-log branch from 4119e70 to 5915475 Compare January 8, 2026 15:29

grcevski reviewed Jan 9, 2026

View reviewed changes

mmat11 force-pushed the matt/trace-log branch 3 times, most recently from 5cf2592 to 8dcb895 Compare January 15, 2026 02:16

grcevski approved these changes Jan 15, 2026

View reviewed changes

mmat11 added 3 commits January 15, 2026 17:30

bpf: add trace-log correlation

f840bfa

Signed-off-by: Mattia Meleleo <mattia.meleleo@coralogix.com>

rebase and use g_bpf_debug

812e747

fix log enricher pid filtering

e9bcde0

mmat11 force-pushed the matt/trace-log branch from 8dcb895 to e9bcde0 Compare January 15, 2026 16:30

grcevski merged commit 26231fb into open-telemetry:main Jan 15, 2026
56 checks passed

MrAlias added this to the v0.5.0 milestone Feb 10, 2026

MrAlias mentioned this pull request Feb 10, 2026

Release v0.5.0 #1274

Merged

Conversation

mmat11 commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

codecov Bot commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

rafaelroquetto left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mmat11 commented Dec 5, 2025

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mmat11 commented Jan 8, 2026

Uh oh!

grcevski commented Jan 8, 2026

Uh oh!

mmat11 commented Jan 8, 2026

Uh oh!

grcevski left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mmat11 Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

grcevski left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mmat11 commented Nov 27, 2025 •

edited

Loading

codecov Bot commented Nov 27, 2025 •

edited

Loading

mmat11 Jan 9, 2026 •

edited

Loading