Skip to content

Implement trace-log correlation#942

Merged
grcevski merged 3 commits intoopen-telemetry:mainfrom
coralogix:matt/trace-log
Jan 15, 2026
Merged

Implement trace-log correlation#942
grcevski merged 3 commits intoopen-telemetry:mainfrom
coralogix:matt/trace-log

Conversation

@mmat11
Copy link
Copy Markdown
Contributor

@mmat11 mmat11 commented Nov 27, 2025

Checklist

@codecov
Copy link
Copy Markdown

codecov Bot commented Nov 27, 2025

Codecov Report

❌ Patch coverage is 18.84422% with 323 lines in your changes missing coverage. Please review.
✅ Project coverage is 46.07%. Comparing base (6a37792) to head (e9bcde0).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
pkg/internal/ebpf/logenricher/logenricher.go 0.00% 241 Missing ⚠️
pkg/appolly/discover/matcher.go 45.00% 20 Missing and 2 partials ⚠️
pkg/ebpf/common/common.go 0.00% 18 Missing ⚠️
pkg/appolly/discover/finder.go 18.75% 12 Missing and 1 partial ⚠️
pkg/ebpf/common/common_linux.go 52.38% 7 Missing and 3 partials ⚠️
pkg/internal/shardedqueue/queue.go 70.58% 9 Missing and 1 partial ⚠️
pkg/ebpf/tracer_linux.go 0.00% 4 Missing ⚠️
pkg/ebpf/tracer.go 25.00% 2 Missing and 1 partial ⚠️
pkg/appolly/discover/typer.go 86.66% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #942      +/-   ##
==========================================
- Coverage   46.49%   46.07%   -0.42%     
==========================================
  Files         267      270       +3     
  Lines       28044    28410     +366     
==========================================
+ Hits        13040    13091      +51     
- Misses      14127    14433     +306     
- Partials      877      886       +9     
Flag Coverage Δ
integration-test 22.19% <12.50%> (-0.22%) ⬇️
integration-test-arm 0.00% <0.00%> (ø)
integration-test-vm-${ARCH}-${KERNEL_VERSION} 0.00% <0.00%> (ø)
k8s-integration-test 2.56% <0.00%> (-0.04%) ⬇️
oats-test 0.00% <0.00%> (ø)
unittests 47.35% <19.37%> (-0.40%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@mmat11 mmat11 force-pushed the matt/trace-log branch 6 times, most recently from 38b68a9 to 2da8e85 Compare December 4, 2025 16:03
@mmat11 mmat11 force-pushed the matt/trace-log branch 2 times, most recently from 1cf9a17 to 46ecdcc Compare December 5, 2025 14:40
@mmat11 mmat11 changed the title RFC: wip: trace-log correlation Implement trace-log correlation Dec 5, 2025
@mmat11 mmat11 marked this pull request as ready for review December 5, 2025 14:40
@mmat11 mmat11 requested a review from a team as a code owner December 5, 2025 14:40
@mmat11 mmat11 force-pushed the matt/trace-log branch 2 times, most recently from 9ca4e7a to 080bacc Compare December 5, 2025 15:25
Comment thread pkg/internal/ebpf/logenricher/logenricher.go
Comment thread pkg/internal/ebpf/logenricher/logenricher.go Outdated
Copy link
Copy Markdown
Contributor

@rafaelroquetto rafaelroquetto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another pass. We should run some scalability tests here before being able to merge this (e.g. deploy it in a big cluster and gather performance data). I am worried about a few points:

  • pressure on the ringbuffer: the shared ringbuffer will now be flooded by lines of logs of every tracked process - that alone is concerning, and it gets worse if we take into account the same ring buffer is used for other kinds of events
  • overhead of context-switching for each _write kprobe / CPU usage and ksoftirq handlers - you can gather performance data using perf for that

The handler for the logs should probably grab the event, and put it into a processing queue to be picked up by another thread or thread pool - doing heavy work like deserialising json or opening and closing files (i.e. I/O) is very slow and will block the ring buffer. That handler should be doing the minimum work possible and return (e.g. copy the event, enqueue it and return).

We should also track the target process stdout (or log sink), open it once for writing, and keep the file descriptor open until either the process or OBI itself terminates, as @grcevski suggested.

Comment thread bpf/logenricher/logenricher.c Outdated
Comment thread bpf/logenricher/logenricher.c Outdated
Comment thread bpf/logenricher/logenricher.c
Comment thread bpf/logenricher/logenricher.c Outdated
Comment thread bpf/logenricher/logenricher.c Outdated
Comment thread bpf/logenricher/path_resolver.h
Comment thread bpf/logenricher/path_resolver.h
// Loop 1, follow the dentry chain (up to a maximum of
// k_path_resolver_max_components) and store pointers to each dentry in
// dentry_arr
for (int i = 0; i < k_path_resolver_max_components; i++) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could simplify this into a single loop by simply walking backwards and write directly to buf. Something like this (pseudo-code):

static __always_inline size_t
resolve_path(char *buf, size_t buflen, struct path *path, const struct task_struct *task) {
    /* ... setup task_struct, etc ... */

    /* these are the main pointers, we iterate backwards into dentry, and then
     * once we hit the current mnt_root, we swap the pointers, rinse and
     * repeat
     */
    struct vfsmount *mnt;
    struct dentry *dentry;

    char *end = buf + buflen;

    *--end = '\0';

    /* so here we start from the leaf path part */
    mnt = path->mnt;
    dentry = path->dentry;

    for (u32 i = 0; i < k_path_resolver_max_components; ++i) {
        struct dentry *parent;

        /* while we are in the samen mnt, we keep prepending the path parts
         * for each dentry
         */

        while (dentry != mnt->mnt_root) {
            int len = dentry->d_name.len;

            /* account for '/' + name */
            if (end - buf < len + 1) {
                /* fail returning 0 bytes written */
                return 0u;
            }

            /* copy name backwards - this is pseudo / regular C, need to read
             * the len, qstr, etc... with bpf_probe_read_kernel() or
             * BPF_CORE_READ()
             */
            end -= len;

            /* read the path part  and preprent '/' */
            bpf_probe_read_kernel(end, len, dentry->d_name.name);

            *--end = '/';

            /* move to parent dentry */
            parent = dentry->d_parent;

            if (parent == dentry)
                break;

            dentry = parent;
        }

        /* if we getting here, we got to the current's dentry chain mnt_root -
         * so all we do is rinse and repeat and switch the dentry and mnt
         * pointers to the next level
         */
        if (mnt == mnt->mnt_parent)
            break;

        dentry = mnt->mnt_mountpoint;
        mnt = mnt->mnt_parent;
    }

    // If we didn't write anything but the terminator, path is "/"
    if (*end == '\0') {
        if (end == buf)
            return 0;
        *--end = '/';
    }

    /* return the number of bytes read, you can use 0 to check if anything was
     * written
     */
    return buflen - end + buf;
}

or something like that - then in userspace you can simply iterate bytes_written event from the end of the buffer to extract the path. This makes it simpler and does ithout the need for the map to collect dentry pointers

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to avoid changing this code too much, as written on top of the file it's been taken from elastic/ebpf and ideally I'd use it as-is as I have the confidence it just works. This function has a lot of edge-cases handling and I don't wanna risk breaking it.
If the code is too much/too complex I can use the fd-based path construction the way I do for pipe_write, I preferred this as there are less moving parts (and less probes)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know... the original code loops over the N components twice, and requires additional loading/storing from an array per part (plus a map lookup every time this is invoked) - whereas the approach I am proposing does without the array altogether and loops once (and no map lookups) in a single pass. This is something that we can test for.

The pseudo-code is lacking a few corner cases (truncation, root path / when no other parts exist, for instance) but these could be addressed.

Having said that, I understand this takes (human) time and see where you are coming from. I'd personally give it a shot (and potentially fail miserably) but ultimately it's up to you - so feel free to leave it as is.

An aside point, since this code comes from elastic/ebpf and is dual-licensed (GPL/BSD2) you may want to confirm whether we need to keep the copyright and license notice or somenthing like that.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the first part, I'll give it a shot after load testing. For the second part, I think we should keep copyright, but I don't know how to do it. I added a comment at the top of the file with the source, should I copy something else? /cc @MrAlias

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mmat11 yeah that sounds good - come to think of it, you are right, let's do the testing first and see if that even matters.

Comment thread bpf/logenricher/types.h Outdated
Comment thread pkg/internal/ebpf/logenricher/logenricher.go Outdated
@mmat11
Copy link
Copy Markdown
Contributor Author

mmat11 commented Dec 5, 2025

Another pass. We should run some scalability tests here before being able to merge this (e.g. deploy it in a big cluster and gather performance data). I am worried about a few points:

pressure on the ringbuffer: the shared ringbuffer will now be flooded by lines of logs of every tracked process - that alone is concerning, and it gets worse if we take into account the same ring buffer is used for other kinds of events

I am using a dedicated ring buffer (see log_events), we can discuss whether we want its size configurable (or auto-adjusting depending on the number of tracked services)

overhead of context-switching for each _write kprobe / CPU usage and ksoftirq handlers - you can gather performance data using perf for that

Will do that 👍🏻

The handler for the logs should probably grab the event, and put it into a processing queue to be picked up by another thread or thread pool - doing heavy work like deserialising json or opening and closing files (i.e. I/O) is very slow and will block the ring buffer. That handler should be doing the minimum work possible and return (e.g. copy the event, enqueue it and return).

I didn't want to add additional complexity without having measured some metrics or observed some bottlenecks, maybe we can do this after that step?

We should also track the target process stdout (or log sink), open it once for writing, and keep the file descriptor open until either the process or OBI itself terminates, as @grcevski suggested.

👍🏻

@mmat11 mmat11 force-pushed the matt/trace-log branch 2 times, most recently from 54af8d8 to a89aba2 Compare December 9, 2025 17:52
Comment thread bpf/logenricher/types.h Outdated
Comment thread bpf/logenricher/types.h Outdated
Comment thread bpf/logenricher/types.h Outdated
}

static __always_inline bool tty_driver_is_pty(struct tty_struct *tty) {
if (bpf_core_enum_value_exists(enum tty_driver_type__new, TTY_DRIVER_TYPE_PTY)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actualy not enough, you need

    if (!bpf_core_field_exists(struct tty_driver, type)) {
        return false;
    }

otherwise it will fail in kernels that don't have that.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the type field is always present

Comment thread bpf/logenricher/types.h Outdated
}

static __always_inline bool tty_driver_is_master(struct tty_struct *tty) {
if (bpf_core_enum_value_exists(enum tty_driver_subtype__new, PTY_TYPE_MASTER)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here for subtype (if that also can vary amongst kernel versions)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Comment thread bpf/logenricher/logenricher.c Outdated
@mmat11 mmat11 force-pushed the matt/trace-log branch 7 times, most recently from 4f7b4c3 to 5e8e0f1 Compare December 18, 2025 13:31
@mmat11
Copy link
Copy Markdown
Contributor Author

mmat11 commented Jan 8, 2026

@rafaelroquetto @grcevski I did add a couple of optimizations, did some tests on a real cluster and applied all the other suggestions (I just left out the pathresolver.h changes). Let me know if I'm missing something

@grcevski
Copy link
Copy Markdown
Contributor

grcevski commented Jan 8, 2026

@rafaelroquetto @grcevski I did add a couple of optimizations, did some tests on a real cluster and applied all the other suggestions (I just left out the pathresolver.h changes). Let me know if I'm missing something

Thanks @mmat11 ! Reviewing this one more time is on my todo for today. I just noticed there are some conflicts with main.

@mmat11
Copy link
Copy Markdown
Contributor Author

mmat11 commented Jan 8, 2026

@rafaelroquetto @grcevski I did add a couple of optimizations, did some tests on a real cluster and applied all the other suggestions (I just left out the pathresolver.h changes). Let me know if I'm missing something

Thanks @mmat11 ! Reviewing this one more time is on my todo for today. I just noticed there are some conflicts with main.

rebased!

Copy link
Copy Markdown
Contributor

@grcevski grcevski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great @mmat11!

I had one question remaining and it's probably me not being able to see how this is done. I see we have another criteria definition for further filtering of the services we want to include for logging, how is this wired in?

I would've thought we needed maybe an extension to the pipeline to have Matcher filter out pids we don't want.

ebpfcommon.ForwardRingbuf(
&p.cfg.EBPF,
p.bpfObjects.LogEvents,
eventCtx.CommonPIDsFilter,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to pass the log pids filter here to the forward ring buff? The common pids filter is for the other bpf programs. Maybe that makes sense actually, you want to filter all logs for non instrumented services. In a sense, tracking logs doesn't make sense if the pid is not also instrumented for tracing.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC I passed the common pids filter there because that filtering is happening at span level after data gets to userspace -- all I needed was to keep the ebpf map with pids aligned with the tracked processes (see AllowPID/BlockPID). Maybe I can add another check in ebpf which just calls valid_pid() on the pids map?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think it's fine, I think it will work, since we require that the logged pids are also traced. I think the main issue is that I still don't see how the logs filtering works, how the additional rules affect certain apps. If you can point me to the code that does the logs criteria matching, I must be missing it. Perhaps I need to download your branch.

Copy link
Copy Markdown
Contributor Author

@mmat11 mmat11 Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works similarly to how generictracer's filter works, I add the pids key (ns + pid) to a map and check via ebpf the current pid is in that map (see pid_tracked() in logenricher.c)

the userspace tracer is responsible to reconcile this map (log_enricher_pids) with the discovered processes based on config

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I wasn't clear, so imagine we have instrumented two ports for tracing, OTEL_EBPF_OPEN_PORT="8080,8081". I see you have LogEnricherServicesConfig, so let's say I have a config:

log_enricher:
  open_port: 8080

I'm wondering where is that extra discovery config used to avoid enrichment of logs for 8081?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm using the main discovery pids for filtering 🤦🏻‍♂️ , I'll fix it

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh no problem, I just thought I was missing something. I think it looks great to be honest, with adding the correct filtering we are good to go IMO.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to change it at "matcher" level and pass the info all along the pipeline, the changes are in the last commit

@mmat11 mmat11 force-pushed the matt/trace-log branch 3 times, most recently from 5cf2592 to 8dcb895 Compare January 15, 2026 02:16
Copy link
Copy Markdown
Contributor

@grcevski grcevski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for making this happen @mmat11 ! I just have one minor comment about adding a log statement if a user picks logging criteria broader than the instrumentation criteria. We can document as much as we want, but there will always be user errors and it would be good to expose them.


return &ProcessMatch{Criteria: criteria, Process: proc}
return &ProcessMatch{Criteria: criteria, LogEnricherCriteria: logEnricherCriteria, Process: proc}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just ask one last thing :), if you don't mind. Before returning nil here can you add an if statement to check if the logEnricherCriteria is not empty and with info log tell us about it.

Something like:

if len(logEnricherCriteria) > 0 {
  m.Log.Info("avoiding log instrumentation for process, since it wasn't matched for instrumentation. Log enrichment criteria must be a subset of instrumentation matching criteria", "pid", proc.Pid...
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, good suggestion!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@grcevski grcevski merged commit 26231fb into open-telemetry:main Jan 15, 2026
56 checks passed
@MrAlias MrAlias added this to the v0.5.0 milestone Feb 10, 2026
@MrAlias MrAlias mentioned this pull request Feb 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants