ROX-30296: track POSIX ACL changes via inode_set_acl LSM hook#878
ROX-30296: track POSIX ACL changes via inode_set_acl LSM hook#878Stringy wants to merge 13 commits into
Conversation
📝 WalkthroughWalkthroughAdds a new eBPF LSM hook ChangesACL Event Tracking
Estimated code review effort: 4 (Complex) | ~60 minutes Sequence Diagram(s)sequenceDiagram
participant Kernel
participant check_inode_set_acl
participant trace_inode_set_acl
participant submit_acl_event
participant Ringbuffer
participant Userspace
Kernel->>check_inode_set_acl: inode_set_acl LSM hook
check_inode_set_acl-->>Kernel: return 0
Kernel->>trace_inode_set_acl: inode_set_acl tracepoint
trace_inode_set_acl->>trace_inode_set_acl: inode_is_monitored check
trace_inode_set_acl->>submit_acl_event: submit_acl_event(args, acl_name, kacl)
submit_acl_event->>Ringbuffer: reserve_event + submit FILE_ACTIVITY_ACL_SET
Ringbuffer->>Userspace: FileData::new decodes acl entries
Userspace->>Userspace: FileData::AclSet -> FileAclChange protobuf
Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #878 +/- ##
==========================================
+ Coverage 32.23% 32.61% +0.37%
==========================================
Files 21 21
Lines 2736 2919 +183
Branches 2736 2919 +183
==========================================
+ Hits 882 952 +70
- Misses 1851 1964 +113
Partials 3 3 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
46d84a9 to
11683ba
Compare
67a3b12 to
be7fb22
Compare
eae45f0 to
fc73ed7
Compare
|
Caution Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted. Error details |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
fact/src/event/mod.rs (1)
538-555: 🎯 Functional Correctness | 🟡 Minor | ⚡ Quick winAdd the missing
Chownequality arm.While extending this manual matcher,
FileData::Chownstill falls through to_ => false, so two identical ownership-change events compare unequal.Proposed fix
(FileData::Unlink(this), FileData::Unlink(other)) => this == other, (FileData::Chmod(this), FileData::Chmod(other)) => this == other, + (FileData::Chown(this), FileData::Chown(other)) => this == other, (FileData::Rename(this), FileData::Rename(other)) => this == other, (FileData::SetXattr(this), FileData::SetXattr(other)) => this == other,🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@fact/src/event/mod.rs` around lines 538 - 555, The manual PartialEq implementation for FileData is missing the FileData::Chown arm, so identical ownership-change events currently fall through to false. Update the eq method in FileData’s PartialEq match to add a Chown(this), Chown(other) branch that compares the contained values the same way as the other variants, keeping the fallback _ => false only for truly different variants.
🧹 Nitpick comments (1)
fact/src/bpf/checks.rs (1)
34-42: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winConsider logging the underlying error before treating any load failure as "unsupported."
probe_hookreturnsfalsefor missing programs, wrong program type, and load failures alike, discarding the error fromprog.load(...). This is fine for genuinely unsupported kernels, but it also silently masks a real bug (e.g. a verifier failure inchecks.c) as "capability not supported," making such regressions hard to diagnose.♻️ Suggested diagnostic logging
fn probe_hook(obj: &mut aya::Ebpf, prog_name: &str, hook: &str, btf: &Btf) -> bool { let Some(prog) = obj.program_mut(prog_name) else { return false; }; let Ok(prog): Result<&mut Lsm, _> = prog.try_into() else { return false; }; - prog.load(hook, btf).is_ok() + match prog.load(hook, btf) { + Ok(()) => true, + Err(e) => { + debug!("Failed to probe {hook} support via {prog_name}: {e:#?}"); + false + } + } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@fact/src/bpf/checks.rs` around lines 34 - 42, `probe_hook` is swallowing real load errors by returning false for every failure case. Update the function to log the underlying error from `prog.load(hook, btf)` before mapping it to unsupported, while keeping the existing false behavior for missing programs and wrong program types. Use the `probe_hook` function and the `Lsm` load path to locate the change, and make sure the log clearly distinguishes load failures from genuinely unsupported kernels.Source: Path instructions
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@fact-ebpf/src/bpf/events.h`:
- Around line 193-199: The ACL copy loop in events.h is always assigning
entry.e_uid.val to args->event->acl.entries[i].e_id, but posix_acl_entry.e_uid
is only valid for ACL_USER and ACL_GROUP. Update the logic in the ACL entry
population path to branch on entry.e_tag and set e_id from e_uid.val only for
those tags; for ACL_USER_OBJ, ACL_GROUP_OBJ, ACL_MASK, and ACL_OTHER, assign
ACL_UNDEFINED_ID instead. Keep the change localized to the ACL event-building
loop so the rest of the field copying in args->event->acl.entries remains
unchanged.
---
Outside diff comments:
In `@fact/src/event/mod.rs`:
- Around line 538-555: The manual PartialEq implementation for FileData is
missing the FileData::Chown arm, so identical ownership-change events currently
fall through to false. Update the eq method in FileData’s PartialEq match to add
a Chown(this), Chown(other) branch that compares the contained values the same
way as the other variants, keeping the fallback _ => false only for truly
different variants.
---
Nitpick comments:
In `@fact/src/bpf/checks.rs`:
- Around line 34-42: `probe_hook` is swallowing real load errors by returning
false for every failure case. Update the function to log the underlying error
from `prog.load(hook, btf)` before mapping it to unsupported, while keeping the
existing false behavior for missing programs and wrong program types. Use the
`probe_hook` function and the `Lsm` load path to locate the change, and make
sure the log clearly distinguishes load failures from genuinely unsupported
kernels.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Enterprise
Run ID: 5b916de2-f11e-48d7-b345-c3ed9b660223
📒 Files selected for processing (16)
CHANGELOG.mdfact-ebpf/src/bpf/checks.cfact-ebpf/src/bpf/events.hfact-ebpf/src/bpf/main.cfact-ebpf/src/bpf/types.hfact-ebpf/src/lib.rsfact/src/bpf/checks.rsfact/src/bpf/mod.rsfact/src/event/mod.rsfact/src/metrics/kernel_metrics.rstests/event.pytests/server.pytests/test_acl.pytests/test_xattr.pytests/utils.pythird_party/stackrox
Molter73
left a comment
There was a problem hiding this comment.
Overall looks good! Thanks for tackling this one!
There's a couple comments, mostly questions so I expect we should be able to merge this soon.
| args->event->acl.acl_type = FACT_ACL_TYPE_ACCESS; | ||
| } | ||
|
|
||
| if (kacl == NULL) { |
There was a problem hiding this comment.
[ultra-nit] Can we negate this condition so the largest block is the one for the true condition and not the else?
| 0x01 => AclTag::UserObj, | ||
| 0x02 => AclTag::User, | ||
| 0x04 => AclTag::GroupObj, | ||
| 0x08 => AclTag::Group, | ||
| 0x10 => AclTag::Mask, | ||
| 0x20 => AclTag::Other, | ||
| other => AclTag::Unknown(other), |
There was a problem hiding this comment.
Have you considered creating a enum acl_tag_t in the bpf code to encode these values at a lower level?
There was a problem hiding this comment.
Good plan, will add.
| perm: e.perm as u32, | ||
| // ACL_UNDEFINED_ID (0xFFFFFFFF) for entries that don't | ||
| // carry a uid/gid (USER_OBJ, GROUP_OBJ, MASK, OTHER). | ||
| id: e.id.unwrap_or(0xFFFFFFFF), |
There was a problem hiding this comment.
| id: e.id.unwrap_or(0xFFFFFFFF), | |
| id: e.id.unwrap_or(u32::MAX), |
| .map(|e| fact_api::AclEntry { | ||
| tag: i32::from(e.tag), | ||
| perm: e.perm as u32, | ||
| // ACL_UNDEFINED_ID (0xFFFFFFFF) for entries that don't | ||
| // carry a uid/gid (USER_OBJ, GROUP_OBJ, MASK, OTHER). | ||
| id: e.id.unwrap_or(0xFFFFFFFF), | ||
| }) |
There was a problem hiding this comment.
Consider adding impl From<AclEntry> for fact_api::AclEntry and replacing this block with
| .map(|e| fact_api::AclEntry { | |
| tag: i32::from(e.tag), | |
| perm: e.perm as u32, | |
| // ACL_UNDEFINED_ID (0xFFFFFFFF) for entries that don't | |
| // carry a uid/gid (USER_OBJ, GROUP_OBJ, MASK, OTHER). | |
| id: e.id.unwrap_or(0xFFFFFFFF), | |
| }) | |
| .map(fact_api::AclEntry::from) |
| _ACL_VERSION = 2 | ||
| _ACL_UNDEFINED_ID = 0xFFFFFFFF | ||
|
|
||
| # Kernel ACL tag values (from include/uapi/linux/posix_acl.h) |
There was a problem hiding this comment.
This comment should be added to the main code as well, better if it is a permalink to github or elixir.bootlin
| The monitored_dir fixture is tracked by path prefix, but since | ||
| the ACL hook only monitors by inode, we need the directory to | ||
| be inode-tracked. monitored_dir is included in fact's paths | ||
| config, so fact tracks it by inode after the initial scan. |
There was a problem hiding this comment.
What? There's no configuration change happening in this test, what?
| # Verify the server is working by doing a chmod on a monitored file | ||
| process = Process.from_proc() | ||
| mode = 0o644 | ||
| os.chmod(test_file, mode) |
There was a problem hiding this comment.
Why chmod? Why not just setting an acl on that file? If you use different type of events a misconfiguration on the server could exclude all acl events and inadvertently pass the test when the acl event is wrongly generated.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
fact-ebpf/src/bpf/types.h (1)
82-103: 🗄️ Data Integrity & Integration | 🔵 Trivial | ⚡ Quick winConsider fixed-width types for wire-format enum fields.
acl_type_tandacl_tag_tare plain C enums with implementation-defined underlying size, but they're used directly as fields inacl_entry_t/theaclunion member (Line 100, Line 149) that gets copied byte-for-byte into a ring-buffer event later decoded by a separate Rust binary. If the enum's compiled width ever diverges from what the Rust decoder assumes, the ACL payload would be silently misparsed. Using an explicit fixed-width type (e.g.,unsigned char) fore_tag/acl_typeat the struct-field level (while keeping the enum for symbolic constants elsewhere) would make the cross-language layout contract explicit and robust.♻️ Proposed fix
struct acl_entry_t { - acl_tag_t e_tag; + unsigned char e_tag; // acl_tag_t value unsigned short e_perm; unsigned int e_id; };🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@fact-ebpf/src/bpf/types.h` around lines 82 - 103, `acl_type_t` and `acl_tag_t` are being used as wire-format field types in `acl_entry_t` and the `acl` union, but plain enums have implementation-defined size. Update the struct fields that are serialized across the ring buffer to use an explicit fixed-width type such as `unsigned char`, while keeping `acl_type_t` and `acl_tag_t` only as symbolic enums for constants. Make sure the `acl_entry_t` definition and the `acl` union member stay layout-compatible with the Rust decoder.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@fact-ebpf/src/bpf/types.h`:
- Around line 82-103: `acl_type_t` and `acl_tag_t` are being used as wire-format
field types in `acl_entry_t` and the `acl` union, but plain enums have
implementation-defined size. Update the struct fields that are serialized across
the ring buffer to use an explicit fixed-width type such as `unsigned char`,
while keeping `acl_type_t` and `acl_tag_t` only as symbolic enums for constants.
Make sure the `acl_entry_t` definition and the `acl` union member stay
layout-compatible with the Rust decoder.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Enterprise
Run ID: ba942558-b1cd-4991-8566-24551f09d46b
📒 Files selected for processing (6)
fact-ebpf/src/bpf/events.hfact-ebpf/src/bpf/types.hfact/src/bpf/checks.rsfact/src/bpf/mod.rsfact/src/event/mod.rstests/test_acl.py
🚧 Files skipped from review as they are similar to previous changes (5)
- fact/src/bpf/checks.rs
- fact-ebpf/src/bpf/events.h
- tests/test_acl.py
- fact/src/bpf/mod.rs
- fact/src/event/mod.rs
Description
Adds ACL tracking via inode_set_acl hook. Due to lack of path info in this hook, it is only possible to monitor inode-tracked files (i.e. those on the host file system) this is similar to how xattrs are tracked.
An extension to the BPF checks.c file has been added, because this hook does not exist on older kernels (where ACLs were applied via xattrs, so we still have coverage on those old kernels, just through a different mechanism.)
Tests factor in this optional nature of the hook, and will only run when we know the kernel supports it.
Relies on stackrox/stackrox#21357
Checklist
Automated testing
If any of these don't apply, please comment below.
Testing Performed
Tested locally, with setfacl, and the integration tests.
Summary by CodeRabbit