LOG-9359: Make ignore_older_secs configurable for audit inputs#3278
Conversation
|
@vparfonov: This pull request references LOG-9359 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.8.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/hold |
WalkthroughAdds an audit input tuning object with optional Audit Input Tuning
sequenceDiagram
participant User
participant API as "API types (Go)"
participant CRD as "CRD/CSV"
participant Gen as "Generator (Go)"
participant Template as "Vector TOML"
participant Collector as "Vector runtime"
User->>API: set spec.inputs[].audit.tuning.ignoreOlder
API->>CRD: CRD schema exposes `ignoreOlder`
User->>CRD: submit ClusterLogForwarder
Gen->>API: read Audit.Tuning.IgnoreOlder
Gen->>Template: generate TOML with ignore_older_secs
Collector->>Template: use generated config (ignore_older_secs applied)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 1 warning, 1 inconclusive)
✅ Passed checks (9 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
api/observability/v1/input_types.go (1)
276-279: ⚡ Quick winAdd the numeric CSV descriptor to
ignoreOlderSeconds.This integer field should include the
urn:alm:descriptor:com.tectonic.ui:numbermetadata so the OLM form renders it as a numeric input instead of generic text, consistent with other numeric audit settings.♻️ Proposed fix
- // +operator-sdk:csv:customresourcedefinitions:type=spec,displayName="Ignore Older Seconds" + // +operator-sdk:csv:customresourcedefinitions:type=spec,displayName="Ignore Older Seconds",xDescriptors={"urn:alm:descriptor:com.tectonic.ui:number"}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@api/observability/v1/input_types.go` around lines 276 - 279, The IgnoreOlderSeconds field needs the OLM numeric descriptor so the CSV form shows a numeric input; update the operator-sdk CSV comment above the IgnoreOlderSeconds declaration (the existing +operator-sdk:csv:customresourcedefinitions:type=spec,displayName="Ignore Older Seconds") to include x-descriptors="urn:alm:descriptor:com.tectonic.ui:number" (i.e. add ,x-descriptors="urn:alm:descriptor:com.tectonic.ui:number" to that comment) so the CRD/CSV metadata marks IgnoreOlderSeconds as a number.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@api/observability/v1/input_types.go`:
- Around line 276-279: The IgnoreOlderSeconds field needs the OLM numeric
descriptor so the CSV form shows a numeric input; update the operator-sdk CSV
comment above the IgnoreOlderSeconds declaration (the existing
+operator-sdk:csv:customresourcedefinitions:type=spec,displayName="Ignore Older
Seconds") to include x-descriptors="urn:alm:descriptor:com.tectonic.ui:number"
(i.e. add ,x-descriptors="urn:alm:descriptor:com.tectonic.ui:number" to that
comment) so the CRD/CSV metadata marks IgnoreOlderSeconds as a number.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 4f767cf4-7927-4852-8408-959b3cf72ced
⛔ Files ignored due to path filters (1)
api/observability/v1/zz_generated.deepcopy.gois excluded by!**/zz_generated*
📒 Files selected for processing (9)
api/observability/v1/input_types.gobundle/manifests/cluster-logging.clusterserviceversion.yamlbundle/manifests/observability.openshift.io_clusterlogforwarders.yamlconfig/crd/bases/observability.openshift.io_clusterlogforwarders.yamlconfig/manifests/bases/cluster-logging.clusterserviceversion.yamlinternal/generator/vector/input/audit.gointernal/generator/vector/input/audit_host_with_ignore_older.tomlinternal/generator/vector/input/audit_with_ignore_older.tomlinternal/generator/vector/input/source_test.go
| // +kubebuilder:validation:Optional | ||
| // +kubebuilder:validation:Minimum=1 | ||
| // +operator-sdk:csv:customresourcedefinitions:type=spec,displayName="Ignore Older Seconds" | ||
| IgnoreOlderSeconds *int64 `json:"ignoreOlderSeconds,omitempty"` |
There was a problem hiding this comment.
replace with a duration for Consistency with output tuning. Looks like there are library functions to address minimum https://kubernetes.io/docs/reference/using-api/cel/#kubernetes-quantity-library
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
docs/reference/operator/api_observability_v1.adoc (2)
1584-1588: ⚡ Quick winClarify the default value format and property name capitalization.
The description starts with "IgnoreOlder" (PascalCase) but the property name in the table is "ignoreOlder" (camelCase). Additionally, the default value "3600 (1 hour)" mixes a raw number with a time unit, which may confuse users about the expected input format.
Consider:
- Using "ignoreOlder" consistently (matching the property name)
- Clarifying the format: either "1h" (if Duration uses Go duration syntax) or "3600 seconds (1 hour)" if specifying seconds explicitly
- Adding a note about the expected Duration format (e.g., "Accepts Go duration format like '1h', '30m', '3600s'")
📝 Suggested documentation improvement
-|ignoreOlder|Duration| IgnoreOlder specifies the maximum duration since the last modification +|ignoreOlder|Duration| ignoreOlder specifies the maximum duration since the last modification of an audit log file before the collector ignores it. When the collector restarts, files that have not been modified within this time window may not be collected. Increase this value for audit sources with infrequent writes to prevent data loss. -The default value is 3600 (1 hour). +The default value is 1h (1 hour). +Accepts Go duration format (e.g., '30m', '1h', '24h').🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/reference/operator/api_observability_v1.adoc` around lines 1584 - 1588, Update the docs to use the camelCase property name "ignoreOlder" consistently (not "IgnoreOlder") and clarify the expected Duration format and default value: state the field accepts Go-style durations (e.g., "1h", "30m", "3600s") and set the default explicitly as "1h (equals 3600s)" or "3600s (1h)" so readers know both the canonical duration string and the equivalent seconds; reference the property name "ignoreOlder" and the Duration type in the description and add a short note like "Accepts Go duration format such as '1h', '30m', or '3600s'".
2645-2651: ⚡ Quick winConsider clarifying the relationship between credentials types and token field.
The documentation explains that
credentialscan be either a service account key file or an external_account configuration file, andtokenis only needed for external_account types. However, the mutual exclusivity or dependency between these fields could be more explicit.Consider adding a note like:
"When using service account authentication, onlycredentialsis required. When using Workload Identity Federation with an external_account credentials file, bothcredentialsandtokenmust be provided."📝 Suggested documentation clarification
|credentials|object| Credentials points to the secret containing the GCP credentials JSON file. For service account auth, this is a service_account key file. For Workload Identity Federation (WIF), this is an external_account configuration file. +When using service account authentication, only credentials is required. |token|object| Token specifies the source of the bearer token used as the subject token for GCP Workload Identity Federation token exchange. Only needed when the credentials -file is an external_account type. +file is an external_account type. Must be provided when using Workload Identity Federation.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/reference/operator/api_observability_v1.adoc` around lines 2645 - 2651, Update the docs to explicitly state the dependency between the credentials and token fields: clarify that when using service account auth only the credentials field (service_account key file) is required, and when using Workload Identity Federation the credentials field must be the external_account config AND the token field (subject token source) must also be provided; modify the descriptions for the credentials and token entries to include a short note like "Service account: credentials only" and "External_account (WIF): credentials + token required" so readers can immediately see the mutual relationship.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@api/observability/v1/input_types.go`:
- Around line 269-281: The IgnoreOlder field on AuditInputTuningSpec is typed as
time.Duration but semantically represents seconds; change it to an explicit
seconds type (either int64 Seconds or metav1.Duration) and update the CRD
validation and conversion sites accordingly: modify the IgnoreOlder declaration
on AuditInputTuningSpec, adjust the XValidation rule to match the new type
(e.g., ensure value >= 1), and update all places that consume/convert
IgnoreOlder (notably internal/generator/vector/input/audit.go and
internal/generator/vector/output/common/request.go) to handle the new units/type
explicitly (no implicit multiply by time.Second), and update JSON/schema
tags/CSV metadata as needed.
---
Nitpick comments:
In `@docs/reference/operator/api_observability_v1.adoc`:
- Around line 1584-1588: Update the docs to use the camelCase property name
"ignoreOlder" consistently (not "IgnoreOlder") and clarify the expected Duration
format and default value: state the field accepts Go-style durations (e.g.,
"1h", "30m", "3600s") and set the default explicitly as "1h (equals 3600s)" or
"3600s (1h)" so readers know both the canonical duration string and the
equivalent seconds; reference the property name "ignoreOlder" and the Duration
type in the description and add a short note like "Accepts Go duration format
such as '1h', '30m', or '3600s'".
- Around line 2645-2651: Update the docs to explicitly state the dependency
between the credentials and token fields: clarify that when using service
account auth only the credentials field (service_account key file) is required,
and when using Workload Identity Federation the credentials field must be the
external_account config AND the token field (subject token source) must also be
provided; modify the descriptions for the credentials and token entries to
include a short note like "Service account: credentials only" and
"External_account (WIF): credentials + token required" so readers can
immediately see the mutual relationship.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: a3646404-6f8c-435a-985c-8905510e9582
⛔ Files ignored due to path filters (1)
api/observability/v1/zz_generated.deepcopy.gois excluded by!**/zz_generated*
📒 Files selected for processing (10)
api/observability/v1/input_types.gobundle/manifests/cluster-logging.clusterserviceversion.yamlbundle/manifests/observability.openshift.io_clusterlogforwarders.yamlconfig/crd/bases/observability.openshift.io_clusterlogforwarders.yamlconfig/manifests/bases/cluster-logging.clusterserviceversion.yamldocs/reference/operator/api_observability_v1.adocinternal/generator/vector/input/audit.gointernal/generator/vector/input/audit_host_with_ignore_older.tomlinternal/generator/vector/input/audit_with_ignore_older.tomlinternal/generator/vector/input/source_test.go
🚧 Files skipped from review as they are similar to previous changes (3)
- config/manifests/bases/cluster-logging.clusterserviceversion.yaml
- internal/generator/vector/input/source_test.go
- internal/generator/vector/input/audit_with_ignore_older.toml
Add AuditInputTuningSpec with IgnoreOlder field (metav1.Duration) to the ClusterLogForwarder API, allowing users to configure how long audit log files can be idle before the collector skips them on restart. Uses metav1.Duration for human-readable values (e.g. "168h") and CEL validation to enforce minimum of 1 second. Default remains 3600s (1 hour) for backward compatibility. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Suggest making it a generic input parameter, could apply to all inputs. Make sense to increase the limit. Beware the tradeoff: this will increase the amount of old log data read on start-up, vector will read more old rotation files than it otherwise would have. Set the limit high enough for the problems you know about, but not too much higher. E.g. if the worse case is 1d maybe 2d is high enough - just an example, you have the data. |
The proposed change is consistent with how we defined tuning for Outputs where they are type specific and application input tuning. It may make sense to add this parameter in future to other inputs but I recommend waiting until we see an actual need given this is the first report for audit since the 6.x API has been in existence |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jcantrill, vparfonov The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/hold cancel |
|
@vparfonov: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Description
The default
ignore_older_secsfor audit file sources is hardcoded to3600seconds (1 hour). This is too short for host audit logs (/var/log/audit/audit.log) and other audit sources that commonly have gaps exceeding one hour in production environments. When the collector restarts (e.g., during a ClusterLogForwarder update) and a file hasn't been modified within that window, audit logs are either silently lost or duplicated (due to upstream vector#17208).This PR adds an
AuditInputTuningSpecwith an ignoreOlder field (*time.Duration, consistent with output tuning specs) to the ClusterLogForwarder API, allowing users to configure this threshold per audit input. CEL validation enforces a minimum of 1 second. The default remains 3600s for backward compatibility.Example usage:
Changes:
/cc
/assign
Links
Summary by CodeRabbit
New Features
Documentation