Skip to content

feat: add CEL metrics tracking for expression evaluations#934

Closed
govindup63 wants to merge 4 commits intokubernetes-sigs:mainfrom
govindup63:feat/issue-190
Closed

feat: add CEL metrics tracking for expression evaluations#934
govindup63 wants to merge 4 commits intokubernetes-sigs:mainfrom
govindup63:feat/issue-190

Conversation

@govindup63
Copy link
Copy Markdown
Contributor

This PR implements CEL expression evaluation cost tracking and exposes metrics in the ResourceGroup instance status, addressing issue #190.

The implementation tracks the cost of all CEL expression evaluations during reconciliation and provides both total cost and per-resource cost breakdown in the instance status. This enables platform teams to monitor and optimize CEL expression performance.

Changes

Core Implementation

  • Added CEL metrics tracking in pkg/runtime/runtime.go:

    • CELMetrics struct to track total cost and per-resource costs
    • RecordCELCost() method to accumulate costs during expression evaluation
    • GetCELMetrics() method that returns a deep copy to protect internal state
    • ResetCELMetrics() method for resetting metrics (interface compliance)
    • Cost tracking integrated into all CEL evaluation points:
      • Static variable evaluation (evaluateStaticVariables)
      • Dynamic variable evaluation (evaluateDynamicVariables)
      • includeWhen expressions (ReadyToProcessResource)
      • readyWhen expressions (IsResourceReady)
  • Status exposure in pkg/controller/instance/controller_status.go:

    • Added celMetrics to instance status with totalCost and costPerResource
    • Added nil checks for safety when runtime creation fails
  • CRD schema updates in pkg/graph/crd/:

    • Added celMetrics as a default status field in CRD schema
    • Defined proper schema structure for totalCost (int64) and costPerResource (map[string]int64)
    • Added additionalProperties: true to status schema to support controller-injected fields

Testing

  • Added comprehensive unit tests in pkg/runtime/runtime_test.go:
    • TestCELMetrics - tests cost accumulation, retrieval, and reset
    • TestEvaluateExpressionCostTracking - verifies cost tracking is enabled
    • TestRuntimeCELCostIntegration - integration test for static variable cost tracking
    • TestIsResourceReadyCostTracking - tests readyWhen expression cost tracking
    • TestReadyToProcessResourceCostTracking - tests includeWhen expression cost tracking
    • TestCELCostMultipleResources - tests cost tracking across multiple resources
    • TestCELCostReadyWhen - focused test for readyWhen expressions

Example Output

After reconciliation, the instance status includes CEL metrics:

apiVersion: kro.run/v1alpha1
kind: WebApp
metadata:
  name: my-application
spec:
  # ... spec fields
status:
  celMetrics:
    totalCost: 76
    costPerResource:
      schema: 18        # Static variable evaluations
      deployment: 14    # Deployment resource expressions
      service: 22       # Service resource expressions
      ingress: 16       # Ingress includeWhen evaluation
      instance: 6       # Instance variable evaluations
  # ... other status fields

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: govindup63
Once this PR has been reviewed and has the lgtm label, please assign barney-s for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 3, 2026
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jan 3, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @govindup63. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jan 3, 2026
- Introduced CELMetrics struct to track total cost and cost per resource.
- Implemented methods to record, reset, and retrieve CEL metrics in ResourceGraphDefinitionRuntime.
- Updated expression evaluation methods to track costs during evaluations.
- Enhanced instance status preparation to include CEL metrics in the status map.
- Added comprehensive tests for CEL metrics functionality and cost tracking in various scenarios
- Updated instance reconciliation to clarify CEL metrics initialization.
- Improved status preparation to handle nil runtime scenarios and ensure CEL metrics are included only when available.
- Added default CEL metrics field to CRD schema for consistent tracking.
- Enhanced tests for CEL cost tracking across multiple resources and readiness evaluations.
- Introduced logging for CEL metrics tracking to aid in debugging.
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 3, 2026
@bschaatsbergen
Copy link
Copy Markdown
Member

bschaatsbergen commented Jan 4, 2026

Hey @govindup63, thanks for working on this! It would be good to align with @TusharMohapatra07 in #931, since they're doing similar work.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 12, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@TusharMohapatra07
Copy link
Copy Markdown

Hey @govindup63, thanks for working on this! It would be good to align with @TusharMohapatra07 in #931, since they're doing similar work.

Hey @bschaatsbergen! Hope you're doing great :)
could you please elaborate on the next steps ?

@govindup63
Copy link
Copy Markdown
Contributor Author

Hi @a-hilaly , just a gentle ping on this PR when you get a chance.
Would really appreciate your review. Thanks!

Copy link
Copy Markdown
Member

@a-hilaly a-hilaly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for working on this.

I’m going to close this PR for now. Exposing CEL evaluation cost in instance status adds a new public status surface, and I don’t think we have the design fully settled yet for what exactly should be tracked, how stable that contract should be, and whether this belongs in status vs other observability paths.

/close

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@a-hilaly: Closed this PR.

Details

In response to this:

thanks for working on this.

I’m going to close this PR for now. Exposing CEL evaluation cost in instance status adds a new public status surface, and I don’t think we have the design fully settled yet for what exactly should be tracked, how stable that contract should be, and whether this belongs in status vs other observability paths.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants