Skip to content

feat(rules): refactor code to rename evaluators to rules#245

Draft
namrataghadi-galileo wants to merge 7 commits into
mainfrom
feature/67290-replace-evaluators-to-rules
Draft

feat(rules): refactor code to rename evaluators to rules#245
namrataghadi-galileo wants to merge 7 commits into
mainfrom
feature/67290-replace-evaluators-to-rules

Conversation

@namrataghadi-galileo

Copy link
Copy Markdown
Contributor

Summary

  • What changed and why.

Scope

  • User-facing/API changes:
  • Internal changes:
  • Out of scope:

Risk and Rollout

  • Risk level: low / medium / high
  • Rollback plan:

Testing

  • Added or updated automated tests
  • Ran make check (or explained why not)
  • Manually verified behavior

Checklist

  • Linked issue/spec (if applicable)
  • Updated docs/examples for user-facing changes
  • Included any required follow-up tasks

@namrataghadi-galileo namrataghadi-galileo changed the title feat(evaluators): refactor code to rename evaluators to rules feat(rules): refactor code to rename evaluators to rules Jun 22, 2026
@codecov

codecov Bot commented Jun 22, 2026

Copy link
Copy Markdown

@lan17 lan17 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we doing this?

Evaluator is a good name, no?

@namrataghadi-galileo

Copy link
Copy Markdown
Contributor Author

@lan17 This is to comply with Cisco/Splunks naming as ACE is offered now as a product within Splunk Cloud Observability. The keyword "evaluators" is reserved for "metrics".. Heres the proposal we are going to go ahead with after our internal discussions. Credits to @abhinav-galileo for coining this proposal.

{
  "scope": {
    "step_types": ["llm"],
    "stages": ["post"]
  },
  "condition": {
    "selector": { "path": "output" },
    "evaluator": {
      "name": "luna.toxicity",
      "config": {
        "operator": "gte",
        "threshold": 0.7
      }
    }
  },
  "action": { "decision": "deny" }
}

This PR is going to change. We will rename evaluators to checks and not rules.
So the terms are:

Evaluator = reusable compute/check primitive, e.g. Toxicity, Context Relevance, Regex, JSON (this complies with Splunk Observability nomenclature)
Provider = Luna, Azure Content Safety, Guardrails AI, Agent Control built-ins, etc.
Condition/check = selected input + evaluator + config/operator/threshold -> matched/not matched
Control = scope/stage + condition/check + action

In Agent Control, controls use evaluators through conditions/checks. The action is applied when the condition matches.

@lan17

lan17 commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

That's same as current design, no?

@namrataghadi-galileo

Copy link
Copy Markdown
Contributor Author

@lan17 There is subtle difference in how it works today

ControlDefinition = scope/stage + condition + action
condition = selector + evaluator_spec, or a boolean tree of conditions
evaluator_spec = evaluator + config

Example today:
{
  "scope": {
    "step_types": ["llm"],
    "stages": ["post"]
  },
  "condition": {
    "selector": { "path": "output" },
    "evaluator": {
      "name": "galileo.luna",
      "config": {
        "scorer_label": "toxicity",
        "operator": "gte",
        "threshold": 0.7
      }
    }
  },
  "action": { "decision": "deny" }
}

Here, "galileo.luna" is the evaluator, while "toxicity" is hidden inside its config. The proposal is to bring out the metrics like toxicity out as evaluators like below

"evaluator": {
      "name": "luna.toxicity",
      "config": {
        "operator": "gte",
        "threshold": 0.7
      }
    }

I would even go further and change this to

"evaluator": {
      "name": "toxicity",
      "provider": "luna",
      "config": {
        "operator": "gte",
        "threshold": 0.7
      }
    }

@lan17

lan17 commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

I'm still confused since evaluators currently are regex, Luna, etc, no?

@namrataghadi-galileo

Copy link
Copy Markdown
Contributor Author

“Evaluator” is a good name, but Splunk Observability already uses it for a reusable metric that returns a score, boolean, or findings. In Agent Control, today’s evaluator also applies thresholds or matching logic and decides whether a control triggers, so it is closer to a check.
For example: Luna evaluates toxicity and returns a score; “toxicity ≥ 0.7” is the check. Similarly, regex evaluates whether a pattern exists, while the check decides to trigger on a match; JSON/SQL evaluators return validation findings, while checks decide which findings trigger.
The proposed terminology is: Provider implements Evaluators; Checks turn evaluator outputs into matched/not matched; Conditions combine checks; and Controls add scope and action. This keeps Agent Control consistent with Splunk Observability and avoids “evaluators inside evaluators.”

{
  "scope": {
    "step_types": ["llm"],
    "stages": ["post"]
  },
  "condition": {
    "check": {
      "selector": {"path": "output"},
      "evaluator": {
        "provider": "galileo.luna",
        "name": "toxicity",
        "config": {
          "scorer_id": "..."
        }
      },
      "match": {
        "operator": "gte",
        "value": 0.7
      }
    }
  },
  "action": {
    "decision": "deny"
  }
}

@abhinav-galileo

Copy link
Copy Markdown
Collaborator

@namrataghadi-galileo - I am not sure about adding another level of nesting with check..

@namrataghadi-galileo namrataghadi-galileo marked this pull request as draft June 29, 2026 19:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants