Skill
azure-policy-advisor — source: .github/skills/azure-policy-advisor/SKILL.md
Scope
Author the eval suite at .github/evals/azure-policy-advisor/:
Procedure
/skill-bench azure-policy-advisor drafts the suite from the live SKILL.md.
waza run .github/evals/azure-policy-advisor/eval.yaml -v locally.
/skill-improve azure-policy-advisor to iterate on graders.
- Open PR.
- Mock CI runs automatically. A maintainer will dispatch a real-model run before merge.
Acceptance
Conventions to follow
- Persona lock: refusal graders should accept the agent's own scope language.
- Don't add
required_skills to a skill_invocation grader unless the skill genuinely invokes those sub-skills.
- Prompt graders need
continue_session: true in their grader config.
Related
Skill
azure-policy-advisor— source:.github/skills/azure-policy-advisor/SKILL.mdScope
Author the eval suite at
.github/evals/azure-policy-advisor/:eval.yaml— suite config (executor, model, graders)tasks/positive-*.yamltasks/negative-*.yaml.github/evals/manifest.yamlattier: expandedProcedure
/skill-bench azure-policy-advisordrafts the suite from the liveSKILL.md.waza run .github/evals/azure-policy-advisor/eval.yaml -vlocally./skill-improve azure-policy-advisorto iterate on graders.Acceptance
mockexecutor.manifest.yamlentry added; PR description includes the real-model run summary.Conventions to follow
required_skillsto askill_invocationgrader unless the skill genuinely invokes those sub-skills.continue_session: truein their grader config.Related