Skill
azure-naming-research — source: .github/skills/azure-naming-research/SKILL.md
Scope
Author the eval suite at .github/evals/azure-naming-research/:
Procedure
/skill-bench azure-naming-research drafts the suite from the live SKILL.md.
waza run .github/evals/azure-naming-research/eval.yaml -v locally — confirm all tasks resolve and produce a score.
/skill-improve azure-naming-research to iterate on graders.
- Open PR.
- Mock CI runs automatically. A maintainer will dispatch a real-model run before merge.
Acceptance
Conventions to follow
- Persona lock: refusal graders should accept the agent's own scope language, not require a specific phrase.
- Don't add
required_skills to a skill_invocation grader unless the skill genuinely invokes those sub-skills.
- Prompt graders need
continue_session: true in their grader config.
Related
Skill
azure-naming-research— source:.github/skills/azure-naming-research/SKILL.mdScope
Author the eval suite at
.github/evals/azure-naming-research/:eval.yaml— suite config (executor, model, graders)tasks/positive-*.yamltasks/negative-*.yaml(off-topic / out-of-scope prompts).github/evals/manifest.yamlattier: expandedProcedure
/skill-bench azure-naming-researchdrafts the suite from the liveSKILL.md.waza run .github/evals/azure-naming-research/eval.yaml -vlocally — confirm all tasks resolve and produce a score./skill-improve azure-naming-researchto iterate on graders.Acceptance
mockexecutor.manifest.yamlentry added; PR description includes the real-model run summary.Conventions to follow
required_skillsto askill_invocationgrader unless the skill genuinely invokes those sub-skills.continue_session: truein their grader config.Related