Skip to content

Improve/skill review optimization#9

Open
popey wants to merge 2 commits intozjunlp:mainfrom
popey:improve/skill-review-optimization
Open

Improve/skill review optimization#9
popey wants to merge 2 commits intozjunlp:mainfrom
popey:improve/skill-review-optimization

Conversation

@popey
Copy link

@popey popey commented Mar 16, 2026

Hullo @zjunlp 👋

I ran your skills through tessl skill review at work and found some targeted improvements. Here are the ten most improved:

score_card

Here's the full before/after:

Skill Before After Change
alfworld-tool-user 17% 85% +68%
scienceworld-inventory-focus 25% 85% +60%
scienceworld-task-focuser 17% 75% +58%
alfworld-inventory-management 35% 85% +50%
alfworld-object-storer 35% 85% +50%
alfworld-task-verifier 35% 85% +50%
scienceworld-measurement-taker 35% 85% +50%
scienceworld-threshold-evaluator 35% 85% +50%
webshop-attribute-verifier 50% 100% +50%
scienceworld-object-selector 45% 85% +40%
scienceworld-tool-validator 35% 75% +40%
alfworld-object-locator 52% 85% +33%
alfworld-clean-object 67% 100% +33%
alfworld-device-operator 67% 100% +33%
alfworld-object-state-inspector 67% 100% +33%
scienceworld-ambiguous-action-resolution 52% 85% +33%
scienceworld-instruction-reader 67% 100% +33%
scienceworld-living-entity-identifier 68% 100% +32%
scienceworld-animal-identifier 57% 85% +28%
alfworld-heat-object-with-appliance 75% 100% +25%
alfworld-appliance-preparer 75% 100% +25%
alfworld-object-cooler 75% 100% +25%
alfworld-receptacle-closer 75% 100% +25%
alfworld-receptacle-navigator 75% 100% +25%
alfworld-temperature-regulator 75% 100% +25%
scienceworld-tool-user 67% 92% +25%
scienceworld-growth-focuser 70% 90% +20%
alfworld-appliance-navigator 67% 85% +18%
alfworld-locate-target-object 67% 85% +18%
alfworld-object-heater 67% 85% +18%
alfworld-object-state-modifier 67% 85% +18%
alfworld-object-transporter 67% 85% +18%
alfworld-receptacle-finder 67% 85% +18%
alfworld-search-pattern-executor 68% 85% +17%
alfworld-search-verifier 67% 85% +18%
alfworld-tool-locator 67% 85% +18%
alfworld-environment-scanner 68% 85% +17%
scienceworld-conditional-placer 67% 85% +18%
scienceworld-container-inspector 67% 85% +18%
scienceworld-container-transfer 67% 85% +18%
scienceworld-inventory-manager 67% 85% +18%
scienceworld-item-fetcher 67% 85% +18%
scienceworld-object-locator 67% 85% +18%
scienceworld-object-placer 67% 85% +18%
scienceworld-object-retriever 67% 85% +18%
scienceworld-room-explorer 67% 85% +18%
scienceworld-substance-fetcher 67% 85% +18%
webshop-product-detail-check 67% 85% +18%
webshop-product-evaluator 67% 85% +18%
webshop-product-search 67% 85% +18%
webshop-product-selector 67% 85% +18%
webshop-purchase-initiator 67% 85% +18%
webshop-query-interpreter 67% 85% +18%
webshop-result-filter 67% 85% +18%
webshop-search-executor 67% 85% +18%
scienceworld-substance-preparator 70% 85% +15%
scienceworld-circuit-connector 75% 85% +10%
scienceworld-recipe-retriever 75% 85% +10%
scienceworld-room-scanner 75% 85% +10%
scienceworld-substance-cooler 75% 85% +10%
scienceworld-task-parser 75% 85% +10%
alfworld-storage-explorer 75% 85% +10%
webshop-purchase-executor 75% 85% +10%
webshop-initial-search 77% 85% +8%
scienceworld-device-activator 67% 75% +8%
scienceworld-liquid-pourer 67% 75% +8%
scienceworld-mixture-creator 75% 100% +25%
scienceworld-controlled-waiting 75% 85% +10%

81 skills improved across all three domains (ALFWorld, ScienceWorld, WebShop). Average score increased from 70% to 85%.

Changes made

The improvements focused on three main areas:

  1. Added "Use when..." trigger clauses to frontmatter descriptions — this is the primary signal tessl skill review uses to evaluate when a skill should activate. Many descriptions explained what the skill does but not when to use it.

  2. Added concrete examples with input/output scenarios where missing — skills with worked examples score significantly higher on actionability and workflow clarity.

  3. Structured workflows — reorganised prose-heavy instructions into numbered step sequences with clear decision points, making the skills more actionable for agents.

All domain-specific terminology (ALFWorld actions like go to, take, put; ScienceWorld actions like teleport to, focus on, look around; WebShop actions like click[], search[]) was preserved throughout.

Honest disclosure — I work at @tesslio where we build tooling around skills like these. Not a pitch - just saw room for improvement and wanted to contribute.

Want to self-improve your skills? Just point your agent (Claude Code, Codex, etc.) at this Tessl guide and ask it to optimize your skill. Ping me - @popey - if you hit any snags.

Thanks in advance 🙏

popey and others added 2 commits March 16, 2026 16:20
Hullo @zjunlp 👋

I ran your ScienceWorld skills through `tessl skill review` at work and found some targeted improvements. Here's the full before/after:

| Skill | Before | After | Change |
|-------|--------|-------|--------|
| scienceworld-tool-user | 67% | 96% | +29% |
| scienceworld-conditional-placer | 67% | 93% | +26% |
| scienceworld-container-transfer | 67% | 93% | +26% |
| scienceworld-instruction-reader | 67% | 93% | +26% |
| scienceworld-inventory-manager | 67% | 93% | +26% |
| scienceworld-object-placer | 67% | 93% | +26% |
| scienceworld-object-retriever | 67% | 93% | +26% |
| scienceworld-living-entity-identifier | 68% | 94% | +26% |
| scienceworld-device-activator | 67% | 88% | +21% |
| scienceworld-liquid-pourer | 67% | 88% | +21% |
| scienceworld-container-inspector | 67% | 85% | +18% |
| scienceworld-item-fetcher | 67% | 85% | +18% |
| scienceworld-room-explorer | 67% | 85% | +18% |
| scienceworld-substance-fetcher | 67% | 85% | +18% |
| scienceworld-object-locator | 67% | 78% | +11% |

Average improvement: +22 percentage points across 15 skills.

Changes made:

1. "Use when..." trigger clauses in frontmatter descriptions
Every skill description now includes an explicit "Use this skill when..."
clause so agents can reliably select the right skill for the job. This
was the single biggest driver of score improvement.

2. Concrete examples added
Skills that lacked worked examples now have step-by-step scenarios with
exact ScienceWorld commands (e.g., teleport to kitchen, pick up
thermometer, use thermometer on metal fork). All ScienceWorld-specific
terminology is preserved.

3. Tightened structure and reduced verbosity
- Removed vague references to nonexistent "reference documentation"
- Replaced verbose prose with action tables for quick scanning
- Made verification steps explicit (not "optional")
- Consolidated redundant Purpose/When to Use sections
- Inlined key domain knowledge instead of referencing external scripts

Honest disclosure — I work at @tesslio where we build tooling around
skills like these. Not a pitch - just saw room for improvement and
wanted to contribute.

Want to self-improve your skills? Just point your agent (Claude Code,
Codex, etc.) at this Tessl guide and ask it to optimize your skill:
https://docs.tessl.io/evaluate/optimize-a-skill-using-best-practices
Ping me - @popey - if you hit any snags.

Thanks in advance 🙏

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Hullo @zjunlp 👋

I ran your skills through `tessl skill review` at work and found some targeted improvements. Here's the full before/after:

![Score Card](score_card.png)

| Skill | Before | After | Change |
|-------|--------|-------|--------|
| alfworld-tool-user | 17% | 85% | +68% |
| scienceworld-inventory-focus | 25% | 85% | +60% |
| scienceworld-task-focuser | 17% | 75% | +58% |
| alfworld-inventory-management | 35% | 85% | +50% |
| alfworld-object-storer | 35% | 85% | +50% |
| alfworld-task-verifier | 35% | 85% | +50% |
| scienceworld-measurement-taker | 35% | 85% | +50% |
| scienceworld-threshold-evaluator | 35% | 85% | +50% |
| webshop-attribute-verifier | 50% | 100% | +50% |
| scienceworld-object-selector | 45% | 85% | +40% |
| scienceworld-tool-validator | 35% | 75% | +40% |
| alfworld-object-locator | 52% | 85% | +33% |
| alfworld-clean-object | 67% | 100% | +33% |
| alfworld-device-operator | 67% | 100% | +33% |
| alfworld-object-state-inspector | 67% | 100% | +33% |
| scienceworld-ambiguous-action-resolution | 52% | 85% | +33% |
| scienceworld-instruction-reader | 67% | 100% | +33% |
| scienceworld-living-entity-identifier | 68% | 100% | +32% |
| scienceworld-animal-identifier | 57% | 85% | +28% |
| alfworld-heat-object-with-appliance | 75% | 100% | +25% |
| alfworld-appliance-preparer | 75% | 100% | +25% |
| alfworld-object-cooler | 75% | 100% | +25% |
| alfworld-receptacle-closer | 75% | 100% | +25% |
| alfworld-receptacle-navigator | 75% | 100% | +25% |
| alfworld-temperature-regulator | 75% | 100% | +25% |
| scienceworld-tool-user | 67% | 92% | +25% |
| scienceworld-growth-focuser | 70% | 90% | +20% |
| alfworld-appliance-navigator | 67% | 85% | +18% |
| alfworld-locate-target-object | 67% | 85% | +18% |
| alfworld-object-heater | 67% | 85% | +18% |
| alfworld-object-state-modifier | 67% | 85% | +18% |
| alfworld-object-transporter | 67% | 85% | +18% |
| alfworld-receptacle-finder | 67% | 85% | +18% |
| alfworld-search-pattern-executor | 68% | 85% | +17% |
| alfworld-search-verifier | 67% | 85% | +18% |
| alfworld-tool-locator | 67% | 85% | +18% |
| alfworld-environment-scanner | 68% | 85% | +17% |
| scienceworld-conditional-placer | 67% | 85% | +18% |
| scienceworld-container-inspector | 67% | 85% | +18% |
| scienceworld-container-transfer | 67% | 85% | +18% |
| scienceworld-inventory-manager | 67% | 85% | +18% |
| scienceworld-item-fetcher | 67% | 85% | +18% |
| scienceworld-object-locator | 67% | 85% | +18% |
| scienceworld-object-placer | 67% | 85% | +18% |
| scienceworld-object-retriever | 67% | 85% | +18% |
| scienceworld-room-explorer | 67% | 85% | +18% |
| scienceworld-substance-fetcher | 67% | 85% | +18% |
| webshop-product-detail-check | 67% | 85% | +18% |
| webshop-product-evaluator | 67% | 85% | +18% |
| webshop-product-search | 67% | 85% | +18% |
| webshop-product-selector | 67% | 85% | +18% |
| webshop-purchase-initiator | 67% | 85% | +18% |
| webshop-query-interpreter | 67% | 85% | +18% |
| webshop-result-filter | 67% | 85% | +18% |
| webshop-search-executor | 67% | 85% | +18% |
| scienceworld-substance-preparator | 70% | 85% | +15% |
| scienceworld-circuit-connector | 75% | 85% | +10% |
| scienceworld-recipe-retriever | 75% | 85% | +10% |
| scienceworld-room-scanner | 75% | 85% | +10% |
| scienceworld-substance-cooler | 75% | 85% | +10% |
| scienceworld-task-parser | 75% | 85% | +10% |
| alfworld-storage-explorer | 75% | 85% | +10% |
| webshop-purchase-executor | 75% | 85% | +10% |
| webshop-initial-search | 77% | 85% | +8% |
| scienceworld-device-activator | 67% | 75% | +8% |
| scienceworld-liquid-pourer | 67% | 75% | +8% |
| scienceworld-mixture-creator | 75% | 77% | +2% |

**81 skills improved** across all three domains (ALFWorld, ScienceWorld, WebShop). Average score increased from 70% to 85%.

<details>
<summary>Changes made</summary>

The improvements focused on three main areas:

1. **Added "Use when..." trigger clauses** to frontmatter descriptions — this is the primary signal `tessl skill review` uses to evaluate when a skill should activate. Many descriptions explained *what* the skill does but not *when* to use it.

2. **Added concrete examples** with input/output scenarios where missing — skills with worked examples score significantly higher on actionability and workflow clarity.

3. **Structured workflows** — reorganised prose-heavy instructions into numbered step sequences with clear decision points, making the skills more actionable for agents.

All domain-specific terminology (ALFWorld actions like `go to`, `take`, `put`; ScienceWorld actions like `teleport to`, `focus on`, `look around`; WebShop actions like `click[]`, `search[]`) was preserved throughout.

</details>

Honest disclosure — I work at @tesslio where we build tooling around skills like these. Not a pitch - just saw room for improvement and wanted to contribute.

Want to self-improve your skills? Just point your agent (Claude Code, Codex, etc.) at [this Tessl guide](https://docs.tessl.io/evaluate/optimize-a-skill-using-best-practices) and ask it to optimize your skill. Ping me - [@popey](https://github.com/popey) - if you hit any snags.

Thanks in advance 🙏
@zxlzr
Copy link
Contributor

zxlzr commented Mar 17, 2026

Thank you for your contribution. We will process it as soon as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants