Open
Conversation
Hullo @zjunlp 👋 I ran your ScienceWorld skills through `tessl skill review` at work and found some targeted improvements. Here's the full before/after: | Skill | Before | After | Change | |-------|--------|-------|--------| | scienceworld-tool-user | 67% | 96% | +29% | | scienceworld-conditional-placer | 67% | 93% | +26% | | scienceworld-container-transfer | 67% | 93% | +26% | | scienceworld-instruction-reader | 67% | 93% | +26% | | scienceworld-inventory-manager | 67% | 93% | +26% | | scienceworld-object-placer | 67% | 93% | +26% | | scienceworld-object-retriever | 67% | 93% | +26% | | scienceworld-living-entity-identifier | 68% | 94% | +26% | | scienceworld-device-activator | 67% | 88% | +21% | | scienceworld-liquid-pourer | 67% | 88% | +21% | | scienceworld-container-inspector | 67% | 85% | +18% | | scienceworld-item-fetcher | 67% | 85% | +18% | | scienceworld-room-explorer | 67% | 85% | +18% | | scienceworld-substance-fetcher | 67% | 85% | +18% | | scienceworld-object-locator | 67% | 78% | +11% | Average improvement: +22 percentage points across 15 skills. Changes made: 1. "Use when..." trigger clauses in frontmatter descriptions Every skill description now includes an explicit "Use this skill when..." clause so agents can reliably select the right skill for the job. This was the single biggest driver of score improvement. 2. Concrete examples added Skills that lacked worked examples now have step-by-step scenarios with exact ScienceWorld commands (e.g., teleport to kitchen, pick up thermometer, use thermometer on metal fork). All ScienceWorld-specific terminology is preserved. 3. Tightened structure and reduced verbosity - Removed vague references to nonexistent "reference documentation" - Replaced verbose prose with action tables for quick scanning - Made verification steps explicit (not "optional") - Consolidated redundant Purpose/When to Use sections - Inlined key domain knowledge instead of referencing external scripts Honest disclosure — I work at @tesslio where we build tooling around skills like these. Not a pitch - just saw room for improvement and wanted to contribute. Want to self-improve your skills? Just point your agent (Claude Code, Codex, etc.) at this Tessl guide and ask it to optimize your skill: https://docs.tessl.io/evaluate/optimize-a-skill-using-best-practices Ping me - @popey - if you hit any snags. Thanks in advance 🙏 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Hullo @zjunlp 👋 I ran your skills through `tessl skill review` at work and found some targeted improvements. Here's the full before/after:  | Skill | Before | After | Change | |-------|--------|-------|--------| | alfworld-tool-user | 17% | 85% | +68% | | scienceworld-inventory-focus | 25% | 85% | +60% | | scienceworld-task-focuser | 17% | 75% | +58% | | alfworld-inventory-management | 35% | 85% | +50% | | alfworld-object-storer | 35% | 85% | +50% | | alfworld-task-verifier | 35% | 85% | +50% | | scienceworld-measurement-taker | 35% | 85% | +50% | | scienceworld-threshold-evaluator | 35% | 85% | +50% | | webshop-attribute-verifier | 50% | 100% | +50% | | scienceworld-object-selector | 45% | 85% | +40% | | scienceworld-tool-validator | 35% | 75% | +40% | | alfworld-object-locator | 52% | 85% | +33% | | alfworld-clean-object | 67% | 100% | +33% | | alfworld-device-operator | 67% | 100% | +33% | | alfworld-object-state-inspector | 67% | 100% | +33% | | scienceworld-ambiguous-action-resolution | 52% | 85% | +33% | | scienceworld-instruction-reader | 67% | 100% | +33% | | scienceworld-living-entity-identifier | 68% | 100% | +32% | | scienceworld-animal-identifier | 57% | 85% | +28% | | alfworld-heat-object-with-appliance | 75% | 100% | +25% | | alfworld-appliance-preparer | 75% | 100% | +25% | | alfworld-object-cooler | 75% | 100% | +25% | | alfworld-receptacle-closer | 75% | 100% | +25% | | alfworld-receptacle-navigator | 75% | 100% | +25% | | alfworld-temperature-regulator | 75% | 100% | +25% | | scienceworld-tool-user | 67% | 92% | +25% | | scienceworld-growth-focuser | 70% | 90% | +20% | | alfworld-appliance-navigator | 67% | 85% | +18% | | alfworld-locate-target-object | 67% | 85% | +18% | | alfworld-object-heater | 67% | 85% | +18% | | alfworld-object-state-modifier | 67% | 85% | +18% | | alfworld-object-transporter | 67% | 85% | +18% | | alfworld-receptacle-finder | 67% | 85% | +18% | | alfworld-search-pattern-executor | 68% | 85% | +17% | | alfworld-search-verifier | 67% | 85% | +18% | | alfworld-tool-locator | 67% | 85% | +18% | | alfworld-environment-scanner | 68% | 85% | +17% | | scienceworld-conditional-placer | 67% | 85% | +18% | | scienceworld-container-inspector | 67% | 85% | +18% | | scienceworld-container-transfer | 67% | 85% | +18% | | scienceworld-inventory-manager | 67% | 85% | +18% | | scienceworld-item-fetcher | 67% | 85% | +18% | | scienceworld-object-locator | 67% | 85% | +18% | | scienceworld-object-placer | 67% | 85% | +18% | | scienceworld-object-retriever | 67% | 85% | +18% | | scienceworld-room-explorer | 67% | 85% | +18% | | scienceworld-substance-fetcher | 67% | 85% | +18% | | webshop-product-detail-check | 67% | 85% | +18% | | webshop-product-evaluator | 67% | 85% | +18% | | webshop-product-search | 67% | 85% | +18% | | webshop-product-selector | 67% | 85% | +18% | | webshop-purchase-initiator | 67% | 85% | +18% | | webshop-query-interpreter | 67% | 85% | +18% | | webshop-result-filter | 67% | 85% | +18% | | webshop-search-executor | 67% | 85% | +18% | | scienceworld-substance-preparator | 70% | 85% | +15% | | scienceworld-circuit-connector | 75% | 85% | +10% | | scienceworld-recipe-retriever | 75% | 85% | +10% | | scienceworld-room-scanner | 75% | 85% | +10% | | scienceworld-substance-cooler | 75% | 85% | +10% | | scienceworld-task-parser | 75% | 85% | +10% | | alfworld-storage-explorer | 75% | 85% | +10% | | webshop-purchase-executor | 75% | 85% | +10% | | webshop-initial-search | 77% | 85% | +8% | | scienceworld-device-activator | 67% | 75% | +8% | | scienceworld-liquid-pourer | 67% | 75% | +8% | | scienceworld-mixture-creator | 75% | 77% | +2% | **81 skills improved** across all three domains (ALFWorld, ScienceWorld, WebShop). Average score increased from 70% to 85%. <details> <summary>Changes made</summary> The improvements focused on three main areas: 1. **Added "Use when..." trigger clauses** to frontmatter descriptions — this is the primary signal `tessl skill review` uses to evaluate when a skill should activate. Many descriptions explained *what* the skill does but not *when* to use it. 2. **Added concrete examples** with input/output scenarios where missing — skills with worked examples score significantly higher on actionability and workflow clarity. 3. **Structured workflows** — reorganised prose-heavy instructions into numbered step sequences with clear decision points, making the skills more actionable for agents. All domain-specific terminology (ALFWorld actions like `go to`, `take`, `put`; ScienceWorld actions like `teleport to`, `focus on`, `look around`; WebShop actions like `click[]`, `search[]`) was preserved throughout. </details> Honest disclosure — I work at @tesslio where we build tooling around skills like these. Not a pitch - just saw room for improvement and wanted to contribute. Want to self-improve your skills? Just point your agent (Claude Code, Codex, etc.) at [this Tessl guide](https://docs.tessl.io/evaluate/optimize-a-skill-using-best-practices) and ask it to optimize your skill. Ping me - [@popey](https://github.com/popey) - if you hit any snags. Thanks in advance 🙏
Contributor
|
Thank you for your contribution. We will process it as soon as possible. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hullo @zjunlp 👋
I ran your skills through
tessl skill reviewat work and found some targeted improvements. Here are the ten most improved:Here's the full before/after:
81 skills improved across all three domains (ALFWorld, ScienceWorld, WebShop). Average score increased from 70% to 85%.
Changes made
The improvements focused on three main areas:
Added "Use when..." trigger clauses to frontmatter descriptions — this is the primary signal
tessl skill reviewuses to evaluate when a skill should activate. Many descriptions explained what the skill does but not when to use it.Added concrete examples with input/output scenarios where missing — skills with worked examples score significantly higher on actionability and workflow clarity.
Structured workflows — reorganised prose-heavy instructions into numbered step sequences with clear decision points, making the skills more actionable for agents.
All domain-specific terminology (ALFWorld actions like
go to,take,put; ScienceWorld actions liketeleport to,focus on,look around; WebShop actions likeclick[],search[]) was preserved throughout.Honest disclosure — I work at @tesslio where we build tooling around skills like these. Not a pitch - just saw room for improvement and wanted to contribute.
Want to self-improve your skills? Just point your agent (Claude Code, Codex, etc.) at this Tessl guide and ask it to optimize your skill. Ping me - @popey - if you hit any snags.
Thanks in advance 🙏