feat(skill): BM25-based smart skill retrieval for large catalogues by claudianus · Pull Request #726 · MoonshotAI/kimi-code

claudianus · 2026-06-13T19:30:26Z

Related Issue

Resolve #725

Problem

When a user has a large number of skills installed (e.g. 1,000+), the current implementation injects the entire skill listing into the system prompt — including name, description (250 chars), whenToUse, and file path for every skill.

With 1,530 real skills this amounts to ~118,000 tokens in the system prompt, which:

Consumes ~46% of a 256K context window
Loads all SKILL.md file bodies into memory at startup (~88 MB JS heap)
Forces the LLM to scan a massive listing to find the right skill — poor accuracy due to lost-in-the-middle effects

What Changed

A 3-tier auto-detection system based on invocable skill count. No feature flag needed — the tier is selected automatically at session start.

Skill count	Listing format	Strategy
≤ 80	Legacy full listing (name + desc + path)	Prompt-cache optimal for small catalogues
81–300	Compact (name + 80-char desc)	Skill search available
> 300	Names only	Skill search required

New files

skill/search.ts — BM25 search index with synonym expansion. Zero external dependencies. ~24ms to build for 1,500 skills, <1ms per query.

Modified files

skill/registry.ts — Auto-detect tier based on skill count. searchSkills() method with lazy index rebuild. renderSkillPrompt() lazy-loads SKILL.md content from disk only when a skill is activated.
skill/parser.ts — parseSkillMetaFromFile() reads only YAML frontmatter (~20 lines) instead of the full SKILL.md body. Uses LAZY_CONTENT_SENTINEL to distinguish "not loaded" from "genuinely empty body".
skill-tool.ts / .md — Skill tool extended with action: "search" alongside existing action: "load". Backward-compatible — skill field remains required.
profile/default/system.md — Two-step skill usage: search first, then load.
session/index.ts — Minor cleanup (removed flag wiring).
flags/registry.ts — No new flags needed.

Tests

New tests merged into existing registry.test.ts (search, tiers, lazy rebuild)
New integration-proof.test.ts validates what the LLM actually sees with real 1,530 skills
Existing scanner.test.ts updated for lazy loading via renderSkillPrompt
2,572 tests pass, 0 failures

Performance (measured with 1,530 real skills from antigravity-awesome-skills)

Metric	Before	After	Reduction
System prompt	118,120 tokens	8,436 tokens	93%
Startup memory	~88 MB	~4 MB	95%
Search latency	N/A (LLM scans listing)	0.0–0.2ms (BM25)	—
Lazy content load	N/A (eager at startup)	0.4ms per activation	—

Checklist

Tests added and passing (2,572/2,572)
Lint clean for modified files (oxlint + sherif)
TypeScript clean (tsc --noEmit, 0 errors)
Changeset generated (minor bump)
No co-author attribution or agent identity
No internal identifiers in diff
Backward-compatible (existing skill callers unchanged)

When >80 skills are installed, the system prompt switches from a full listing to a compact name-only format. The model discovers skills via the Skill tool's new action:"search" endpoint backed by a BM25 index. Three tiers by skill count: ≤ 80: legacy full listing (prompt-cache optimal) 81-300: compact name + description + search > 300: names only + search required Key changes: - skill/search.ts: BM25 index with synonym expansion (zero deps) - skill/registry.ts: auto-detect tier, lazy content loading - skill/parser.ts: parseSkillMetaFromFile (frontmatter-only) - skill-tool.ts: search action on existing Skill tool - system.md: search-first workflow instructions Performance (measured with 1,530 real skills): - System prompt: 118K → 8.4K tokens (93% reduction) - Startup memory: 88MB → 4MB (95% reduction) - Search latency: 0.0-0.2ms per query - Lazy content load: 0.4ms per skill activation Closes MoonshotAI#725

changeset-bot · 2026-06-13T19:30:31Z

🦋 Changeset detected

Latest commit: b1136f9

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package

Name	Type
@moonshot-ai/kimi-code	Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4eca85320e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-06-13T19:33:43Z

 export const SkillToolInputSchema: z.ZodType<SkillToolInput> = z.object({
  skill: z.string(),


Allow search calls without a skill name

When the model follows the new instructions/example and calls Skill as {"action":"search","query":"e2e test"}, preflight validation rejects it before execution() runs because this schema still makes skill required; loop/tool-call.ts validates the advertised JSON schema with AJV. This breaks the large-catalog search-first flow unless the model invents a dummy skill name, so skill should be optional for search or the schema should be discriminated by action.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-13T19:33:43Z

+import { join } from 'node:path';
+import { performance } from 'node:perf_hooks';
+
+const SKILLS_DIR = join(homedir(), '.kimi', 'skills');


Avoid depending on a local home skill catalog

These tests depend on ~/.kimi/skills existing and containing the author's real 1,530-skill catalog. In a clean CI or developer checkout this path is usually absent or has different contents, so the registry loads no/different skills and the listing assertions or results[0]! dereference fail. Build a temporary fixture catalog inside the test instead of reading the user's home directory.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-13T19:33:43Z

+      fencesFound++;
+      if (fencesFound === 2) break;
+    }
+    offset += line.length + 1;


Preserve CRLF frontmatter fences when slicing

For SKILL.md files written with CRLF line endings, split(/\r?\n/) removes a two-character newline but this offset only adds + 1, so frontmatterOnly is sliced before the second ---; parseSkillText then reports a missing closing fence and discovery skips the skill. This regresses Windows-authored skills that the full parser already accepts, so compute the fence position on the original buffer or account for \r\n lengths.

Useful? React with 👍 / 👎.

P1: Make skill field optional in SkillToolInputSchema — search-only calls no longer need a dummy skill name. Validation added for load action requiring skill parameter. P1: Replace home-dir-dependent integration tests with self-contained temp fixture (350 SKILL.md files in mkdtemp). Tests are now portable across CI and developer machines. P2: Fix CRLF fence offset in parseSkillMetaFromFile — split on \n and strip trailing \r to correctly account for 2-byte newlines when computing the slice boundary. Also: update skill-tool.test.ts contract to reflect optional skill.

Six smoke tests with real fixture files (350 SKILL.md): 1. Tier auto-detection (names-only for 350 skills) 2. Lazy content loading (sentinel → readFileSync) 3. BM25 search accuracy (8/8 domain queries correct) 4. Schema validation (search without skill name) 5. CRLF frontmatter parsing 6. Full model flow: search → pick → load → render

chatgpt-codex-connector Bot reviewed Jun 13, 2026

View reviewed changes

claudianus added 2 commits June 14, 2026 04:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(skill): BM25-based smart skill retrieval for large catalogues#726

feat(skill): BM25-based smart skill retrieval for large catalogues#726
claudianus wants to merge 3 commits into
MoonshotAI:mainfrom
claudianus:main

claudianus commented Jun 13, 2026

Uh oh!

changeset-bot Bot commented Jun 13, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 13, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 13, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		export const SkillToolInputSchema: z.ZodType<SkillToolInput> = z.object({
		skill: z.string(),

Conversation

claudianus commented Jun 13, 2026

Related Issue

Problem

What Changed

New files

Modified files

Tests

Performance (measured with 1,530 real skills from antigravity-awesome-skills)

Checklist

Uh oh!

changeset-bot Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

changeset-bot Bot commented Jun 13, 2026 •

edited

Loading