Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 35 additions & 1 deletion extract-knowhow/commands/extract-knowhow.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,8 @@ Read full `.jsonl` files. For sessions > 30,000 chars, split into 25,000-char se

**The goal is to extract tacit knowledge — the hard-won intuition, thinking frameworks, and principles that experts carry in their heads but never write down.** Skills should be useful to ANY researcher in the same subdomain, not just the original author.

**Prefer fewer, stronger skills over many weak ones.** If an item is borderline generic, borderline project-specific, or hard to reuse without the original context, skip it.

When extracting from a specific project, always ask: "Would this help a new PhD student entering this field?" If yes, extract it. If it only makes sense in the context of this particular project, generalize it or skip it.

**Generalize:** "For our LiFePO4 simulation, AMIX=0.05 worked" → "For GGA+U calculations on any transition metal oxide with localized d-electrons, reduce AMIX to 0.05"
Expand Down Expand Up @@ -163,6 +165,38 @@ Replace specific references with generic descriptions: "our internal dataset"
- Standard textbook knowledge with no novel application
- Any personally identifiable information

### Reuse Quality Bar

Only keep an item if it passes **all** of these checks:

1. **Transferable:** It applies to a recognizable class of problems in the subdomain, not just one project file or one dataset.
2. **Actionable:** A researcher could do something differently after reading it.
3. **Replicable:** The reasoning protocol is concrete enough that another researcher could follow it without hidden project context.
4. **Non-obvious:** It contains judgment, heuristics, failure diagnosis, or tradeoffs that are not just textbook definitions.
5. **Scoped correctly:** It is neither too broad ("validate results carefully") nor too narrow ("change line 214 in script X").

Reject items that fail any one of these checks.

### Specificity Calibration

- **Too general:** advice that could apply to almost any research project without change. Example: "Check your data quality before analysis."
- **Too specific:** advice that depends on one dataset, one repository, one file path, or one unpublished internal convention.
- **Good:** a reusable pattern with a clear trigger condition, action, and scientific rationale. Example: "When land-surface model validation maps look spatially sparse, first verify that remote CSV endpoints returned actual data rather than HTML error pages, because silent fetch failures often masquerade as missing observations."

When in doubt, rewrite toward the "Good" level or skip the item.

### Replicability Requirements

Each accepted item should make the hidden know-how operational:

- `title`: name the problem class or decision point, not a vague theme
- `description`: state the trigger condition, recommended action, and why it matters scientifically
- `reasoning_steps`: include 3-7 concrete steps or checks another researcher could actually follow
- `tools`: include only tools that materially support the workflow, not every tool mentioned in the session
- `pitfalls`: describe concrete failure modes, not generic warnings

If you cannot write a concrete reasoning protocol or concrete pitfalls, the item is probably not reusable enough to keep.

### Output per item
```json
{
Expand Down Expand Up @@ -229,7 +263,7 @@ Assemble all results into a single JSON object:

Create directory: `mkdir -p ~/.openscientist`

Read the HTML template from the npm package at `templates/report.html` (installed alongside this command). The template contains all CSS, JS, and the interactive UI. Replace the `__REPORT_DATA__` placeholder in the template with the actual JSON data object from Step 6.2.
Read the HTML template from the npm package at `templates/report.html` (installed alongside this command). The template contains all CSS, JS, and the interactive UI. Replace only the `__REPORT_DATA__` token in the line `const DATA = __REPORT_DATA__;` with the raw JSON data object from Step 6.2, and preserve every other template line exactly.

If the template file is not found, fall back to writing a minimal HTML page that embeds the JSON data and displays it.

Expand Down
11 changes: 6 additions & 5 deletions extract-knowhow/templates/report.html
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
.stat-label{font-size:.8rem;color:#8b949e}
.author-bar{display:flex;justify-content:center;gap:1.5rem;margin-bottom:2rem;flex-wrap:wrap}
.author-field{display:flex;align-items:center;gap:.5rem;color:#8b949e;font-size:.9rem}
.author-field label{white-space:nowrap}
.author-field input{background:#161b22;border:1px solid #30363d;color:#e6edf3;padding:6px 12px;border-radius:4px;font-size:.9rem;width:200px}
.project{background:#161b22;border:1px solid #30363d;border-radius:8px;margin-bottom:1.5rem;overflow:hidden}
.project-header{padding:1rem 1.5rem;background:#1c2129;border-bottom:1px solid #30363d;display:flex;justify-content:space-between;align-items:center;flex-wrap:wrap;gap:.5rem}
Expand Down Expand Up @@ -77,9 +78,9 @@ <h1>🏛️ OpenScientist</h1>
</div>
</div>
<div class="author-bar">
<div class="author-field">Name: <input id="author-name" type="text" placeholder="e.g. Albert Einstein"></div>
<div class="author-field">Institution: <input id="author-inst" type="text" placeholder="e.g. ETH Zürich Physics"></div>
<div class="author-field">Role: <select id="author-role" style="background:#161b22;border:1px solid #30363d;color:#e6edf3;padding:6px 12px;border-radius:4px;font-size:.9rem">
<div class="author-field"><label for="author-name">Name:</label> <input id="author-name" type="text" placeholder="e.g. Albert Einstein"></div>
<div class="author-field"><label for="author-inst">Institution:</label> <input id="author-inst" type="text" placeholder="e.g. ETH Zürich Physics"></div>
<div class="author-field"><label for="author-role">Role:</label> <select id="author-role" style="background:#161b22;border:1px solid #30363d;color:#e6edf3;padding:6px 12px;border-radius:4px;font-size:.9rem">
<option value="">Select...</option>
<option value="Undergraduate">Undergraduate Student</option>
<option value="Master's Student">Master's Student</option>
Expand Down Expand Up @@ -107,7 +108,7 @@ <h1>🏛️ OpenScientist</h1>
</div>

<script>
// __REPORT_DATA__ is replaced by the prompt with actual extracted JSON
// Insert the extracted report JSON into the DATA assignment below.
const DATA = __REPORT_DATA__;

const CAT = {
Expand All @@ -120,7 +121,7 @@ <h1>🏛️ OpenScientist</h1>

const projects = DATA.projects || [];
document.getElementById('author-name').value = DATA.author || '';
document.getElementById('author-inst').value = DATA.email || '';
document.getElementById('author-inst').value = DATA.institution || '';
if(DATA.role) document.getElementById('author-role').value = DATA.role;
document.getElementById('rdate').textContent = DATA.date || new Date().toISOString().slice(0,10);
document.getElementById('s-proj').textContent = projects.length;
Expand Down
10 changes: 10 additions & 0 deletions extract-knowhow/tests/test-postinstall.js
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ const { execFileSync } = require("child_process");
const COMMANDS_DIR = path.join(os.homedir(), ".claude", "commands");
const TARGET = path.join(COMMANDS_DIR, "extract-knowhow.md");
const SCRIPT_DIR = path.join(__dirname, "..", "scripts");
const TEMPLATE = path.join(__dirname, "..", "templates", "report.html");

let passed = 0;
let failed = 0;
Expand All @@ -35,6 +36,15 @@ assert(fs.existsSync(TARGET), "Command file exists after install");
const content = fs.readFileSync(TARGET, "utf-8");
assert(content.includes("extract-knowhow"), "Command file contains expected content");
assert(content.startsWith("#"), "Command file starts with markdown header");
assert(content.includes("Prefer fewer, stronger skills over many weak ones."), "Command includes the stronger skill-selection guidance");
assert(content.includes("Only keep an item if it passes **all** of these checks:"), "Command includes the reuse quality bar");
assert(content.includes("If you cannot write a concrete reasoning protocol or concrete pitfalls"), "Command enforces replicability requirements");

console.log("\nTest: report template invariants");
const template = fs.readFileSync(TEMPLATE, "utf-8");
assert(template.includes("const DATA = __REPORT_DATA__;"), "Template keeps a single DATA placeholder assignment");
assert(!template.includes("__REPORT_DATA__ is replaced by"), "Template does not repeat the placeholder in comments");
assert(template.includes("DATA.institution || ''"), "Template maps institution field from DATA.institution");

console.log("\nTest: postuninstall.js");
execFileSync(process.execPath, [path.join(SCRIPT_DIR, "postuninstall.js")], { stdio: "pipe" });
Expand Down