Shipwright

Write PRDs, run discovery cycles, plan launches, and facilitate strategy sessions — from your terminal.

Shipwright gives PMs a real operating system for product work: framework-backed skills, orchestrated workflows, and quality gates that produce artifacts teams can execute.

Under the hood, Shipwright includes 46 skills, 7 specialist agents, and 18 chained workflows. The counts matter less than the contract: evidence-first outputs, explicit decisions, pass/fail gating, deterministic recovery, and optional adversarial review for high-stakes artifacts.

The skills are plain markdown files, so they're compatible with any AI coding tool that reads skill files (Cursor, Codex, Gemini CLI, and others). Agents, commands, and the original /start orchestrator are Claude Code-specific. This repo also includes a Codex-native bridge via AGENTS.md so plain-language prompts in Codex can still route through Shipwright's bounded research and framework selection.

Why this beats raw AI

Shipwright is not "better prompting." It is a quality system around prompting.

Dimension	Raw AI prompting	Shipwright
Consistency	Format shifts each run	Stable output signature via output standard
Decision quality	Often descriptive, not decisive	Required `Decision Frame` with recommendation + trade-off + owner/date
Evidence discipline	Easy to mix assumptions and facts	Sourced claims + explicit unknowns
Readiness gating	"Looks good" is subjective	Binary pass/fail gates before scoring
Adversarial pressure	Critique depends on the same prompt that produced the work	Optional `/challenge` workflow and red-team review for pressure-testing finished artifacts
Recovery path	Ad hoc rewrites	Deterministic recovery playbooks
Handoff quality	Varies by prompt quality	Repeatable workflows with role constraints and checks

Demo

A PM wrote a PRD recommending enterprise expansion as the top priority. Before sending it to engineering, they ran /challenge to pressure-test it.

Input:

/challenge Review this PRD at Standard depth before I send it to the eng lead.

What the red-team agent found:

Claim Challenged	Attack Vector	Severity	Why This Is Vulnerable	What Would Resolve It
"Enterprise is our highest-growth segment"	Evidence Integrity	Moderate	Cited market report covers the category, not this product. No enterprise-specific pipeline or win-rate data.	Cite enterprise pipeline metrics or downgrade to hypothesis.
"Minimal incremental engineering cost"	Structural Honesty	Critical	SSO, audit logging, and SLA requirements are listed in the appendix but not reflected in the cost estimate or timeline.	Reconcile appendix requirements with the effort estimate or scope them out explicitly.
"Self-serve onboarding will scale to enterprise"	Decision Courage	Moderate	The PRD hedges with "may require some customization" but doesn't commit to whether enterprise onboarding is self-serve or high-touch.	Make the call: self-serve with guardrails, or dedicated onboarding. State the trade-off.

Verdict: DEFEND. The enterprise thesis may still be right, but the cost estimate contradicts the appendix and the growth claim lacks product-specific evidence. The PM should route findings back before treating the PRD as settled.

The PM sent findings back to the producing agent, which revised the cost section and downgraded the growth claim to a hypothesis. A second /challenge pass returned CLEAR.

Start Here: 3 Paths

Most PM work falls into one of three patterns. If you're unsure where to begin, pick the path that matches this week's job.

Path 1: New Feature

/discover  →  /write-prd  →  /tech-handoff

Start with customer evidence, convert it into a structured PRD, then generate the engineering handoff package. You end with: discovery report, Working Backwards PRD, tech spec, design review, epics, and stories. Typical effort: a few focused sessions.

Path 2: Quarterly Planning

/customer-review  →  /strategy  →  /okrs

Synthesize customer signals, set strategic bets and boundaries, then draft and audit OKRs against those bets. You end with: customer intelligence report, strategy doc with kill criteria, and audited OKRs. Typical effort: a few focused sessions.

Path 3: Launch

/strategy  →  /plan-launch  →  /sprint

Lock positioning, build the GTM launch plan, then turn it into execution-ready sprint scope. You end with: strategy doc, GTM launch plan, and sprint plan with stories. Typical effort: a few focused sessions.

Each path chains 3 workflows; run them in separate sessions or back-to-back. For full path details, see the workflows guide.

Quick Start

1) Install

Option A: Plugin install (recommended)

claude plugin marketplace add EdgeCaser/shipwright
claude plugin install shipwright@shipwright

Option B: Script install (recommended for manual)

git clone https://github.com/EdgeCaser/shipwright.git
bash shipwright/scripts/sync.sh --install your-project/

This copies all skills, agents, commands, docs, and evals into your-project/.claude/, and drops a shipwright-sync.sh script you can re-run later to pull updates.

Option C: Manual install

git clone https://github.com/EdgeCaser/shipwright.git
cp -r shipwright/skills/ your-project/.claude/skills/
cp -r shipwright/agents/ your-project/.claude/agents/
cp -r shipwright/commands/ your-project/.claude/commands/
mkdir -p your-project/.claude/scripts/
cp -r shipwright/scripts/ your-project/.claude/scripts/

Using a different tool? See the cross-tool install guide.

If you are running directly from this repo in Codex, you do not need slash commands. The project-level AGENTS.md tells Codex to treat plain-language PM prompts as Shipwright work and to use the local research collector before broad interactive browsing when it is available.

2) Add product context and go

cp shipwright/examples/CLAUDE.md.example your-project/CLAUDE.md
# Fill in your product name, personas, metrics, and priorities — even rough answers help

Then open Claude Code in your project and run:

/start I'm a PM at [company] working on [brief context]

That's it. The orchestrator reads your CLAUDE.md, picks up your context, and routes you to the right workflow. If you skip the CLAUDE.md, Shipwright still works — but outputs will be generic instead of tailored to your product.

You can also run workflows directly:

/discover   /write-prd   /plan-launch   /strategy   /sprint   /okrs   /challenge   /status   /quality-check

For the full workflow list and behavior, see using workflows.

Keep Sessions Fast

When you already know the job to be done, run the workflow directly instead of routing through /start. For example, use /competitive for competitive analysis or /pricing for pricing work.

When a task needs fresh public research, keep the first pass narrow:

do market sizing first, then positioning
do competitive landscape first, then battlecards
ask for findings inline before asking for a polished memo or saved file

This keeps web-heavy work bounded and reduces timeout risk on broad requests.

If you want to reduce search latency further without changing the conversational UX, Shipwright also includes scripts/collect-research.mjs, which can build a compact evidence pack from programmatic web search before the model synthesizes it. The helper now escalates automatically from a small first pass to broader subqueries, caches fresh evidence packs under .shipwright/cache/research/v1/ for 24 hours by default, and only asks the model to browse interactively for the remaining gaps. It also emits a facts.json sidecar with deterministic pricing, review, product, date, and package-registry facts, including adapter-backed metadata from npm, PyPI, and crates.io when available. If no Brave or Tavily key is configured, it still degrades gracefully by writing a needs-interactive-followup pack instead of failing hard, and those no-provider fallback packs are not cached. To clear the local cache manually, run node scripts/collect-research.mjs --clear-cache.

Standalone Mode (Any Tool)

You can use any skill directly without workflows, agents, or orchestrator:

Read skills/execution/prd-development/SKILL.md and write a PRD for [feature].

Use standalone mode for one framework and one question. Move up to workflows when you need repeatable, multi-step output quality.

Proof and Quality Gates

Want proof before adoption? Start here:

Case studies for real-world proof points from production use
Golden outputs for side-by-side baseline vs Shipwright comparisons
Pass/fail gates for binary readiness checks
Eval rubrics for scored quality dimensions
Adversarial review rubric for calibrating Challenge Reports
Failure modes and recovery playbooks for deterministic fixes

What the output signature looks like in practice

Every Shipwright artifact closes with the same three blocks. Here's a real example from a competitive brief:

## Decision Frame
Recommendation: Lead the first discovery call with revenue cycle friction (documentation
accuracy, prior auth denial rate) before surfacing automation capabilities. Do not open with
technology.
Trade-off: A slower first meeting vs. a pitch that lands before the client has confirmed the pain.
Confidence: High — revenue impact is quantifiable from published industry benchmarks, and
competitor capability gap is sourced from press releases and analyst reports.
Decision owner/date: PM (2026-03-15). Revisit after first discovery call.

## Unknowns and Evidence Gaps
- EHR platform(s) in use — determines integration path
- Payer mix breakdown — affects whether the documented revenue gap is material at this client's scale
- Whether any value-based contracts are already in place — changes the urgency framing

## Pass/Fail Readiness
PASS — competitive claims are sourced, revenue impact is quantified, discovery entry points are
ranked by evidence quality, and unknowns are listed with resolution path (first call).
FAIL condition: if competitive capability claims are taken from positioning pages only with no
outcome data, or if revenue impact has no source.

Keeping Your Install Up to Date

If you installed with scripts/sync.sh --install, your project has a shipwright-sync.sh script. After pulling new changes in the Shipwright repo, run it from your project directory:

bash shipwright-sync.sh          # interactive — shows what changed, asks before updating
bash shipwright-sync.sh --yes    # auto-update everything without prompting

The sync script compares every file against the Shipwright source and reports what's changed, what's new, and what's been removed. You can update all at once or file-by-file with diffs.

Slack Agent

Shipwright includes an optional local Slack integration in slack-agent/. It lets you @mention a bot in Slack, route the message into Claude Code running on your machine, and post the reply back into the same thread.

Current behavior:

runs locally through Slack Socket Mode, so no public webhook is required
keeps Claude session continuity per Slack thread
supports strict commands like question: and status:
supports thread-scoped listening mode with listen on / listen off
uses the project directory you configure via PROJECT_CWD

Important warning:

this Slack agent is for personal use only
it is not intended to be a shared Claude gateway for teammates
it should be limited to allowlisted users and channels
READ_ONLY_MODE=true is the recommended default

If you want a team-facing Slack product, use a proper API-backed architecture instead of routing requests through a local authenticated Claude Code session.

Setup, configuration, safety guidance, and supported commands are documented in slack-agent/README.md.

Deep Reference Docs

Workflows guide: all commands, orchestration model, common paths
Output standard: required sections, signature rules, decision framing
Composition model: how skills, workflows, and agents compose
AI vs non-AI guide: what to automate deterministically, what to keep agentic, and where to invest next
Cross-tool install: Cursor, Codex, Gemini CLI, others
Tool connections: MCP setup and integration patterns
Skills catalog and Agents: source of truth for all components

Contributing

PRs are welcome. Before opening one, run:

./scripts/validate.sh

See CONTRIBUTING.md for skill/workflow/rubric requirements and submission checklist.

Acknowledgments

Built on ideas from PM practitioners and the AI coding agent community:

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
.claude-plugin		.claude-plugin
.codex/skills		.codex/skills
.github		.github
agents		agents
benchmarks		benchmarks
case-studies		case-studies
commands		commands
docs		docs
evals		evals
examples		examples
schemas/artifacts		schemas/artifacts
scripts		scripts
skills		skills
slack-agent		slack-agent
tests		tests
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
manifest.json		manifest.json
skills-map.md		skills-map.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Shipwright

Why this beats raw AI

Demo

Start Here: 3 Paths

Path 1: New Feature

Path 2: Quarterly Planning

Path 3: Launch

Quick Start

1) Install

2) Add product context and go

Keep Sessions Fast

Standalone Mode (Any Tool)

Proof and Quality Gates

What the output signature looks like in practice

Keeping Your Install Up to Date

Slack Agent

Deep Reference Docs

Contributing

Acknowledgments

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Shipwright

Why this beats raw AI

Demo

Start Here: 3 Paths

Path 1: New Feature

Path 2: Quarterly Planning

Path 3: Launch

Quick Start

1) Install

2) Add product context and go

Keep Sessions Fast

Standalone Mode (Any Tool)

Proof and Quality Gates

What the output signature looks like in practice

Keeping Your Install Up to Date

Slack Agent

Deep Reference Docs

Contributing

Acknowledgments

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages