Skip to content

Commit 2ffdb43

Browse files
feat(cli): optimize startup time (#96)
* feat(cli): optimize startup time * update readme
1 parent 3ed07c1 commit 2ffdb43

15 files changed

Lines changed: 1542 additions & 61 deletions
Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
---
2+
phase: design
3+
title: System Design & Architecture
4+
description: Define the technical architecture, components, and data models
5+
feature: cli-startup-performance
6+
---
7+
8+
# System Design & Architecture
9+
10+
## Architecture Overview
11+
12+
```mermaid
13+
graph TD
14+
User[User or CI] --> Bin[dist/cli.js bin entry]
15+
Bin --> Bootstrap[Lightweight CLI Bootstrap]
16+
Bootstrap --> Manifest[Command Metadata Manifest]
17+
Manifest --> Commander[Commander Parser]
18+
Commander --> Help[Help and Version Output]
19+
Commander --> Dispatch[Lazy Action Dispatcher]
20+
Dispatch --> Init[init/phase/setup/lint/install handlers]
21+
Dispatch --> Memory[memory handlers]
22+
Dispatch --> Skill[skill handlers]
23+
Dispatch --> Agent[agent handlers and TUI]
24+
Dispatch --> Channel[channel handlers and bridge]
25+
Dispatch --> Docs[docs handlers]
26+
Bench[Benchmark Script] --> Bin
27+
CI[CI Gate] --> Bench
28+
```
29+
30+
The optimized CLI should separate cheap command metadata from expensive command execution code.
31+
32+
The approved architecture is a two-step optimization:
33+
34+
1. Implement a lightweight static command metadata layer plus lazy action dispatcher first. This keeps TypeScript source maintainable and removes eager command-handler imports from startup/help paths.
35+
2. Run the benchmark after the lazy metadata/dispatcher refactor. If p50 remains above `50 ms`, add generated or bundled `dist` optimization using only existing repo tooling and without changing package manifests.
36+
37+
Key components:
38+
39+
- **Lightweight CLI bootstrap**: The published entrypoint that loads only Commander, version metadata, command metadata, and dispatch glue.
40+
- **Command metadata manifest**: Static data for command names, descriptions, arguments, and options. This enables help/version output without importing handlers.
41+
- **Lazy action dispatcher**: Imports the actual command module only when the selected action executes.
42+
- **Command handler modules**: Existing command implementations, refactored only as needed to avoid top-level imports that are not needed by the selected subcommand.
43+
- **Benchmark script**: Local and CI entrypoint for startup/help timing and representative command smoke checks.
44+
45+
## Data Models
46+
47+
### Command Metadata
48+
49+
```typescript
50+
interface CliCommandDefinition {
51+
name: string;
52+
description: string;
53+
arguments?: CliArgumentDefinition[];
54+
options?: CliOptionDefinition[];
55+
subcommands?: CliCommandDefinition[];
56+
action?: LazyActionDefinition;
57+
}
58+
59+
interface LazyActionDefinition {
60+
module: string;
61+
exportName: string;
62+
}
63+
```
64+
65+
The exact shape can be simpler if hand-written registration helpers are clearer. The design requirement is that help-visible command metadata is available without importing heavy handler modules.
66+
67+
### Benchmark Result
68+
69+
```typescript
70+
interface BenchmarkCaseResult {
71+
label: string;
72+
command: string[];
73+
iterations: number;
74+
minMs: number;
75+
p50Ms: number;
76+
p95Ms: number;
77+
maxMs: number;
78+
avgMs: number;
79+
failures: number;
80+
}
81+
```
82+
83+
## API Design
84+
85+
No public CLI API changes are allowed. Internal APIs may be introduced:
86+
87+
- `registerCommandMetadata(program, definitions)` to build Commander commands from metadata.
88+
- `lazyAction(modulePath, exportName)` to wrap `.action(...)` with dynamic import and error handling.
89+
- `runCliBenchmark(cases, options)` to execute benchmark cases with repeated child processes.
90+
91+
Existing command modules should continue exposing testable handler functions where practical.
92+
93+
## Component Breakdown
94+
95+
### CLI Bootstrap
96+
97+
- Owns `program.name`, package version loading, root command metadata, and `program.parse`.
98+
- Must not import heavy command modules at top level.
99+
- Must not import `ink`, `react`, `inquirer`, `telegraf`, `@ai-devkit/agent-manager`, `@ai-devkit/memory`, or channel bridge code unless the chosen command requires them.
100+
101+
### Command Registration
102+
103+
- Keeps help text equivalent to current help output.
104+
- Registers command actions through lazy dispatch wrappers.
105+
- May be hand-written first to reduce risk; generated metadata is allowed if it improves maintainability.
106+
107+
### Command Handlers
108+
109+
- Existing command behavior remains source of truth.
110+
- Heavy subcommand-specific dependencies should move into the action path when feasible. Example: `agent console` should be the path that loads Ink/React, not `agent --help`.
111+
- Shared utility imports are acceptable only when they are lightweight enough for the target.
112+
113+
### Build Output
114+
115+
- The build may produce generated or bundled `dist` artifacts.
116+
- Source maps or clear generated-file provenance must exist if output becomes hard to inspect.
117+
- `packages/cli/package.json` `bin` behavior must remain install-compatible.
118+
119+
### Benchmarking
120+
121+
- Benchmark direct built CLI execution after `npm run build`.
122+
- Use at least 20 iterations per startup/help command.
123+
- Record p50 and p95; p50 is the enforcement metric for `<50 ms`.
124+
- Use temporary directories/config for memory benchmark cases.
125+
126+
## Design Decisions
127+
128+
### 1. Optimize Current Node CLI First
129+
130+
Rust is intentionally out of scope. Measurements show most overhead comes from eager imports and CLI bootstrap shape, not local CPU-heavy work. The fastest low-risk path is to remove unnecessary Node module loading.
131+
132+
### 2. Preserve CLI Semantics
133+
134+
Performance work must be behavior-preserving. Any command output, option parsing, or exit-code change is a regression unless explicitly approved later.
135+
136+
### 3. Allow Bootstrap/Build Restructuring
137+
138+
The `<50 ms` target is aggressive. Dynamic imports alone may not be enough with native ESM file fanout, so the design allows a lightweight bootstrap, generated metadata, or bundled artifacts without adding dependencies.
139+
140+
Chosen path: do not start with bundling. Start with static metadata plus lazy dispatch because it is easier to review and preserves source/debug clarity. Treat bundling or generated `dist` output as a measured second step only if the first step does not meet the target.
141+
142+
### 4. No New Dependencies
143+
144+
The implementation must use existing repo tooling or plain Node scripts. If bundling is required, use tooling already available through the current lockfile without changing manifests, or implement a non-bundled fallback.
145+
146+
## Alternatives Considered
147+
148+
- **Action-only dynamic imports**: Simple and likely helpful, but may not hit `<50 ms` if command metadata still imports large modules.
149+
- **Static command metadata plus lazy handlers**: Chosen first step. Better startup characteristics while keeping source maintainable; requires keeping metadata and handler behavior aligned.
150+
- **Bundled bootstrap or CLI**: Conditional second step. Can reduce ESM file-load overhead; requires careful handling of dynamic imports, source maps, shebang, templates, and daemon entrypoints.
151+
- **Rust rewrite**: Best native startup potential, but too much scope for this feature and does not directly address command compatibility risk.
152+
153+
## Non-Functional Requirements
154+
155+
- Startup/help benchmark p50 `<50 ms` for required commands.
156+
- Lightweight command RSS should drop materially from current `~100 MB+` import paths; exact memory threshold is secondary to startup target.
157+
- CI benchmark must avoid single-run flakiness through repeated sampling.
158+
- The implementation must remain portable on supported Node versions and existing npm workspace tooling.
159+
- No additional secrets, credentials, or network services are needed for tests.
Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
---
2+
phase: implementation
3+
title: Implementation Guide
4+
description: Technical implementation notes, patterns, and code guidelines
5+
feature: cli-startup-performance
6+
---
7+
8+
# Implementation Guide
9+
10+
## Development Setup
11+
12+
- Work in branch/worktree `feature-cli-startup-performance`.
13+
- Run `npm ci` in the feature worktree before phase work.
14+
- Run `npm run build` before benchmark commands so measurements use `packages/cli/dist/cli.js`.
15+
16+
## Code Structure
17+
18+
Expected touch points:
19+
20+
- `packages/cli/src/cli.ts`: root bootstrap and command registration.
21+
- `packages/cli/src/commands/*.ts`: command metadata/action split and lazy handler imports.
22+
- `packages/cli/src/__tests__/commands/*.test.ts`: command behavior regression tests.
23+
- `e2e/`: built CLI smoke coverage if bootstrap/build changes affect published behavior.
24+
- CI workflow files if benchmark gate is added there.
25+
26+
Current implementation deltas:
27+
28+
- `packages/cli/src/util/cli-benchmark.ts`: local startup benchmark utility and executable built script entrypoint.
29+
- `packages/cli/src/__tests__/util/cli-benchmark.test.ts`: TDD coverage for timing stats, failure accounting, required benchmark case list, and built-script root resolution.
30+
- `packages/cli/src/cli-command-manifest.ts`: shared lightweight top-level command manifest used by static help and lazy dispatch.
31+
- `packages/cli/src/__tests__/util/cli-command-manifest.test.ts`: coverage proving the manifest drives root help, command help, and dispatch paths.
32+
- `packages/cli/src/cli-runtime.ts`: lightweight static help rendering and lazy top-level command execution.
33+
- `packages/cli/src/__tests__/util/cli-runtime.test.ts`: TDD coverage for lightweight help/version, dispatch mapping, and lazy command registration.
34+
- `packages/cli/src/cli.ts`: thin bootstrap that handles static help/version and delegates real commands to the lazy dispatcher.
35+
- `packages/cli/package.json`: `benchmark:startup` script running `node dist/util/cli-benchmark.js`.
36+
- `.github/workflows/ci.yml`: CI benchmark step after build.
37+
38+
## Phase 6 Implementation Check
39+
40+
Alignment with the design:
41+
42+
- The root entrypoint now imports only package metadata plus lightweight bootstrap/dispatch helpers before command selection.
43+
- Root `--version`, root `--help`, and top-level command `--help` paths are served from static metadata and do not load the previous heavy command graph.
44+
- Real command execution imports only the selected top-level command module before Commander parsing.
45+
- Unknown command routing uses a lightweight Commander program populated from the shared manifest, preserving the existing unknown-command error without eager command-module imports.
46+
- The startup benchmark runs locally and in CI after build, with the `<50 ms` p50 gate enforced for version/help paths.
47+
48+
Deviations and follow-ups:
49+
50+
- Static help metadata duplicates command names/descriptions and selected option metadata. This is the main drift risk versus Commander-generated help and should be reviewed when commands change.
51+
- Lazy loading is currently at the top-level command group boundary. Heavy subcommand-specific dependencies inside groups such as `agent` and `channel` can be split further later, but this was not required to meet the startup/help target.
52+
- Representative real commands are smoke-measured in the benchmark table, but CI does not enforce a `10%` real-command regression threshold because there is no stored baseline in this implementation.
53+
54+
## Phase 8 Code Review Notes
55+
56+
- Reviewed the lightweight help metadata against real command registration. Fixed two public help parity gaps found during review: option-bearing command help now includes command-specific flags for `init`, `setup`, `lint`, and `install`; `channel --help` now includes `stop [name]`.
57+
- Refactored the runtime to expose `registerSelectedCommand` for direct branch coverage, while keeping `runSelectedCommand` as the CLI entrypoint dispatch API.
58+
- Verified exported helper usage with `rg`: new APIs are referenced only by `cli.ts`, tests, benchmark script entrypoint, and feature docs.
59+
- No new runtime dependencies, config keys, migrations, or irreversible state changes were introduced.
60+
61+
## Simplification Pass
62+
63+
- Consolidated top-level command metadata into `cli-command-manifest.ts`, so adding or changing a top-level command has one lightweight metadata entry used by both help rendering and dispatch resolution.
64+
- Removed the source `cli-full.ts` eager fallback. Unknown command handling now builds a lightweight Commander shell from the manifest, avoiding a second full command graph.
65+
- Consolidated `cli-bootstrap.ts` and `cli-dispatch.ts` into `cli-runtime.ts`, keeping one runtime module for help rendering and lazy command execution.
66+
- Updated the CLI entrypoint to fast-path `--version` before importing runtime code.
67+
- Added manifest tests to guard against future drift between root help, command help, and dispatch.
68+
69+
## Implementation Notes
70+
71+
### Core Features
72+
73+
- Keep root bootstrap lightweight. Avoid top-level imports of heavy command modules.
74+
- Keep help-visible command metadata available without importing handler dependencies.
75+
- Import command handlers dynamically only when a command action actually runs.
76+
- Implement static command metadata plus lazy dispatch as the first optimization step.
77+
- Introduce generated or bundled `dist` output only after benchmarking proves the first step misses the `<50 ms` target.
78+
- If generated or bundled output is introduced, keep the source architecture explicit and testable.
79+
80+
### Patterns & Best Practices
81+
82+
- Prefer small registration helpers over broad abstractions unless generated metadata becomes necessary.
83+
- Preserve existing `withErrorHandler` behavior around async command actions.
84+
- Keep command handler functions exported for direct unit testing.
85+
- Do not introduce new dependencies.
86+
87+
## Integration Points
88+
89+
- `packages/cli/package.json` `bin.ai-devkit` must remain compatible.
90+
- `packages/cli` build must continue copying `templates` into `dist/templates`.
91+
- `channel-daemon` launch logic must still resolve dev and built paths correctly.
92+
- Existing package imports from `@ai-devkit/agent-manager`, `@ai-devkit/memory`, and `@ai-devkit/channel-connector` should move behind lazy boundaries where possible.
93+
94+
## Error Handling
95+
96+
- Lazy import failures should surface as command failures with the same error handling conventions as existing commands.
97+
- Benchmark failures should print the failing command, p50, threshold, iterations, and failed process count.
98+
- Generated build failures should fail `npm run build` clearly.
99+
100+
## Performance Considerations
101+
102+
- Optimize for fresh process startup, not long-lived process warm paths.
103+
- Avoid unnecessary JSON/config/file reads before command selection.
104+
- Avoid loading TUI/React/Ink unless `agent console` runs.
105+
- Avoid loading memory database code unless a memory action runs.
106+
- Avoid loading Telegram/channel bridge code unless channel actions requiring them run.
107+
108+
Benchmark foundation evidence:
109+
110+
- `npm test -w packages/cli -- src/__tests__/util/cli-benchmark.test.ts` passed with 4 tests.
111+
- `npm test -w packages/cli -- src/__tests__/util/cli-runtime.test.ts src/__tests__/util/cli-command-manifest.test.ts src/__tests__/util/cli-benchmark.test.ts` passed with 18 tests after the final simplification pass.
112+
- `npm run build` passed for all 4 projects.
113+
- `AI_DEVKIT_CLI_BENCHMARK_ITERATIONS=1 npm run benchmark:startup -w packages/cli` executed all 15 configured cases with `0` failures. This smoke run captures current unoptimized startup timings around `325-680 ms`, confirming the benchmark exposes the baseline regression target.
114+
- After lightweight bootstrap, `npm run benchmark:startup -w packages/cli` with 20 iterations produced `0` failures. Startup/help p50 values were `24.070-25.226 ms`; `--version` p50 was `25.080 ms`.
115+
- After top-level lazy dispatch and CI gate wiring, `npm run benchmark:startup -w packages/cli` with 20 iterations exited `0`. Startup/help p50 values were `29.391-33.132 ms`; real command smoke p50 values were `75.028 ms` for `lint`, `239.437 ms` for `agent-list-json`, and `153.793 ms` for `memory-search`.
116+
- After the simplification pass, `npm run benchmark:startup -w packages/cli` with 20 iterations exited `0`. Startup/help p50 values were `24.085-25.149 ms`; real command smoke p50 values were `70.889 ms` for `lint`, `227.256 ms` for `agent-list-json`, and `149.253 ms` for `memory-search`.
117+
- After moving runtime modules next to `cli.ts`, `npm run benchmark:startup -w packages/cli` with 20 iterations exited `0`. Startup/help p50 values were `24.290-26.318 ms`; real command smoke p50 values were `71.420 ms` for `lint`, `255.848 ms` for `agent-list-json`, and `149.797 ms` for `memory-search`.
118+
119+
## Security Notes
120+
121+
- Benchmark scripts must not read or print secrets.
122+
- Channel and memory smoke tests should use temporary or project-isolated config paths.
123+
- No Telegram tokens, tmux sessions, or external network calls should be required in CI.

0 commit comments

Comments
 (0)