All notable changes to @oni.bot/core are documented here.
Format follows Keep a Changelog. This project uses Semantic Versioning.
Automated bug pipeline sweep: 90 bugs found, fixed, and independently verified (BUG-0024 through BUG-0182). Primarily security hardening and reliability improvements across the entire codebase.
- Path traversal prevention:
persistSemantic(),persistEpisodic(),persistInternal(),safeSkillPath(), and GitHub API calls now validate/sanitize paths (BUG-0064, 0075, 0100, 0101, 0106, 0163, 0166) - Null-byte injection: Stripped from
sessionId,domain, andtopicfields before filesystem operations (BUG-0163, 0166) - Command injection:
execSyncreplaced withexecFileSync(array args) in CLI commands;shell:trueremoved from spawn (BUG-0063, 0104) - Prototype pollution:
deepMergefilters__proto__/constructor/prototypekeys; recursive strip inmodifiedInput(BUG-0062, 0144) - Mermaid injection: Bracket characters escaped in node labels to prevent diagram injection (BUG-0174)
- Role string sanitization: LLM message interpolation sites in supervisor sanitized (BUG-0173)
- LLM-generated skill validation: Content validated before writing to disk (BUG-0157)
- ReDoS prevention: Replaced vulnerable
rm -rfregex with O(n) token-based check (BUG-0180)
- URL scheme validation: All model factories (
anthropic,openai,openrouter,google,ollama) andA2AClientnow reject non-HTTP(S)baseUrlschemes to prevent API key exfiltration (BUG-0105, 0141, 0142, 0143, 0170, 0179) - API key validation: All model factories throw at construction time if API key is missing (BUG-0098, 0099, 0159)
- Error message sanitization: Raw API response bodies replaced with generic status messages across all model adapters, GitHub, Firecrawl, web-search, and A2A handlers (BUG-0076, 0102, 0111, 0155, 0161, 0164, 0165, 0167)
- TOCTOU race fix:
checkAllowedPath()returns resolved path; all 6 filesystem handlers use it for I/O (BUG-0065, 0181) - Fail-closed semantics: Empty
allowedPathsnow throws instead of allowing all access; hook timeouts fail-closed for security-critical events (BUG-0149, 0168, 0169) - Default-deny permissions: Missing agent entries in permission config now deny by default (BUG-0172)
- Safety gate strictness:
approvedfield checked with strict boolean equality (BUG-0162) - Body size limits: A2A server enforces 1MB max body;
requestHandler()enforcesMAX_BODY_SIZEfor Fetch-API callers (BUG-0077, 0145) - Buffer overflow guards: MCP
StdioTransportenforcesMAX_BUFFER_SIZE; LSP client caps buffer at 64MB (BUG-0068, 0182) - Unbounded growth caps:
ExperimentLogcapped at 1000 records;DeadLetterQueueat 100 per thread (BUG-0129, 0130) - CORS enforcement: A2A server adds preflight handling and
Content-Typeenforcement (BUG-0171)
- PRNG hardening:
Math.randomreplaced withcrypto.randomUUIDfor HITL resume IDs, broker IDs, andgenerateId()(BUG-0095, 0103, 0122)
- Shared mutable state: 6 module-level counters/maps moved to per-instance scope to prevent cross-tenant interference (BUG-0070, 0082, 0083, 0092, 0123, 0134)
- Race conditions:
refreshTools()coalescing lock prevents concurrent races;fileVersionsre-read at increment point; race timeout properly cleared (BUG-0124, 0125, 0127) - Resource leaks:
ReadableStreamreader released infinallyblock;onAny()returns unsubscribe function (BUG-0079, 0128) - Lifecycle events:
SessionEndfires on all loop exit paths including HITL interrupt; telemetry span closes on interrupt (BUG-0080, 0086) - Redis atomicity:
SET+ZADDcombined into single Lua script forput();DEL+ZREMwrapped inMULTI/EXEC(BUG-0073, 0096, 0126) - ESM compatibility:
require()calls replaced with static ESM imports inSkillLoaderandsafeSkillPath(BUG-0078, 0081) - Stream resilience: OpenRouter adapter handles usage-only SSE chunks and guards
choicesdereference;SafetyGate.check()handles non-string/non-array content (BUG-0153, 0178) - Error handling: JSON-RPC handler catches sync throws; spawn processes get
errorevent handlers; JSONL parsing skips malformed lines (BUG-0029, 0114, 0115, 0150) - Input validation:
fromJSONvalidates input shape; nested object schemas enforce required fields;validatecallback invoked on resumed user input;Object.freezeno longer blocks VM result write-back (BUG-0024, 0112, 0151, 0152) - Minor fixes: Empty CSV guard, PII regex
/gflag removal, Slackusername/iconEmojipassthrough, round-robin over full slot list,structuredClonefallback, dotless path guard ingetLoader, null-saferefreshTools, unknown channel key filtering, state method signature alignment, E2B timeout cleanup, off-by-one infireSessionEndturn count, fallback truncationcontinuefix (BUG-0025, 0026, 0027, 0028, 0030, 0067, 0090, 0093, 0107, 0120, 0121, 0133, 0136, 0137, 0154, 0175)
- 8 new test files covering prototype pollution, hooks eval bypass, PII regex safety, skill evolver ESM paths, DAG unsatisfiable deps, supervisor routing errors, wrap-agent loop errors, and A2A handler subscribe errors
- Deep equal cycle guard: Two separate WeakSets for independent object tracking prevents false equality on cyclic structures
- Direction-aware improvement check:
ExperimentalExecutornow correctly evaluates optimization direction - ESM error code: Postgres store uses
ERR_MODULE_NOT_FOUND(correct ESM code) instead ofMODULE_NOT_FOUND - Streaming tool-call ID collisions: Counter-based IDs prevent duplicates in parallel tool calls
- SSE error propagation: Nested try/catch with
controller.errorfor proper stream error handling - Fallback truncation:
continueinstead ofbreakso oversized messages don't halt truncation - Silent parse failures:
console.warnsurfaced across all 5 model adapters andresponseFormatparsers - Postgres checkpoint deserialization: Runtime validation on restored checkpoint data
- PubSub subscriber errors: Errors in subscriber callbacks are now logged instead of silently swallowed
- Bridge tracer cleanup:
startTimesmap entries removed onunsubscribeto prevent memory leaks - MCP client error propagation:
.catch()reordered so callback errors propagate correctly - Mermaid graph rendering: Router-to-target edges now render correctly in
toMermaid() contentLength()accounting: Tool call tokens included in content length calculationinterruptAfterparallel safety: Moved to post-loop pass to avoid interrupting mid-superstepexecuteToolsparallel safety:parallelSafecheck now matchesdefineAgentpattern- EventBus dispose:
waitForpromises correctly reject on bus disposal - Harness hooks engine: Error isolation between hook handlers
- OpenAI adapter streaming: Robust chunk parsing for partial SSE frames
- Skill evolver: Pattern learner and evolver state management fixes
- Redis store: Connection lifecycle and error handling improvements
- Removed internal planning documents from repository (architecture docs, bug trackers, sprint plans)
- Moved developer guide to root level (
GUIDE.md)
- Budget guardrails:
BudgetTracker.record()is now called fromdefineAgentafter eachmodel.chat()via a RunContext callback; throwsBudgetExceededErrorwhenmaxTokensPerRun/maxTokensPerAgent/maxCostPerRunis exceeded. - defineAgent ToolContext:
storeand streamemitnow read from the live RunContext instead of being hard-coded tonull/noop, so tools receive the real store and can emit custom stream events. - parallelSafe enforcement:
parallelSafe: falseis now preserved throughToolDefinitionand respected bydefineAgent— when any tool in a batch hasparallelSafe: falseall calls in that step execute sequentially. - pathMap compile-time validation:
StateGraph.compile()now validates conditional edgepathMaptargets at compile time, throwingNodeNotFoundErrorinstead of failing silently at runtime. - spawnAgent in unsupervised swarms:
spawnAgent()now throws a clear error when called on a swarm without a supervisor, rather than silently adding an agent that never executes. - removeAgent edge cleanup:
removeAgent()now removes stale static edges pointing to the removed node from_edgesBySource, preventingNodeNotFoundErrorwhen Pregel tries to route to a removed agent. - Lifecycle events from defineAgent:
llm.request,llm.response,tool.call, andtool.resultevents are now emitted and audit entries written for every model call and tool execution insidedefineAgent. - toolPermissions enforcement:
checkToolPermission()is called before each tool execution indefineAgent, outside the tool's try/catch, soToolPermissionErrorpropagates as a real graph failure.
- Token streaming (parallel nodes): Replaced module-level
_tokenHandlerglobal withAsyncLocalStorage— each node in a parallel fan-out now gets its own token handler, preventing tokens from being silently dropped or misrouted to the wrong node. - HITL resume:
resume()now looks up the session byresumeId(not just the first pending interrupt) and callsmarkResumed()so sessions transition from"pending"to"resumed"after being handled. - Subgraph checkpointer restore: The child runner's original
checkpointeris saved before being overwritten with aNamespacedCheckpointerand restored after the subgraph completes, preventing the child's checkpointer from leaking across invocations. - Circuit breaker fallback: The user-supplied
fallback(state, error)is now called with the real node state and theCircuitBreakerOpenErrorinstance rather than(undefined, undefined).
oni-codeAI coding assistant extracted to@oni.bot/code(separate package)sentinelcode analysis engine extracted to@oni.bot/sentinel(separate package)oniandoni-codeCLI binaries removed from this package (install@oni.bot/codefor the CLI)
./configsub-module export (@oni.bot/core/config) — JSONC config loader with env var substitution and hierarchical merge"sideEffects": false— enables full tree shaking in bundlers
- 5 model adapters (anthropic, openai, openrouter, google, ollama) — zero runtime dependencies
- 21 total exports: root + 20 named subpaths
- Structured error codes with
ONI_<CATEGORY>_<NAME>taxonomy — every error carries code, category, recoverable flag, suggestion, and context - Per-node timeouts via
addNode(name, fn, { timeout: ms })withNodeTimeoutError - Global default timeout via
compile({ defaults: { nodeTimeout: ms } }) - Circuit breaker pattern:
addNode(name, fn, { circuitBreaker: { threshold, resetAfter, fallback? } }) - Dead letter queue:
compile({ deadLetterQueue: true })captures failed node inputs for recovery - OpenTelemetry tracing adapter (
ONITracer) — zero-dep, user brings own tracer - Backpressure streaming with
BoundedBuffer(drop-oldest and error strategies) - Testing utilities:
mockModel(),assertGraph(),createTestHarness()via@oni.bot/core/testing oni initCLI command for project scaffolding- 7 new error types:
NodeTimeoutError,CircuitBreakerOpenError,SwarmDeadlockError,ModelRateLimitError,ModelContextLengthError,CheckpointCorruptError,StoreKeyNotFoundError
ONIErrornow accepts optionalONIErrorOptionsas second constructor parameter- All existing error classes enhanced with structured codes (backward compatible)
set()alias forput()onInMemoryStoreandNamespacedStore
createReactAgentnow acceptsONIModel(auto-adapts chat to invoke)
- Package renamed from
@oni-bot/coreto@oni.bot/core
- Removed all external framework references from codebase and documentation
- Resolved TypeScript strict errors in swarm template
Sendcasts
SwarmGraphbuilder for multi-agent orchestrationSwarmGraph.hierarchical()template — supervisor-workers patternSwarmGraph.fanOut()template — parallel agent execution with reducerSwarmGraph.pipeline()template — linear chain with conditional transitionsSwarmGraph.peerNetwork()template — decentralized agent handoffsSwarmGraph.mapReduce()template — parallel map with reducerSwarmGraph.debate()template — multi-round parallel debate with judgeSwarmGraph.hierarchicalMesh()template — nested team coordination- Lazy coordination auto-wiring (broker and pubsub) on
SwarmGraph - Handoff execution — agent
Handoffreturns converted toCommandrouting - Retry-then-fallback — agents retry on failure, fall back to supervisor
- Replaced
SwarmLLMwithONIModelin supervisor routing
ONIModelinterface and core LLM types- Anthropic LLM adapter
- OpenAI LLM adapter
- Ollama LLM adapter
- Google Gemini LLM adapter
- Models export path and re-exports
- Tool framework with
defineTool()andToolContext AgentContextand agent typesdefineAgent()declarative agent factoryagent()functional agent factoryaddAgent()onStateGraphto wire agent nodes- Request/response and pub/sub coordination patterns
- Tool permission guardrails
- Budget tracking and cost control guardrails
- Content filtering guardrails
- Audit trail and guardrails exports
EventBusfor structured lifecycle events- Guardrails and event bus wired into compile/execution pipeline
- Integration tests for agents, tools, and guardrails
- Direct
openaidependency (adapters are now bring-your-own-client)
StateGraphbuilder withaddNode(),addEdge(),addConditionalEdges()- Pregel execution engine (
ONIPregelRunner) with parallel superstep execution CommandandSendprimitives for dynamic routing- Subgraph support via
addSubgraph() - Functional API (
entrypoint,task) MemoryCheckpointerandNoopCheckpointerInMemoryStoreandNamespacedStorefor shared key-value state- Human-in-the-loop via
interrupt()andgetUserInput() - Token streaming with
emitToken()andTokenStreamWriter - Stream modes:
values,updates,events,messages,custom - Graph inspection via
getGraphDef() createReactAgentprebuilt for tool-calling loopsToolNodeprebuilt for automatic tool dispatch- Messages reducer (
messagesReducer,addMessages,removeMessages) - Retry policies with configurable backoff
- Swarm primitives:
AgentRegistry,AgentPool,Mailbox,Supervisor - Ephemeral channel support
- Map-reduce parallel fan-out pattern