Skip to content

Latest commit

 

History

History
540 lines (406 loc) · 20.8 KB

File metadata and controls

540 lines (406 loc) · 20.8 KB

Supertag CLI - Claude Code Context

SpecFlow Development

When implementing specs in .specify/specs/, invoke the SpecFlow skill.

This project uses spec-driven development. Specs are stored in .specify/specs/<feature-id>/ with:

  • spec.md - What and Why (requirements)
  • plan.md - How (technical design)
  • tasks.md - Work items with TDD enforcement

Triggers: "work on F-XXX", "implement F-XXX", "/specflow"

Documentation Locations

When releasing a new version, update ALL of these:

File Purpose Location
CHANGELOG.md Internal version history (detailed) ./CHANGELOG.md
README.md Technical documentation, CLI usage, MCP setup ./README.md
SKILL.md PAI skill documentation with USE WHEN triggers ./SKILL.md
CHANGELOG.md Public release notes (customer-facing) ~/work/web/invisible-store/tana/CHANGELOG.md
USER-GUIDE.md Customer-facing user guide ~/work/web/invisible-store/tana/USER-GUIDE.md
Marketing description Store listing and marketing copy ~/work/web/invisible-store/tana/index.html

Release Checklist

IMPORTANT: Update CHANGELOG.md BEFORE running release.sh

  1. Update CHANGELOG.md - Change [Unreleased] to [X.Y.Z] - YYYY-MM-DD
  2. Update version number in package.json
  3. Run bun run typecheck - Ensure TypeScript type checks pass
  4. Run bun run test:full - Ensure all tests pass
  5. Run ./release.sh X.Y.Z --push to build, tag, and push
  6. Update public CHANGELOG.md at ~/work/web/invisible-store/tana/CHANGELOG.md
  7. Update store listing if features changed

Note: The release script updates package.json version automatically if you pass a version argument. Step 2 can be skipped if using ./release.sh X.Y.Z.

PR Checklist

IMPORTANT: Run these checks before pushing a PR:

  1. Run bun run typecheck - TypeScript types must pass
  2. Run bun run test - Fast tests must pass (1741+ tests)
  3. Push and verify CI passes (GitHub Actions runs full test suite)

Key Architecture

  • Main CLI: supertag - query, write, sync, server, workspaces
  • Export CLI: supertag-export - Playwright browser automation
  • MCP Server: supertag-mcp - AI tool integration via Model Context Protocol

Important Technical Notes

Tana Input API Inline References

Two ways to create references via Input API:

  1. Inline reference in text (within node name or child text):

    <span data-inlineref-node="NODE_ID">Display Text</span>

    Example payload:

    {"name": "Meeting with <span data-inlineref-node=\"abc123\">John Doe</span> today"}
  2. Child reference node (entire child is a reference):

    {"children": [{"dataType": "reference", "id": "NODE_ID"}]}

IMPORTANT: Do NOT end a node with an inline reference - always add text after the closing </span> tag.

  • "Meeting with <span data-inlineref-node=\"id\">John</span> today"
  • "Meeting with <span data-inlineref-node=\"id\">John</span>"

Note: Tana Paste syntax ([[Node Name]], [[text^id]]) does NOT work in Input API - use the HTML span syntax above.

See src/mcp/tools/create.ts for implementation.

Config Namespace

Uses supertag namespace (not tana) to avoid conflicts with official Tana app:

  • Config: ~/.config/supertag/config.json
  • Data: ~/.local/share/supertag/
  • Cache: ~/.cache/supertag/

Export Format

Tana exports now wrap data in storeData object. The schema registry handles both formats.

Export Location

Tana JSON exports are stored at: ~/Documents/Tana-Export/main/ Files are named: {workspaceId}@{date}.json (e.g., M9rkJkwuED@2025-12-12.json)

Entity Detection (_flags)

Based on Tana developer insights from Odin Urdland:

  • Entity flag: props._flags % 2 === 1 (LSB set = entity) - NOTE: uses _flags with underscore prefix
  • User override: props._entityOverride (takes precedence if present)
  • Entities are "interesting" nodes: tagged items, library items, "Create new" items
  • Export contains ~13,735 entities with _flags=1 out of 1.3M total nodes

Entity Detection Priority (in order):

  1. props._entityOverride - Explicit user override (if true/false, use that)
  2. props._flags % 2 === 1 - Automatic entity flag from Tana
  3. props._ownerId.endsWith('_STASH') - Library items (inferred)
  4. Has tag in tag_applications table - Tagged items (inferred)

Key Files:

  • src/db/entity.ts - Entity detection functions (isEntity, isEntityById, findNearestEntityAncestor)
  • src/types/tana-dump.ts - Zod schema with _flags and .passthrough() to preserve props
  • tests/entity-detection.test.ts - Comprehensive entity detection tests

Field/Tuple Structures

Tana stores field values in tuple nodes. There are two patterns:

  1. Standard Field Tuples (extracted ✅):

    • Parent → Tuple → [Label, Value1, Value2, ...]
    • May or may not have _sourceId
    • Up to 50 children
  2. Mega-Tuple Flat Structure (not extracted ❌):

    • Single tuple with 50-1000+ children
    • Field labels: " - FieldName:" (indentation in name)
    • Values: " - Value text" (more indentation)
    • Labels and values are siblings, not parent-child
    • Used by daily briefings/AI features

Key insight: The _sourceId field is often missing on valid tuples. The isFieldTuple() function handles both cases.

See: docs/TANA-FIELD-STRUCTURES.md for full technical documentation.

Content Filtering for Embeddings

When generating embeddings, content is filtered to focus on meaningful nodes:

Default Filters (src/embeddings/content-filter.ts):

  • minLength: 15 - Minimum 15 characters (but entities bypass this)
  • excludeTimestamps: true - Exclude 1970-01-01... artifacts
  • excludeSystemTypes: true - Exclude system docTypes (tuple, metanode, viewDef, etc.)

Important: Entities bypass the minLength filter because short-named entities like "Animal #topic" are still meaningful. This ensures tagged items and library items always get embedded regardless of name length.

CLI Options for embed generate:

  • --min-length <n> - Override minimum length (default: 15)
  • --include-all - Bypass all content filters
  • --include-timestamps - Include timestamp nodes
  • --include-system - Include system docTypes
  • -t, --tag <tag> - Only embed nodes with specific supertag

Graph Query DSL (F-102)

Declarative graph query language for traversing typed relationships.

CLI: supertag gquery "<DSL>" [--explain] [--format json|csv|table] [--limit N]

DSL syntax:

FIND <supertag> [WHERE <field> <op> <value>]* [CONNECTED TO <supertag> [VIA <field>]]* [DEPTH <n>] RETURN <projection>

Examples:

supertag gquery "FIND person RETURN name"
supertag gquery 'FIND person WHERE name ~ Simon RETURN name'
supertag gquery 'FIND meeting CONNECTED TO person VIA Attendees RETURN name, person.name'
supertag gquery 'FIND person CONNECTED TO meeting RETURN name' --explain

MCP tool: tana_graph_query with query, workspace, limit, explain parameters.

Architecture: 4-stage pipeline — Tokenizer → Parser → Planner (validates tags/fields against DB) → Executor (orchestrates UnifiedQueryEngine + GraphTraversalService).

Key files:

  • src/query/graph-types.ts - Type definitions (GraphQueryAST, QueryPlan, QueryStep)
  • src/query/graph-parser.ts - Recursive descent parser
  • src/query/graph-planner.ts - Query planner with tag/field validation
  • src/query/graph-executor.ts - Executor via existing services
  • src/query/graph-query-service.ts - Service facade
  • src/commands/gquery.ts - CLI command
  • src/mcp/tools/graph-query.ts - MCP tool
  • tests/graph-parser.test.ts, tests/graph-planner.test.ts, tests/graph-executor.test.ts - Tests

Graph-Aware Embeddings (F-104)

Embedding generation enriches node text with supertag type and field values before embedding, improving semantic search for typed queries like "find meetings about X".

Enriched text format: [Type: #meeting] [Date: 2026-02-20] [Attendees: Daniel, Sarah] Weekly sync

CLI options for embed generate:

  • --graph-aware (default: enabled) - Enrich text with type/field prefixes
  • --no-graph-aware - Use legacy ancestor-based contextualization
  • --enrichment-preview <nodeId> - Preview enriched text for a single node without generating embeddings

Search query enrichment:

  • CLI: supertag search --semantic --type-hint meeting "weekly sync"
  • MCP: tana_semantic_search with typeHint: "meeting" parameter

Per-supertag config: ~/.config/supertag/embed-enrichment.json

{
  "defaults": {
    "includeTagName": true,
    "includeFields": ["options", "date", "instance"],
    "maxFieldsPerTag": 5
  },
  "overrides": {
    "meeting": { "includeFields": ["Date", "Attendees"], "maxFieldsPerTag": 3 },
    "person": { "includeFields": ["Email", "Company"] },
    "internal": { "disabled": true }
  }
}

When no config file exists, all tags are enriched with fields matching the default field types (default behavior).

Key files:

  • src/types/enrichment.ts - Type definitions and defaults
  • src/embeddings/graph-enricher.ts - Single-node and batch enrichment
  • src/embeddings/enrichment-config.ts - Config loader
  • src/embeddings/enrichment-truncator.ts - Token-aware truncation (512 token limit)
  • tests/embeddings/enrichment-*.test.ts - 25 unit tests

Workspace Database Paths

  • Workspace DB: ~/.local/share/supertag/workspaces/{alias}/tana-index.db
  • Default workspace: main
  • Full path: ~/.local/share/supertag/workspaces/main/tana-index.db

Always use the workspace-specific database, not the legacy path at ~/.local/share/supertag/tana-index.db.

Unified Workspace Resolver (Spec 052)

ALWAYS use resolveWorkspaceContext() for workspace resolution - never use resolveWorkspace() directly.

import { resolveWorkspaceContext } from '../config/workspace-resolver';

// Basic usage - uses default workspace
const ws = resolveWorkspaceContext();
console.log(ws.dbPath);  // ~/.local/share/supertag/workspaces/main/tana-index.db

// Specific workspace
const ws = resolveWorkspaceContext({ workspace: 'books' });

// For commands that create databases (sync, export)
const ws = resolveWorkspaceContext({
  workspace: options.workspace,
  requireDatabase: false,  // Don't throw if database doesn't exist
});

ResolvedWorkspace interface:

interface ResolvedWorkspace {
  alias: string;           // 'main', 'books', etc.
  config: WorkspaceConfig; // Full workspace config
  dbPath: string;          // Full path to SQLite database
  schemaPath: string;      // Full path to schema cache
  exportDir: string;       // Full path to export directory
  isDefault: boolean;      // Whether this is the default workspace
  nodeid?: string;         // Tana nodeid for API calls
  rootFileId: string;      // Tana rootFileId
}

Error handling:

  • WorkspaceNotFoundError - Workspace alias not in configuration
  • WorkspaceDatabaseMissingError - Database file doesn't exist (when requireDatabase: true)

MCP cache clear: Use tana_cache_clear tool or call clearWorkspaceCache() to refresh workspace data.

Query Builder Utilities (Spec 055)

ALWAYS use query builders for SQL construction - never build LIMIT/OFFSET or ORDER BY manually.

import { buildPagination, buildOrderBy, buildWhereClause } from '../db/query-builder';

// Pagination - returns { sql: "LIMIT ? OFFSET ?", params: [100, 0] }
const pagination = buildPagination({ limit: 100, offset: 0 });

// Order by - returns { sql: "ORDER BY created DESC", params: [] }
const orderBy = buildOrderBy({ sort: "created", direction: "DESC" }, []);

// Where clause - returns { sql: "WHERE status = ?", params: ["active"] }
const where = buildWhereClause([{ column: "status", operator: "=", value: "active" }]);

Usage pattern with sqlParts array:

const sqlParts = ["SELECT * FROM nodes WHERE field_name = ?"];
const params: SQLQueryBindings[] = [fieldName];

// Add ORDER BY
const orderBy = buildOrderBy({ sort: "created", direction: "DESC" }, []);
sqlParts.push(orderBy.sql);

// Add pagination
const pagination = buildPagination({ limit, offset });
if (pagination.sql) {
  sqlParts.push(pagination.sql);
  params.push(...(pagination.params as SQLQueryBindings[]));
}

return db.query(sqlParts.join(" ")).all(...params);

Available functions:

  • buildPagination({ limit?, offset? }) - LIMIT/OFFSET with safe defaults
  • buildOrderBy({ sort, direction }, allowedColumns) - ORDER BY with column whitelist validation
  • buildWhereClause(conditions) - WHERE with =, !=, >, <, LIKE, IN, IS NULL operators
  • buildSelectQuery(options) - Complete SELECT query composer

Key files:

  • src/db/query-builder.ts - Implementation (53 tests, 108 assertions)
  • Files using builders: field-query.ts, field-values.ts, tana-query-engine.ts, search.ts

When NOT to use:

  • Simple IN clauses already embedded in template literals (e.g., WHERE id IN (${placeholders})) - these are already safe with parameterized queries

Running from Source vs Binary

  • Binary: ./supertag - Compiled, may not have latest schema changes
  • Source: bun run src/index.ts - Always has latest code

After schema changes (like adding _flags support), you must either:

  1. Run from source: bun run src/index.ts sync
  2. Rebuild binary: ./scripts/build.sh

Testing Workflow

IMPORTANT: Use fast tests during development, full suite only before pushing.

bun run test          # Fast tests only (~12s) - use during development
bun run test:full     # ALL tests (~200s) - run before pushing/releasing
bun run test:slow     # Slow tests only

Do NOT run bun test directly - it runs ALL tests including slow ones. Use bun run test instead.

The build script (./scripts/build.sh) runs fast tests automatically.

Building After Implementation

IMPORTANT: After implementing any code changes, rebuild the binary:

./scripts/build.sh           # Build if source changed (runs fast tests first)
./scripts/build.sh --force   # Force rebuild
./scripts/build.sh --check   # Check if rebuild needed

The build script:

  1. Runs fast tests first (fails build if tests fail)
  2. Only rebuilds if source files changed (unless --force)
  3. Compiles to standalone supertag binary

Universal Format Options (Spec 060)

All query commands support 6 output formats via --format <type>:

Format Description Use Case
table Human-readable with emojis Interactive terminal use
json Pretty-printed JSON array API integration, jq processing
csv RFC 4180 compliant CSV Excel, spreadsheets
ids One ID per line xargs piping
minimal JSON with id, name, tags only Quick lookups
jsonl JSON Lines (streaming) Log processing, large datasets

Format resolution priority:

  1. --format <type> flag (highest)
  2. --json flag (legacy, maps to json)
  3. --pretty flag (legacy, maps to table)
  4. SUPERTAG_FORMAT environment variable
  5. Config file output.format setting
  6. Default: table (Unix-style TSV output)

Example usage:

# CSV export for spreadsheet
supertag search "meeting" --format csv > meetings.csv

# IDs for batch processing
supertag search --tag todo --format ids | xargs -I{} supertag nodes show {}

# JSON Lines for streaming
supertag nodes recent --format jsonl | while read -r line; do echo "$line" | jq .name; done

# Disable header row in CSV
supertag tags list --format csv --no-header

Commands supporting --format:

  • search (all modes: FTS, semantic, tagged)
  • nodes show, nodes refs, nodes recent
  • tags list, tags top, tags show, tags inheritance, tags fields
  • fields list, fields values, fields search

Key files:

  • src/utils/output-formatter.ts - Formatter implementations
  • src/utils/output-options.ts - Format resolution logic
  • tests/format-integration.test.ts - E2E format tests

Live Read Backend (F-097)

Read/search operations route through Tana's Local API when available, with SQLite fallback.

import { resolveReadBackend } from '../api/read-backend-resolver';

// Get the best available backend (never throws)
const backend = await resolveReadBackend({ workspace: 'main' });
const results = await backend.search('meeting notes', { limit: 20 });
const node = await backend.readNode('nodeId', 2); // depth=2
console.log(node.markdown);

Resolution order: --offline flag → cached backend → Local API healthy → SQLite fallback.

Key files:

  • src/api/read-backend.ts - TanaReadBackend interface + canonical types
  • src/api/local-api-read-backend.ts - Local API implementation
  • src/api/sqlite-read-backend.ts - SQLite implementation (wraps TanaQueryEngine + show.ts)
  • src/api/read-backend-resolver.ts - Backend resolver (never throws, session-cached)

CLI helper:

import { resolveReadBackendFromOptions } from './helpers';

// In command action handlers:
const backend = await resolveReadBackendFromOptions(options); // respects --offline, --workspace

What routes through the read backend:

  • FTS search (supertag search, tana_search MCP tool)
  • Node content (supertag nodes show, tana_node MCP tool)

What stays on SQLite:

  • Semantic search (embeddings are local-only)
  • Tagged search, nodes recent, tags list (need structured query support)
  • Reference graph, field queries, aggregation

Watch Mode (F-103)

Continuous monitoring of Tana changes with hook-based automation.

CLI: supertag sync watch [--interval <s>] [--filter-tag <tag>] [--on-change <cmd>] [--on-create <cmd>] [--on-modify <cmd>] [--on-delete <cmd>] [--event-log <path>] [--dry-run] [--max-failures <n>]

Architecture: Pre/post snapshot diffing around DeltaSyncService. Each poll cycle:

  1. Captures snapshot of current node state
  2. Runs delta-sync to fetch changes from Tana Local API
  3. Captures post-sync snapshot
  4. Diffs to detect creates, modifies, deletes
  5. Dispatches shell hooks with change details via env vars

Key files:

  • src/watch/watch-service.ts - Main watch loop with backoff
  • src/watch/differ.ts - Snapshot diffing (creates/modifies/deletes)
  • src/watch/snapshot.ts - Pre/post snapshot capture
  • src/watch/hook-runner.ts - Shell hook execution ("never throws" pattern)
  • src/watch/event-logger.ts - JSONL event logging
  • src/commands/sync.ts - CLI command registration (lines 643-735)
  • tests/watch/ - 6 test files, 70 tests

Design patterns: Interface narrowing for DI (Pick<Service, 'method'>), exponential backoff with cap, JSONL event logs, "never throws" for side effects.

Error Handling System (Spec 073)

Structured Errors - All errors extend StructuredError with consistent structure:

import { StructuredError } from '../utils/structured-errors';

// Creating errors
throw new StructuredError("WORKSPACE_NOT_FOUND", "Workspace 'test' not found", {
  details: { requestedWorkspace: "test", availableWorkspaces: ["main"] },
  suggestion: "Try one of: main",
  recovery: { canRetry: false, alternatives: ["main"] },
});

// For workspace errors, use specialized classes:
import { WorkspaceNotFoundError, WorkspaceDatabaseMissingError } from '../config/workspace-resolver';
throw new WorkspaceNotFoundError("test", ["main", "books"]);
throw new WorkspaceDatabaseMissingError("test", "/path/to/db");

Error Codes:

  • WORKSPACE_NOT_FOUND - Workspace alias not in config
  • DATABASE_NOT_FOUND - Database file missing
  • TAG_NOT_FOUND - Supertag doesn't exist
  • NODE_NOT_FOUND - Node ID not found
  • API_ERROR - Tana API request failed
  • VALIDATION_ERROR - Input validation failed
  • CONFIG_NOT_FOUND - Config file missing
  • UNKNOWN_ERROR - Unclassified error

Formatting for different contexts:

import { formatErrorForCli, formatErrorForMcp } from '../utils/error-formatter';

// CLI output (human-readable with colors)
console.error(formatErrorForCli(error));

// MCP output (structured JSON for AI agents)
return { error: formatErrorForMcp(error) };

Debug mode:

import { isDebugMode, setDebugMode, formatDebugError } from '../utils/debug';

// Enable debug mode (shows stack traces)
setDebugMode(true);

// Format with debug info when in debug mode
console.error(formatDebugError(error));

MCP error handling:

import { handleMcpError } from '../mcp/error-handler';

try {
  const result = await someOperation();
  return { content: [{ type: 'text', text: JSON.stringify(result) }] };
} catch (error) {
  return handleMcpError(error);  // Returns { isError: true, content: [...] }
}

Key files:

  • src/utils/structured-errors.ts - StructuredError class
  • src/utils/error-formatter.ts - CLI and MCP formatters
  • src/utils/error-registry.ts - Error code registry
  • src/utils/debug.ts - Debug mode utilities
  • src/mcp/error-handler.ts - MCP error handling
  • src/config/workspace-resolver.ts - Workspace error classes