| rule_format | Make each rule a high-level, concise, short, and clear one-liner. |
|---|
- README.md - For humans developing the FireGen Extension (setup, quick start, deployment)
- ARCHITECTURE.md - For AI agents developing the FireGen Extension (system design, patterns, technical deep-dive)
- LLMS.md - For AI agents consuming/integrating FireGen Extension into their applications (API reference, job schemas)
- AGENTS.md (this file) - Working directory rules for AI agents (Claude Code, Gemini CLI, etc.)
- TEST.md - Test data examples
- LLMS.md is for external consumers - AI agents integrating FireGen use LLMS.md for API schemas, job structure, and integration examples (consumer perspective: "how to USE")
- README.md is for internal developers - Human developers building/deploying FireGen use README.md for setup, installation, and development workflow (builder perspective: "how to BUILD")
This is the most important interface to end-users. All changes must maintain this structure.
firegen-jobs/{jobId}/
// Core job data
uid: string // User ID
model: string // Model identifier (e.g., "veo-3.1-fast-generate-preview")
status: "requested" | "starting" | "running" | "succeeded" | "failed" | "expired" | "canceled"
// Model communication (raw data)
request: Record<string, unknown> // Raw request sent to model API
response?: Record<string, unknown> // Raw response from model API (includes tokens, safety, etc.)
// Generated files (user access)
files?: [
{
name: string // Filename (e.g., "file0.mp4", "file1.png")
gs: string // GCS URI (gs://bucket/path/file0.mp4)
https: string // Signed URL (expires in 24h)
mimeType?: string // e.g., "video/mp4", "image/png"
size?: number // File size in bytes
}
]
// Errors
error?: {
code: string // Error code (e.g., "VALIDATION_ERROR")
message: string // Human-readable error message
details?: Record<string, unknown> // Additional error context
}
// AI Assisted Mode only
assisted?: {
prompt: string // Original user prompt (AI-assisted mode only)
reasons: string[] // AI reasoning chain (AI-assisted mode only)
}
// Metadata
metadata: {
version: string // FireGen version
createdAt: number // Job creation timestamp (ms)
updatedAt: number // Last update timestamp (ms)
// Polling metadata (async operations only)
operation?: string // Vertex AI operation name
attempt?: number // Poll attempts
nextPoll?: number // Next poll timestamp (ms)
ttl?: number // Job expiration timestamp (ms)
lastError?: number // Last error timestamp (ms)
}Key Design Principles:
requestandresponseare raw - Exact original raw RESTFUL API parameters and responses of the model, no transformationfilesis user-facing - Clean access URLs with sequential naming (file0, file1, file2...)erroris FireGen system errors - Model errors stay inresponse.errormodelat root level - Enables efficient querying by model typeassistedoptional - Only present for AI-assisted jobsmetadatacontains everything else - All temporal and diagnostic data in one namespace
- Use TypeScript as possible as you can, because TypeScript is preferred as primary language.
- Use
npx tsxto run TypeScript code. - Use
pnpminstead ofnpmto manage dependencies. - package.json dependencies should be installed using
pnpm addwith@latestversion - never use keyword/pattern match to solve problems, always use AI semantic understanding
- when trying to solve a problem, always follow 1st principles thinking, break down the problem into smaller parts, and solve each part step-by-step. always search online for similar problems and solutions.
- Always use semantic understanding and AI-native techniques for prompt tagging and classification.
- use
zodfor schema definition and validation - Never use hard-coded keyword/substring rule engines for prompt tagging or classification.
- create temp files in
/tmpdirectory. i.e. log files, debug code files. - Avoid git operations (clone, commit, push, pull, etc.) unless user approved.
- use
gcloud auth application-default loginto authenticate gcloud API requests in development environment - Always read file content before editing, especially for front matter or top-of-file modifications to avoid duplication.
- Error Logging: Always serialize Error objects before logging - use
serializeError(err)fromlib/error-utilsinstead of rawerrbecause Error objects have non-enumerable properties and serialize to{}in JSON/Firebase Functions logger. - Mirror Vertex AI REST schemas exactly in every model request/response validator—no optional or extra fields beyond the official docs.
- Explicit Exports Only - Never use
export * from; always use explicitexport { X, Y, Z } fromsyntax for clarity and intentionality - Minimal Public Interface - Treat each folder as a module; only export what external modules actually use; keep everything private by default; analyze actual usage before exporting
- Latest versions only: FireGen always uses the newest model versions (Veo, Gemini, etc.)
- When new versions release: Old versions are removed from AI analyzer but kept in codebase for backward compatibility
- Example: When Veo 3.1 releases, Veo 3.0 is hidden from AI analyzer (ai assist mode) but still works for direct API calls (explicit mode)
- AI Analyzer Tier: Only exposes the latest model versions
- Explicit Request Tier: Supports all model versions (new and old)
- AI Analyzer: Remove all old versions from AI hints - only latest versions visible
- Codebase: Keep old model adapters and schemas - don't delete them
- Tests: Update to expect new versions - remove tests for old versions
- Documentation: Mark old versions as "explicit requests only"
- Dependencies: Always use
@latestSDK version
- Standalone over Inheritance - Use separate classes per model to enable parallel AI modifications without conflicts
- One Model = One Complete File - Each model adapter is self-contained with all necessary code
- No Version Flags - Create separate files for new versions instead of if-else branches (e.g.,
veo-3.1.ts, notisVeo31flag) - Single Responsibility - Each class handles one concern only
- Open/Closed - New model versions = new files, not modifications to existing base classes
- Duplication over Coupling - Each schema file is self-contained; duplicating code is better than sharing schemas across models
- Use actual REST API model names everywhere
- AI Hints: Primary reference must be actual model name (can mention nickname in parentheses for context only)
- use
zodfor schema definition and validation
- Each
.schema.tsfile must be completely self-contained - NEVER import schemas from other models; duplicate code instead to enable independent evolution. - Schema structure must match official Vertex AI REST API exactly - no transformations or additions.
- Schemas use semantically correct types - e.g.,
durationSeconds: z.union([z.literal(4), z.literal(6), z.literal(8)])uses integers because the API accepts integers, not strings. - Both Request AND Response schemas must be defined in same
.schema.tsfile. - Types are inferred from schemas using
z.infer<typeof Schema>- never manually duplicate type definitions. - Use
schema.parse(request)for validation in all model adapters. - AI hints auto-generated from schema using
zodToJsonSchema()- never hardcode JSON examples. - Tests must expect REST API format - use
expect.any()for AI-chosen values only. - Export schemas publicly from
.schema.tsand re-export from main model file for easy imports.
- Schema Layer -
.schema.tsdescribes WHAT fields do functionally (capabilities, valid values, constraints, relationships) - AI Hints Layer -
ai-hints.tsdescribes HOW users express intent (language patterns, decision rules, prompt examples) - Golden Rule - Schema descriptions are functional documentation; user language patterns belong exclusively in AI hints
- Never mix layers - NEVER include "user says..." patterns or keyword matching guidance in schema descriptions
AI hints must be auto-generated from Zod schemas to prevent drift:
- Use
zodToJsonSchema()- Convert Zod schema to JSON Schema format for AI hints (one function call only) - No hardcoded JSON examples - Schema structure comes from Zod, not manual JSON strings
- Update schema → hints update automatically - Changing Zod schema immediately updates AI guidance
- Concise over Verbose - Shortest possible prompts that remain logical and complete
- No Common Sense - Don't state obvious information LLMs already know
- No Keyword Matching - Use semantic understanding, not hard-coded keyword rules
- Action-Oriented - Write direct actions ("Extract gs:// URIs" not "You should extract...")
- Comprehensive in Primary - Full documentation in main model variant, minimal hints in secondary variants
- Critical Rules First - Most important detection rules at the top
- One AI Hints Prompt - In the file
models/{modelId}/ai-hints.tsonly export one string constant named{MODEL}_AI_HINTSto be used by AI agents.
- Incremental Fixes - Address test failures one category at a time
- Always Validate - Run full test suite after structural changes
- Balance Brevity - Hints must be concise but effective enough to guide AI correctly
- Multi-step AI pipelines return structured data + reasoning - Each AI step returns both typed data and reasoning chain for transparency
- Use JSON mode with schema validation - Always use AI's native JSON mode with schema definitions for type-safe outputs
- System instructions omit role field - System instruction format is
{parts: [{text}]}, not{role: "user", parts: [{text}]} - Pre/post-process layers are deterministic - Separate pure logic (URL extraction, validation) from AI calls
- Semantic tagging for AI context - Convert complex data (URLs, references) to semantic tags before AI processing, restore after validation
- Single schema per AI generation step - Never combine multiple schemas in one generation - focus AI on one schema at a time
- Explicit data flow between steps - Return structured data explicitly instead of parsing AI text output with regex
- Reasoning chains are flat string arrays - Each reasoning line is a separate array item, not nested objects or concatenated strings
- One-liner reasoning format - Each reason must be concise:
"<decision>: <value> → <reason>" - Reasoning feeds forward through pipeline - Each step receives all previous reasoning as context
- No keyword extraction from reasoning - Trust AI to infer semantically from reasoning text; avoid parsing with regex
- Relaxed validation for intermediate representations - Use lenient schema validation for tagged/intermediate formats, strict validation for final output
- Two-stage validation pattern - Validate intermediate format during generation, validate final format after transformation
- Validate immediately after AI generation - Use schema validation right after each AI step, don't defer
- Explicit state tracking between steps - Return critical state (like model selection) as structured data, not embedded in reasoning text
- Immutable prompts across steps - Don't add metadata labels to prompts; concatenate raw content only
- Expert persona for schema-based generation - Use domain expert system prompts for structured output generation tasks
- Step files are numbered and self-contained - Name pipeline steps as
step1-*.ts,step2-*.tswith clear single responsibility - Orchestrator coordinates without logic - Main orchestrator file coordinates pipeline flow but contains no business logic
- README in each major module - Document architecture, data flow, and design principles in module-level README
- run
npm run buildandnpm run testto ensure everything works