This document explains the technical design decisions, architecture, and implementation details of the OpenAI Codex OAuth plugin for OpenCode.
- Architecture Overview
- Stateless vs Stateful Mode
- Message ID Handling
- Reasoning Content Flow
- Request Pipeline
- Comparison with Codex CLI
- Design Rationale
┌─────────────┐
│ OpenCode │ TUI/Desktop client
└──────┬──────┘
│
│ streamText() with AI SDK
│
▼
┌──────────────────────────────┐
│ OpenCode Provider System │
│ - Loads plugin │
│ - Calls plugin.auth.loader() │
│ - Passes provider config │
└──────┬───────────────────────┘
│
│ Custom fetch()
│
▼
┌──────────────────────────────┐
│ This Plugin │
│ - AccountManager │
│ - Multi-account rotation │
│ - OAuth authentication │
│ - Request transformation │
│ - store:false handling │
│ - Codex bridge prompts │
│ - Toast notifications │
└──────┬───────────────────────┘
│
│ HTTP POST with OAuth
│
▼
┌──────────────────────────────┐
│ OpenAI Codex API │
│ (ChatGPT Backend) │
│ - Requires OAuth │
│ - Supports store:false │
│ - Returns SSE stream │
└──────────────────────────────┘
The AccountManager handles multiple ChatGPT accounts with automatic rotation:
class AccountManager {
// Core state
private accounts: ManagedAccount[] = [];
private activeIndex = 0;
private config: MultiAccountConfig;
// Key methods
async loadFromDisk(): Promise<void> // Load accounts from JSON
async importFromOpenCodeAuth(): Promise<void> // Import from legacy auth
async addAccount(...): Promise<ManagedAccount> // Add new account
async getNextAvailableAccount(model?): Promise<ManagedAccount | null>
markRateLimited(account, retryAfterMs, model?)
async ensureValidToken(account): Promise<boolean>
}1. Request comes in with model name
│
├─▶ getNextAvailableAccount(model)
│ │
│ ├─▶ Check current account availability
│ │ ├─ consecutiveFailures < 3?
│ │ ├─ globalRateLimitReset expired?
│ │ └─ perModelRateLimit[model] expired?
│ │
│ ├─▶ If available: use current account
│ │
│ └─▶ If not: try next accounts in order
│ │
│ └─▶ If all rate limited: return least-limited
│
├─▶ ensureValidToken(account)
│ │
│ ├─▶ Check expiration (5 min proactive refresh)
│ └─▶ Refresh if needed
│
└─▶ executeRequest(account, input, init)
│
├─▶ On 429: markRateLimited() + try next account
├─▶ On 401: markRefreshFailed() + try next account
└─▶ On success: return response
{
"version": 1,
"accounts": [
{
"index": 0,
"email": "user@example.com",
"userId": "user-xxx",
"accountId": "acct-xxx",
"planType": "plus",
"addedAt": 1705000000000,
"lastUsed": 1705001000000,
"parts": {
"refreshToken": "..."
},
"access": "eyJ...",
"expires": 1705003600000,
"rateLimitResets": {
"gpt-5.2-codex": 1705002000000
},
"globalRateLimitReset": null,
"consecutiveFailures": 0
}
],
"activeAccountIndex": 0
}| Variable | Description | Default |
|---|---|---|
OPENCODE_OPENAI_QUIET |
Disable toast notifications | Off |
OPENCODE_OPENAI_DEBUG |
Enable debug logging | Off |
OPENCODE_OPENAI_STRATEGY |
sticky, round-robin, hybrid |
sticky |
OPENCODE_OPENAI_PID_OFFSET |
PID-based account offset | Off |
The plugin uses store: false (stateless mode) because:
-
ChatGPT Backend Requirement (confirmed via testing):
// Attempt with store:true → 400 Bad Request {"detail":"Store must be set to false"}
-
Codex CLI Behavior (
tmp/codex/codex-rs/core/src/client.rs:215-232):// Codex CLI uses store:false for ChatGPT OAuth let azure_workaround = self.provider.is_azure_responses_endpoint(); store: azure_workaround, // false for ChatGPT, true for Azure
Key Points:
- ✅ ChatGPT backend REQUIRES store:false (not optional)
- ✅ Codex CLI uses store:false for ChatGPT
- ✅ Azure requires store:true (different endpoint, not supported by this plugin)
- ✅ Stateless mode = no server-side conversation storage
Question: If there's no server storage, how does the LLM remember previous turns?
Answer: Full message history is sent in every request:
// Turn 3 request contains ALL previous messages:
input: [
{ role: "developer", content: "..." }, // System prompts
{ role: "user", content: "write test.txt" }, // Turn 1 user
{ type: "function_call", name: "write", ... }, // Turn 1 tool call
{ type: "function_call_output", ... }, // Turn 1 tool result
{ role: "assistant", content: "Done!" }, // Turn 1 response
{ role: "user", content: "read it" }, // Turn 2 user
{ type: "function_call", name: "read", ... }, // Turn 2 tool call
{ type: "function_call_output", ... }, // Turn 2 tool result
{ role: "assistant", content: "Contents..." }, // Turn 2 response
{ role: "user", content: "what did you write?" } // Turn 3 user (current)
]
// All IDs stripped, item_reference filtered outContext is maintained through:
- ✅ Full message history (LLM sees all previous messages)
- ✅ Full tool call history (LLM sees what it did)
- ✅
reasoning.encrypted_content(preserves reasoning between turns)
Source: Verified via ENABLE_PLUGIN_REQUEST_LOGGING=1 logs
| Aspect | store:false (This Plugin) | store:true (Azure Only) |
|---|---|---|
| ChatGPT Support | ✅ Required | ❌ Rejected by API |
| Message History | ✅ Sent in each request (no IDs) | Stored on server |
| Message IDs | ❌ Must strip all | ✅ Required |
| AI SDK Compat | ❌ Must filter item_reference |
✅ Works natively |
| Context | Full history + encrypted reasoning | Server-stored conversation |
| Codex CLI Parity | ✅ Perfect match | ❌ Different mode |
Decision: Use store:false (only option for ChatGPT backend).
OpenCode/AI SDK sends two incompatible constructs:
// Multi-turn request from OpenCode
const body = {
input: [
{ type: "message", role: "developer", content: [...] },
{ type: "message", role: "user", content: [...], id: "msg_abc" },
{ type: "item_reference", id: "rs_xyz" }, // ← AI SDK construct
{ type: "function_call", id: "fc_123" }
]
};Two issues:
item_reference- AI SDK construct for server state lookup (not in Codex API spec)- Message IDs - Cause "item not found" with
store: false
ChatGPT Backend Requirement (confirmed via testing):
{"detail":"Store must be set to false"}Errors that occurred:
❌ "Item with id 'msg_abc' not found. Items are not persisted when `store` is set to false."
❌ "Missing required parameter: 'input[3].id'" (when item_reference has no ID)
Filter AI SDK Constructs + Strip IDs (lib/request/request-transformer.ts:114-135):
export function filterInput(input: InputItem[]): InputItem[] {
return input
.filter((item) => {
// Remove AI SDK constructs not supported by Codex API
if (item.type === "item_reference") {
return false; // AI SDK only - references server state
}
return true; // Keep all other items
})
.map((item) => {
// Strip IDs from all items (stateless mode)
if (item.id) {
const { id, ...itemWithoutId } = item;
return itemWithoutId as InputItem;
}
return item;
});
}Why this approach?
- ✅ Filter
item_reference- Not in Codex API, AI SDK-only construct - ✅ Keep all messages - LLM needs full conversation history for context
- ✅ Strip ALL IDs - Matches Codex CLI stateless behavior
- ✅ Future-proof - No ID pattern matching, handles any ID format
The plugin logs ID filtering for debugging:
// Before filtering
console.log(`[openai-codex-plugin] Filtering ${originalIds.length} message IDs from input:`, originalIds);
// After filtering
console.log(`[openai-codex-plugin] Successfully removed all ${originalIds.length} message IDs`);
// Or warning if IDs remain
console.warn(`[openai-codex-plugin] WARNING: ${remainingIds.length} IDs still present after filtering:`, remainingIds);Source: lib/request/request-transformer.ts:287-301
Challenge: How to maintain context across turns when store:false means no server-side storage?
Solution: Use reasoning.encrypted_content
body.include = modelConfig.include || ["reasoning.encrypted_content"];How it works:
- Turn 1: Model generates reasoning, encrypted content returned
- Client: Stores encrypted content locally
- Turn 2: Client sends encrypted content back in request
- Server: Decrypts content to restore reasoning context
- Model: Has full context without server-side storage
Flow Diagram:
Turn 1:
Client → [Request without IDs] → Server
Server → [Response + encrypted reasoning] → Client
Client stores encrypted content locally
Turn 2:
Client → [Request with encrypted content, no IDs] → Server
Server decrypts reasoning context
Server → [Response + new encrypted reasoning] → Client
Codex CLI equivalent (tmp/codex/codex-rs/core/src/client.rs:190-194):
let include: Vec<String> = if reasoning.is_some() {
vec!["reasoning.encrypted_content".to_string()]
} else {
vec![]
};Source: lib/request/request-transformer.ts:303
1. Original OpenCode Request
├─ model: "gpt-5-codex"
├─ input: [{ id: "msg_123", ... }, { id: "rs_456", ... }]
└─ tools: [...]
2. Model Normalization
├─ Detect codex/gpt-5/codex-mini variants
└─ Normalize to "gpt-5", "gpt-5-codex", or "codex-mini-latest"
3. Config Merging
├─ Global options (provider.openai.options)
├─ Model-specific options (provider.openai.models[name].options)
└─ Result: merged config for this model
4. Message ID Filtering
├─ Remove ALL IDs from input array
├─ Log original IDs for debugging
└─ Verify no IDs remain
5. System Prompt Handling (CODEX_MODE)
├─ Filter out OpenCode system prompts
├─ Preserve OpenCode env + AGENTS instructions when concatenated
└─ Add Codex-OpenCode bridge prompt
6. Orphan Tool Output Handling
├─ Match function_call_output to function_call OR local_shell_call
├─ Match custom_tool_call_output to custom_tool_call
└─ Convert unmatched outputs to assistant messages (preserve context)
7. Reasoning Configuration
├─ Set reasoningEffort (minimal/low/medium/high)
├─ Set reasoningSummary (auto/detailed)
└─ Based on model variant
8. Prompt Caching & Session Headers
├─ Preserve host-supplied prompt_cache_key (OpenCode session id)
├─ Add conversation + account headers for Codex debugging when cache key exists
└─ Leave headers unset if host does not provide a cache key
9. Final Body
├─ store: false
├─ stream: true
├─ instructions: Codex system prompt
├─ input: Filtered messages (no IDs)
├─ reasoning: { effort, summary }
├─ text: { verbosity }
├─ include: ["reasoning.encrypted_content"]
└─ prompt_cache_key: conversation-scoped UUID
Source: lib/request/request-transformer.ts:265-329
| Feature | Codex CLI | This Plugin | Match? |
|---|---|---|---|
| OAuth Flow | ✅ PKCE + ChatGPT login | ✅ Same | ✅ |
| store Parameter | false (ChatGPT) |
false |
✅ |
| Message IDs | Stripped in stateless | Stripped | ✅ |
| reasoning.encrypted_content | ✅ Included | ✅ Included | ✅ |
| Model Normalization | "gpt-5" / "gpt-5-codex" / "codex-mini-latest" | Same | ✅ |
| Reasoning Effort | medium (default) | medium (default) | ✅ |
| Text Verbosity | medium (codex), low (gpt-5) | Same | ✅ |
| Feature | Codex CLI | This Plugin | Why? |
|---|---|---|---|
| Codex-OpenCode Bridge | N/A (native) | ✅ Custom prompt | OpenCode → Codex translation |
| OpenCode Prompt Filtering | N/A | ✅ Filter & replace | Remove OpenCode prompts, keep env/AGENTS |
| Orphan Tool Output Handling | ✅ Drop orphans | ✅ Convert to messages | Preserve context + avoid 400s |
| Usage-limit messaging | CLI prints status | ✅ Friendly error summary | Surface 5h/weekly windows in OpenCode |
| Per-Model Options | CLI flags | ✅ Config file | Better UX in OpenCode |
| Custom Model Names | No | ✅ Display names | UI convenience |
Pros of store:true:
- ✅ No ID filtering needed
- ✅ Server manages conversation
- ✅ Potentially more robust
Cons of store:true:
- ❌ Diverges from Codex CLI behavior
- ❌ Requires conversation ID management
- ❌ More complex error handling
- ❌ Unknown server-side storage limits
Decision: Use store:false for Codex parity and simplicity.
Alternative: Filter specific ID patterns (rs_*, msg_*, etc.)
Problem:
- ID patterns may change
- New ID types could be added
- Partial filtering is brittle
Solution: Remove ALL IDs
Rationale:
- Matches Codex CLI behavior exactly
- Future-proof against ID format changes
- Simpler implementation (no pattern matching)
- Clearer semantics (stateless = no IDs)
Problem: OpenCode's system prompts are optimized for OpenCode's tool set and behavior patterns.
Solution: Replace OpenCode prompts with Codex-specific instructions.
Benefits:
- ✅ Explains tool name differences (apply_patch → edit)
- ✅ Documents available tools
- ✅ Maintains OpenCode working style
- ✅ Preserves Codex best practices
- ✅ 90% reduction in prompt tokens
Source: lib/prompts/codex-opencode-bridge.ts
Alternative: Single global config
Problem:
gpt-5-codexoptimal settings differ fromgpt-5-nano- Users want quick switching between quality levels
- No way to save "presets"
Solution: Per-model options in config
Benefits:
- ✅ Save multiple configurations
- ✅ Quick switching (no CLI args)
- ✅ Descriptive names ("Fast", "Balanced", "Max Quality")
- ✅ Persistent across sessions
Source: config/opencode-legacy.json (legacy) or config/opencode-modern.json (variants)
Cause: Message ID leaked through filtering
Fix: Improved filterInput() removes ALL IDs
Prevention: Debug logging catches remaining IDs
Cause: OAuth access token expired
Fix: shouldRefreshToken() checks expiration
Prevention: Auto-refresh before requests
Cause: Azure doesn't support stateless mode
Workaround: Codex CLI uses store: true for Azure only
This Plugin: Only supports ChatGPT OAuth (no Azure)
Codex Bridge Prompt: ~550 tokens (~90% reduction vs full OpenCode prompt) Benefit: Faster inference, lower costs
Prompt Caching: Uses promptCacheKey for session-based caching
Result: Reduced token usage on subsequent turns
Source: tmp/opencode/packages/opencode/src/provider/transform.ts:90-92
- Azure Support: Add
store: truemode with ID management - Version Detection: Adapt to OpenCode/AI SDK version changes
- Config Validation: Warn about unsupported options
- Test Coverage: Unit tests for all transformation functions
- Performance Metrics: Log token usage and latency
- AI SDK Updates: Changes to
.responses()method - OpenCode Changes: New message ID formats
- Codex API Changes: New request parameters
- CONFIG_FLOW.md - Configuration system guide
- Codex CLI Source - Official implementation
- OpenCode Source - OpenCode implementation