Fix Codex cost attribution for long turn contexts#1014
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes Codex cost attribution when large turn_context JSONL rows are truncated before normal parsing, allowing model state to survive and preventing later model-less token_count events from falling back incorrectly.
Changes:
- Retains JSONL truncated-line prefixes and uses them to recover Codex
turn_contextmodel state. - Gives explicit
token_countevent models precedence over prior turn-context state. - Bumps Codex cache artifact version and adds regression coverage for long/oversized contexts and cache invalidation.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
Sources/CodexBarCore/Vendored/CostUsage/CostUsageJsonl.swift |
Retains configured prefix bytes for truncated JSONL lines. |
Sources/CodexBarCore/Vendored/CostUsage/CostUsageScanner.swift |
Recovers Codex model state from truncated turn_context prefixes and updates model precedence. |
Sources/CodexBarCore/Vendored/CostUsage/CostUsageCache.swift |
Bumps Codex cache artifact version to v8. |
Tests/CodexBarTests/CostUsageScannerBreakdownTests.swift |
Adds Codex regression tests for long/oversized contexts, model precedence, and cache rebuild behavior. |
Tests/CodexBarTests/CostUsageJsonlScannerTests.swift |
Updates scanner expectations to verify retained truncated prefixes. |
Tests/CodexBarTests/CostUsageCacheTests.swift |
Updates Codex cache filename expectation to codex-v8.json. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 616b43e080
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if let model = Self.extractCodexTurnContextModel(from: line.bytes) { | ||
| currentModel = model |
There was a problem hiding this comment.
Prefer event model over recovered turn_context model
When a truncated turn_context line sets currentModel, later token_count events that include their own model can still be attributed to the stale context model because parseCodexFile resolves model as currentModel ?? modelFromInfo. This became a practical regression with this change because oversized contexts now update currentModel more often; if Codex emits a differing explicit model on token_count (for example after a model switch), usage will be booked to the wrong model. Keep the recovered context as fallback only, and let explicit event-level model win.
Useful? React with 👍 / 👎.
Fixes #1013.
Summary
turn_contextlines by extracting the model from the retained prefix before skipping normal JSON parsing.token_countevent over the previous turn context model when both are available.codex-v7.jsontocodex-v8.jsonso locally misattributed cache data is rebuilt.turn_contextrows, explicit event-model precedence, cache artifact invalidation, and the shared JSONL scanner prefix behavior.Background
This came from a local CodexBar cost-history discrepancy observed on the Homebrew-installed CodexBar 0.26.1 app, not from a local development build. The generated local cost cache (GCL) at
~/Library/Caches/CodexBar/cost-usage/codex-v6.jsonmatched the UI and attributed a large part of the day togpt-5. The raw Codex CLI session JSONL under~/.codex/sessions/2026/05/18, however, showed that the session was usinggpt-5.5.The proximate failure was a long
turn_contextrow. In the observed file, the row was a little over 32 KiB and contained"model":"gpt-5.5"near the start. The scanner limited Codex lines to a 32 KiB prefix and dropped truncated rows entirely, so it lost the current model. Latertoken_countevents often do not repeat the model, which made the parser fall back togpt-5.Current
mainalready has newer Codex cache/schema work (codex-v7.json), but the scanner still had the same failure mode: a long or oversizedturn_contextcould be skipped before it updated model state. Bumping only the prefix size would reduce how often this happens, but it would still be a threshold-based fix. This patch keeps the prefix for truncated lines and uses it specifically to recover Codex turn-context model state, so the fix covers rows that exceed the full parse limit as long as the model appears in the retained prefix.Why this shape
The regular JSON parser still skips truncated lines, so partial JSON is not fed into normal event parsing. The new prefix extraction is narrowly scoped to Codex
turn_contextmodel recovery, which is the state needed by later model-lesstoken_countevents. The cache bump is intentionally Codex-only and forces affected local cost caches to be rebuilt without changing remote quota or account behavior.Validation
swift test --filter CostUsageJsonlScannerTestsswift test --filter CostUsageScannerBreakdownTestsswift test --filter CostUsageCacheTestsswift test --filter CostUsageJsonlPerformanceTests./Scripts/lint.sh lint