Skip to content

Fix Codex cost attribution for long turn contexts#1014

Merged
steipete merged 3 commits into
steipete:mainfrom
hhh2210:codex/fix-codex-long-turn-model
May 18, 2026
Merged

Fix Codex cost attribution for long turn contexts#1014
steipete merged 3 commits into
steipete:mainfrom
hhh2210:codex/fix-codex-long-turn-model

Conversation

@hhh2210
Copy link
Copy Markdown
Contributor

@hhh2210 hhh2210 commented May 17, 2026

Fixes #1013.

Summary

  • Retain the configured prefix bytes for truncated JSONL lines instead of clearing the buffer.
  • Preserve Codex model state from oversized turn_context lines by extracting the model from the retained prefix before skipping normal JSON parsing.
  • Prefer the explicit model on a token_count event over the previous turn context model when both are available.
  • Bump the Codex local cost cache artifact from codex-v7.json to codex-v8.json so locally misattributed cache data is rebuilt.
  • Add regression coverage for long and oversized turn_context rows, explicit event-model precedence, cache artifact invalidation, and the shared JSONL scanner prefix behavior.

Background

This came from a local CodexBar cost-history discrepancy observed on the Homebrew-installed CodexBar 0.26.1 app, not from a local development build. The generated local cost cache (GCL) at ~/Library/Caches/CodexBar/cost-usage/codex-v6.json matched the UI and attributed a large part of the day to gpt-5. The raw Codex CLI session JSONL under ~/.codex/sessions/2026/05/18, however, showed that the session was using gpt-5.5.

The proximate failure was a long turn_context row. In the observed file, the row was a little over 32 KiB and contained "model":"gpt-5.5" near the start. The scanner limited Codex lines to a 32 KiB prefix and dropped truncated rows entirely, so it lost the current model. Later token_count events often do not repeat the model, which made the parser fall back to gpt-5.

Current main already has newer Codex cache/schema work (codex-v7.json), but the scanner still had the same failure mode: a long or oversized turn_context could be skipped before it updated model state. Bumping only the prefix size would reduce how often this happens, but it would still be a threshold-based fix. This patch keeps the prefix for truncated lines and uses it specifically to recover Codex turn-context model state, so the fix covers rows that exceed the full parse limit as long as the model appears in the retained prefix.

Why this shape

The regular JSON parser still skips truncated lines, so partial JSON is not fed into normal event parsing. The new prefix extraction is narrowly scoped to Codex turn_context model recovery, which is the state needed by later model-less token_count events. The cache bump is intentionally Codex-only and forces affected local cost caches to be rebuilt without changing remote quota or account behavior.

Validation

  • swift test --filter CostUsageJsonlScannerTests
  • swift test --filter CostUsageScannerBreakdownTests
  • swift test --filter CostUsageCacheTests
  • swift test --filter CostUsageJsonlPerformanceTests
  • ./Scripts/lint.sh lint

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes Codex cost attribution when large turn_context JSONL rows are truncated before normal parsing, allowing model state to survive and preventing later model-less token_count events from falling back incorrectly.

Changes:

  • Retains JSONL truncated-line prefixes and uses them to recover Codex turn_context model state.
  • Gives explicit token_count event models precedence over prior turn-context state.
  • Bumps Codex cache artifact version and adds regression coverage for long/oversized contexts and cache invalidation.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
Sources/CodexBarCore/Vendored/CostUsage/CostUsageJsonl.swift Retains configured prefix bytes for truncated JSONL lines.
Sources/CodexBarCore/Vendored/CostUsage/CostUsageScanner.swift Recovers Codex model state from truncated turn_context prefixes and updates model precedence.
Sources/CodexBarCore/Vendored/CostUsage/CostUsageCache.swift Bumps Codex cache artifact version to v8.
Tests/CodexBarTests/CostUsageScannerBreakdownTests.swift Adds Codex regression tests for long/oversized contexts, model precedence, and cache rebuild behavior.
Tests/CodexBarTests/CostUsageJsonlScannerTests.swift Updates scanner expectations to verify retained truncated prefixes.
Tests/CodexBarTests/CostUsageCacheTests.swift Updates Codex cache filename expectation to codex-v8.json.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread Sources/CodexBarCore/Vendored/CostUsage/CostUsageScanner.swift Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 616b43e080

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +799 to +800
if let model = Self.extractCodexTurnContextModel(from: line.bytes) {
currentModel = model
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Prefer event model over recovered turn_context model

When a truncated turn_context line sets currentModel, later token_count events that include their own model can still be attributed to the stale context model because parseCodexFile resolves model as currentModel ?? modelFromInfo. This became a practical regression with this change because oversized contexts now update currentModel more often; if Codex emits a differing explicit model on token_count (for example after a model switch), usage will be booked to the wrong model. Keep the recovered context as fallback only, and let explicit event-level model win.

Useful? React with 👍 / 👎.

@steipete steipete merged commit 036b497 into steipete:main May 18, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Codex cost history can misattribute long gpt-5.5 sessions to gpt-5

3 participants