Skip to content

CCM/OCM service improvements#3913

Open
nekohasekai wants to merge 95 commits intotestingfrom
ccm-ocm-improvements
Open

CCM/OCM service improvements#3913
nekohasekai wants to merge 95 commits intotestingfrom
ccm-ocm-improvements

Conversation

@nekohasekai
Copy link
Copy Markdown
Member

@nekohasekai nekohasekai commented Mar 14, 2026

Features:

  • Multiple credential
  • Balancer/Fallback credential
  • External credential/user
  • Reverse mode for external user
  • Reserve/Max 5h/weekly usage limit

TODO:

  • Update documentations
  • Test everything

@nekohasekai nekohasekai force-pushed the ccm-ocm-improvements branch 2 times, most recently from 0da7dd4 to fc8cfc4 Compare March 15, 2026 04:25
@nekohasekai nekohasekai force-pushed the testing branch 2 times, most recently from 5f049c1 to 92e07f6 Compare March 15, 2026 09:19
…ring

Extract credential interface from *defaultCredential to support both
default (OAuth) and external (remote proxy) credential types. External
credentials proxy requests to a remote ccm/ocm instance with bearer
token auth, poll a /status endpoint for utilization, and parse
aggregated rate limit headers from responses.

Add allow_external_usage user flag to control whether balancer/fallback
providers may select external credentials. Add status endpoint
(/ccm/v1/status, /ocm/v1/status) returning averaged utilization across
eligible credentials. Rewrite response rate limit headers for external
users with aggregated values.
Refuse to refresh tokens when the credential file is not writable,
preventing server-side invalidation of the old refresh token that
would make the credential permanently unusable after restart.
aTLS.NewListener returns *LazyConn, not *tls.Conn, so Go's
http.Server cannot detect TLS via type assertion and falls back
to HTTP/1.x. When ALPN negotiates h2, the client sends HTTP/2
frames that the server fails to parse, causing HTTP 520 errors
behind Cloudflare.

Wrap HTTP handlers with h2c.NewHandler to intercept the HTTP/2
client preface and dispatch to http2.Server.ServeConn, consistent
with DERP, v2rayhttp, naive, and v2raygrpclite services.
Allow two CCM/OCM instances to share credentials when only one has a
public IP, using yamux-multiplexed reverse connections.

Three credential modes:
- Normal: URL set, reverse=false — standard HTTP proxy
- Receiver: URL empty — waits for incoming reverse connection
- Connector: URL set, reverse=true — dials out to establish connection

Extend InterfaceUpdated to services so network changes trigger
reverse connection reconnection.
The consecutiveFailures counter in connectorLoop never resets,
causing backoff to permanently cap at 30-45s even after a
connection that served successfully for hours.

Reset the counter when connectorConnect ran for at least one
minute, indicating a successful session rather than a transient
dial/handshake failure.
InterfaceUpdated() writes reverseContext and reverseCancel without
synchronization while connectorLoop/connectorConnect goroutines
read them concurrently. close() also accesses reverseCancel without
a lock.

Fix by extending reverseAccess mutex to protect these fields:
- Add getReverseContext()/resetReverseContext() methods
- Pass context as parameter to connectorConnect
- Merge close() into a single lock acquisition
- Use resetReverseContext() in InterfaceUpdated()
connectorConnect() creates a bufio.NewReader to read the HTTP 101
upgrade response, but then passes the raw conn to yamux.Server().
If TCP coalesces the 101 response with initial yamux frames, the
bufio reader over-reads into its buffer and those bytes are lost
to yamux, causing session failure.

Wrap the bufio.Reader and raw conn into a bufferedConn so yamux
reads through the buffer first.
updateStateFromHeaders unconditionally applied header utilization
values even when they were lower than the current state, causing
poll-sourced values to be overwritten by stale header values.

Parse reset timestamps before utilization and only allow decreases
when the reset timestamp changes (indicating a new rate-limit
window). Also add math.Ceil to CCM external credential for
consistency with default credential.
WebSocket 101 upgrade responses do not include utilization headers
(confirmed via codex CLI source). Rate limit data is delivered
exclusively through in-band events (codex.rate_limits and error
events with status 429).

Previously, updateStateFromHeaders unconditionally bumped lastUpdated
even when no utilization headers were found, which suppressed polling
and left credential utilization permanently stale during WebSocket
sessions.

- Only bump lastUpdated when actual utilization data is parsed
- Parse in-band codex.rate_limits events to update credential state
- Detect in-band 429 error events to markRateLimited
- Fix WebSocket 429 retry to update old credential state before retry
The HTTP path rewrites utilization headers for external users via
rewriteResponseHeadersForExternalUser to show aggregated values.
The WebSocket upgrade headers were also rewritten, but in-band
codex.rate_limits events were forwarded unmodified, leaking
per-credential utilization to external users.
Add limit_5h and limit_weekly options as alternatives to reserve_5h
and reserve_weekly for capping credential utilization. The two are
mutually exclusive per window.

Fix computeAggregatedUtilization to scale per-credential utilization
relative to each credential's cap before averaging, so external users
see correct available capacity regardless of per-credential caps.

Fix pickLeastUsed to compare remaining capacity (cap - utilization)
instead of raw utilization, ensuring fair comparison across credentials
with different caps.
computeAggregatedUtilization used isAvailable() which only checks
permanent unavailability, so credentials rejected by upstream 400
still had their planWeight included in the total, inflating reported
capacity and diluting utilization.
After re-login with newer Claude Code (v2.1.75+), CCM refresh requests
returned persistent 429s. Root cause: CCM omitted the `scope` parameter
that the server now requires for tokens with `user:file_upload` scope.

Changes to fully match Claude Code's OAuth behavior:

- Add `scope` parameter to token refresh request body
- Parse `scope` from refresh response and store back
- Add `subscriptionType`/`rateLimitTier` to credential struct to
  preserve Claude Code's profile state on write-back
- Change credential file write to read-modify-write, preserving
  other top-level JSON keys (matches Claude Code's BP6 pattern)
- Same for macOS keychain write path
- Increase token expiry buffer from 1 min to 5 min (matching CC's
  isOAuthTokenExpired with 300s buffer)
- Add cross-process mkdir-based file lock compatible with Claude
  Code's proper-lockfile protocol (~/.claude.lock)
- Add post-failure recovery: re-read credentials from disk after
  refresh failure in case another process succeeded
- Add 401/403 "OAuth token has been revoked" recovery in proxy
  handler: reload credentials and retry once
@nekohasekai nekohasekai force-pushed the ccm-ocm-improvements branch from cbdf283 to 441c988 Compare March 22, 2026 09:02
@nekohasekai nekohasekai force-pushed the testing branch 4 times, most recently from 7d24f54 to fcc6017 Compare March 24, 2026 07:20
Codex CLI ignores x-codex-* headers in the WebSocket upgrade response
and only reads rate limits from in-band codex.rate_limits events.
Previously, the first synthetic event was gated by firstRealRequest
(after warmup), delaying usage display. Now send aggregated status
right after subscribing, so the client sees rate limits before the
first turn begins.
Remove the poll_interval config surface from CCM and OCM so both services fall back to the built-in 1h polling cadence again. Also isolate CCM credential lock mocking per test instance so the access-token refresh tests stop racing on shared global state.
…eam events

The initial synthetic event from 6721dff arrives before the Codex CLI's
response stream reader is active. Additionally, the shouldEmit gate in
updateStateFromHeaders suppresses the async replacement when values haven't
changed. Send aggregated status inline in proxyWebSocketUpstreamToClient
so the client receives it at the exact protocol position it expects.
…user connect

External credentials now properly increment consecutivePollFailures on
poll errors (matching defaultCredential behavior), marking the credential
as temporarily blocked. When a user with external_credential connects
and the credential is not usable, a forced poll is triggered to check
recovery.
…ocally

Strip all upstream rate limit headers and compute unified-status,
representative-claim, reset times, and surpassed-threshold from
aggregated utilization data. Never expose per-account overage or
fallback information. Remove per-credential unified state storage,
snapshot aggregation, and WebSocket synthetic rate limit events.
… 401

tryRefreshCredentials now returns error and calls markCredentialsUnavailable
when lock acquisition or file write permission fails. getAccessToken propagates
the error instead of silently returning the expired token. pollUsage handles
401 by attempting auth recovery and marking unavailable on failure. All
credential error paths now use Error log level instead of Debug. Startup
checks expired tokens eagerly via tryRefreshCredentials.
@nekohasekai nekohasekai force-pushed the ccm-ocm-improvements branch from 6ed4780 to 471c9c3 Compare March 28, 2026 10:07
…ests

Connector mode credentials were unconditionally blocked from local use
by unavailableError(), despite having a working forwardHTTPClient. Also
set credentialDialer in OCM connector mode to prevent nil panic in
WebSocket handler.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant