Problem
providers/openai/models/gpt-5.4.toml currently defines:
[limit]
context = 1_050_000
input = 272_000
output = 128_000
However, the OpenAI docs for GPT-5.4 say the model has a 1.05M context window, and also note:
For models with a 1.05M context window (GPT-5.4 and GPT-5.4 pro), prompts with >272K input tokens are priced at 2x input and 1.5x output for the full session for standard, batch, and flex.
Source: https://developers.openai.com/api/docs/models/gpt-5.4
That wording reads like a pricing threshold, not a hard maximum accepted input size.
Why this matters
Projects consuming models.dev use limit.input as an actual input cap for overflow handling / session compaction. With the current metadata, clients may compact at ~252K input tokens even when the backend can successfully accept much larger requests.
Request
Could you clarify whether 272_000 is intended to be the real max input token limit for openai/gpt-5.4, or whether it is the pricing breakpoint?
If it is only the pricing breakpoint, would it make sense to:
- change
limit.input to the true accepted max input size, or
- remove / relax
limit.input for this model, and
- represent the >272K pricing behavior separately
If 272_000 is in fact the real hard input limit, could you link the authoritative source for that?
Problem
providers/openai/models/gpt-5.4.tomlcurrently defines:However, the OpenAI docs for GPT-5.4 say the model has a
1.05Mcontext window, and also note:Source: https://developers.openai.com/api/docs/models/gpt-5.4
That wording reads like a pricing threshold, not a hard maximum accepted input size.
Why this matters
Projects consuming
models.devuselimit.inputas an actual input cap for overflow handling / session compaction. With the current metadata, clients may compact at ~252K input tokens even when the backend can successfully accept much larger requests.Request
Could you clarify whether
272_000is intended to be the real max input token limit foropenai/gpt-5.4, or whether it is the pricing breakpoint?If it is only the pricing breakpoint, would it make sense to:
limit.inputto the true accepted max input size, orlimit.inputfor this model, andIf
272_000is in fact the real hard input limit, could you link the authoritative source for that?