Add LiteLLM provider into ‎responses_api_models by imxj · Pull Request #990 · NVIDIA-NeMo/Gym

imxj · 2026-04-01T00:29:16Z

Summary

Add new responses_api_models/litellm_model server for LiteLLM proxy endpoints
LiteLLMModelServer extends openai_model's SimpleModelServer, overriding only responses() to normalize LiteLLM proxy quirks
Normalize chat.completion hybrid responses to standard Responses API format (LiteLLM may downgrade /v1/responses internally)
Fix reasoning.effort="none" (string) to null Pydantic validation error for native Responses API responses
openai_model remains clean -- no proxy-specific workarounds

Test

Tested with GPT-5.4 via LiteLLM-backed endpoint -- native response format, reasoning fix applied
Tested with Opus 4.6 via LiteLLM-backed endpoint -- chat.completion hybrid normalized
End-to-end rollout collection verified with both models
11 unit tests for litellm_model (normalization + server integration)
openai_model existing tests still pass after revert

copy-pr-bot · 2026-04-01T00:29:20Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

LiteLLM proxies may return chat.completion format or hybrid response objects when called via /v1/responses. This normalizes those responses to the expected Responses API shape so downstream NeMoGymResponse validation succeeds. Also fixes reasoning.effort="none" validation error for native Responses API responses. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Jin Xu <jinx@nvidia.com>

Revert openai_model/app.py to its clean state and move the LiteLLM response normalization logic into a new responses_api_models/litellm_model that extends SimpleModelServer. This keeps proxy-specific workarounds (reasoning.effort="none" fix, chat.completion->response normalization) isolated from the native OpenAI model server. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Jin Xu <jinx@nvidia.com>

imxj force-pushed the feat/support_litellm_responses branch 2 times, most recently from 805aaef to f0558e5 Compare April 1, 2026 00:42

imxj changed the title ~~Add LiteLLM response normalization for Responses API proxy compatibility~~ Add LiteLLM provider into ‎responses_api_models Apr 7, 2026

imxj and others added 2 commits April 6, 2026 18:52

imxj force-pushed the feat/support_litellm_responses branch from 105e7c6 to 307b580 Compare April 7, 2026 01:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LiteLLM provider into ‎responses_api_models#990

Add LiteLLM provider into ‎responses_api_models#990
imxj wants to merge 2 commits intoNVIDIA-NeMo:mainfrom
imxj:feat/support_litellm_responses

imxj commented Apr 1, 2026 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

imxj commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test

Uh oh!

copy-pr-bot bot commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

imxj commented Apr 1, 2026 •

edited

Loading