Add frozen model inference engine support for hosting reward models without weight update by ghShu · Pull Request #1055 · NovaSky-AI/SkyRL

ghShu · 2026-02-08T19:29:11Z

This patch adds support for dedicated reward model inference engines that use frozen_model=True (no weight sync, always active). This enables:

LLM-as-Judge patterns (RLAIF, Constitutional AI)
Process Reward Models (verifiers)
Frozen reward models for scoring/evaluation

Changes:

Add RewardInferenceConfig and PlacementGenerationEnvConfig to config
Add pretrained_lora_path option to SkyRLLoraConfig
Add reward_inference section to ppo_base_config.yaml
Add get_reward_inference_client() method to BasePPOExp
Add frozen_model parameter to create_ray_wrapped_inference_engines()
Pass reward_inference_client to generator

This patch adds support for dedicated reward model inference engines that use frozen_model=True (no weight sync, always active). This enables: - LLM-as-Judge patterns (RLAIF, Constitutional AI) - Process Reward Models (verifiers) - Frozen reward models for scoring/evaluation Changes: - Add RewardInferenceConfig and PlacementGenerationEnvConfig to config - Add pretrained_lora_path option to SkyRLLoraConfig - Add reward_inference section to ppo_base_config.yaml - Add get_reward_inference_client() method to BasePPOExp - Add frozen_model parameter to create_ray_wrapped_inference_engines() - Pass reward_inference_client to generator

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Add frozen model inference engine support for hosting reward models without weight update#1055

Add frozen model inference engine support for hosting reward models without weight update#1055
ghShu wants to merge 1 commit intoNovaSky-AI:mainfrom
ghShu:gshu/extend-inference-engine-for-frozen-reward-model

ghShu commented Feb 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

ghShu commented Feb 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant