feature: prime train routes full_finetune TOMLs to hosted endpoint by JannikSt · Pull Request #592 · PrimeIntellect-ai/prime

JannikSt · 2026-05-02T23:30:48Z

Note

Medium Risk
Adds new run-dispatch and deletion paths hitting /training/runs and changes log routing for multiple components; mistakes could start/stop the wrong run type or fetch incorrect logs, but changes are scoped to CLI command behavior.

Overview
prime train now detects full finetune TOMLs (type="full_finetune" or [deployment] GPU fields) and dispatches them to a new hosted full-FT endpoint (POST /training/runs) instead of the existing LoRA shared /rft/runs flow, including team association and env-file secret collection while intentionally not printing returned per-run tokens.

Run deletion now first attempts the hosted full-FT delete (DELETE /training/runs/{id}) and falls back to the LoRA path only on typed NotFoundError (new 404 exception class). Log fetching is expanded to support component-scoped logs (orchestrator/trainer/inference/env-server) with env_name selection, and RLRun gains an optional kind discriminator to represent these run types.

^{Reviewed by Cursor Bugbot for commit 2d15d7a. Bugbot is set up for automated code reviews on this repo. Configure here.}

…o hosted endpoint `prime train <toml>` stays the single entry point. When the TOML carries `type = "full_finetune"` (or a `[hosted]` block, or a `[deployment]` block matching prime-rl's qwen30b_math/rl.toml shape), the CLI routes to the new public API at /api/v1/training/runs instead of the LoRA shared-cluster path. Backwards compatible: configs without these markers run through the existing flow unchanged. * api/training.py: new HostedTrainingClient + build_payload_from_toml (whitelist-maps prime-rl example fields onto the API payload). * api/rl.py: surface `kind` on RLRun so `prime train delete` can route to the right endpoint based on run kind. * commands/rl.py: peek the TOML before strict RLConfig parse; on full-FT hand off to _dispatch_full_finetune_run with shared env-file/secrets plumbing. Delete looks up kind and dispatches via HostedTrainingClient for DEDICATED_FULL_FT runs. Tested end-to-end against local backend on rft-freyr cluster: dispatch + status mirroring + completion + delete all clean.

Drop the friction of looking up a cluster cuid for the common single- cluster setup. Backend now auto-picks the first uncordoned PrimeCluster when the field is omitted, so `prime train backend/examples/training/ reverse_text.toml` is zero-config.

…eyRef The platform materialises the per-run RFT API token into the per-run k8s Secret on dispatch and the chart binds it to the orchestrator pod's PRIME_API_KEY env var. The token already lives where prime-monitor needs it — surfacing it on stdout just makes it easy to leak into shared shell history or CI logs.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7b88b5f7d0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

- Drop prime_cluster_id from CLI plumbing entirely. Backend auto-picks the first uncordoned PrimeCluster; the CLI never threads a cluster id, removing a footgun (mistargeting a config to the wrong cluster). Drops the [hosted] discriminator path too — type = 'full_finetune' or a [deployment] block remain the only triggers. - codex P1 / cursor: --output json on the full-FT path no longer short-circuits with a {would_dispatch} preview. Mirrors the LoRA path's 'create then format' contract — automation that pipes the JSON to grab run_id now actually dispatches the run. - codex P2: env_file (deprecated, singular) is loaded BEFORE env_files (canonical, plural) so env_files wins on key collision. Matches the LoRA path's documented precedence.

build_payload_from_toml used to whitelist ~12 individual fields; the backend rebuilt a minimal TOML from them, dropping anything outside the whitelist (custom optim schedules, eval configs, custom scheduler params, …). E2E prime-rl runs that depended on those knobs silently diverged from `uv run rl @ rl.toml` behaviour. Now: ship the whole TOML as 'config' (companion to platform PR #1824 faa934d56). The backend's build_values takes the config dict directly so the same TOML works locally with prime-rl and remote-dispatched through prime-cli with no fork in behaviour.

Unrelated to the full-FT training payload change; accidentally picked up by 'git add -A'. Restoring inference.py to its 4b9be8f state. The streaming improvements will land in a sibling PR off main.

Cursor caught: when rl_client.get_run fails (e.g. pydantic ValidationError because a DEDICATED_FULL_FT row doesn't carry the LoRA-required RLRun fields), the prior except APIError block silently set kind=None and routed delete to the LoRA endpoint. The hosted helm release + namespace would stay live with no signal back. Restructure: try hosted_client.delete_run first; on HTTP 404 the backend's kind gate told us 'not a DEDICATED_FULL_FT', so we fall through to the LoRA path. Removes the get_run discriminator roundtrip entirely — and any pydantic surprise it could have raised.

Cursor flagged: distinguishing 404 by 'HTTP 404' not in str(e) is fragile — depends on the message format never changing AND on no unrelated error body coincidentally containing the substring. - Add NotFoundError(APIError) subclass; APIClient raises it for 404 responses (sync + async paths). - HostedTrainingClient.create_run / delete_run no longer catch Exception and rewrap into a generic APIError — typed APIError subclasses propagate so callers can branch by class. - prime train delete uses except NotFoundError as the 'not a hosted run, try LoRA' fallback signal.

Cursor caught: `if output != "json" and not yes` meant a researcher piping JSON for scripting would have full-FT runs auto-launched without ack. The LoRA path always prompts unless --yes regardless of output format. Match that contract — gate confirmation purely on --yes.

`_dispatch_full_finetune_run` was building the payload without a teamId, so every dispatched run landed on the caller's personal account even when `prime config view` showed a team selected. The LoRA path (rl.py:1216) already passes `app_config.team_id` — match that behaviour for the full-FT path. Backend's `CreateDedicatedRunRequest.team_id` is Optional, so the prior omission was silently wrong instead of returning a 400.

Backend's /api/v1/rft/runs/{run_id}/logs now accepts component + env_name params (dedicated full-FT). Surface them through the CLI: prime train logs <run_id> -c trainer prime train logs <run_id> -c inference prime train logs <run_id> -c env-server --env <name> Legacy --env <name>/<idx> still routes through the env-server-logs endpoint (shared-RFT pods, cluster_id-backed lookup). Dedicated env-server (slug, no slash) goes through the unified /logs route. Per-rank --pod-index intentionally not exposed yet: the chart's torchrun --local-ranks-filter=0 already collapses in-pod rank fan-out to rank 0 stdout, and Loki's pod-label indexing in this tenant doesn't actually filter the prime-job-* streams — per-pod inspection on multi-node runs is kubectl + the PVC log files for now.

…full-finetune # Conflicts: # packages/prime/src/prime_cli/api/rl.py # packages/prime/src/prime_cli/commands/rl.py

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 2d15d7a. Configure here.}

cursor · 2026-05-19T07:40:43Z

+                    env_name=env_name_q,
+                    env_index=env_index_q,
+                    tail_lines=t,
+                )


Legacy env-server log filters silently dropped

Medium Severity

The new legacy shared-RFT env-server branch (env with "/" qualifier) no longer passes search, regex, normalized_level, or since_seconds to rl_client.get_env_server_logs, even though the method accepts all four. The old code forwarded these filters, so users combining --env name/1 with --search, --level, or --since will now have their filters silently ignored — a regression from the refactor.

^{Reviewed by Cursor Bugbot for commit 2d15d7a. Configure here.}

JannikSt added 3 commits May 1, 2026 11:28

chatgpt-codex-connector Bot reviewed May 2, 2026

View reviewed changes

Comment thread packages/prime/src/prime_cli/commands/rl.py Outdated

Comment thread packages/prime/src/prime_cli/commands/rl.py Outdated

cursor Bot reviewed May 2, 2026

View reviewed changes

Comment thread packages/prime/src/prime_cli/commands/rl.py Outdated

JannikSt added 3 commits May 2, 2026 16:45

fix: move inference reasoning-content streaming changes off this PR

3142f87

Unrelated to the full-FT training payload change; accidentally picked up by 'git add -A'. Restoring inference.py to its 4b9be8f state. The streaming improvements will land in a sibling PR off main.

cursor Bot reviewed May 3, 2026

View reviewed changes

Comment thread packages/prime/src/prime_cli/commands/rl.py Outdated

cursor Bot reviewed May 3, 2026

View reviewed changes

Comment thread packages/prime/src/prime_cli/commands/rl.py Outdated

cursor Bot reviewed May 3, 2026

View reviewed changes

Comment thread packages/prime/src/prime_cli/commands/rl.py

JannikSt added 2 commits May 5, 2026 18:23

JannikSt requested review from JohannesHa, burnpiro, d42me, kcoopermiller and willccbb as code owners May 19, 2026 00:05

willccbb previously approved these changes May 19, 2026

View reviewed changes

JannikSt dismissed willccbb’s stale review via 2fe14da May 19, 2026 07:31

Merge remote-tracking branch 'origin/main' into feature/train-hosted-…

2d15d7a

…full-finetune # Conflicts: # packages/prime/src/prime_cli/api/rl.py # packages/prime/src/prime_cli/commands/rl.py

cursor Bot reviewed May 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature: prime train routes full_finetune TOMLs to hosted endpoint#592

feature: prime train routes full_finetune TOMLs to hosted endpoint#592
JannikSt wants to merge 12 commits into
mainfrom
feature/train-hosted-full-finetune

JannikSt commented May 2, 2026 •

edited by cursor Bot

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JannikSt commented May 2, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 19, 2026

Choose a reason for hiding this comment

Legacy env-server log filters silently dropped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JannikSt commented May 2, 2026 •

edited by cursor Bot

Loading