Skip to content

feat: mmlu_pro and mmlu_prox benchmarks#988

Merged
bxyu-nvidia merged 7 commits intomainfrom
fsiino/mmlu-pro
Apr 5, 2026
Merged

feat: mmlu_pro and mmlu_prox benchmarks#988
bxyu-nvidia merged 7 commits intomainfrom
fsiino/mmlu-pro

Conversation

@fsiino-nvidia
Copy link
Copy Markdown
Contributor

@fsiino-nvidia fsiino-nvidia commented Mar 31, 2026

Signed-off-by: Frankie Siino <fsiino@nvidia.com>
Signed-off-by: Frankie Siino <fsiino@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Mar 31, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

Signed-off-by: Frankie Siino <fsiino@nvidia.com>
Signed-off-by: Frankie Siino <fsiino@nvidia.com>
@fsiino-nvidia fsiino-nvidia changed the title feat: mmlu_pro benchmark feat: mmlu_pro and mmlu_prox benchmarks Apr 2, 2026
Signed-off-by: Frankie Siino <fsiino@nvidia.com>
Signed-off-by: Frankie Siino <fsiino@nvidia.com>
@fsiino-nvidia fsiino-nvidia marked this pull request as ready for review April 3, 2026 20:26
Signed-off-by: Frankie Siino <fsiino@nvidia.com>
@bxyu-nvidia bxyu-nvidia merged commit b7b3398 into main Apr 5, 2026
6 checks passed
@bxyu-nvidia bxyu-nvidia deleted the fsiino/mmlu-pro branch April 5, 2026 19:58
kajalj22 pushed a commit that referenced this pull request Apr 7, 2026
mmlu-pro: https://wandb.ai/nvidia/fsiino-gym-dev/runs/mi6p08ns
83.90957446808511

mmlu-prox: https://wandb.ai/nvidia/fsiino-gym-dev/runs/fxhaochj
70.33903109674858

---------

Signed-off-by: Frankie Siino <fsiino@nvidia.com>
cmunley1 pushed a commit that referenced this pull request Apr 8, 2026
mmlu-pro: https://wandb.ai/nvidia/fsiino-gym-dev/runs/mi6p08ns
83.90957446808511

mmlu-prox: https://wandb.ai/nvidia/fsiino-gym-dev/runs/fxhaochj
70.33903109674858

---------

Signed-off-by: Frankie Siino <fsiino@nvidia.com>
Signed-off-by: cmunley1 <cmunley@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants