Skip to content

dmis-lab/SciCON

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Scicon

Code for scientific figure multiple-choice QA with contrastive decoding.

Method Overview

SciCon overview

SciCon is a simple contrastive decoding method for scientific figure multiple-choice QA.

  • The model first scores each answer candidate with the full multimodal input.
  • It then scores the same candidates again using a text-only version of the question.
  • SciCon subtracts the text-only prior, scaled by alpha, from the multimodal score.
  • This suppresses answers that are mainly favored by textual bias and promotes answers grounded in the figure.

In short, SciCon turns answer choices into an explicit prior and removes that prior during decoding so that the final prediction relies more on visual evidence.


What Is Included

  • evaluation script for contrastive decoding over answer candidates
  • automatic dataset path discovery under data/
  • support for mac, scifi, and mmsci
  • OpenAI-compatible API inference

What Is Not Included

  • dataset files
  • model weights
  • training code
  • built-in model serving

Data

This repository does not bundle the datasets. If you want to run the released script on the same benchmarks, use the following dataset repositories:

Place the prepared files under data/. The script will try to auto-detect standard layouts, and you can also pass paths manually through command-line arguments.

Example layout:

data/
  MAC_Bench/
    test.jsonl
  images/
    MAC_Bench/
      ...
  scifi/
    test.parquet
  mmsci/
    test.json
    images/
      ...

Setup

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Quick Start

Run against any OpenAI-compatible VLM endpoint:

python src/run_always_contrastive_all_candidate.py \
  --dataset mac \
  --api-base http://127.0.0.1:30000/v1 \
  --output-jsonl results/mac_predictions.jsonl

Supported datasets:

  • mac
  • scifi
  • mmsci

If --api-model is not provided, the script tries to auto-detect it from /v1/models.

Serving Backends

The script expects an OpenAI-compatible API for a vision-language model.

Typical options:

  • sglang
  • vLLM
  • other compatible servers exposing /v1/models and chat/completions APIs

Example: sglang

python src/run_always_contrastive_all_candidate.py \
  --dataset mac \
  --api-base http://127.0.0.1:30000/v1 \
  --output-jsonl results/mac_predictions.jsonl

Example: vLLM

python src/run_always_contrastive_all_candidate.py \
  --dataset scifi \
  --api-base http://127.0.0.1:8000/v1 \
  --api-model your-vlm-name \
  --output-jsonl results/scifi_predictions.jsonl

Notes:

  • the served model must support image input
  • if /v1/models is unavailable or empty, set --api-model explicitly
  • some VLMs need serving-side options such as chat templates or multimodal limits

More Examples

Smoke test on a small subset:

python src/run_always_contrastive_all_candidate.py \
  --dataset mac \
  --api-base http://127.0.0.1:30000/v1 \
  --max-samples 10 \
  --output-jsonl results/mac_smoke.jsonl

Explicit MMSci paths:

python src/run_always_contrastive_all_candidate.py \
  --dataset mmsci \
  --input-mmsci-json data/mmsci/test.json \
  --image-root data/mmsci/images \
  --api-base http://127.0.0.1:30000/v1 \
  --output-jsonl results/mmsci_predictions.jsonl

Output

By default, outputs are written to:

results/<dataset>_predictions.jsonl

Override this with --output-jsonl if needed.


Citation

If you use this repository in your research, please cite:

@article{roh2026choices,
  title={When Choices Become Priors: Contrastive Decoding for Scientific Figure Multiple-Choice QA},
  author={Roh, Taeyun and Jo, Eun-yeong and Jang, Wonjune and Kang, Jaewoo},
  journal={arXiv preprint arXiv:2603.28026},
  year={2026}
}

Paper

When Choices Become Priors: Contrastive Decoding for Scientific Figure Multiple-Choice QA
Taeyun Roh, Eun-yeong Jo, Wonjune Jang, Jaewoo Kang

This repository contains the evaluation code accompanying the paper and is intended as a lightweight research release for scientific figure multiple-choice QA.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages