채점 기능 구현 by w8385 · Pull Request #57 · Coduck-Team/coduck-backend

w8385 · 2026-01-27T04:15:21Z

Closes #56

Summary

본 PR은 Docker + isolate 기반 채점(Grading) 시스템을 백엔드에 도입합니다.

채점 실행 환경을 Docker → isolate → grader의 3단 구조로 분리하여,
보안·재현성·확장성을 동시에 확보하는 것을 목표로 합니다.

uploads.zip
위 파일을 프로젝트 루트 디렉터리에 압축해제하여 테스트 가능합니다.

Module Diagram

flowchart TD
    A[HTTP Handler] --> B[Grader]
    B --> C[Isolate]
    C --> D[Docker Exec / Cp]

    %% isolate internals
    subgraph Isolate Module
        C1[init]
        C2[copy]
        C3[compile]
        C4[execute]
        C5[cleanup]
    end

    C --> C1
    C --> C2
    C --> C3
    C --> C4
    C --> C5

    %% compile / execute configs
    B --> E[LanguageRegistry]
    E --> F[CompileConfig]
    E --> G[ExecuteConfig]

    %% errors
    D -->|DockerError| C
    C -->|IsolateError| B
    B -->|JudgeError| A

Architecture

Docker
최하위 계층으로 docker exec, cp를 래핑하는 어댑터
채점 환경 고정 및 1차 격리
isolate
중간 계층으로 docker 인터페이스를 사용하여 isolate 명령어를 수행함
untrusted code 실행을 위한 샌드박싱
프로세스 / 파일 접근 제한
추후) 시간 / 메모리 제한
grader
최상위 계층으로 isolate 인터페이스만 사용함
컴파일 → 실행 → 결과 수집 오케스트레이션
LanguageRegistry
언어별 컴파일/실행 명령어 프로필 생성
YAML 등으로 깔끔하게 하고 싶었으나 일단은 grader/config.rs 내부에 하드코딩
CompileConfig, ExecuteConfig를 관리합니다.

Handler Explain

1. `initialize_isolate`

POST /problems/{problem_id}/init

Docker 기반 isolate sandbox 초기화
로컬 업로드 디렉터리 내용을 sandbox box로 복사
isolate 내부
- /box/ 생성
- uploads/{problem_id}/ 하위 파일들이 box 루트로 복사됨

2. `compile_file`

POST /problems/{problem_id}/{category}/{filename}/compile?language=cpp

box 내부에서 컴파일 수행
컴파일 산출물 경로를 기준으로 실행파일 회수
isolate 내부
- 컴파일 결과물 생성 (executable 또는 class 파일)
로컬
- uploads/{problem_id}/{executable} 생성

3. `execute_file`

POST /problems/{problem_id}/{category}/{filename}/execute?language=cpp

isolate box 내부에서 실행
stdout / stderr / exit code 수집

{
    "exit_code": 0,
    "message": "File hello executed successfully in category solutions for problem ID 0",
    "stderr": "OK (0.002 sec real, 0.006 sec wall)\n",
    "stdout": "Hello, World!\n"
}

4. `generator`

POST /problems/{problem_id}/generate?count=10

gen.cpp 컴파일
각 테스트에 대해
- 입력 생성
- 정답 솔루션 실행
- 결과 파일 회수

isolate 내부
- tests/input/*.in
- tests/answer/*.a
로컬
- uploads/{problem_id}/tests/input/*.in
- uploads/{problem_id}/tests/answer/*.a

{
    "message": "Generated 10 test cases using gen for problem ID 0"
}

5. `checker`

POST /problems/{problem_id}/{category}/{filename}/check?language=cpp

제출 코드 컴파일
wcmp.cpp 컴파일
각 테스트에 대해
- 제출 코드 실행 → output 생성
- wcmp 실행 → verdict 판단

isolate 내부
- tests/output/*.out 생성
로컬
- 파일 회수 없음 (결과는 JSON으로만 반환)

{
    "test_0": {
        "verdict": "ok \"1684907\"\nOK (0.001 sec real, 0.001 sec wall)\n"
    },
    ...
    "test_9": {
        "verdict": "ok \"992346\"\nOK (0.001 sec real, 0.001 sec wall)\n"
    }
}

6. `cleanup_isolate`

DELETE /problems/{problem_id}/cleanup

isolate box 삭제

Result

보안이 강화된 채점 환경
언어 추가 시 코드 수정 최소화
향후 k8s등 분산 채점 워커로 확장 가능한 구조

arg, stdin, stdout

…e copy functions

…tructs

Copilot

Pull request overview

This PR implements a Docker + isolate based grading system for competitive programming problems. The implementation uses a 3-tier architecture (Docker → isolate → grader) to provide secure, isolated code execution with compile, execute, test generation, and checking capabilities.

Changes:

Added grading infrastructure with Docker and isolate sandbox integration
Implemented 6 new HTTP endpoints for problem compilation, execution, test generation, checking, and sandbox management
Created modular architecture with separate docker, isolate, and grader layers with language-specific compilation/execution configurations

Reviewed changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 42 comments.

Show a summary per file

File	Description
src/lib.rs	Added judge_manager routes and module declarations
src/judge_manager/mod.rs	Module definition for judge manager handlers
src/judge_manager/handlers.rs	HTTP handlers for initialize, compile, execute, generator, checker, and cleanup operations
src/isolate/path.rs	Helper for constructing isolate box paths
src/isolate/mod.rs	Isolate wrapper module with init, compile, execute, copy, and cleanup operations
src/isolate/init.rs	Isolate box initialization implementation
src/isolate/execute.rs	Code execution within isolate sandbox
src/isolate/copy.rs	File copying to/from isolate box
src/isolate/compile.rs	Code compilation within isolate sandbox
src/isolate/cleanup.rs	Isolate box cleanup implementation
src/isolate/box_id.rs	BoxId type wrapper for isolate box identification
src/grader/result.rs	Result types for compile and execute operations
src/grader/mod.rs	Grader module definition
src/grader/grader.rs	Grader trait defining the grading interface
src/grader/docker_isolate.rs	DockerIsolateGrader implementation of Grader trait
src/grader/config.rs	Language-specific compilation and execution configurations
src/errors/mod.rs	Added docker, isolate, and judge error modules
src/errors/judge.rs	JudgeError enum and IntoResponse implementation
src/errors/isolate.rs	IsolateError enum with Display implementation
src/errors/docker.rs	DockerError enum with Display implementation
src/docker/mod.rs	Docker module definition
src/docker/exec.rs	Docker exec command wrapper with output handling
src/docker/cp.rs	Docker cp command wrapper for file copying
docker-compose.yml	Docker Compose configuration for grader container
Dockerfile	Docker image with gcc, Python, Java, isolate, and testlib
Cargo.toml	Added async-trait dependency

Comments suppressed due to low confidence (1)

src/judge_manager/handlers.rs:206

The new judge_manager module lacks test coverage. Given that the existing codebase has comprehensive tests for the file_manager module (see tests/file_manager/handlers.rs), similar test coverage should be added for judge_manager handlers to maintain consistency. Consider adding integration tests for initialize_isolate, compile_file, execute_file, generator, checker, and cleanup_isolate.

use crate::errors::JudgeError;
use crate::file_manager::Language;
use crate::grader::config::{CompileConfig, ExecuteConfig};
use crate::grader::docker_isolate::DockerIsolateGrader;
use crate::grader::grader::Grader;
use axum::extract::Query;
use axum::{extract::Path, response::IntoResponse, Json};
use serde::Deserialize;

const UPLOAD_DIR: &str = "uploads";

pub async fn initialize_isolate(
    Path(problem_id): Path<u32>,
) -> Result<impl IntoResponse, JudgeError> {
    let grader = DockerIsolateGrader::new("coduck-grader");
    grader.init(problem_id).await?;
    grader
        .copy_to_box(problem_id, &format!("{}/{}/.", UPLOAD_DIR, problem_id), ".")
        .await?;

    Ok(Json(serde_json::json!({
        "message": format!("Sandbox initialized for problem ID {}", problem_id)
    })))
}

#[derive(Deserialize)]
pub struct OptionLanguage {
    language: Option<Language>,
}

pub async fn compile_file(
    Path((problem_id, category, filename)): Path<(u32, String, String)>,
    params: Query<OptionLanguage>,
) -> Result<impl IntoResponse, JudgeError> {
    let language = params.language.clone().unwrap();

    let grader = DockerIsolateGrader::new("coduck-grader");
    let result = grader
        .compile(
            problem_id,
            CompileConfig::new(&category, &filename, language),
        )
        .await?;

    let output_path = result.output.clone();
    grader
        .copy_from_box(
            problem_id,
            &output_path,
            &format!("{}/{}/{}", UPLOAD_DIR, problem_id, output_path),
        )
        .await?;

    Ok(Json(serde_json::json!({
        "message": format!("File {} compiled successfully in category {} for problem ID {}", filename, category, problem_id),
    })))
}

pub async fn execute_file(
    Path((problem_id, category, filename)): Path<(u32, String, String)>,
    params: Query<OptionLanguage>,
) -> Result<impl IntoResponse, JudgeError> {
    let language = params.language.clone().unwrap();

    let grader = DockerIsolateGrader::new("coduck-grader");
    let result = grader
        .execute(
            problem_id,
            ExecuteConfig::new(&category, &filename, language),
        )
        .await?;

    Ok(Json(serde_json::json!({
        "message": format!("File {} executed successfully in category {} for problem ID {}", filename, category, problem_id),
        "stdout": result.stdout,
        "stderr": result.stderr,
        "exit_code": result.exit_code,
    })))
}

#[derive(Deserialize)]
pub struct OptionGenerateCount {
    count: Option<u32>,
}

pub async fn generator(
    Path(problem_id): Path<u32>,
    params: Query<OptionGenerateCount>,
) -> Result<impl IntoResponse, JudgeError> {
    let count = params.count.unwrap_or(10);

    let grader = DockerIsolateGrader::new("coduck-grader");
    grader
        .compile(
            problem_id,
            CompileConfig::new("files", "gen.cpp", Language::Cpp),
        )
        .await?;

    for i in 0..count {
        grader
            .execute(
                problem_id,
                ExecuteConfig::new("files", "gen", Language::Cpp)
                    .stdout(&format!("tests/input/{:02}.in", i))
                    .arg(&i.to_string()),
            )
            .await?;

        grader
            .copy_from_box(
                problem_id,
                &format!("tests/input/{:02}.in", i),
                &format!("{}/{}/tests/input/{:02}.in", UPLOAD_DIR, problem_id, i),
            )
            .await?;

        grader
            .execute(
                problem_id,
                ExecuteConfig::new("solutions", "mcs", Language::Cpp)
                    .stdin(&format!("tests/input/{:02}.in", i))
                    .stdout(&format!("tests/answer/{:02}.a", i)),
            )
            .await?;

        grader
            .copy_from_box(
                problem_id,
                &format!("tests/answer/{:02}.a", i),
                &format!("{}/{}/tests/answer/{:02}.a", UPLOAD_DIR, problem_id, i),
            )
            .await?;
    }

    Ok(Json(serde_json::json!({
        "message": format!("Generated {} test cases using {} for problem ID {}", count, "gen", problem_id)
    })))
}

pub async fn checker(
    Path((problem_id, category, filename)): Path<(u32, String, String)>,
    params: Query<OptionLanguage>,
) -> Result<impl IntoResponse, JudgeError> {
    let language = params.language.clone().unwrap();

    let grader = DockerIsolateGrader::new("coduck-grader");
    let submission_result = grader
        .compile(
            problem_id,
            CompileConfig::new(&category, &filename, language),
        )
        .await?
        .output
        .clone();
    let executable = submission_result.split('/').last().unwrap();
    println!("submission_result: {}", submission_result);
    println!("executable: {}", executable);

    grader
        .compile(
            problem_id,
            CompileConfig::new("files", "wcmp.cpp", Language::Cpp),
        )
        .await?;
    println!("check compiled");

    let mut json_result = Json(serde_json::json!({}));
    for i in 0..10 {
        grader
            .execute(
                problem_id,
                ExecuteConfig::new(&category, &executable, Language::Cpp)
                    .stdin(&format!("tests/input/{:02}.in", i))
                    .stdout(&format!("tests/output/{:02}.out", i)),
            )
            .await?;
        println!("executed test {}", i);

        let result = grader
            .execute(
                problem_id,
                ExecuteConfig::new("files", "wcmp", Language::Cpp)
                    .arg(&format!("tests/input/{:02}.in", i))
                    .arg(&format!("tests/output/{:02}.out", i))
                    .arg(&format!("tests/answer/{:02}.a", i)),
            )
            .await?;

        json_result.0[format!("test_{}", i)] = serde_json::json!({
            "verdict": result.stderr,
        });
    }

    Ok(json_result)
}

pub async fn cleanup_isolate(Path(problem_id): Path<u32>) -> Result<impl IntoResponse, JudgeError> {
    let grader = DockerIsolateGrader::new("coduck-grader");
    grader.cleanup(problem_id).await?;

    Ok(Json(serde_json::json!({
        "message": format!("Sandbox cleaned up for problem ID {}", problem_id)
    })))
}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-02T13:20:34Z

src/judge_manager/handlers.rs

+    params: Query<OptionLanguage>,
+) -> Result<impl IntoResponse, JudgeError> {
+    let language = params.language.clone().unwrap();
+


The .unwrap() call will panic if the language parameter is not provided. Consider returning a proper error response instead. For example, return a JudgeError variant indicating that the language parameter is required.

Copilot · 2026-02-02T13:20:34Z

src/judge_manager/handlers.rs

+
+    let mut json_result = Json(serde_json::json!({}));
+    for i in 0..10 {
+        grader


The number of test cases is hardcoded to 10. This should match the test count used in the generator function or be configurable. Consider accepting this as a parameter or reading it from a configuration.

docker-compose.yml

Copilot · 2026-02-02T13:20:35Z

src/judge_manager/handlers.rs

+        .compile(
+            problem_id,
+            CompileConfig::new("files", "wcmp.cpp", Language::Cpp),
+        )


The checker filename "wcmp.cpp" is hardcoded. Consider making this configurable or passing it as a parameter to support different checker implementations.

Copilot · 2026-02-02T13:20:35Z

src/grader/config.rs

+                "-Xss512m",
+                "Main",
+            ],
+            _ => vec![],


The match arms for unsupported languages return an empty vector. This could lead to silent failures where no execution occurs. Consider returning an error for unsupported languages instead of silently doing nothing.

Copilot · 2026-02-02T13:20:44Z

src/judge_manager/handlers.rs

+    Path((problem_id, category, filename)): Path<(u32, String, String)>,
+    params: Query<OptionLanguage>,
+) -> Result<impl IntoResponse, JudgeError> {
+    let language = params.language.clone().unwrap();


The .unwrap() call will panic if the language parameter is not provided. Consider returning a proper error response instead. For example, return a JudgeError variant indicating that the language parameter is required.

Suggested change

let language = params.language.clone().unwrap();

let language = params

.language

.clone()

.ok_or(JudgeError::InvalidRequest(

"language parameter is required".to_string(),

))?;

Copilot · 2026-02-02T13:20:44Z

src/judge_manager/handlers.rs

+            )
+            .await?;
+        println!("executed test {}", i);
+


Debug print statements should be removed or replaced with proper logging using a logging framework (e.g., tracing or log). These println statements are not appropriate for production code.

Copilot · 2026-02-02T13:20:44Z

src/judge_manager/handlers.rs

+    let grader = DockerIsolateGrader::new("coduck-grader");
+    grader.init(problem_id).await?;
+    grader
+        .copy_to_box(problem_id, &format!("{}/{}/.", UPLOAD_DIR, problem_id), ".")


The problem_id parameter is used directly in file paths without validation. Consider validating that problem_id doesn't contain path traversal sequences (e.g., "..", "/") to prevent potential security issues where malicious input could access files outside the intended directory.

Copilot · 2026-02-02T13:20:44Z

Dockerfile

+RUN git clone https://github.com/ioi/isolate.git
+RUN cd isolate && make install


The RUN git clone https://github.com/ioi/isolate.git step fetches and installs a third-party tool directly from a mutable Git branch without pinning to a specific commit or verifying integrity, which exposes the build to supply-chain compromise if the repository or network is tampered with. An attacker who gains control over that Git ref could inject arbitrary code into the image and thus the grading environment. Pin this dependency to an immutable commit or signed release and add integrity verification (or vendor it) instead of cloning the moving default branch at build time.

Copilot · 2026-02-02T13:20:45Z

Dockerfile

+RUN curl -L https://raw.githubusercontent.com/MikeMirzayanov/testlib/master/testlib.h \
+    -o /usr/include/testlib.h


The RUN curl -L https://raw.githubusercontent.com/MikeMirzayanov/testlib/master/testlib.h step downloads source code from a mutable remote branch without any checksum or signature verification, creating a supply-chain risk if that URL or branch is compromised. Because this header will be compiled into all judged programs, a malicious change upstream could result in arbitrary code execution inside your grading environment. Pin this asset to an immutable version (e.g., a specific commit) and verify its integrity via a known hash or signature instead of relying on the moving master branch.

reddevilmidzy

수고하셨습니다, 유닛테스트와 e2e 테스트도 짜주세요 🙏

src/docker/cp.rs

w8385 added 17 commits January 27, 2026 12:55

feat: /judge/init

4764810

refactor: remove unnecessary IntoResponse implementations

cfc6d35

refactor: migrate judge components to grader module

2b84dcd

feat: /judge/compile

7ce815f

feat: /judge/cleanup

3f3a929

feat: /judge/execute

72e6003

fix: add classpath option to Java execution configuration

efdcad0

refactor: add ExecuteConfig builders

4a08ead

arg, stdin, stdout

feat: include output path in CompileResult

423b696

refactor: update copy functions to use string paths instead of Path

8636864

refactor: replace PathBuf with String for box_root function and updat…

b048d58

…e copy functions

feat: /judge/generator

c938c50

refactor: change arg method to &str instead of String

75dde35

feat: /judge/checker

7b590e8

refactor: remove unused error variants from JudgeError enum

61cdecf

refactor: remove unused fields from CompileConfig and ExecuteConfig s…

47c17be

…tructs

refactor: remove unused stdout field from compilation response

bd0387e

w8385 requested review from Copilot, reddevilmidzy and utilForever February 2, 2026 13:11

Copilot started reviewing on behalf of w8385 February 2, 2026 13:11 View session

w8385 marked this pull request as ready for review February 2, 2026 13:11

w8385 added 2 commits February 2, 2026 22:15

refactor: remove unused stdout field from CompileResult struct

2a88bdc

refactor: remove unused stdout field from CompileResult in compile.rs

f30bff8

Copilot AI reviewed Feb 2, 2026

View reviewed changes

reddevilmidzy requested changes Feb 2, 2026

View reviewed changes

src/docker/cp.rs Show resolved Hide resolved

w8385 added 3 commits February 2, 2026 22:56

refactor: simplify docker cp operations by extracting helper functions

577c92c

refactor: clean up Dockerfile by removing apt cache after installation

c573429

refactor: handle Docker command spawn errors with proper error mapping

75ba7b4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

채점 기능 구현#57

채점 기능 구현#57
w8385 wants to merge 22 commits intomainfrom
56-judge

w8385 commented Jan 27, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 2, 2026

Uh oh!

Copilot AI Feb 2, 2026

Uh oh!

Uh oh!

Copilot AI Feb 2, 2026

Uh oh!

Copilot AI Feb 2, 2026

Uh oh!

Copilot AI Feb 2, 2026

Uh oh!

Copilot AI Feb 2, 2026

Uh oh!

Copilot AI Feb 2, 2026

Uh oh!

Copilot AI Feb 2, 2026

Uh oh!

Copilot AI Feb 2, 2026

Uh oh!

reddevilmidzy left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		RUN git clone https://github.com/ioi/isolate.git
		RUN cd isolate && make install

		RUN curl -L https://raw.githubusercontent.com/MikeMirzayanov/testlib/master/testlib.h \
		-o /usr/include/testlib.h

Conversation

w8385 commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Module Diagram

Architecture

Handler Explain

1. initialize_isolate

2. compile_file

3. execute_file

4. generator

5. checker

6. cleanup_isolate

Result

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

reddevilmidzy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

w8385 commented Jan 27, 2026 •

edited

Loading

1. `initialize_isolate`

2. `compile_file`

3. `execute_file`

4. `generator`

5. `checker`

6. `cleanup_isolate`