Skip to content

채점 기능 구현#57

Open
w8385 wants to merge 22 commits intomainfrom
56-judge
Open

채점 기능 구현#57
w8385 wants to merge 22 commits intomainfrom
56-judge

Conversation

@w8385
Copy link
Member

@w8385 w8385 commented Jan 27, 2026

Closes #56

Summary

본 PR은 Docker + isolate 기반 채점(Grading) 시스템을 백엔드에 도입합니다.

채점 실행 환경을 Docker → isolate → grader의 3단 구조로 분리하여,
보안·재현성·확장성을 동시에 확보하는 것을 목표로 합니다.

uploads.zip
위 파일을 프로젝트 루트 디렉터리에 압축해제하여 테스트 가능합니다.


Module Diagram

flowchart TD
    A[HTTP Handler] --> B[Grader]
    B --> C[Isolate]
    C --> D[Docker Exec / Cp]

    %% isolate internals
    subgraph Isolate Module
        C1[init]
        C2[copy]
        C3[compile]
        C4[execute]
        C5[cleanup]
    end

    C --> C1
    C --> C2
    C --> C3
    C --> C4
    C --> C5

    %% compile / execute configs
    B --> E[LanguageRegistry]
    E --> F[CompileConfig]
    E --> G[ExecuteConfig]

    %% errors
    D -->|DockerError| C
    C -->|IsolateError| B
    B -->|JudgeError| A
Loading

Architecture

  • Docker
    최하위 계층으로 docker exec, cp를 래핑하는 어댑터
    채점 환경 고정 및 1차 격리

  • isolate
    중간 계층으로 docker 인터페이스를 사용하여 isolate 명령어를 수행함
    untrusted code 실행을 위한 샌드박싱
    프로세스 / 파일 접근 제한
    추후) 시간 / 메모리 제한

  • grader
    최상위 계층으로 isolate 인터페이스만 사용함
    컴파일 → 실행 → 결과 수집 오케스트레이션

  • LanguageRegistry
    언어별 컴파일/실행 명령어 프로필 생성
    YAML 등으로 깔끔하게 하고 싶었으나 일단은 grader/config.rs 내부에 하드코딩
    CompileConfig, ExecuteConfig를 관리합니다.


Handler Explain

1. initialize_isolate

POST /problems/{problem_id}/init
  • Docker 기반 isolate sandbox 초기화
  • 로컬 업로드 디렉터리 내용을 sandbox box로 복사
  • isolate 내부
    • /box/ 생성
    • uploads/{problem_id}/ 하위 파일들이 box 루트로 복사됨

2. compile_file

POST /problems/{problem_id}/{category}/{filename}/compile?language=cpp
  • box 내부에서 컴파일 수행
  • 컴파일 산출물 경로를 기준으로 실행파일 회수
  • isolate 내부
    • 컴파일 결과물 생성 (executable 또는 class 파일)
  • 로컬
    • uploads/{problem_id}/{executable} 생성

3. execute_file

POST /problems/{problem_id}/{category}/{filename}/execute?language=cpp
  • isolate box 내부에서 실행
  • stdout / stderr / exit code 수집
{
    "exit_code": 0,
    "message": "File hello executed successfully in category solutions for problem ID 0",
    "stderr": "OK (0.002 sec real, 0.006 sec wall)\n",
    "stdout": "Hello, World!\n"
}

4. generator

POST /problems/{problem_id}/generate?count=10
  1. gen.cpp 컴파일
  2. 각 테스트에 대해
    • 입력 생성
    • 정답 솔루션 실행
    • 결과 파일 회수
  • isolate 내부
    • tests/input/*.in
    • tests/answer/*.a
  • 로컬
    • uploads/{problem_id}/tests/input/*.in
    • uploads/{problem_id}/tests/answer/*.a
{
    "message": "Generated 10 test cases using gen for problem ID 0"
}

5. checker

POST /problems/{problem_id}/{category}/{filename}/check?language=cpp
  1. 제출 코드 컴파일
  2. wcmp.cpp 컴파일
  3. 각 테스트에 대해
    • 제출 코드 실행 → output 생성
    • wcmp 실행 → verdict 판단
  • isolate 내부
    • tests/output/*.out 생성
  • 로컬
    • 파일 회수 없음 (결과는 JSON으로만 반환)
{
    "test_0": {
        "verdict": "ok \"1684907\"\nOK (0.001 sec real, 0.001 sec wall)\n"
    },
    ...
    "test_9": {
        "verdict": "ok \"992346\"\nOK (0.001 sec real, 0.001 sec wall)\n"
    }
}

6. cleanup_isolate

DELETE /problems/{problem_id}/cleanup
  • isolate box 삭제

Result

  • 보안이 강화된 채점 환경
  • 언어 추가 시 코드 수정 최소화
  • 향후 k8s등 분산 채점 워커로 확장 가능한 구조

@w8385 w8385 marked this pull request as ready for review February 2, 2026 13:11
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a Docker + isolate based grading system for competitive programming problems. The implementation uses a 3-tier architecture (Docker → isolate → grader) to provide secure, isolated code execution with compile, execute, test generation, and checking capabilities.

Changes:

  • Added grading infrastructure with Docker and isolate sandbox integration
  • Implemented 6 new HTTP endpoints for problem compilation, execution, test generation, checking, and sandbox management
  • Created modular architecture with separate docker, isolate, and grader layers with language-specific compilation/execution configurations

Reviewed changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 42 comments.

Show a summary per file
File Description
src/lib.rs Added judge_manager routes and module declarations
src/judge_manager/mod.rs Module definition for judge manager handlers
src/judge_manager/handlers.rs HTTP handlers for initialize, compile, execute, generator, checker, and cleanup operations
src/isolate/path.rs Helper for constructing isolate box paths
src/isolate/mod.rs Isolate wrapper module with init, compile, execute, copy, and cleanup operations
src/isolate/init.rs Isolate box initialization implementation
src/isolate/execute.rs Code execution within isolate sandbox
src/isolate/copy.rs File copying to/from isolate box
src/isolate/compile.rs Code compilation within isolate sandbox
src/isolate/cleanup.rs Isolate box cleanup implementation
src/isolate/box_id.rs BoxId type wrapper for isolate box identification
src/grader/result.rs Result types for compile and execute operations
src/grader/mod.rs Grader module definition
src/grader/grader.rs Grader trait defining the grading interface
src/grader/docker_isolate.rs DockerIsolateGrader implementation of Grader trait
src/grader/config.rs Language-specific compilation and execution configurations
src/errors/mod.rs Added docker, isolate, and judge error modules
src/errors/judge.rs JudgeError enum and IntoResponse implementation
src/errors/isolate.rs IsolateError enum with Display implementation
src/errors/docker.rs DockerError enum with Display implementation
src/docker/mod.rs Docker module definition
src/docker/exec.rs Docker exec command wrapper with output handling
src/docker/cp.rs Docker cp command wrapper for file copying
docker-compose.yml Docker Compose configuration for grader container
Dockerfile Docker image with gcc, Python, Java, isolate, and testlib
Cargo.toml Added async-trait dependency
Comments suppressed due to low confidence (1)

src/judge_manager/handlers.rs:206

  • The new judge_manager module lacks test coverage. Given that the existing codebase has comprehensive tests for the file_manager module (see tests/file_manager/handlers.rs), similar test coverage should be added for judge_manager handlers to maintain consistency. Consider adding integration tests for initialize_isolate, compile_file, execute_file, generator, checker, and cleanup_isolate.
use crate::errors::JudgeError;
use crate::file_manager::Language;
use crate::grader::config::{CompileConfig, ExecuteConfig};
use crate::grader::docker_isolate::DockerIsolateGrader;
use crate::grader::grader::Grader;
use axum::extract::Query;
use axum::{extract::Path, response::IntoResponse, Json};
use serde::Deserialize;

const UPLOAD_DIR: &str = "uploads";

pub async fn initialize_isolate(
    Path(problem_id): Path<u32>,
) -> Result<impl IntoResponse, JudgeError> {
    let grader = DockerIsolateGrader::new("coduck-grader");
    grader.init(problem_id).await?;
    grader
        .copy_to_box(problem_id, &format!("{}/{}/.", UPLOAD_DIR, problem_id), ".")
        .await?;

    Ok(Json(serde_json::json!({
        "message": format!("Sandbox initialized for problem ID {}", problem_id)
    })))
}

#[derive(Deserialize)]
pub struct OptionLanguage {
    language: Option<Language>,
}

pub async fn compile_file(
    Path((problem_id, category, filename)): Path<(u32, String, String)>,
    params: Query<OptionLanguage>,
) -> Result<impl IntoResponse, JudgeError> {
    let language = params.language.clone().unwrap();

    let grader = DockerIsolateGrader::new("coduck-grader");
    let result = grader
        .compile(
            problem_id,
            CompileConfig::new(&category, &filename, language),
        )
        .await?;

    let output_path = result.output.clone();
    grader
        .copy_from_box(
            problem_id,
            &output_path,
            &format!("{}/{}/{}", UPLOAD_DIR, problem_id, output_path),
        )
        .await?;

    Ok(Json(serde_json::json!({
        "message": format!("File {} compiled successfully in category {} for problem ID {}", filename, category, problem_id),
    })))
}

pub async fn execute_file(
    Path((problem_id, category, filename)): Path<(u32, String, String)>,
    params: Query<OptionLanguage>,
) -> Result<impl IntoResponse, JudgeError> {
    let language = params.language.clone().unwrap();

    let grader = DockerIsolateGrader::new("coduck-grader");
    let result = grader
        .execute(
            problem_id,
            ExecuteConfig::new(&category, &filename, language),
        )
        .await?;

    Ok(Json(serde_json::json!({
        "message": format!("File {} executed successfully in category {} for problem ID {}", filename, category, problem_id),
        "stdout": result.stdout,
        "stderr": result.stderr,
        "exit_code": result.exit_code,
    })))
}

#[derive(Deserialize)]
pub struct OptionGenerateCount {
    count: Option<u32>,
}

pub async fn generator(
    Path(problem_id): Path<u32>,
    params: Query<OptionGenerateCount>,
) -> Result<impl IntoResponse, JudgeError> {
    let count = params.count.unwrap_or(10);

    let grader = DockerIsolateGrader::new("coduck-grader");
    grader
        .compile(
            problem_id,
            CompileConfig::new("files", "gen.cpp", Language::Cpp),
        )
        .await?;

    for i in 0..count {
        grader
            .execute(
                problem_id,
                ExecuteConfig::new("files", "gen", Language::Cpp)
                    .stdout(&format!("tests/input/{:02}.in", i))
                    .arg(&i.to_string()),
            )
            .await?;

        grader
            .copy_from_box(
                problem_id,
                &format!("tests/input/{:02}.in", i),
                &format!("{}/{}/tests/input/{:02}.in", UPLOAD_DIR, problem_id, i),
            )
            .await?;

        grader
            .execute(
                problem_id,
                ExecuteConfig::new("solutions", "mcs", Language::Cpp)
                    .stdin(&format!("tests/input/{:02}.in", i))
                    .stdout(&format!("tests/answer/{:02}.a", i)),
            )
            .await?;

        grader
            .copy_from_box(
                problem_id,
                &format!("tests/answer/{:02}.a", i),
                &format!("{}/{}/tests/answer/{:02}.a", UPLOAD_DIR, problem_id, i),
            )
            .await?;
    }

    Ok(Json(serde_json::json!({
        "message": format!("Generated {} test cases using {} for problem ID {}", count, "gen", problem_id)
    })))
}

pub async fn checker(
    Path((problem_id, category, filename)): Path<(u32, String, String)>,
    params: Query<OptionLanguage>,
) -> Result<impl IntoResponse, JudgeError> {
    let language = params.language.clone().unwrap();

    let grader = DockerIsolateGrader::new("coduck-grader");
    let submission_result = grader
        .compile(
            problem_id,
            CompileConfig::new(&category, &filename, language),
        )
        .await?
        .output
        .clone();
    let executable = submission_result.split('/').last().unwrap();
    println!("submission_result: {}", submission_result);
    println!("executable: {}", executable);

    grader
        .compile(
            problem_id,
            CompileConfig::new("files", "wcmp.cpp", Language::Cpp),
        )
        .await?;
    println!("check compiled");

    let mut json_result = Json(serde_json::json!({}));
    for i in 0..10 {
        grader
            .execute(
                problem_id,
                ExecuteConfig::new(&category, &executable, Language::Cpp)
                    .stdin(&format!("tests/input/{:02}.in", i))
                    .stdout(&format!("tests/output/{:02}.out", i)),
            )
            .await?;
        println!("executed test {}", i);

        let result = grader
            .execute(
                problem_id,
                ExecuteConfig::new("files", "wcmp", Language::Cpp)
                    .arg(&format!("tests/input/{:02}.in", i))
                    .arg(&format!("tests/output/{:02}.out", i))
                    .arg(&format!("tests/answer/{:02}.a", i)),
            )
            .await?;

        json_result.0[format!("test_{}", i)] = serde_json::json!({
            "verdict": result.stderr,
        });
    }

    Ok(json_result)
}

pub async fn cleanup_isolate(Path(problem_id): Path<u32>) -> Result<impl IntoResponse, JudgeError> {
    let grader = DockerIsolateGrader::new("coduck-grader");
    grader.cleanup(problem_id).await?;

    Ok(Json(serde_json::json!({
        "message": format!("Sandbox cleaned up for problem ID {}", problem_id)
    })))
}


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

params: Query<OptionLanguage>,
) -> Result<impl IntoResponse, JudgeError> {
let language = params.language.clone().unwrap();

Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .unwrap() call will panic if the language parameter is not provided. Consider returning a proper error response instead. For example, return a JudgeError variant indicating that the language parameter is required.

Copilot uses AI. Check for mistakes.

let mut json_result = Json(serde_json::json!({}));
for i in 0..10 {
grader
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The number of test cases is hardcoded to 10. This should match the test count used in the generator function or be configurable. Consider accepting this as a parameter or reading it from a configuration.

Copilot uses AI. Check for mistakes.
Comment on lines +161 to +164
.compile(
problem_id,
CompileConfig::new("files", "wcmp.cpp", Language::Cpp),
)
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The checker filename "wcmp.cpp" is hardcoded. Consider making this configurable or passing it as a parameter to support different checker implementations.

Copilot uses AI. Check for mistakes.
"-Xss512m",
"Main",
],
_ => vec![],
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The match arms for unsupported languages return an empty vector. This could lead to silent failures where no execution occurs. Consider returning an error for unsupported languages instead of silently doing nothing.

Copilot uses AI. Check for mistakes.
Path((problem_id, category, filename)): Path<(u32, String, String)>,
params: Query<OptionLanguage>,
) -> Result<impl IntoResponse, JudgeError> {
let language = params.language.clone().unwrap();
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .unwrap() call will panic if the language parameter is not provided. Consider returning a proper error response instead. For example, return a JudgeError variant indicating that the language parameter is required.

Suggested change
let language = params.language.clone().unwrap();
let language = params
.language
.clone()
.ok_or(JudgeError::InvalidRequest(
"language parameter is required".to_string(),
))?;

Copilot uses AI. Check for mistakes.
)
.await?;
println!("executed test {}", i);

Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug print statements should be removed or replaced with proper logging using a logging framework (e.g., tracing or log). These println statements are not appropriate for production code.

Copilot uses AI. Check for mistakes.
let grader = DockerIsolateGrader::new("coduck-grader");
grader.init(problem_id).await?;
grader
.copy_to_box(problem_id, &format!("{}/{}/.", UPLOAD_DIR, problem_id), ".")
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem_id parameter is used directly in file paths without validation. Consider validating that problem_id doesn't contain path traversal sequences (e.g., "..", "/") to prevent potential security issues where malicious input could access files outside the intended directory.

Copilot uses AI. Check for mistakes.
Comment on lines +12 to +13
RUN git clone https://github.com/ioi/isolate.git
RUN cd isolate && make install
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RUN git clone https://github.com/ioi/isolate.git step fetches and installs a third-party tool directly from a mutable Git branch without pinning to a specific commit or verifying integrity, which exposes the build to supply-chain compromise if the repository or network is tampered with. An attacker who gains control over that Git ref could inject arbitrary code into the image and thus the grading environment. Pin this dependency to an immutable commit or signed release and add integrity verification (or vendor it) instead of cloning the moving default branch at build time.

Copilot uses AI. Check for mistakes.
Comment on lines +17 to +18
RUN curl -L https://raw.githubusercontent.com/MikeMirzayanov/testlib/master/testlib.h \
-o /usr/include/testlib.h
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RUN curl -L https://raw.githubusercontent.com/MikeMirzayanov/testlib/master/testlib.h step downloads source code from a mutable remote branch without any checksum or signature verification, creating a supply-chain risk if that URL or branch is compromised. Because this header will be compiled into all judged programs, a malicious change upstream could result in arbitrary code execution inside your grading environment. Pin this asset to an immutable version (e.g., a specific commit) and verify its integrity via a known hash or signature instead of relying on the moving master branch.

Copilot uses AI. Check for mistakes.
Copy link
Member

@reddevilmidzy reddevilmidzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

수고하셨습니다, 유닛테스트와 e2e 테스트도 짜주세요 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

채점 기능 구현

2 participants