Conversation
arg, stdin, stdout
There was a problem hiding this comment.
Pull request overview
This PR implements a Docker + isolate based grading system for competitive programming problems. The implementation uses a 3-tier architecture (Docker → isolate → grader) to provide secure, isolated code execution with compile, execute, test generation, and checking capabilities.
Changes:
- Added grading infrastructure with Docker and isolate sandbox integration
- Implemented 6 new HTTP endpoints for problem compilation, execution, test generation, checking, and sandbox management
- Created modular architecture with separate docker, isolate, and grader layers with language-specific compilation/execution configurations
Reviewed changes
Copilot reviewed 26 out of 26 changed files in this pull request and generated 42 comments.
Show a summary per file
| File | Description |
|---|---|
| src/lib.rs | Added judge_manager routes and module declarations |
| src/judge_manager/mod.rs | Module definition for judge manager handlers |
| src/judge_manager/handlers.rs | HTTP handlers for initialize, compile, execute, generator, checker, and cleanup operations |
| src/isolate/path.rs | Helper for constructing isolate box paths |
| src/isolate/mod.rs | Isolate wrapper module with init, compile, execute, copy, and cleanup operations |
| src/isolate/init.rs | Isolate box initialization implementation |
| src/isolate/execute.rs | Code execution within isolate sandbox |
| src/isolate/copy.rs | File copying to/from isolate box |
| src/isolate/compile.rs | Code compilation within isolate sandbox |
| src/isolate/cleanup.rs | Isolate box cleanup implementation |
| src/isolate/box_id.rs | BoxId type wrapper for isolate box identification |
| src/grader/result.rs | Result types for compile and execute operations |
| src/grader/mod.rs | Grader module definition |
| src/grader/grader.rs | Grader trait defining the grading interface |
| src/grader/docker_isolate.rs | DockerIsolateGrader implementation of Grader trait |
| src/grader/config.rs | Language-specific compilation and execution configurations |
| src/errors/mod.rs | Added docker, isolate, and judge error modules |
| src/errors/judge.rs | JudgeError enum and IntoResponse implementation |
| src/errors/isolate.rs | IsolateError enum with Display implementation |
| src/errors/docker.rs | DockerError enum with Display implementation |
| src/docker/mod.rs | Docker module definition |
| src/docker/exec.rs | Docker exec command wrapper with output handling |
| src/docker/cp.rs | Docker cp command wrapper for file copying |
| docker-compose.yml | Docker Compose configuration for grader container |
| Dockerfile | Docker image with gcc, Python, Java, isolate, and testlib |
| Cargo.toml | Added async-trait dependency |
Comments suppressed due to low confidence (1)
src/judge_manager/handlers.rs:206
- The new judge_manager module lacks test coverage. Given that the existing codebase has comprehensive tests for the file_manager module (see tests/file_manager/handlers.rs), similar test coverage should be added for judge_manager handlers to maintain consistency. Consider adding integration tests for initialize_isolate, compile_file, execute_file, generator, checker, and cleanup_isolate.
use crate::errors::JudgeError;
use crate::file_manager::Language;
use crate::grader::config::{CompileConfig, ExecuteConfig};
use crate::grader::docker_isolate::DockerIsolateGrader;
use crate::grader::grader::Grader;
use axum::extract::Query;
use axum::{extract::Path, response::IntoResponse, Json};
use serde::Deserialize;
const UPLOAD_DIR: &str = "uploads";
pub async fn initialize_isolate(
Path(problem_id): Path<u32>,
) -> Result<impl IntoResponse, JudgeError> {
let grader = DockerIsolateGrader::new("coduck-grader");
grader.init(problem_id).await?;
grader
.copy_to_box(problem_id, &format!("{}/{}/.", UPLOAD_DIR, problem_id), ".")
.await?;
Ok(Json(serde_json::json!({
"message": format!("Sandbox initialized for problem ID {}", problem_id)
})))
}
#[derive(Deserialize)]
pub struct OptionLanguage {
language: Option<Language>,
}
pub async fn compile_file(
Path((problem_id, category, filename)): Path<(u32, String, String)>,
params: Query<OptionLanguage>,
) -> Result<impl IntoResponse, JudgeError> {
let language = params.language.clone().unwrap();
let grader = DockerIsolateGrader::new("coduck-grader");
let result = grader
.compile(
problem_id,
CompileConfig::new(&category, &filename, language),
)
.await?;
let output_path = result.output.clone();
grader
.copy_from_box(
problem_id,
&output_path,
&format!("{}/{}/{}", UPLOAD_DIR, problem_id, output_path),
)
.await?;
Ok(Json(serde_json::json!({
"message": format!("File {} compiled successfully in category {} for problem ID {}", filename, category, problem_id),
})))
}
pub async fn execute_file(
Path((problem_id, category, filename)): Path<(u32, String, String)>,
params: Query<OptionLanguage>,
) -> Result<impl IntoResponse, JudgeError> {
let language = params.language.clone().unwrap();
let grader = DockerIsolateGrader::new("coduck-grader");
let result = grader
.execute(
problem_id,
ExecuteConfig::new(&category, &filename, language),
)
.await?;
Ok(Json(serde_json::json!({
"message": format!("File {} executed successfully in category {} for problem ID {}", filename, category, problem_id),
"stdout": result.stdout,
"stderr": result.stderr,
"exit_code": result.exit_code,
})))
}
#[derive(Deserialize)]
pub struct OptionGenerateCount {
count: Option<u32>,
}
pub async fn generator(
Path(problem_id): Path<u32>,
params: Query<OptionGenerateCount>,
) -> Result<impl IntoResponse, JudgeError> {
let count = params.count.unwrap_or(10);
let grader = DockerIsolateGrader::new("coduck-grader");
grader
.compile(
problem_id,
CompileConfig::new("files", "gen.cpp", Language::Cpp),
)
.await?;
for i in 0..count {
grader
.execute(
problem_id,
ExecuteConfig::new("files", "gen", Language::Cpp)
.stdout(&format!("tests/input/{:02}.in", i))
.arg(&i.to_string()),
)
.await?;
grader
.copy_from_box(
problem_id,
&format!("tests/input/{:02}.in", i),
&format!("{}/{}/tests/input/{:02}.in", UPLOAD_DIR, problem_id, i),
)
.await?;
grader
.execute(
problem_id,
ExecuteConfig::new("solutions", "mcs", Language::Cpp)
.stdin(&format!("tests/input/{:02}.in", i))
.stdout(&format!("tests/answer/{:02}.a", i)),
)
.await?;
grader
.copy_from_box(
problem_id,
&format!("tests/answer/{:02}.a", i),
&format!("{}/{}/tests/answer/{:02}.a", UPLOAD_DIR, problem_id, i),
)
.await?;
}
Ok(Json(serde_json::json!({
"message": format!("Generated {} test cases using {} for problem ID {}", count, "gen", problem_id)
})))
}
pub async fn checker(
Path((problem_id, category, filename)): Path<(u32, String, String)>,
params: Query<OptionLanguage>,
) -> Result<impl IntoResponse, JudgeError> {
let language = params.language.clone().unwrap();
let grader = DockerIsolateGrader::new("coduck-grader");
let submission_result = grader
.compile(
problem_id,
CompileConfig::new(&category, &filename, language),
)
.await?
.output
.clone();
let executable = submission_result.split('/').last().unwrap();
println!("submission_result: {}", submission_result);
println!("executable: {}", executable);
grader
.compile(
problem_id,
CompileConfig::new("files", "wcmp.cpp", Language::Cpp),
)
.await?;
println!("check compiled");
let mut json_result = Json(serde_json::json!({}));
for i in 0..10 {
grader
.execute(
problem_id,
ExecuteConfig::new(&category, &executable, Language::Cpp)
.stdin(&format!("tests/input/{:02}.in", i))
.stdout(&format!("tests/output/{:02}.out", i)),
)
.await?;
println!("executed test {}", i);
let result = grader
.execute(
problem_id,
ExecuteConfig::new("files", "wcmp", Language::Cpp)
.arg(&format!("tests/input/{:02}.in", i))
.arg(&format!("tests/output/{:02}.out", i))
.arg(&format!("tests/answer/{:02}.a", i)),
)
.await?;
json_result.0[format!("test_{}", i)] = serde_json::json!({
"verdict": result.stderr,
});
}
Ok(json_result)
}
pub async fn cleanup_isolate(Path(problem_id): Path<u32>) -> Result<impl IntoResponse, JudgeError> {
let grader = DockerIsolateGrader::new("coduck-grader");
grader.cleanup(problem_id).await?;
Ok(Json(serde_json::json!({
"message": format!("Sandbox cleaned up for problem ID {}", problem_id)
})))
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| params: Query<OptionLanguage>, | ||
| ) -> Result<impl IntoResponse, JudgeError> { | ||
| let language = params.language.clone().unwrap(); | ||
|
|
There was a problem hiding this comment.
The .unwrap() call will panic if the language parameter is not provided. Consider returning a proper error response instead. For example, return a JudgeError variant indicating that the language parameter is required.
|
|
||
| let mut json_result = Json(serde_json::json!({})); | ||
| for i in 0..10 { | ||
| grader |
There was a problem hiding this comment.
The number of test cases is hardcoded to 10. This should match the test count used in the generator function or be configurable. Consider accepting this as a parameter or reading it from a configuration.
| .compile( | ||
| problem_id, | ||
| CompileConfig::new("files", "wcmp.cpp", Language::Cpp), | ||
| ) |
There was a problem hiding this comment.
The checker filename "wcmp.cpp" is hardcoded. Consider making this configurable or passing it as a parameter to support different checker implementations.
| "-Xss512m", | ||
| "Main", | ||
| ], | ||
| _ => vec![], |
There was a problem hiding this comment.
The match arms for unsupported languages return an empty vector. This could lead to silent failures where no execution occurs. Consider returning an error for unsupported languages instead of silently doing nothing.
| Path((problem_id, category, filename)): Path<(u32, String, String)>, | ||
| params: Query<OptionLanguage>, | ||
| ) -> Result<impl IntoResponse, JudgeError> { | ||
| let language = params.language.clone().unwrap(); |
There was a problem hiding this comment.
The .unwrap() call will panic if the language parameter is not provided. Consider returning a proper error response instead. For example, return a JudgeError variant indicating that the language parameter is required.
| let language = params.language.clone().unwrap(); | |
| let language = params | |
| .language | |
| .clone() | |
| .ok_or(JudgeError::InvalidRequest( | |
| "language parameter is required".to_string(), | |
| ))?; |
| ) | ||
| .await?; | ||
| println!("executed test {}", i); | ||
|
|
There was a problem hiding this comment.
Debug print statements should be removed or replaced with proper logging using a logging framework (e.g., tracing or log). These println statements are not appropriate for production code.
| let grader = DockerIsolateGrader::new("coduck-grader"); | ||
| grader.init(problem_id).await?; | ||
| grader | ||
| .copy_to_box(problem_id, &format!("{}/{}/.", UPLOAD_DIR, problem_id), ".") |
There was a problem hiding this comment.
The problem_id parameter is used directly in file paths without validation. Consider validating that problem_id doesn't contain path traversal sequences (e.g., "..", "/") to prevent potential security issues where malicious input could access files outside the intended directory.
| RUN git clone https://github.com/ioi/isolate.git | ||
| RUN cd isolate && make install |
There was a problem hiding this comment.
The RUN git clone https://github.com/ioi/isolate.git step fetches and installs a third-party tool directly from a mutable Git branch without pinning to a specific commit or verifying integrity, which exposes the build to supply-chain compromise if the repository or network is tampered with. An attacker who gains control over that Git ref could inject arbitrary code into the image and thus the grading environment. Pin this dependency to an immutable commit or signed release and add integrity verification (or vendor it) instead of cloning the moving default branch at build time.
| RUN curl -L https://raw.githubusercontent.com/MikeMirzayanov/testlib/master/testlib.h \ | ||
| -o /usr/include/testlib.h |
There was a problem hiding this comment.
The RUN curl -L https://raw.githubusercontent.com/MikeMirzayanov/testlib/master/testlib.h step downloads source code from a mutable remote branch without any checksum or signature verification, creating a supply-chain risk if that URL or branch is compromised. Because this header will be compiled into all judged programs, a malicious change upstream could result in arbitrary code execution inside your grading environment. Pin this asset to an immutable version (e.g., a specific commit) and verify its integrity via a known hash or signature instead of relying on the moving master branch.
reddevilmidzy
left a comment
There was a problem hiding this comment.
수고하셨습니다, 유닛테스트와 e2e 테스트도 짜주세요 🙏
Closes #56
Summary
본 PR은 Docker + isolate 기반 채점(Grading) 시스템을 백엔드에 도입합니다.
채점 실행 환경을 Docker → isolate → grader의 3단 구조로 분리하여,
보안·재현성·확장성을 동시에 확보하는 것을 목표로 합니다.
uploads.zip
위 파일을 프로젝트 루트 디렉터리에 압축해제하여 테스트 가능합니다.
Module Diagram
flowchart TD A[HTTP Handler] --> B[Grader] B --> C[Isolate] C --> D[Docker Exec / Cp] %% isolate internals subgraph Isolate Module C1[init] C2[copy] C3[compile] C4[execute] C5[cleanup] end C --> C1 C --> C2 C --> C3 C --> C4 C --> C5 %% compile / execute configs B --> E[LanguageRegistry] E --> F[CompileConfig] E --> G[ExecuteConfig] %% errors D -->|DockerError| C C -->|IsolateError| B B -->|JudgeError| AArchitecture
Docker
최하위 계층으로 docker exec, cp를 래핑하는 어댑터
채점 환경 고정 및 1차 격리
isolate
중간 계층으로 docker 인터페이스를 사용하여 isolate 명령어를 수행함
untrusted code 실행을 위한 샌드박싱
프로세스 / 파일 접근 제한
추후) 시간 / 메모리 제한
grader
최상위 계층으로 isolate 인터페이스만 사용함
컴파일 → 실행 → 결과 수집 오케스트레이션
LanguageRegistry
언어별 컴파일/실행 명령어 프로필 생성
YAML 등으로 깔끔하게 하고 싶었으나 일단은
grader/config.rs내부에 하드코딩CompileConfig, ExecuteConfig를 관리합니다.
Handler Explain
1.
initialize_isolate/box/생성uploads/{problem_id}/하위 파일들이 box 루트로 복사됨2.
compile_fileuploads/{problem_id}/{executable}생성3.
execute_file{ "exit_code": 0, "message": "File hello executed successfully in category solutions for problem ID 0", "stderr": "OK (0.002 sec real, 0.006 sec wall)\n", "stdout": "Hello, World!\n" }4.
generatorgen.cpp컴파일tests/input/*.intests/answer/*.auploads/{problem_id}/tests/input/*.inuploads/{problem_id}/tests/answer/*.a{ "message": "Generated 10 test cases using gen for problem ID 0" }5.
checkerwcmp.cpp컴파일tests/output/*.out생성{ "test_0": { "verdict": "ok \"1684907\"\nOK (0.001 sec real, 0.001 sec wall)\n" }, ... "test_9": { "verdict": "ok \"992346\"\nOK (0.001 sec real, 0.001 sec wall)\n" } }6.
cleanup_isolateResult