Kernelguard pre-queue check addition#473
Open
SinatrasC wants to merge 4 commits intogpu-mode:mainfrom
Open
Conversation
- Introduced KernelGuard for validating submissions before processing. - Implemented error handling for rejected submissions in the backend. - Updated database methods to mark submissions as hacked when flagged. - Enhanced tests to cover new KernelGuard functionality and error scenarios. - Added a new kernelguard.py module for managing submission analysis and pre-checks.
- Updated Python version requirement from 3.10 to 3.11 in pyproject.toml and uv.lock. - Added `kernelguard` dependency to manage submission pre-checks. - Enhanced error handling in submission processes to include KernelGuard rejection scenarios. - Implemented pre-check logic in the submission workflow to prevent blocked submissions from queuing. - Updated tests to reflect changes in submission handling and pre-check logic.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Integrates KernelGuard as a submission pre-check gate that detects and rejects exploit submissions before GPU execution. The check runs at the API boundary, before queue enqueue on the async path and before GPU dispatch on the sync path, so blocked payloads never consume worker resources. Flagged submissions are recorded with a "hacked" status in the database. The feature is off by default (KERNELGUARD_ENABLED) and supports both fail-open and fail-closed modes for handling analyzer outages.
Changes
src/libkernelbot/kernelguard.py- CLI wrapper forkernelguardtool with env-var configurationtests/test_kernelguard.py- unit tests for mode gating, rejection, fail-open/closedbackend.py- precheck insubmit_full()withskip_precheckflag,submission_startedguardmain.py- precheck beforeenqueue_background_job()in async endpointapi_utils.py- catchKernelGuardRejected(400) andKernelBotError(503) separatelybackground_submission_manager.py- handleKernelGuardRejectedwith "hacked" job statusleaderboard_db.py-mark_submission_hacked()methodpyproject.toml- addkernelguard>=0.1.1, bumprequires-pythonto>=3.11Environment Variables
KERNELGUARD_ENABLEDKERNELGUARD_TIMEOUT_SECKERNELGUARD_PROFILEKERNELGUARD_CONFIGKERNELGUARD_FAIL_OPENKERNELGUARD_COMMANDTest plan
test_kernelguard.py— 6 tests: mode gating, rejection, fail-open, fail-closed, CLI delegationtest_backend.py— 2 tests: hacked submission recording with/without pre_sub_idtest_background_submission_manager.py— 1 test: hacked status propagationKERNELGUARD_ENABLED=1on staging, submit known exploit, verify 400 rejectionKERNELGUARD_FAIL_OPEN=1, verify submissions pass through on outage