ci: remove dataset downloads from Python test scripts#1458
ci: remove dataset downloads from Python test scripts#1458ramakrishnap-nv wants to merge 6 commits into
Conversation
All datasets needed by Python and server tests are now committed to the repo or have been replaced with already-committed files. - Commit datasets/solomon/In/r107.txt (7.5 KB) for test_solomon() - Update warm-start tests to use committed afiro_original.mps instead of a2864.mps (53 MB) and square41.mps (19 MB); drop the PDLP iteration invariant assertion (covered by C++ pdlp_test.cu:1844) - Remove download_pdlp_test_dataset.sh, download_miplib_test_dataset.sh, and get_test_data.sh calls from ci/test_python.sh, ci/test_wheel_cuopt.sh, and ci/test_wheel_cuopt_server.sh - Remove dead --tsp code from ci/test_wheel_cuopt.sh - Add .gitignore exceptions for datasets/solomon/In/r107.txt Saves 5–15 min of download time and reduces network dependency per CI run. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughCI scripts stop downloading datasets during runs and now export ChangesTest dataset handling
PR test summary
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@python/cuopt/cuopt/tests/linear_programming/test_lp_solver.py`:
- Around line 622-627: The warm-start test in test_lp_solver currently runs the
second Solver.Solve call without checking its result, so it only validates that
the path executes. Update the test to capture the second solve output and assert
a concrete success/correctness signal from it, using the existing solver.Solve
and settings.set_pdlp_warm_start_data flow to verify the warm-started solution
is numerically correct rather than just error-free.
In `@python/cuopt/cuopt/tests/linear_programming/test_python_API.py`:
- Around line 475-480: The two problem.solve(settings) calls in
test_python_API.py are only checking that execution succeeds, so add explicit
assertions on the returned solve results to validate correctness. Use the
existing test flow around get_pdlp_warm_start_data() and
set_pdlp_warm_start_data() to verify the first and second solve outcomes,
including solution status and any numerical values the test already inspects.
Make the assertions specific to the solve return object or problem state so
regressions in solution quality are caught.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 5ded3d8f-6633-4520-aee8-b8265e5ef6dc
⛔ Files ignored due to path filters (1)
datasets/solomon/In/r107.txtis excluded by!datasets/**/*.txt
📒 Files selected for processing (7)
.gitignoreci/test_python.shci/test_wheel_cuopt.shci/test_wheel_cuopt_server.shpython/cuopt/cuopt/tests/linear_programming/test_lp_solver.pypython/cuopt/cuopt/tests/linear_programming/test_python_API.pypython/cuopt_server/cuopt_server/tests/test_pdlp_warmstart.py
💤 Files with no reviewable changes (3)
- ci/test_wheel_cuopt.sh
- ci/test_python.sh
- ci/test_wheel_cuopt_server.sh
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Drop _CRASH_PATTERNS and all crash-detection logic - Simplify _is_test_job to plain startswith (exact-match fallback unneeded) - _analyze_job_log returns a list directly instead of (list, crash) tuple - Remove verbose module/function/class docstrings Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Address CodeRabbit review: both solve calls in test_warm_start were unchecked, so regressions in solution quality would go undetected. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@ci/utils/pr_test_summary.py`:
- Around line 70-78: The _api helper currently makes a blocking
urllib.request.urlopen call to GitHub without any timeout, so update _api in
pr_test_summary.py to pass a finite timeout through the request path. Use the
existing _api function and its urllib.request.urlopen call as the fix point, and
keep the timeout consistent with the other network calls in this file so a
stalled connection cannot hang the CI step indefinitely.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 4017d091-0082-455b-a405-a01a1c2ec84d
📒 Files selected for processing (3)
ci/utils/pr_test_summary.pypython/cuopt/cuopt/tests/linear_programming/test_lp_solver.pypython/cuopt/cuopt/tests/linear_programming/test_python_API.py
🚧 Files skipped from review as they are similar to previous changes (2)
- python/cuopt/cuopt/tests/linear_programming/test_lp_solver.py
- python/cuopt/cuopt/tests/linear_programming/test_python_API.py
| def _api(path, token, method, payload): | ||
| req = urllib.request.Request( | ||
| f"https://api.github.com{path}", | ||
| data=json.dumps(payload).encode(), | ||
| method=method, | ||
| headers={**_headers(token), "Content-Type": "application/json"}, | ||
| ) | ||
| with urllib.request.urlopen(req) as resp: | ||
| return json.loads(resp.read()) |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Add an HTTP timeout in _api calls.
Line 77 performs a blocking GitHub API request without a timeout, unlike the other network paths in this file. A stalled connection can hang the CI step indefinitely.
Suggested fix
def _api(path, token, method, payload):
@@
- with urllib.request.urlopen(req) as resp:
+ with urllib.request.urlopen(req, timeout=_HTTP_TIMEOUT_SEC) as resp:
return json.loads(resp.read())📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _api(path, token, method, payload): | |
| req = urllib.request.Request( | |
| f"https://api.github.com{path}", | |
| data=json.dumps(payload).encode(), | |
| method=method, | |
| headers={**_headers(token), "Content-Type": "application/json"}, | |
| ) | |
| with urllib.request.urlopen(req) as resp: | |
| return json.loads(resp.read()) | |
| def _api(path, token, method, payload): | |
| req = urllib.request.Request( | |
| f"https://api.github.com{path}", | |
| data=json.dumps(payload).encode(), | |
| method=method, | |
| headers={**_headers(token), "Content-Type": "application/json"}, | |
| ) | |
| with urllib.request.urlopen(req, timeout=_HTTP_TIMEOUT_SEC) as resp: | |
| return json.loads(resp.read()) |
🧰 Tools
🪛 ast-grep (0.44.0)
[warning] 76-76: Request-controlled URL passed to urlopen; validate against an allowlist to prevent SSRF.
Context: urllib.request.urlopen(req)
Note: [CWE-918] Server-Side Request Forgery (SSRF).
(urlopen-unsanitized-data)
[info] 72-72: use jsonify instead of json.dumps for JSON output
Context: json.dumps(payload)
Note: [CWE-116] Improper Encoding or Escaping of Output.
(use-jsonify)
🪛 Ruff (0.15.18)
[error] 77-77: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@ci/utils/pr_test_summary.py` around lines 70 - 78, The _api helper currently
makes a blocking urllib.request.urlopen call to GitHub without any timeout, so
update _api in pr_test_summary.py to pass a finite timeout through the request
path. Use the existing _api function and its urllib.request.urlopen call as the
fix point, and keep the timeout consistent with the other network calls in this
file so a stalled connection cannot hang the CI step indefinitely.
swath1.mps (1.7 MB) is used by test_heuristics_only and test_incumbent_callbacks; it was not committed and was previously provided by download_miplib_test_dataset.sh which this PR removed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Datasets needed by Python and server tests are now committed to the repo or replaced with already-committed alternatives, eliminating all network downloads from CI test scripts.
What changes:
datasets/solomon/In/r107.txt(7.5 KB Solomon VRPTW benchmark) fortest_solomon()a2864.mps(53 MB compressed) andsquare41.mps(19 MB compressed) with the already-committedafiro_original.mpsin all three warm-start tests (test_lp_solver.py,test_python_API.py,test_pdlp_warmstart.py). The PDLP iteration invariant assertion is dropped — it is already covered bycpp/tests/linear_programming/pdlp_test.cu:1844across 7 instances including a2864; the Python tests now only verify warm-start API behavior.download_pdlp_test_dataset.sh,download_miplib_test_dataset.sh, andget_test_data.shcalls fromci/test_python.sh,ci/test_wheel_cuopt.sh, andci/test_wheel_cuopt_server.sh. MIPLIB tests already useswath1.mpsandneos5-free-bound.mps, both of which are committed.--tspdownload and run fromci/test_wheel_cuopt.sh..gitignoreexceptions sodatasets/solomon/In/r107.txtis tracked.Savings: 2-3 minutes of download time and no external network dependency per CI run.