Skip to content

Add Python Vulnerability Lookup tool#261

Merged
simonw merged 2 commits intomainfrom
claude/vulnerability-lookup-tool-mhR8d
Mar 29, 2026
Merged

Add Python Vulnerability Lookup tool#261
simonw merged 2 commits intomainfrom
claude/vulnerability-lookup-tool-mhR8d

Conversation

@simonw
Copy link
Copy Markdown
Owner

@simonw simonw commented Mar 29, 2026

Fetch https://gist.githubusercontent.com/simonw/b11683fd23e12c764380e99196d1aa71/raw/112f6e754ce87715ba3ae0ce8238b83eee7a0182/spec.md with curl to /tmp and read it

That describes a CORS-enabled API for looking up vulnerabilities

Build a tool where I can copy and paste in either a pyproject.toml file OR a requirements.txt file and it extracts the packages listed there (both required and dev / optional dependencies) and uses the API above to look up any vulnerabilities - sending a version if one was provided, otherwise no version - and then displays that vulnerability information in a neat readable way on the page

The original pretty-printed JSON for each reported package should also be available but hidden in a details/summary and should have a Copy button

It should work by making a single hit to the vulnerabilties POST API and then parallel calls to the GET API for fetching details of each one

Use "uvx rodney --help" as your browser testing tool for this

Also add a feature where you can optionally give it the URL to a requirements.txt or pyproject.toml file OR the URL to a GitHub repo e.g. https://github.com/simonw/datasette OR just a simonw/datasette path and it uses the raw github API to guess the possible URL of both a requirements.txt or a pyproject.toml file and checks both to see if they 200 and then combines the requirements from both and does the lookup for those.

That feature should be bookmarkable: hitting the tool with ?path=simonw/datasette or ?path=https://...url-to-requirements.txt should automatically run the code again for that, and when you use the feature it should update the URL bar to reflect that state

Make sure document.title updates to reflect the ?path= thing that was used

Summary

This PR adds a new interactive web-based tool for checking Python package vulnerabilities against the OSV.dev database. The tool allows users to paste dependency files or fetch them from GitHub repositories and displays detailed vulnerability information.

Key Features

  • Multiple input methods: Paste pyproject.toml or requirements.txt content directly, or fetch from GitHub repositories
  • Flexible parsing: Automatically detects and parses both pyproject.toml (including [project] dependencies and [dependency-groups]) and requirements.txt formats
  • GitHub integration: Supports shorthand notation (owner/repo), full GitHub URLs, and direct file URLs
  • OSV.dev integration: Queries the OSV.dev API to fetch vulnerability data for detected packages
  • Rich UI:
    • Color-coded severity badges (Critical, High, Moderate, Low)
    • Detailed vulnerability information including summaries, aliases, and affected version ranges
    • Expandable raw JSON view for each vulnerability with copy-to-clipboard functionality
    • Summary statistics showing total vulnerabilities and clean packages
  • Responsive design: Mobile-friendly layout with proper styling for various screen sizes
  • URL state management: Supports URL parameters to share vulnerability checks (e.g., ?path=owner/repo)

Implementation Details

  • Uses the OSV.dev batch query API (/v1/querybatch) for efficient vulnerability lookups
  • Implements custom TOML array parsing to extract dependencies from various TOML sections
  • Handles PEP 508 dependency specification format
  • Deduplicates packages across multiple dependency files
  • Provides user feedback through status messages during fetching and querying
  • Stores vulnerability JSON data for easy copying without re-fetching

https://claude.ai/code/session_01FofeT3UVvJGdSBqi4vkQg6

Single-file HTML tool that parses pyproject.toml or requirements.txt files,
queries the OSV.dev API for known vulnerabilities, and displays results with
severity badges, affected version ranges, and expandable raw JSON.

Supports fetching dependencies directly from GitHub repos via ?path=owner/repo
parameter, with bookmarkable URLs and document title updates.

https://claude.ai/code/session_01FofeT3UVvJGdSBqi4vkQg6
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Mar 29, 2026

Claude finished @simonw's task —— View job


Code Review Complete

I've thoroughly reviewed the Python vulnerability lookup tool implementation. Here's my comprehensive feedback:

  • Read and analyze the HTML file code
  • Evaluate code quality and best practices
  • Check for potential bugs or issues
  • Assess performance considerations
  • Review security concerns
  • Evaluate test coverage
  • Provide comprehensive feedback

Code Quality & Best Practices ✅

Strengths:

  • Clean, well-structured HTML with semantic markup
  • Comprehensive CSS with responsive design and good accessibility (focus states, proper contrast)
  • Modular JavaScript functions with clear separation of concerns
  • Good error handling with user-friendly status messages
  • Consistent code style and naming conventions

Minor improvements:

  • Consider extracting inline styles to CSS classes (lines 194-202)
  • Function parseRequirementsTxt at line 213 could benefit from more robust comment handling

Potential Bugs & Issues ⚠️

Critical Issues Found:

  1. TOML parsing vulnerability (lines 226-256): The custom TOML array parser is brittle and could break on valid TOML:

    • Doesn't handle escaped quotes in strings: "string with \" quote"
    • Fails on multi-line strings or complex TOML structures
    • No validation of TOML syntax
  2. Regex issues (lines 218, 238):

    • Package name regex [a-zA-Z0-9_.-]+ doesn't fully comply with PEP 508 (missing hyphens in some contexts)
    • Could fail on legitimate package names with special characters
  3. Error handling gaps:

    • No timeout handling for fetch requests (lines 556, 624, 650)
    • Network failures could leave UI in disabled state
    • No handling of malformed JSON responses from OSV.dev API

Performance Considerations ⚡

Good practices:

  • Efficient use of Promise.all() for concurrent API calls (lines 554, 653)
  • Deduplication prevents redundant API calls (lines 266-272, 579-585)
  • Proper debouncing through button disabling

Areas for optimization:

  • Large vulnerability JSON storage in memory (jsonStore at line 354) could cause memory issues with many vulnerabilities
  • Consider lazy loading of raw JSON details instead of storing all upfront
  • Batch API calls are well-implemented but could add request size limiting

Security Concerns 🔒

Well-handled security aspects:

  • Proper HTML escaping via escapeHtml() function (lines 480-484)
  • Safe URL construction with encodeURIComponent() (lines 432, 650)
  • rel="noopener" on external links (line 432)

Security recommendations:

  • Add Content Security Policy (CSP) headers to prevent XSS
  • Consider input validation for GitHub repo patterns to prevent SSRF
  • Raw URLs from user input are fetched without validation (line 556) - could be used for SSRF attacks

Test Coverage 🧪

Missing test coverage:
This is a client-side only HTML file with no automated tests. Consider adding:

  • Unit tests for parsing functions (parseRequirementsTxt, parsePyprojectToml)
  • Integration tests for API interactions
  • UI tests for error states and edge cases
  • Tests for the TOML parsing edge cases identified above

Recommendations

High Priority:

  1. Fix TOML parsing: Consider using a proper TOML parser library or add extensive validation
  2. Add request timeouts: Prevent hanging requests from blocking the UI
  3. Improve error recovery: Ensure UI remains functional after network errors
  4. Input validation: Validate GitHub URLs before fetching to prevent SSRF

Medium Priority:
5. Add CSP headers for additional security
6. Implement proper PEP 508 dependency parsing
7. Add request size limits for very large dependency files
8. Consider adding basic unit tests

Overall Assessment: This is a well-crafted tool with excellent UI/UX and solid architecture. The main concerns are around the custom TOML parser robustness and some security hardening opportunities. The code demonstrates good JavaScript practices and thoughtful error handling in most areas.

The implementation successfully delivers all the promised features and provides a great user experience for Python vulnerability scanning.


@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages bot commented Mar 29, 2026

Deploying tools with  Cloudflare Pages  Cloudflare Pages

Latest commit: 63d2b48
Status:⚡️  Build in progress...

View logs

@simonw
Copy link
Copy Markdown
Owner Author

simonw commented Mar 29, 2026

make sure that hitting <enter> in the load from URL text input submits that form

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Mar 29, 2026

Claude finished @simonw's task —— View job


Claude finished @simonw's task —— View job


Code Review Complete

I've thoroughly reviewed the Python vulnerability lookup tool implementation. Here's my comprehensive feedback:

  • Read and analyze the HTML file code
  • Evaluate code quality and best practices
  • Check for potential bugs or issues
  • Assess performance considerations
  • Review security concerns
  • Evaluate test coverage
  • Provide comprehensive feedback

Code Quality & Best Practices ✅

Strengths:

  • Clean, well-structured HTML with semantic markup
  • Comprehensive CSS with responsive design and good accessibility (focus states, proper contrast)
  • Modular JavaScript functions with clear separation of concerns
  • Good error handling with user-friendly status messages
  • Consistent code style and naming conventions

Minor improvements:

  • Consider extracting inline styles to CSS classes (lines 194-202)
  • Function parseRequirementsTxt at line 213 could benefit from more robust comment handling

Potential Bugs & Issues ⚠️

Critical Issues Found:

  1. TOML parsing vulnerability (lines 226-256): The custom TOML array parser is brittle and could break on valid TOML:

    • Doesn't handle escaped quotes in strings: "string with \" quote"
    • Fails on multi-line strings or complex TOML structures
    • No validation of TOML syntax
  2. Regex issues (lines 218, 238):

    • Package name regex [a-zA-Z0-9_.-]+ doesn't fully comply with PEP 508 (missing hyphens in some contexts)
    • Could fail on legitimate package names with special characters
  3. Error handling gaps:

    • No timeout handling for fetch requests (lines 556, 624, 650)
    • Network failures could leave UI in disabled state
    • No handling of malformed JSON responses from OSV.dev API

Performance Considerations ⚡

Good practices:

  • Efficient use of Promise.all() for concurrent API calls (lines 554, 653)
  • Deduplication prevents redundant API calls (lines 266-272, 579-585)
  • Proper debouncing through button disabling

Areas for optimization:

  • Large vulnerability JSON storage in memory (jsonStore at line 354) could cause memory issues with many vulnerabilities
  • Consider lazy loading of raw JSON details instead of storing all upfront
  • Batch API calls are well-implemented but could add request size limiting

Security Concerns 🔒

Well-handled security aspects:

  • Proper HTML escaping via escapeHtml() function (lines 480-484)
  • Safe URL construction with encodeURIComponent() (lines 432, 650)
  • rel="noopener" on external links (line 432)

Security recommendations:

  • Add Content Security Policy (CSP) headers to prevent XSS
  • Consider input validation for GitHub repo patterns to prevent SSRF
  • Raw URLs from user input are fetched without validation (line 556) - could be used for SSRF attacks

Test Coverage 🧪

Missing test coverage:
This is a client-side only HTML file with no automated tests. Consider adding:

  • Unit tests for parsing functions (parseRequirementsTxt, parsePyprojectToml)
  • Integration tests for API interactions
  • UI tests for error states and edge cases
  • Tests for the TOML parsing edge cases identified above

Recommendations

High Priority:

  1. Fix TOML parsing: Consider using a proper TOML parser library or add extensive validation
  2. Add request timeouts: Prevent hanging requests from blocking the UI
  3. Improve error recovery: Ensure UI remains functional after network errors
  4. Input validation: Validate GitHub URLs before fetching to prevent SSRF

Medium Priority:
5. Add CSP headers for additional security
6. Implement proper PEP 508 dependency parsing
7. Add request size limits for very large dependency files
8. Consider adding basic unit tests

Overall Assessment: This is a well-crafted tool with excellent UI/UX and solid architecture. The main concerns are around the custom TOML parser robustness and some security hardening opportunities. The code demonstrates good JavaScript practices and thoughtful error handling in most areas.

The implementation successfully delivers all the promised features and provides a great user experience for Python vulnerability scanning.


@simonw simonw merged commit d9bada9 into main Mar 29, 2026
1 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants