exp: use allow_fail True for downloads after checking the index contents#3657
exp: use allow_fail True for downloads after checking the index contents#3657aignas wants to merge 3 commits intobazel-contrib:mainfrom
Conversation
This makes the logic in the parser a little bit more sophisticated, but we also start handling the yank reason. This fixes the issue where the `data-yank` presence but no value would be interpreted as a yanked package. With this it should start working. This implementation assumes that we have HTML escaped sequences as tag values. It also unescapes them when returning the strings. The posibilities that it gives us are: - Use the `data-requires-python` to potentially discard any Python packages that are unsupported in the `select_whl` function. Work towards bazel-contrib#260. Work towards bazel-contrib#2731.
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This experimental pull request introduces a significant change to how Python packages are resolved and downloaded from PyPI. Instead of directly attempting downloads, the system now first queries the PyPI index contents to identify available packages and their precise download URLs. This approach aims to improve reliability, enable cross-compilation by fetching platform-specific wheels, and lay the groundwork for more sophisticated package resolution, especially in multi-index or private index scenarios. The changes also refine the handling of "yanked" packages by allowing a more descriptive status. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request significantly improves the handling of multiple PyPI indexes by introducing a more efficient method for querying index roots and refactoring the HTML parser to support modern features like yanked packages. A security audit found no medium, high, or critical vulnerabilities, confirming adherence to Bazel's security model and absence of injection or data leakage issues. However, a critical error handling placeholder (fail("TODO")) for unavailable indexes still needs to be addressed.
| if not result.success: | ||
| fail("TODO") |
There was a problem hiding this comment.
Calling fail() here will cause the entire dependency resolution to crash if an index URL is unavailable or returns an error. This could be due to transient network issues or a misconfigured URL. It would be more resilient to handle this failure gracefully by logging a warning and skipping the faulty index, allowing the process to continue with the remaining ones.
| if not result.success: | |
| fail("TODO") | |
| if not result.success: | |
| # TODO: Add logging to warn the user about the unreachable index. | |
| continue |
| {obj}`http_file` for docs. | ||
| cache: {type}`struct` the `pypi_cache` instance. | ||
| get_auth: A function to get auth information. Used in tests. | ||
| parse_index: TODO |
There was a problem hiding this comment.
The documentation for the parse_index parameter is a 'TODO'. Please provide a clear explanation of its purpose to improve maintainability.
| parse_index: TODO | |
| parse_index: {type}`bool` Whether to parse the content as a root index page (e.g. `/simple/`) instead of a package-specific page. |
Stacked on #3656.
I am leaving this here in case someone is interested and motivated to
collaborate on this (i.e. help with writing code and doing testing with their
setup).
Summary:
Things that I would like to see in the final solution:
use for each package. That way we can be sure that it works as we see that
the package names for some of the packages are not the same as the package
name itself. (you can do this if you output the return value of the index.
Experiment for #2632
Related to #3260