-
-
Notifications
You must be signed in to change notification settings - Fork 645
refactor: optimize venv creation for nvidia and pkgutil style namespace packages #3460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: optimize venv creation for nvidia and pkgutil style namespace packages #3460
Conversation
…into refactor.optimize.venv.namespace.packages
….namespace.packages
….namespace.packages
….namespace.packages
…into fix.conflict.merging.files
….namespace.packages
…r.optimize.venv.pkgutil.namespace.packages
Summary of ChangesHello @rickeylev, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the virtual environment (venv) creation process to improve efficiency and reduce build conflicts, particularly when dealing with Python namespace packages. By introducing mechanisms to automatically detect and correctly handle Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a valuable optimization for creating virtual environments, specifically targeting pkgutil-style namespace packages and the nvidia package family. By intelligently identifying these namespace packages, the changes reduce analysis-time overhead and conflicts, which is a great performance improvement. The addition of comprehensive tests to validate this new logic is also excellent. However, I've found a critical issue in the implementation that nullifies the optimization for pkgutil and nvidia packages. Please see the detailed comment.
…into refactor.optimize.venv.namespace.packages
…r.optimize.venv.pkgutil.namespace.packages
|
Doh, forgot to remove some debugging files... |
…into refactor.optimize.venv.pkgutil.namespace.packages
|
PTAL. Debug code removed and tests fixed. |
|
Should we close #3401 as part of this? |
|
Yes, marked this as fixing #3401. I'm not sure what optimizations are left action-count wise. |
When pkgutil style namespace packages are used, multiple distributions provide
the same venv path (e.g.
foo/__init__.py). The venv symlink logic then tries tosymlink the
foo/directory as it looks like the highest linkable directory. Whenconflict merging logic runs later, it then has to flatten a depset with all the files
in the conflicting distributions.
To fix, have whl_library() try to guess when a file is a pkgutil namespace package.
These are then pass onto py_library's venv building logic so it can treat the
directories as not directly linkable. A conflict still occurs, but it only
contains the single
__init__.pyfile.Along the way, special case the "nvidia" package name and always treat it as a namespace
package. This is because nvidia packages aren't strictly correct: each has a blank
__init__.pyfile (which marks it as a regular package, not namespace package). Specialcasing like this is undesirable, but it greatly reduces the number of conflicts if
e.g. torch is installed, and I couldn't find any other metadata to indicate it's a
namespace package.
Along the way, add some hints to AGENTS.md so they understand repository rules better.
Fixes #3401