Skip to content

Conversation

@agriyakhetarpal
Copy link
Member

@agriyakhetarpal agriyakhetarpal commented Mar 29, 2025

Description

This PR is the first in a series of PRs aiming to unblock #109 and also towards strides in other areas; such as being able to build a package from one directory twice, see pyodide/pyodide#5031. In particular, this adds two new flags: --no-isolation/-n and --skip-dependency-check/-x, to the pyodide build CLI. These will allow us not to use an isolated virtual environment when building a wheel for a package if required.

Closes #136

See also

Some TODO items

  • Get "unisolated" builds working
  • Get tests passing
  • Investigate and fix the issue where the compiler wrappers are ending up in temporary directories
  • Gotta add tests!!!
    • Basic tests that see if the isolation is working
    • Integration test with NumPy/SciPy/etc.
  • Add a CHANGELOG entry

@agriyakhetarpal agriyakhetarpal marked this pull request as ready for review March 30, 2025 01:20
@agriyakhetarpal
Copy link
Member Author

Hi @ryanking13, I've been able to build and test out the feature(s) I've added in this PR, but one problem still exists: the compiler wrappers for Emscripten are still being placed inside temporary directories as a part of the build process for some reason: https://github.com/pyodide/pyodide-build/actions/runs/14151180877/job/39644562112?pr=152#step:11:449

What this means in action is that if I were to build NumPy without isolation via the below command invocation in a persistent build directory, say, in build/ inside the NumPy repo root:

pyodide build . \
-Csetup-args="--cross-file=$(pyodide config get meson_cross_file)" \
-Csetup-args="-Dblas=none" -Csetup-args="-Dlapack=none" \
-nx \ # i.e., --no-isolation and --skip-dependency-check args, which this PR adds
-Cbuild-dir=build`

I am able to build a NumPy wheel. However, if I were to rerun the same command, hoping to make use of the caching provided by the persistent build directory and achieve a speedup for the subsequent run(s) , it would fail because Meson stores its build metadata inside the build directory, which also contains information about the compiler, etc. Meson then fails to find the compiler, as the temporary directory used the previous wheel build where the compiler wrappers were stored no longer exists.

When I build NumPy with a persistent build dir with native compilation on my macOS machine, the compilers are picked up correctly from conda-forge each time – this makes me think that this is a pyodide-build specific issue. Maybe it has something to do with pywasmcross?

I think you would be the best person to know why this is the case and on how we can address it, so I thought I should ask you about it! :)

@ryanking13
Copy link
Member

Hi @agriyakhetarpal, thanks for working on this.

the compiler wrappers for Emscripten are still being placed inside temporary directories as a part of the build process for some reason:

Yep, we are using temporary directory for now, but I think you can change it. We've already changed it in recipe build to store the symlinks in the build directory (#96), so I think you can do the similar thing in the pyodide build.

The directory where the symlinks are stored is decided here and here. I think you can pass the build_dir parameter here to store the symlink.

config_settings: ConfigSettingsType,
output_lockfile: str | None,
isolation: bool = True,
skip_dependency_check: bool = False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(No action is required for this PR)

It may be weird to build all the dependencies in a no-isolation mode. But let's go as-is for now. I personally want to deprecate the feature to build the dependencies if package maintainers in Python ecosystem starts to build Pyodide wheel themselves.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Joe Marshall added this and we were always a bit unsure if it made sense as a feature.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I added these extra args to normalise what passes through these functions, even if unused. I agree that it makes sense to drop it entirely. I don't think I've used it personally, and I I haven't seen it being used in the main Pyodide repo.

builder = _ProjectBuilder(srcdir, runner=_gen_runner(build_env))

if not skip_dependency_check:
missing = builder.check_dependencies(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

check_dependencies looks like a union of

  1. get_requires_for_build (== dependencies for the build backend such as setuptools, meson)
  2. build_system_requires (== dependencies in the pyproject.toml)

In the isolated build, we've been dealing with them separately. I guess we can use check_dependencies there too (no action required for this PR).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this to a new issue: #175.

@agriyakhetarpal agriyakhetarpal marked this pull request as draft March 31, 2025 00:35
@agriyakhetarpal
Copy link
Member Author

Yep, we are using temporary directory for now, but I think you can change it. We've already changed it in recipe build to store the symlinks in the build directory (#96), so I think you can do the similar thing in the pyodide build.

The directory where the symlinks are stored is decided here and here. I think you can pass the build_dir parameter here to store the symlink.

Thanks a lot for the pointer! I'm able to get subsequent builds for NumPy working. :D

I was able to compile NumPy in 77 seconds on my machine, and taking advantage of the build caching through the persistence of the build dir, numpy-tests compiles in 8 seconds.

@agriyakhetarpal agriyakhetarpal marked this pull request as ready for review March 31, 2025 02:48
@agriyakhetarpal agriyakhetarpal force-pushed the feat/support-non-isolated-builds branch from 217185c to 6088aba Compare April 1, 2025 17:58
agriyakhetarpal added a commit that referenced this pull request Apr 1, 2025
The integration tests have been failing on `main` since CMake 4 came out
due to an uninterpretable caching issue. Upgrading to `geos`'s latest
v3.13.1 release seems to fix it. This PR has been stemmed off from #152.
@ryanking13
Copy link
Member

Thanks again Agriya! I didn't forget this PR, but I've been a little busy this week and haven't been keeping up with all the PRs. I would like to review this in detail, so please wait for me until the weekend! 🙏

@agriyakhetarpal
Copy link
Member Author

No rush, thank you! :)

Copy link
Member

@ryanking13 ryanking13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some minor comments, but it looks good to me overall.

I was able to compile NumPy in 77 seconds on my machine, and taking advantage of the build caching through the persistence of the build dir, numpy-tests compiles in 8 seconds.

Nice!

BTW, some of the Google's Bazel-based packages were relying on no-isolated builds, such as JAX. There were a lot of other build issues as well, but I think would make one step closer to supporting those packages. Thank you!

Comment on lines 287 to +288
@contextmanager
def _create_symlink_dir(env: dict[str, str], build_dir: Path | None):
if build_dir:
# If we're running under build-recipes, leave the symlinks in
# the build directory. This helps with reproducing.
symlink_dir = build_dir / "pywasmcross_symlinks"
shutil.rmtree(symlink_dir, ignore_errors=True)
symlink_dir.mkdir()
yield symlink_dir
return

# Running from "pyodide build". Put symlinks in a temporary directory.
# TODO: Add a debug option to save the symlinks.
with TemporaryDirectory() as symlink_dir_str:
yield Path(symlink_dir_str)
def _create_symlink_dir(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not using a temporary directory anymore, so I don't think this needs to be a context manager anymore. No cleanup is done at the function end.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this, and it still requires a temporary directory to be created in one more case, as get_build_env also allows build_dir to be None. It looks like it's a little more complex than expected, so I hope it's okay to leave this and try it out in a follow-up PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved it to a new issue: #176

@agriyakhetarpal
Copy link
Member Author

Many thanks for the detailed review, @ryanking13! I'll merge this now, and I've opened issues for adding follow-ups.

@agriyakhetarpal agriyakhetarpal merged commit 557762c into main Apr 5, 2025
18 checks passed
@agriyakhetarpal agriyakhetarpal deleted the feat/support-non-isolated-builds branch April 5, 2025 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support --no-isolation flag from pypa/build for "unisolated" package builds

4 participants