Skip to content

Conversation

@Peter554
Copy link
Collaborator

@Peter554 Peter554 commented Nov 10, 2025

Closed in favor of
#264


A demo showing a complete port of build_graph to rust, for your interest/consideration @seddonym

The benchmark results from my machine:

just benchmark-build-graph-rust-vs-python octoenergy /Users/peter.byfield/projects/kraken-core/src
Benchmarking graph building for package: octoenergy
============================================================

Python version without cache (build_graph):
  Time:    3.127s
  Modules: 65757
  Imports: 221797

Python version with cache - first run (cold cache):
  Time:    3.478s
  Modules: 65757
  Imports: 221797

Python version with cache - second run (warm cache):
  Time:    2.647s
  Modules: 65757
  Imports: 221797

Rust version without cache (build_graph_rust):
  Time:    1.931s
  Modules: 65757
  Imports: 221797

Rust version with cache - first run (cold cache):
  Time:    1.949s
  Modules: 65757
  Imports: 221797

Rust version with cache - second run (warm cache):
  Time:    0.967s
  Modules: 65757
  Imports: 221797

============================================================
Comparison:
  Python (no cache):          3.127s
  Python (cold cache):        3.478s  (0.90x speedup)
  Python (warm cache):        2.647s  (1.18x speedup)
  Rust (no cache):            1.931s  (1.62x vs Python no cache)
  Rust (cold cache):          1.949s  (1.60x vs Python no cache)
  Rust (warm cache):          0.967s  (3.23x vs Python no cache)

  Python cache speedup:       1.18x
  Rust cache speedup:         2.00x

@Peter554 Peter554 self-assigned this Nov 10, 2025
@codspeed-hq
Copy link

codspeed-hq bot commented Nov 10, 2025

CodSpeed Performance Report

Merging #263 will improve performances by ×2.4

Comparing Peter554:build-graph-rust (efa9e4b) with main (558e62c)

Summary

⚡ 5 improvements
✅ 21 untouched
⏩ 23 skipped1

Benchmarks breakdown

Mode Benchmark BASE HEAD Change
WallTime test_build_django_from_cache_a_few_misses[15] 153.7 ms 73.9 ms ×2.1
WallTime test_build_django_from_cache_a_few_misses[2] 151.2 ms 71.5 ms ×2.1
WallTime test_build_django_from_cache_a_few_misses[350] 282 ms 145.6 ms +93.73%
WallTime test_build_django_from_cache_no_misses 142.8 ms 68.8 ms ×2.1
WallTime test_build_django_uncached 157.4 ms 64.7 ms ×2.4

Footnotes

  1. 23 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@seddonym seddonym self-requested a review November 10, 2025 12:55
Copy link
Collaborator

@seddonym seddonym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very exciting! I can confirm that it's noticeably faster locally, on a large code base.

With cache: 2.3s -> 0.9s (Subsecond for the first time!)
Without cache: 3.2s -> 2.5s.

It would be interesting to see how we could move this over while retaining the same Python-based testing approach, in particulary things like interacting with fake file systems. How far away are we from having a PR that switches to this implementation while still retaining the testing approach?


# Create the graph_builder
package_spec = rust.PackageSpec(package_name, package_directory)
graph_builder = rust.GraphBuilder(package_spec)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not convinced this should be a mutable class - why not just a function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes made it a function now - that's more pythonic

rust_graph = graph_builder.build()

# Wrap the rust graph in our ImportGraph wrapper
graph = ImportGraph()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should encapsulate this so we don't need to access private attributes here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done it

More pythonic than builder pattern.
self
}

pub fn build(&self) -> Graph {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added error handling a bit late 0c28293, would have been better to do this from the start 🙈

1000 is still enough, and it allows deadlocks to show up in the
test suite if we've gotten the logic wrong.
@Peter554
Copy link
Collaborator Author

It would be interesting to see how we could move this over while retaining the same Python-based testing approach, in particulary things like interacting with fake file systems. How far away are we from having a PR that switches to this implementation while still retaining the testing approach?

@seddonym Yeah this is the main change related to this PR. I don't really know how to integrate the abstract file system approach, at least not without compromising performance.

See the discover_and_parse_modules function in the PR. Module discovery and parsing are now operating in parallel, so we don't have to discover all the modules before we can begin parsing. Communication between the module discovery and module parsing threads is done over a channel. This is one of the main performance tricks in this PR. I don't really know how we could adapt that approach to the abstract/fake file system.

I'm a bit unsure how much the abstract/fake file system help us. It makes the tests a bit faster, but I wonder how much. Are we sure we couldn't get away with testing on the actual file system (temporary files)? If I add an assert 0 at the top of build_graph only 67 tests fail - running that number of tests using temporary files might actually just be fine. If it's needed the abstract/fake file system helps us, but if not it kind of gets in the way IMO.

@Peter554 Peter554 changed the title Demo: Build graph rust Demo: Build graph rust (Parallel) Nov 12, 2025
@Peter554
Copy link
Collaborator Author

I'll close this PR since #264 is almost as fast and is conceptually simpler.

@Peter554 Peter554 closed this Nov 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants