Skip to content

trie: swap out the pools in StackTrie with an arena allocator #32227

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

omerfirmak
Copy link
Contributor

@omerfirmak omerfirmak commented Jul 17, 2025

Arena allocators are a perfect match for something like StackTrie where allocations happens in stack-like "frames" and can then be deallocated in-bulk at the end of the "allocation frame"

We can construct the "allocation frames" every time we create a new extension/branch node and decontstruct the frame by deallocating everything allocated on the arena during that frame.

This saves us from dealing with individual allocations/deallocations

goos: linux
goarch: amd64
pkg: github.com/ethereum/go-ethereum/trie
cpu: AMD Ryzen 7 5700U with Radeon Graphics         
              │ /tmp/old.txt │            /tmp/new.txt             │
              │    sec/op    │   sec/op     vs base                │
Insert100K-16    40.57m ± 1%   31.01m ± 1%  -23.55% (p=0.000 n=10)

@omerfirmak omerfirmak requested a review from rjl493456442 as a code owner July 17, 2025 07:43
@rjl493456442 rjl493456442 self-assigned this Jul 17, 2025
Arena allocators are a perfect match for something like StackTrie
where allocations happens in stack-like "frames" and can then be
deallocation at the end of the "allocation frame"

We can construct the "allocation frames" every time we create a new
extension/branch node and decontstruct the frame by deallocating everything
allocated on the arena during that frame.

This saves us from dealing with individual allocations/deallocations
@rjl493456442
Copy link
Member

I haven't really dive in the implementation details, quick question: is it possible to only track the arena offset for the branch node? It feels a bit overcomplicated for tracking extension node as well.

@omerfirmak
Copy link
Contributor Author

I haven't really dive in the implementation details, quick question: is it possible to only track the arena offset for the branch node? It feels a bit overcomplicated for tracking extension node as well.

Yeah, extension node makes it a bit more tricky. But it should be possible to track branches only as well. Performance wise, it shouldn't really make any difference.

I wanted to be on the safe side wrt maintaining a similar peak allocation size as the original implementation, so I went for both types.

Btw, I managed to snap sync to sepolia with this. So it is highly likely that implementation is correct.

@omerfirmak
Copy link
Contributor Author

This PR took 20mins less to finish the initial phase of snap sync.

PR: INFO [07-18|12:57:10.970] Syncing: state download in progress synced=100.00% state=350.56GiB accounts=307,257,191@61.95GiB slots=1,336,088,498@278.15GiB codes=1,719,210@10.46GiB eta=-5m36.165s

Master: INFO [07-18|13:17:25.727] Syncing: state download in progress synced=100.00% state=350.57GiB accounts=307,766,814@61.95GiB slots=1,336,091,723@278.16GiB codes=1,719,330@10.46GiB eta=-45.924s

n = t.nodeAllocator.Alloc().reset()
n.typ = hashedNode
n.val = e.val
t.pushAllocationFrame() // for the "new" st that might end up being a branch or a ext
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason to push frame here?

The case here is:

     +----------------+                                  +--------------------------+
     |   Extension    |                                  |         FullNode         |
     +----------------+                                  +--------------------------+
              |                     ------------->            +----------------+     
              |                                               |   Extension    |     
              v                                               +----------------+     
+--------------------------+                             +--------------------------+
|         FullNode         |                             |         FullNode         |
+--------------------------+                             +--------------------------+
                                                                                     

the old st (extension) will be converted to branch node if the break is the first byte; or another extension node if the break is non-first byte;

  • for case a; the frame of the old st was already tracked;
  • for case b; the frame of the newly created branch node will be tracked at line 303

no reason to track the additional frame here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

frame of the old st gets popped regardless of the case at L274 and we unconditionally reuse st, so we need a new frame for it.

the frame pushed on L303 is for the newly created branch node at L301 as the child of st. So they need to have separate frames.

@rjl493456442
Copy link
Member

Generally I don't think it's necessary to track the extension node.

Given that the child of the extension node MUST be a full node (branch node). It makes sense to track the branch node for its children, but the child of extension is only 1. Let's say now there is an extension node with frame offset 100; then the child of it (branch node) will have offset 101.

@@ -223,8 +270,14 @@ func (t *StackTrie) insert(st *stNode, key, value []byte, path []byte) {
// Break on the non-last byte, insert an intermediate
// extension. The path prefix of the newly-inserted
// extension should also contain the different byte.
n = newExt(st.key[diffidx+1:], st.children[0])
t.hash(n, append(path, st.key[:diffidx+1]...))
e := makeExt(t.tmpNode.reset(), st.key[diffidx+1:], st.children[0]) // build a temporary extension node to hash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't track the frame for extension node, then you can allocate a node from the arena, set the type as extension and link the child, call t.hash(newNode) then. You can get rid of this tmpNode trick.

Copy link
Contributor Author

@omerfirmak omerfirmak Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if that is strictly true. If t.hash(newNode) ends up hashing a child fullnode, it will pop a frame and deallocate newNode, no? Regardless of extnode tracking.

@omerfirmak
Copy link
Contributor Author

Generally I don't think it's necessary to track the extension node.

Given that the child of the extension node MUST be a full node (branch node). It makes sense to track the branch node for its children, but the child of extension is only 1. Let's say now there is an extension node with frame offset 100; then the child of it (branch node) will have offset 101.

I agree that it is not necessary, but I would rather not change the logic here since we will probably have to re-test the entire thing. It wouldn't significantly simplify the code either.

I will leave it up to you, either let me know how you want to proceed so I can make the changes needed or feel free to make changes directly.

@omerfirmak omerfirmak closed this Aug 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants