MDBValue Optimizations #192

CoreyKaylor · 2025-12-16T17:14:03Z

Optimizations suggested by @sebastienros in #191
I've only compared to v0.20.0, not the previous comparer implementations. The current branch outperforms the baseline v0.20.0 between 5% and 20%

Separately, it feels like there might be a more optimal route for sorted Guid as well (which I assume is a common key scenario). I'll do a little bit of digging, but if anyone knows something I don't already feel free to share.

Memory

Allocation-free comparers confirmed - constant 56B overhead regardless of operation count (100 ops → 56B, 10000 ops → 56-59B)

Read Performance - All Comparers (1000 ops, 64B values)

Comparer	Time	vs Native
LengthOnly	24.9 μs	-75%
ReverseSignedInt	99.6 μs	-1%
Default (Native)	100.8 μs	baseline
SignedInt	103.0 μs	+2%
ReverseUnsignedInt	105.8 μs	+5%
UnsignedInt	107.3 μs	+6%
ReverseBitwise	110.8 μs	+10%
Bitwise	113.2 μs	+12%
Utf8String	114.1 μs	+13%
ReverseLength	118.4 μs	+17%
Length	119.4 μs	+18%
ReverseUtf8String	163.9 μs	+63%
HashCode	172.1 μs	+71%

Write Performance - All Comparers (1000 ops, 64B values)

Comparer	Time	vs Native
LengthOnly	82.3 μs	-65%
UnsignedInt	200.4 μs	-15%
SignedInt	202.5 μs	-14%
ReverseLength	219.0 μs	-7%
ReverseSignedInt	221.7 μs	-6%
ReverseUnsignedInt	224.8 μs	-5%
ReverseBitwise	228.6 μs	-3%
Default (Native)	236.5 μs	baseline
Length	237.6 μs	+0.5%
Utf8String	237.8 μs	+0.5%
Bitwise	294.5 μs	+25%
ReverseUtf8String	301.4 μs	+27%
HashCode	314.7 μs	+33%

Integer Keys (10000 ops, 4-byte keys)

Comparer	Time	vs Native
SignedInt	307 μs	-81%
UnsignedInt	312 μs	-81%
ReverseSignedInt	1,012 μs	-38%
ReverseUnsignedInt	1,007 μs	-38%
Default (Native)	1,633 μs	baseline

Notes

LengthOnly is fastest but only compares by length (no content comparison)
SignedInt/UnsignedInt provide major gains for integer keys
HashCode is consistently slowest across all scenarios
Most custom comparers perform within ±15% of native for general byte data

sebastienros · 2025-12-16T18:37:52Z

src/LightningDB/MDBValue.cs

 /// </remarks>
+#if NET5_0_OR_GREATER
+[SkipLocalsInit]
+#endif


Is MDBValue read-only?

This could be something to leverage, marking it readonly, and adding the in keyword in some methods such that the value is passed by ref automatically.

Here is the chatgpt explanation

Short answer:
They reduce copies, prevent accidental mutation, and enable better compiler optimizations—especially for larger structs.

Details:

1) readonly struct

Marking a struct as readonly guarantees it’s immutable after construction.

Advantages

No defensive copies: The compiler knows instance methods won’t mutate fields, so it doesn’t create hidden copies when the struct is accessed through in, readonly fields, or properties.

Clear intent & safety: Prevents accidental field mutation and enforces immutability at compile time.

Better optimizations: The JIT can make stronger assumptions, sometimes improving inlining and register usage.

Thread-safety by design: Immutable value types are naturally safer to share.

Cost

You must ensure all instance fields are readonly and methods don’t mutate state.

2) in parameters

in passes a struct by readonly reference instead of by value.

Advantages

Avoids copying large structs: Especially useful when structs exceed ~16 bytes or are passed frequently.

Expresses intent: Signals “this method will not modify the argument.”

Interoperates with readonly structs: No defensive copies when calling methods on a readonly struct.

Cost

Indirection: Very small structs can be slower due to pointer dereferencing.

Readonly rules: Attempting mutation causes compile errors or hidden copies if the struct isn’t readonly.

3) Using both together (best case)

This is where the real benefit shows up.

readonly struct + in parameters ⇒ zero copies, no defensive cloning, maximum safety.

Methods called on the struct don’t trigger hidden temporaries.

Ideal for math types, vectors, coordinates, timestamps, and domain value objects.

Practical guidance

Use readonly struct when:

The type is logically immutable

It’s used frequently or passed around a lot

Use in when:

The struct is medium-to-large

The method is hot-path or allocation/copy sensitive

Don’t bother for tiny structs (e.g., two ints).

Bottom line:
You get immutability guarantees, fewer copies, and better performance—when used selectively and intentionally.

I overlooked an aspect of the question.

P/Invoke signatures must use ref - Functions like mdb_cursor_get write back values (native code sets size/data pointers). Can't use in.

CompareFunction delegate must use ref - This is called FROM native code. The marshalling requires ref, not in:
delegate int CompareFunction(ref MDBValue left, ref MDBValue right);

IComparer passes by value - The standard interface signature is Compare(T x, T y), not Compare(in T x, in T y)

That said, there's probably several of the interop methods that could benefit from changing to 'in' instead of ref.

ref is fine, in just behaves like a ref without the need to use the ref keyword from the caller.
It's fine if not all methods use in or ref, like Compare. It's not a requirement, just that readonly allows the usage of in. We can still to ref when it's required.

mdb_cursor_get write back values...

Haven't checked the code, maybe a custom mutable and reusable struct (or class) could be used to pass to these, and then the library creates immutable ones.

Just brainstorming in case we can find patterns.

Even with readonly it can pass structs to be updated, as long as we know when it's done, ideally only during initialization:

private (MDBResultCode resultCode, MDBValue key, MDBValue value) Get(CursorOperation operation) { MDBValue mdbKey = default; MDBValue mdbValue = default; unsafe { var result = mdb_cursor_get( _handle, ref Unsafe.AsRef<MDBValue>(in mdbKey), ref Unsafe.AsRef<MDBValue>(in mdbValue), operation); return (result, mdbKey, mdbValue); } }

And this PR is fine, if you want to do changes maybe isolated them in separate PRs, no need to dead-lock PRs.

I was less hung up on the PR comments, though I don't mind queueing up your feedback before pulling in. I was more delayed from a company party and corresponding activities last week.

CoreyKaylor · 2025-12-23T00:31:35Z

Benchmarks comparing to v0.20.0 is quite a bit slower. Trying to narrow in specifically to what it is causing things to slow down.

ReadBenchmarks

OpsPerTx	ValueSize	v0.20.0	optimizations	Change
1	8	78.18 ns	169.0 ns	+116% slower
1	64	76.81 ns	151.2 ns	+97% slower
1	256	76.48 ns	146.1 ns	+91% slower
100	8	5,952 ns	11,058 ns	+86% slower
100	64	5,865 ns	9,270 ns	+58% slower
100	256	6,423 ns	12,128 ns	+89% slower
1000	8	86,140 ns	134,786 ns	+56% slower
1000	64	96,559 ns	157,259 ns	+63% slower
1000	256	99,618 ns	179,171 ns	+80% slower

WriteBenchmarks

OpsPerTx	ValueSize	v0.20.0	optimizations	Change
1	8	43.75 μs	64.21 μs	+47% slower
1	64	44.45 μs	61.27 μs	+38% slower
1	256	44.07 μs	52.43 μs	+19% slower
100	8	53.98 μs	68.49 μs	+27% slower
100	64	51.70 μs	65.30 μs	+26% slower
100	256	64.77 μs	82.80 μs	+28% slower
1000	8	161.62 μs	198.13 μs	+23% slower
1000	64	198.75 μs	339.59 μs	+71% slower
1000	256	272.54 μs	405.36 μs	+49% slower

CoreyKaylor · 2025-12-23T14:08:19Z

Summary of Tests

Test	Read (1 op)	Write (1 op)	Status
v0.20.0 baseline	78 ns	44 μs	✓ Fast
MDBValue only (readonly struct + helpers, with ref)	77 ns	45 μs	✓ Fast
ref→in only (struct MDBValue, with in)	102 ns	54 μs	⚠️ Partial regression
Full optimizations (readonly + in)	169 ns	64 μs	✗ Full regression

Conclusion

MDBValue changes are fine - readonly struct, [SkipLocalsInit], new helper methods cause NO performance impact when used with ref P/Invoke
ref→in is the culprit - This P/Invoke change alone causes ~30% regression
The combination is worse - ref→in with readonly struct together cause ~100% regression (double the impact)

AlgorithmsAreCool · 2025-12-23T19:17:10Z

Summary of Tests

Test Read (1 op) Write (1 op) Status
v0.20.0 baseline 78 ns 44 μs ✓ Fast
MDBValue only (readonly struct + helpers, with ref) 77 ns 45 μs ✓ Fast
ref→in only (struct MDBValue, with in) 102 ns 54 μs ⚠️ Partial regression
Full optimizations (readonly + in) 169 ns 64 μs ✗ Full regression
Conclusion

MDBValue changes are fine - readonly struct, [SkipLocalsInit], new helper methods cause NO performance impact when used with ref P/Invoke

ref→in is the culprit - This P/Invoke change alone causes ~30% regression

The combination is worse - ref→in with readonly struct together cause ~100% regression (double the impact)

interesting, there must be some hidden defensive copying going on in there. I wonder if it is in the LibraryImport generated source.

Curiously, the IL and ASM outputs are the same for each of the 3 argument styles, i'm not sure where the regression is coming from:

https://sharplab.io/#v2:C4LghgzgtgPgAgJgIwFgBQcAMACOSB0ASgK4B2wAllAKb4DCA9lAA4UA21ATgMpcBuFAMbUIAbnRZcBEuSq0AkuS4NmvTgOFiJAZlwJsdbAG902M7l1kIYAGbVcAFmyFqNgIKcA5gApOr7ACyACIAQgBqYGzE9jYMDACUxqbmKTYUAB7UACbeweGR0QBU2MzAnNgAvNgAZLEJySlmJmiNrYEAnnAArN6lnPHiLW0Avg3Yo0NmY3CWpNZ2jtiKHj4UpIGhEVExcYnNbdhpmTl5W0UlZZU1dfFjKfsH5gGdPX0Dd+YTKV/m07Pz9jgThcNhcYCyAHlSGx2itfP4/OCGND2ht8ttDrskpNGkdsrlNgVqMU+ldarsPk1KY1nt1emV3jjvmMflMcQBtIJsNjyFgMTjAbwAIhsYAA1tQ2BQAEZC+IAXTGzE4FD4YGAgKQADZsFZbPZqOkNZx1kCOnTTkTijdBhN0OzuGKKMwADIMQSRCCKCjARUYXSIrLImG6ub67AQMrEQTANFnajofZrY2kSLYQPB1GkZMRigAL2og2wya4qbY6eoSJR2Gl7Q1xSy6rAtqAA

CoreyKaylor · 2025-12-23T20:21:06Z

I'll make another pass at it and remove the LibraryImport in favor of the DllImport and see if the performance is the same. While I've had it added for LibraryImport since it was introduced, I'm not sure it's bought much in terms of the implementation. If it ends up speeding up by removing it with 'in' included, it would be worth dropping probably.

AlgorithmsAreCool · 2025-12-23T21:18:31Z

The libraryimport itself isn't doing much besides pinning the args and passing them onto the inner dllimport. And i'm pretty sure that pinning a valuetype is a noop so i don't think there is any hurt in it.

CoreyKaylor added 3 commits December 12, 2025 13:52

Incorporate Claude / Seb feedback for MDBValue optimizations

741f3c3

Adding comparer benchmarks

fc78d12

Adding Guid comparer option

483976a

sebastienros reviewed Dec 16, 2025

View reviewed changes

Incorporating some optimization recommedations from @sebastienros

fdde10a

Going back to ref, in caused performance regressions

b3cfc55

CoreyKaylor merged commit 5805814 into main Dec 23, 2025
3 checks passed

CoreyKaylor deleted the optimizations branch December 23, 2025 14:39

MDBValue Optimizations #192

MDBValue Optimizations #192

Uh oh!

Conversation

CoreyKaylor commented Dec 16, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

1) readonly struct

2) in parameters

3) Using both together (best case)

Practical guidance

Uh oh!

CoreyKaylor Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

CoreyKaylor commented Dec 23, 2025

Uh oh!

CoreyKaylor commented Dec 23, 2025

Uh oh!

Uh oh!

AlgorithmsAreCool commented Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CoreyKaylor commented Dec 23, 2025

Uh oh!

AlgorithmsAreCool commented Dec 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

1) `readonly struct`

2) `in` parameters

CoreyKaylor Dec 16, 2025 •

edited

Loading

AlgorithmsAreCool commented Dec 23, 2025 •

edited

Loading