-
Notifications
You must be signed in to change notification settings - Fork 34
Optimize #eql?
#58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Optimize #eql?
#58
Conversation
This PR is inspired by tom-pang#56, and assumes that code will be merged, so uses it in the benchmarks here: https://gist.github.com/ms-ati/fa8002ef8a0ce00716e9aa6510d3d4d9 It is common in our code, as in any idiomatic code using value objects in loops or pipelines, to call `#with` many times, returning a new immutable object each time with 1 or more fields replaced with new values. The optimizations in this PR eliminate a number of extra Hash and Array instantiations that were occurring each time, in favor of iterating only over the constant `VALUE_ATTRS` array and doing key lookups in the given Hash parameter in the hot paths. Per the gist above, this increases ips (iterations per second) 2.29x, from 335.9 to 769.6 on my machine.
Speed improvements: | When you have... | Delta IPS | | -------------------- | --------- | | Same instance | 15.6x | | Different last field | 10.2x | | Equal fields | 1.4x | Benchmarking gist: https://gist.github.com/ms-ati/3e62cca57ca32cd17ad2e2e2b14cc65e Explanation of changes: It came as a surprise to see the humble `#eql?` comparison in profiling -- we had not realized that in Ruby, if one wants to short-circuit comparison of a single instance of an object to itself, for example, that this logic must be explicitly added to `#eql?`. This accounts for *15.6x* difference in iterations-per-second of same-instance comparison. Furthermore, the cost of pre-calculating `hash` is already being paid at instantation of every value object. So we add an early exit if the hash does not match. This accounts for *10.2x* difference in iterations-per-second of comparisons of instances that differ in only the last field. Finally, the cost of instantiating two temporary Array instances is significant when comparing two distinct instances which have equal field values. So we iterate the field comparisons without the temporary Arrays. This accounts for the *1.4x* difference in iterations-per-second of comparisons of distinct instances with equal field values.
Codecov Report
@@ Coverage Diff @@
## master #58 +/- ##
==========================================
- Coverage 96.42% 96.07% -0.36%
==========================================
Files 1 1
Lines 84 102 +18
==========================================
+ Hits 81 98 +17
- Misses 3 4 +1
Continue to review full report at Codecov.
|
|
One potential downside is that this code enforces that fields which have non-equal What is the Ruby language policy on unequal hash implies not eql? I believe this is the case in Java but not sure here. |
|
Ok, interesting. One data point on "can unequal A very common example of instances of different classes attempting full interchangeability with one another is ActiveSupport's TimeWithZone, which goes to some lengths to be interchangeable at the instance level with Ruby's Time. And in fact, it does ensure that both tz = Time.zone.now
t = tz.to_time
[t.class == tz.class, t == tz, t.hash == tz.hash]
# [false, true, true]
A = Value.new(time)
A.new(tz) == A.new(t)
# trueSo far so good! I am looking for authoritative documentation that Ruby classes are expected to honor the contract that |
|
Ok, as I had hoped, this is part of the Ruby language specification for Object#hash:
|
|
So... even more interesting. It turns out that, because the existing Values gem pre-computes # existing gem, not this branch
A = Value.new(:arr)
x = A.new([1])
y = A.new([1])
[x == y, x.hash == y.hash]
#=> [true, true]
# EDGE CASE: mutate value of attribute of y
y.arr << 2
y
#=> #<A arr=[1, 2]>
[x == y, x.hash == y.hash]
#=> [false, true] |
NOTE: Currently this commit is based of of PR #57, but can be separated if necessary.
Speed improvements:
Benchmarking gist:
https://gist.github.com/ms-ati/3e62cca57ca32cd17ad2e2e2b14cc65e
Explanation of changes:
It came as a surprise to see the humble
#eql?comparison in profiling --we had not realized that in Ruby, if one wants to short-circuit comparison
of a single instance of an object to itself, for example, that this logic
must be explicitly added to
#eql?. This accounts for 15.6x differencein iterations-per-second of same-instance comparison.
Furthermore, the cost of pre-calculating
hashis already being paid atinstantiation of every value object. So we add an early exit if the hash
does not match. This accounts for 10.2x difference in iterations-per-second
of comparisons of instances that differ in only the last field.
Finally, the cost of instantiating two temporary Array instances is
significant when comparing two distinct instances which have equal field
values. So we iterate the field comparisons without the temporary
Arrays. This accounts for the 1.4x difference in iterations-per-second
of comparisons of distinct instances with equal field values.