Skip to content

Conversation

@ms-ati
Copy link
Contributor

@ms-ati ms-ati commented Apr 7, 2017

NOTE: Currently this commit is based of of PR #57, but can be separated if necessary.

Speed improvements:

When you have... Delta IPS
Same instance 15.6x
Different last field 10.2x
Equal fields 1.4x

Benchmarking gist:
https://gist.github.com/ms-ati/3e62cca57ca32cd17ad2e2e2b14cc65e

Explanation of changes:

It came as a surprise to see the humble #eql? comparison in profiling --
we had not realized that in Ruby, if one wants to short-circuit comparison
of a single instance of an object to itself, for example, that this logic
must be explicitly added to #eql?. This accounts for 15.6x difference
in iterations-per-second of same-instance comparison.

Furthermore, the cost of pre-calculating hash is already being paid at
instantiation of every value object. So we add an early exit if the hash
does not match. This accounts for 10.2x difference in iterations-per-second
of comparisons of instances that differ in only the last field.

Finally, the cost of instantiating two temporary Array instances is
significant when comparing two distinct instances which have equal field
values. So we iterate the field comparisons without the temporary
Arrays. This accounts for the 1.4x difference in iterations-per-second
of comparisons of distinct instances with equal field values.

ms-ati added 2 commits March 23, 2017 11:38
This PR is inspired by tom-pang#56, and
assumes that code will be merged, so uses it in the benchmarks here:
https://gist.github.com/ms-ati/fa8002ef8a0ce00716e9aa6510d3d4d9

It is common in our code, as in any idiomatic code using value objects
in loops or pipelines, to call `#with` many times, returning a new immutable
object each time with 1 or more fields replaced with new values.

The optimizations in this PR eliminate a number of extra Hash and Array
instantiations that were occurring each time, in favor of iterating only over
the constant `VALUE_ATTRS` array and doing key lookups in the given
Hash parameter in the hot paths.

Per the gist above, this increases ips (iterations per second) 2.29x, from
335.9 to 769.6 on my machine.
Speed improvements:

| When you have...     | Delta IPS |
| -------------------- | --------- |
| Same instance        |     15.6x |
| Different last field |     10.2x |
| Equal fields         |      1.4x |

Benchmarking gist:
https://gist.github.com/ms-ati/3e62cca57ca32cd17ad2e2e2b14cc65e

Explanation of changes:

It came as a surprise to see the humble `#eql?` comparison in profiling --
we had not realized that in Ruby, if one wants to short-circuit comparison
of a single instance of an object to itself, for example, that this logic
must be explicitly added to `#eql?`. This accounts for *15.6x* difference
in iterations-per-second of same-instance comparison.

Furthermore, the cost of pre-calculating `hash` is already being paid at
instantation of every value object. So we add an early exit if the hash
does not match. This accounts for *10.2x* difference in iterations-per-second
of comparisons of instances that differ in only the last field.

Finally, the cost of instantiating two temporary Array instances is
significant when comparing two distinct instances which have equal field
values. So we iterate the field comparisons without the temporary
Arrays. This accounts for the *1.4x* difference in iterations-per-second
of comparisons of distinct instances with equal field values.
@codecov-io
Copy link

codecov-io commented Apr 7, 2017

Codecov Report

Merging #58 into master will decrease coverage by 0.35%.
The diff coverage is 96%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #58      +/-   ##
==========================================
- Coverage   96.42%   96.07%   -0.36%     
==========================================
  Files           1        1              
  Lines          84      102      +18     
==========================================
+ Hits           81       98      +17     
- Misses          3        4       +1
Impacted Files Coverage Δ
lib/values.rb 96.07% <96%> (-0.36%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 121e18d...a627364. Read the comment docs.

@ms-ati
Copy link
Contributor Author

ms-ati commented Apr 7, 2017

One potential downside is that this code enforces that fields which have non-equal #hash values will cause the Value objects containing them to be unequal.

What is the Ruby language policy on unequal hash implies not eql? I believe this is the case in Java but not sure here.

@ms-ati
Copy link
Contributor Author

ms-ati commented Apr 7, 2017

Ok, interesting. One data point on "can unequal hash imply not eql?"

A very common example of instances of different classes attempting full interchangeability with one another is ActiveSupport's TimeWithZone, which goes to some lengths to be interchangeable at the instance level with Ruby's Time. And in fact, it does ensure that both hash and eql? match, so enclosing Value objects do compare as equal:

tz = Time.zone.now
t = tz.to_time

[t.class == tz.class, t == tz, t.hash == tz.hash]
# [false, true, true]

A = Value.new(time)
A.new(tz) == A.new(t)
# true

So far so good! I am looking for authoritative documentation that Ruby classes are expected to honor the contract that eql? being true implies equal hash, and will post here if I can find it.

@ms-ati
Copy link
Contributor Author

ms-ati commented Apr 7, 2017

Ok, as I had hoped, this is part of the Ruby language specification for Object#hash:

Generates a Fixnum hash value for this object. This function must have the property that a.eql?(b) implies a.hash == b.hash.

@ms-ati
Copy link
Contributor Author

ms-ati commented Apr 8, 2017

So... even more interesting. It turns out that, because the existing Values gem pre-computes #hash but does not enforce the requirement that all attribute values are frozen, the existing gem allows inconsistent behavior in the edge case where an attribute value is mutated. For example:

# existing gem, not this branch
A = Value.new(:arr)

x = A.new([1])
y = A.new([1])

[x == y, x.hash == y.hash]
#=> [true, true]

# EDGE CASE: mutate value of attribute of y
y.arr << 2
y
#=> #<A arr=[1, 2]>

[x == y, x.hash == y.hash]
#=> [false, true]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants