Skip to content

Conversation

Yannic
Copy link

@Yannic Yannic commented Jul 22, 2025

This change adds a fast-path to Iterables.get() if the iterable is of typeImmutableCollection that allows access to the element in O(1) instead of O(n). This extends the existing optimization that allowed constant-time access for iterables of type List.

@Yannic
Copy link
Author

Yannic commented Jul 22, 2025

Hi Guava team,

I've read through the contribution guidelines, which suggest creating an issue first for adding significant new features. I think this feature is small enough and has an uncontroversial API, so I went ahead and created the PR right away given that I did not have to spend signifiant time on the implementation.

If you have things you'd like to discuss, I'm happy to do so either here or I can open an issue for this.

Thanks,
Yannic

@cpovirk
Copy link
Member

cpovirk commented Jul 22, 2025

Users can use this method to, e.g., retrieve a random element from ImmutableSet (e.g., to select a random node to connect to from a pool of nodes in a distributed system).

Would it work to use immutableSet().asList().get(i)? I think there are some unusual cases in which ImmutableCollection.asList() is expensive, but the implementation for "ordinary" ImmutableSet implementations are cheap.

@cpovirk
Copy link
Member

cpovirk commented Jul 23, 2025

Sorry, let me put a little more usefully:

  • Some immutable collections support a constant-time asList() view with a constant-time get(...) method. This includes collections that do not support internalArray() (e.g., RegularImmutableMap.keySet()).
  • I suspect that every collection that does support internalArray() also supports the fast asList()+get(...), but I haven't gone to verify that. (If not, that's something that is probably easy to fix once it's identified.)
  • Given that, the potential advantage of an internalArray()-based Collections2.getElement(Collection, int) over asList().get(...) would have to be either:
    • in the fast case, the marginal reduction in allocation from avoiding a lightweight asList() view (almost certainly too small to be worthwhile) or
    • in the worst case, in which asList() would have to allocate a full array and internalArray() is not available, the relative savings from "just" allocating over n elements instead of copying n+m elements to an array (potentially more significant when it comes up, but I'd hope it comes up rarely, and I'd hope there's an entirely separate option that's even better for those users)

@eamonnmcmanus
Copy link
Member

Rather than adding a new method, perhaps you could optimize the existing [Iterables.get](https://javadoc.io/static/com.google.guava/guava/33.4.8-jre/com/google/common/collect/Iterables.html#get(java.lang.Iterable,int) methods? As @cpovirk suggests, that could use asList() to avoid explicitly depending on the implementation details of the various immutable collections.

This change adds a new method `Collections2.getElement(Collection, int)`, which
returns the i-th element of the collection.

Unlike existing methods via Java `Stream` or `Iterable`, the method in this
change supports fast access for `ImmutableCollection`s backed by an internal
array in `O(1)` instead of `O(n)` with the existing approach.

Users can use this method to, e.g., retrieve a random element from
`ImmutableSet` (e.g., to select a random node to connect to from a pool of
nodes in a distributed system).
@Yannic Yannic force-pushed the yannic-collections-getelement branch from d5742ed to 94e2948 Compare July 24, 2025 15:21
@Yannic Yannic changed the title [collect] Add method to get i-th element of Collections [collect] Optimize Iterables.get() for ImmutableCollection Jul 24, 2025
@Yannic
Copy link
Author

Yannic commented Jul 24, 2025

Moved the code to Iterables.get().

I'm a bit on the fence for asList() vs. internalArray(): the extra allocation is small, so I agree that that shouldn't be an issue. I'll definitely keep an eye on it when rolling this into our codebase, but I suspect things will be just fine. The part I'm more worried about is when isList() does a copy which could be more expensive than finding the element via the iterator, especially when the collection is larger.

I think this needs some benchmarking before it should be merged. Would you mind already running CI on it please?

@netdpb netdpb requested a review from cpovirk July 28, 2025 16:32
@netdpb netdpb added package=collect type=performance Related to performance P3 no SLO java Pull requests that update Java code labels Jul 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
java Pull requests that update Java code P3 no SLO package=collect type=performance Related to performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants