-
Notifications
You must be signed in to change notification settings - Fork 856
Description
Cadence has been using consistent hashing based host-shard management. However this logic comes up with significant drawbacks which come to surface at high scale. Issues include:
- Shard Load Skew
- Shard Data Payload Skew
- Heterogeneous Host SKUs
- Shared Hosts
- Individual Host Problems
- Shutdowns, Restarts, Host Turn-up
and the list goes on.
With consistent hashing, shards are pinned to hosts. When a host is having a problem due to the factors listed above, adding a new host doesn't guarantee taking load from the problematic host. Similarly, removing a host might just shift the issues to the next host. Even though Cadence uses and recommends using many hosts in a cluster (~16K), these issues would still happen.
A host, as long as it reports itself as healthy to the membership protocol, could get traffic even if it was shutting down or just starting up.
With a high frequency, high availability environments, this logic poses an availability risk. Cadence will implement a load aware and smarter Shard Manager that can move shards among hosts to uniformly distribute the traffic and gracefully handle host maintenance events.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status