-
Notifications
You must be signed in to change notification settings - Fork 869
feat(shard-distributor): Add ping verification to ephemeral shard creator #7496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(shard-distributor): Add ping verification to ephemeral shard creator #7496
Conversation
a043f8b to
f9d6569
Compare
0046451 to
9353d41
Compare
After creating a shard, the ephemeral shard creator now pings the owner to verify that: 1. The shard was created successfully 2. The owner can be reached via gRPC 3. The owner actually owns the shard This provides end-to-end validation of the shard creation and routing mechanisms in the canary test environment. Changes: - Switch from using ShardDistributor client to using Spectators - Add pingShardOwner() method that sends canary ping after creation - Verify executor ID and ownership in ping response - Log warnings if executor ID mismatches or ownership is incorrect - Add test coverage for ping verification flow - Use refactored *spectatorclient.Spectators type Signed-off-by: Jakob Haahr Taankvist <jht@uber.com>
Integrate all canary components into the module: - Create SpectatorPeerChooser and manage its lifecycle - Provide canary client using the dispatcher - Create and start the pinger component - Register ping handler as a YARPC server - Wire spectators to peer chooser on startup This connects all the pieces needed for executor-to-executor canary testing via ping/pong requests. Dependencies: - Requires SpectatorPeerChooser - Requires PingHandler - Requires Pinger component Signed-off-by: Jakob Haahr Taankvist <jht@uber.com>
Signed-off-by: Jakob Haahr Taankvist <jht@uber.com>
Signed-off-by: Jakob Haahr Taankvist <jht@uber.com>
Signed-off-by: Jakob Haahr Taankvist <jht@uber.com>
Signed-off-by: Jakob Haahr Taankvist <jht@uber.com>
3efbb33 to
150093f
Compare
|
|
||
| fx.Provide( | ||
| func(t *grpc.Transport) peer.Transport { return t }, | ||
| ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have transport in fx.supply, do we need this? it seems redundant
| }), | ||
|
|
||
| fx.Provide( | ||
| func(t *grpc.Transport) peer.Transport { return t }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have transport in fx.supply, do we need this? it seems redundant
|
|
||
| for executorID, state := range namespaceState.Executors { | ||
| if now.Sub(state.LastHeartbeat) > p.cfg.HeartbeatTTL { | ||
| p.logger.Info("Executor has not reported a heartbeat recently", tag.ShardExecutor(executorID), tag.ShardNamespace(p.namespaceCfg.Name), tag.Value(state.LastHeartbeat)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
What changed?
The ephemeral shard creator now pings shard owners immediately after creation to verify end-to-end functionality.
This also wires up the spectator, and dispatcher and fixed namespace pinger.
Why?
This adds canary verification that validates:
Previously, the shard creator only verified that GetShardOwner returned successfully, but didn't verify that the executor was actually reachable or owned the shard.
How did you test it?
Unit tests
Potential risks
Low risk - this only affects the canary test environment and adds verification without changing core shard creation logic.
Documentation