-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
fix: Issue with etcd key disappearing and unable to auto-re-register #4960
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
这种极端情况确实存在,需要重新注册服务 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
希望pr能够补充相应单测
core/discov/publisher.go
Outdated
@@ -125,6 +125,8 @@ func (p *Publisher) keepAliveAsync(cli internal.EtcdClient) error { | |||
} | |||
|
|||
threading.GoSafe(func() { | |||
ticker := time.NewTicker(5 * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里可以调整到30s,兜底check不需要这么频繁
core/discov/publisher.go
Outdated
@@ -135,6 +137,16 @@ func (p *Publisher) keepAliveAsync(cli internal.EtcdClient) error { | |||
} | |||
return | |||
} | |||
case <-ticker.C: | |||
resp, err := cli.Get(cli.Ctx(), p.fullKey) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这部分可以复用 doKeepAlive 的逻辑,感觉可以写成:
resp, err := cli.Get(cli.Ctx(), p.fullKey)
if err != nil {
logc.Errorf(cli.Ctx(), "etcd publisher check register key error: %s", err.Error())
continue
}
if len(resp.Kvs) == 0 {
logc.Errorf(cli.Ctx(), "etcd publisher key %s not found, will re-register", p.fullKey)
if err := p.doKeepAlive(); err != nil {
logc.Errorf(cli.Ctx(), "etcd publisher KeepAlive: %s", err.Error())
}
return
}
这两天太忙了,周六或者周日我再完善下。 |
优化了下,从定时器检查改为watch这个key,如果收到delete事件就会重新注册。 |
f979851
to
04fe8dd
Compare
Thanks for your contribution! Will be merged in v1.9.0, which will be released in August. |
1411476
to
7dc1613
Compare
7dc1613
to
e833031
Compare
e833031
to
585d307
Compare
Problem
There is an issue in production where when go-zero uses etcd, under certain conditions, the registered key disappears, causing frontend services to be unable to find backend services.
Reproduction conditions
Then I simulated it locally by deleting the registered key in etcd. It won't regenerate, even if the service process still exists. So I used a simple strategy for re-registration: if getting this key doesn't exist, then put it. After using the new strategy, deleting this key will regenerate it.
(Since the time spent handling this was relatively short, only 30 minutes, the PR may not be very complete. Welcome to discuss with the author)