-
Notifications
You must be signed in to change notification settings - Fork 127
retain valid certs on fetch failures #1567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
😊 Welcome @deveshdama! This is either your first contribution to the Istio ztunnel repo, or it's been You can learn more about the Istio working groups, Code of Conduct, and contribution guidelines Thanks for contributing! Courtesy of your friendly welcome wagon. |
Hi @deveshdama. Thanks for your PR. I'm waiting for a istio member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/ok-to-test |
- fetches now records failed attempts as well. - validate that valid certificate are retained across fetch attempts despite ca failures
@howardjohn can you please take a look. |
}, | ||
// we don't have a valid existing certificate | ||
None => { | ||
tracing::debug!(%id, "certificate fetch failed ({err}) and no valid existing certificate, retrying in {retry_delay:?}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could make this a warn rather than a debug. Do we continue to retry indefinitely?
tracing::debug!(%id, "certificate fetch failed ({err}), retrying in {retry:?}"); | ||
let refresh_at = Instant::now() + retry; | ||
(CertState::Unavailable(err), refresh_at) | ||
tracing::debug!(%id, "certificate fetch failed ({err}), retrying in {retry_delay:?}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we move this log within Some((valid_cert, cert_expiry_instant))
? And clarify we are using existing valid certificate
@keithmattix or @Stevenjin8 could you PTAL? |
@@ -362,6 +362,10 @@ impl Worker { | |||
// Note that we are using a backoff-per-unique-identity-request. This is to prevent issues | |||
// when a cert cannot be fetched for Pod A, but that should not stall retries for | |||
// pods B, C, and D. | |||
|
|||
// Check if we should retain the existing valid certificate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we move this above the big comment block?
Retain existing valid certificates when new fetch attempts fail, improving service reliability during CA outages. Implements backoff scheduling that respects certificate expiry times.
This PR addresses istio issue#56452