-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
Describe the bug
After upgrading the Kubernetes client to version 7.4.0, we have intermittently been seeing an informer sync issue.
Our liveness probe checks the health of our informers. If an informer's health is not good, the liveness probe fails, and the pod restarts. After upgrading the Fabric8 Kubernetes client to version 7.4.0, our service pod has been restarting several times. Upon investigation, we found that the health of some informers was not good.
We check if an informer is running(), watching() and hasSynced() to determine if its health is good.
However, after downgrading the Kubernetes client to version 7.3.1, we are no longer seeing the service pod restart and the informers' health is also good.
I have attached the reproducer here. We can deploy this to any Kubernetes cluster to reproduce the issue. Just deploy it and let it run for three to four hours. You will then observe pod restarts several times due to a health failure caused by the informer issue.
cc: @manusa
Fabric8 Kubernetes Client version
7.4.0
Steps to reproduce
- Upgrade the Kubernetes Client to version 7.4.0
- Use a SharedIndexInformer as part of the application's health check
- Configure the Kubernetes liveness probe to check the SharedIndexInformer's health
- Deploy the application in a Kubernetes cluster and let it run for three to four hours
Expected behavior
SharedIndexInformer health should not fail in Kubernetes client version 7.4.0
Runtime
other (please specify in additional context)
Kubernetes API Server version
1.32
Environment
Linux, other (please specify in additional context)
Fabric8 Kubernetes Client Logs
HealthResponseImpl[status=UP, details={PodInformer hasSynced=true, PodInformer isRunning=true, PodInformer isWatching=true}]
HealthResponseImpl[status=DOWN, details={PodInformer hasSynced=true, PodInformer isRunning=true, PodInformer isWatching=false}]
Sep 23, 2025 11:08:00 AM io.helidon.Main shutdown
INFO: Shutdown requested by JVM shutting down
Sep 23, 2025 11:08:00 AM io.helidon.webserver.ServerListener listen
INFO: [0x53e218d9] @default socket closed.
Sep 23, 2025 11:08:00 AM io.helidon.Main shutdown
INFO: Shutdown finished
Additional context
livenessProbe:
failureThreshold: 1
httpGet:
path: /health
port: 8080
scheme: HTTP
initialDelaySeconds: 30
periodSeconds: 15
successThreshold: 1
timeoutSeconds: 2
Version:
kubectl version:
Client Version: v1.32.1
Kustomize Version: v5.5.0
Server Version: v1.32.1
We are using Oracle Kubernetes Engine (OKE) clusters to deploy our applications