-
Notifications
You must be signed in to change notification settings - Fork 65
Updated DRA APIs to v1 #116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated DRA APIs to v1 #116
Conversation
/assign |
Please see the following for how to gt the DRA driver to work with all of v1, v1alpha1, and v1alpha2: #115 (comment) |
9fddc5c
to
111c26c
Compare
32dad5f
to
1ef61dd
Compare
deployments/helm/dra-example-driver/templates/validatingwebhookconfiguration.yaml
Outdated
Show resolved
Hide resolved
1ef61dd
to
8fa192c
Compare
I am using the example driver in some kubernetes documentation, and want to queue up a PR that references the tag for the image that will be released with this (in kubernetes/website#51979). Will the new image released be Also for what it's worth, I did run through that tutorial with a local build of the example driver from this branch and it worked great so, I can confirm that it works with k/k 1.34 and the v1 APIs haha 😇 |
Yes, I'd like to publish a v0.2.0 image and chart soon after this merges.
Thanks for checking! |
8fa192c
to
682e7fc
Compare
682e7fc
to
b30950a
Compare
I know I saw the changes from NVIDIA/k8s-dra-driver-gpu@920a287 integrated in the PR at some point, but they seem to have been dropped... |
I dropped them because I was thinking the multiversion compatability would be handled with runtime-config in kind-cluster-config.yaml? |
Also having [control-plane-check] kube-apiserver is not healthy after 4m0.00028186s
A control plane component may have crashed or exited when started by the container runtime.
error: error execution phase wait-control-plane: failed while waiting for the control plane to start: kube-apiserver check failed at https://172.18.0.2:6443/livez: client rate limiter Wait returned an error: context deadline exceeded When excluding changes from NVIDIA/k8s-dra-driver-gpu@920a287 and with |
It looks like |
With [api-check] The API server is not healthy after 4m0.001581692s
Unfortunately, an error has occurred:
context deadline exceeded
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled) 1.33 cluster (correctly uses v1beta2) To recap: So I think defining both v1beta1 and v1beta2 in runtime config doesn't work becasue v1beta2 not available for 1.32 |
0b396d1
to
2061e73
Compare
258f0c3
to
9395ab2
Compare
deployments/helm/dra-example-driver/templates/validatingwebhookconfiguration.yaml
Outdated
Show resolved
Hide resolved
I ran through all the valid combinations of v1beta1/v1beta2/v1 and 1.32/1.33/1.34 and things are working as I expect, so overall LGTM. @alimaazamat As soon as the k8s.io dependencies and kind image for the final 1.34.0 release are available let's get a clean CI run before squashing and merging this. Feel free to squash earlier too if that's easier. |
6f8a7b6
to
01e57c2
Compare
01e57c2
to
a63aae2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alimaazamat, nojnhuh The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
fixes #115