-
Notifications
You must be signed in to change notification settings - Fork 431
fix: update digest lookup to use versioned RepoTags instead of latest #1935
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
deb4294 to
e789ee7
Compare
|
PTAL @cdesiniotis @tariq1890 |
| image: nvcr.io/nvidia/k8s/dcgm-exporter:4.4.1-4.6.0-distroless@sha256:ed2dfcb708949de649ab8e1b23521cfd1eba89774dbbe662b6835f1ffcaadb1a | ||
| - name: dcgm-image | ||
| image: nvcr.io/nvidia/cloud-native/dcgm@sha256:99187d6b023689f50cf065c77b96ba5aacfa26a618854608a1e31da5e826b765 | ||
| image: nvcr.io/nvidia/cloud-native/dcgm:4.4.1-2-ubuntu22.04@sha256:d42cd2afe032d9bfcb101cc2f739683b123a6c744f2732f221362a3cac776806 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be picking the ubi9 tags not ubuntu22.04
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I matched the version with the deployments/gpu-operator/values.yaml file which has ubuntu22.04 for dcgm-exporter .
values.yaml changed:-
- version: 4.4.2-1-ubuntu22.04
+ version: 4.4.2-1-ubi9
| image: nvcr.io/nvidia/driver@sha256:35359117c5cdf786694d2fdba2ba038e7f673c5d0243c9ed4dc6cdaf6e675e4a | ||
| - name: device-plugin-image | ||
| image: nvcr.io/nvidia/k8s-device-plugin@sha256:2d16df5f3f12081b4bd6b317cf697e5c7a195c53cec7e0bab756db02a06b985c | ||
| image: nvcr.io/nvidia/k8s-device-plugin:v0.18.0@sha256:2d16df5f3f12081b4bd6b317cf697e5c7a195c53cec7e0bab756db02a06b985c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we adding the version numbers here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renovate always pulls the main:latest digest and updates the sha256. However, we have a specicfic requirement:
the sha in the bundle/manifests/gpu-operator-certified.clusterserviceversion.yaml must match the version specified in values.yaml
To achieve this and to ensure both files are updated to the same version we need an identifier. with this change , renovate should first update the tags/version in values.yaml , then use that version to fetch the corresponding sha256 and update the bundle/manifests/gpu-operator-certified.clusterserviceversion.yaml .
if we dont associate the version with the sha renovate will not be able to identify which sha belongs to which version and it will use main:latest .
a8eb42b to
6f8277b
Compare
rajathagasthya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This largely looks good to me. Just a couple of questions.
| "[-\\s]*value:\\s*\"?(?<depName>[^:\"]+)(?::(?<currentValue>[^@\"]+))?@(?<currentDigest>sha256:[a-f0-9]{64})\"?", | ||
| "[-\\s]*image: (?<depName>.*?)(?::(?<currentValue>.*?))?@(?<currentDigest>sha256:[a-f0-9]{64})", | ||
| "- name: (?<suffix>[\\w-]+)[-\\s]*image: (?<depName>.*?)(?::(?<currentValue>.*?))?@(?<currentDigest>sha256:[a-f0-9]{64})" | ||
| "[-\\s]*repository:\\s*(?<repo>\\S+)\\s*\\n(?:\\s*#.*\\n|\\s*\\n)*[-\\s]*image:\\s*(?<image>\\S+)\\s*\\n(?:\\s*#.*\\n|\\s*\\n)*[-\\s]*version:\\s*(?<currentValue>\\S+)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regexes are hard to understand. Do you mind giving a quick rundown of what this is doing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suppose we have image format as this in values.yaml :-
driverManager:
repository: nvcr.io/nvidia/cloud-native
image: k8s-driver-manager
# When choosing a different version of k8s-driver-manager, DO NOT downgrade to a version lower than v0.6.4
# to ensure k8s-driver-manager stays compatible with gpu-operator starting from v24.3.0
version: v0.9.1
imagePullPolicy: IfNotPresent
env: []
[-\\s]*repository:\\s*(?<repo>\\S+)\\s*\\n
: [-\\s]* : - (list preseent or not ) or indentation match (optional)
: repository: match repository word
: \\s*(?<repo>\\S+): capture nvcr.io/nvidia/cloud-native
: \s*\n : ends at new line
(?:\\s*#.*\\n|\\s*\\n)*
: to ignore any blank lines, comment (#) or any formatting issue.
[-\\s]*image:\\s*(?<image>\\S+)\\s*\\n :
only change here is capture image: k8s-driver-manager
another
(?:\\s*#.*\\n|\\s*\\n)* :
ignore blank , line comments etc .
these lines will be ignored
# When choosing a different version of k8s-driver-manager, DO NOT downgrade to a version lower than v0.6.4
# to ensure k8s-driver-manager stays compatible with gpu-operator starting from v24.3.0
(?:\\s*#.*\\n|\\s*\\n)*[-\\s]*version:\\s* : captures version
(?\S+)" : use version value v0.9.1` as currentValue and query docker data source for new version due to rule L:46 and L:47:
```
"depNameTemplate": "{{repo}}/{{image}}",
"datasourceTemplate": "docker",
| ], | ||
| "matchStrings": [ | ||
| "[-\\s]*repository:\\s*(?<repo>\\S+)\\s*\\n(?:\\s*#.*\\n|\\s*\\n)*[-\\s]*image:\\s*(?<image>\\S+)\\s*\\n(?:\\s*#.*\\n|\\s*\\n)*[-\\s]*version:\\s*(?<currentValue>\\S+)" | ||
| "(?<depName>[^:\"]+):(?<currentValue>[^@\"]+)@(?<currentDigest>sha256:[a-f0-9]{64})" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suppose we have sha image in file bundle/manifests/gpu-operator-certified.clusterserviceversion.yaml :
image: nvcr.io/nvidia/cloud-native/k8s-driver-manager:v0.9.1@sha256:c549346eb993fda62e9bf665aabaacc88abc06b0b24e69635427d4d71c2d5ed4
(?<depName>[^:\"]+) :
renovate captures depName : nvcr.io/nvidia/cloud-native/k8s-driver-manager [^:\"] : match any characters , stop when we reach : or " , final match will be anything before the version tag v0.9.1 .
:(?<currentValue>[^@\"]+)@ :
start matching after :, captures all characters upto @ , stop at @ , avoid " qoutotes currentValue=v0.9.1
(?<currentDigest>sha256:[a-f0-9]{64}) :
start with sha256
matches only characters [a-f] and [0-9]
match valid sha update 64 characters
currentDigest = c549346eb993fda62e9bf665aabaacc88abc06b0b24e69635427d4d71c2d5ed4
| "[-\\s]*value:\\s*\"?(?<depName>[^:\"]+)(?::(?<currentValue>[^@\"]+))?@(?<currentDigest>sha256:[a-f0-9]{64})\"?", | ||
| "[-\\s]*image: (?<depName>.*?)(?::(?<currentValue>.*?))?@(?<currentDigest>sha256:[a-f0-9]{64})", | ||
| "- name: (?<suffix>[\\w-]+)[-\\s]*image: (?<depName>.*?)(?::(?<currentValue>.*?))?@(?<currentDigest>sha256:[a-f0-9]{64})" | ||
| "[-\\s]*repository:\\s*(?<repo>\\S+)\\s*\\n(?:\\s*#.*\\n|\\s*\\n)*[-\\s]*image:\\s*(?<image>\\S+)\\s*\\n(?:\\s*#.*\\n|\\s*\\n)*[-\\s]*version:\\s*(?<currentValue>\\S+)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suppose we have image format as this in values.yaml :-
driverManager:
repository: nvcr.io/nvidia/cloud-native
image: k8s-driver-manager
# When choosing a different version of k8s-driver-manager, DO NOT downgrade to a version lower than v0.6.4
# to ensure k8s-driver-manager stays compatible with gpu-operator starting from v24.3.0
version: v0.9.1
imagePullPolicy: IfNotPresent
env: []
[-\\s]*repository:\\s*(?<repo>\\S+)\\s*\\n
: [-\\s]* : - (list preseent or not ) or indentation match (optional)
: repository: match repository word
: \\s*(?<repo>\\S+): capture nvcr.io/nvidia/cloud-native
: \s*\n : ends at new line
(?:\\s*#.*\\n|\\s*\\n)*
: to ignore any blank lines, comment (#) or any formatting issue.
[-\\s]*image:\\s*(?<image>\\S+)\\s*\\n :
only change here is capture image: k8s-driver-manager
another
(?:\\s*#.*\\n|\\s*\\n)* :
ignore blank , line comments etc .
these lines will be ignored
# When choosing a different version of k8s-driver-manager, DO NOT downgrade to a version lower than v0.6.4
# to ensure k8s-driver-manager stays compatible with gpu-operator starting from v24.3.0
(?:\\s*#.*\\n|\\s*\\n)*[-\\s]*version:\\s* : captures version
(?\S+)" : use version value v0.9.1` as currentValue and query docker data source for new version due to rule L:46 and L:47:
```
"depNameTemplate": "{{repo}}/{{image}}",
"datasourceTemplate": "docker",
| ], | ||
| "matchStrings": [ | ||
| "[-\\s]*repository:\\s*(?<repo>\\S+)\\s*\\n(?:\\s*#.*\\n|\\s*\\n)*[-\\s]*image:\\s*(?<image>\\S+)\\s*\\n(?:\\s*#.*\\n|\\s*\\n)*[-\\s]*version:\\s*(?<currentValue>\\S+)" | ||
| "(?<depName>[^:\"]+):(?<currentValue>[^@\"]+)@(?<currentDigest>sha256:[a-f0-9]{64})" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suppose we have sha image in file bundle/manifests/gpu-operator-certified.clusterserviceversion.yaml :
image: nvcr.io/nvidia/cloud-native/k8s-driver-manager:v0.9.1@sha256:c549346eb993fda62e9bf665aabaacc88abc06b0b24e69635427d4d71c2d5ed4
(?<depName>[^:\"]+) :
renovate captures depName : nvcr.io/nvidia/cloud-native/k8s-driver-manager [^:\"] : match any characters , stop when we reach : or " , final match will be anything before the version tag v0.9.1 .
:(?<currentValue>[^@\"]+)@ :
start matching after :, captures all characters upto @ , stop at @ , avoid " qoutotes currentValue=v0.9.1
(?<currentDigest>sha256:[a-f0-9]{64}) :
start with sha256
matches only characters [a-f] and [0-9]
match valid sha update 64 characters
currentDigest = c549346eb993fda62e9bf665aabaacc88abc06b0b24e69635427d4d71c2d5ed4
| "nvcr.io/nvidia/kubevirt-gpu-device-plugin", | ||
| "nvcr.io/nvidia/k8s-device-plugin" | ||
| ], | ||
| "versioning": "regex:^v?(?<major>\\d+)\\.(?<minor>\\d+)\\.(?<patch>\\d+)$", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\.(?\d+)$` :
^: start of string
v? : optionsla v0.18.0 or 0.18.0
(?<major>\\d+): major version in digit
\\. : dot match
\\.(?<minor>\\d+) : minor version in digit
\. : dot match
(?\d+) : patch version in digit
$: end of string
it will match :
v0.18.0
0.18.0
it will not match
0.18.0-ubuntu
v0.18 etc
| ], | ||
| "versioningTemplate": "{{depName}}{{#if newValue}}:{{newValue}}{{/if}}{{#if newDigest}}@{{newDigest}}{{/if}}", | ||
| "datasourceTemplate": "docker" | ||
| "depNameTemplate": "{{repo}}/{{image}}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this depNameTemplate ensures matching updates in file bundle/manifests/gpu-operator-certified.clusterserviceversion.yaml for depName variable
| image: ghcr.io/nvidia/gpu-operator:main-latest | ||
| - name: dcgm-exporter-image | ||
| image: nvcr.io/nvidia/k8s/dcgm-exporter@sha256:7c0ac4430bb0a5868b7404a0e06c47e02b0375b61aadd614385ad0bc2d43815a | ||
| image: nvcr.io/nvidia/k8s/dcgm-exporter:4.4.2-4.7.0-distroless@sha256:7c0ac4430bb0a5868b7404a0e06c47e02b0375b61aadd614385ad0bc2d43815a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shivakunv Let's separate changes to this file which introduces a new tagging convention into a different PR. The reason is we may need to check with RedHat to make sure this doesn't break their certification.
Once that change is done, we can run your renovate changes to see if things are updating correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes to this file will not be separated, because the purpose of Renovate ( here ) is to update both values.yaml and the bundle/manifests/gpu-operator-certified.clusterserviceversion.yaml file simultaneously.
Currently, the Renovate bot is broken for the bundle/manifests/gpu-operator-certified.clusterserviceversion.yaml file, as it sometimes opens the wrong PR. The Renovate PR for values.yaml is working fine.
Both PRs (opened by renovate) should open together and be merged into main.
Until this change is merged into main, we need to manually review and update the PR created by Renovate bot for thebundle/manifests/gpu-operator-certified.clusterserviceversion.yaml . @rajathagasthya
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rajathagasthya for your review
@cdesiniotis PTAL
Signed-off-by: Shiva Kumar (SW-CLOUD) <shivaku@nvidia.com>
6f8277b to
f3e6ed9
Compare
|
PTAL @tariq1890 Regarding this section: I have only added TODO comments. |
Current Issue:
Renovate always pulls the digest from the latest tag, which results in a SHA difference.
However, the version in the
values.yamlfile is not updated in this scenario, so the digest in the manifests becomes mismatched and stale.Solution:
Open a PR only when the RepoTags value in values.yaml changes.
Then search for that exact digest in
bundle/manifests/gpu-operator-certified.clusterserviceversion.yamlinstead of using the digest from latest.
check open PRs:
https://github.com/shivakunv/gpu-operator-shiva/pulls