Skip to content

Conversation

@shivakunv
Copy link
Contributor

@shivakunv shivakunv commented Nov 24, 2025

Current Issue:
Renovate always pulls the digest from the latest tag, which results in a SHA difference.
However, the version in the values.yaml file is not updated in this scenario, so the digest in the manifests becomes mismatched and stale.

Solution:
Open a PR only when the RepoTags value in values.yaml changes.
Then search for that exact digest in
bundle/manifests/gpu-operator-certified.clusterserviceversion.yaml
instead of using the digest from latest.

check open PRs:
https://github.com/shivakunv/gpu-operator-shiva/pulls

@shivakunv shivakunv force-pushed the grouptagcommitwithbundle branch from deb4294 to e789ee7 Compare November 24, 2025 18:37
@shivakunv shivakunv marked this pull request as ready for review November 24, 2025 18:38
@shivakunv shivakunv self-assigned this Nov 24, 2025
@shivakunv
Copy link
Contributor Author

shivakunv commented Nov 24, 2025

PTAL @cdesiniotis @tariq1890

@coveralls
Copy link

coveralls commented Nov 24, 2025

Coverage Status

coverage: 23.618%. remained the same
when pulling f3e6ed9 on grouptagcommitwithbundle
into 75d1b96 on main.

image: nvcr.io/nvidia/k8s/dcgm-exporter:4.4.1-4.6.0-distroless@sha256:ed2dfcb708949de649ab8e1b23521cfd1eba89774dbbe662b6835f1ffcaadb1a
- name: dcgm-image
image: nvcr.io/nvidia/cloud-native/dcgm@sha256:99187d6b023689f50cf065c77b96ba5aacfa26a618854608a1e31da5e826b765
image: nvcr.io/nvidia/cloud-native/dcgm:4.4.1-2-ubuntu22.04@sha256:d42cd2afe032d9bfcb101cc2f739683b123a6c744f2732f221362a3cac776806
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be picking the ubi9 tags not ubuntu22.04

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I matched the version with the deployments/gpu-operator/values.yaml file which has ubuntu22.04 for dcgm-exporter .

values.yaml changed:-
- version: 4.4.2-1-ubuntu22.04
+ version: 4.4.2-1-ubi9

image: nvcr.io/nvidia/driver@sha256:35359117c5cdf786694d2fdba2ba038e7f673c5d0243c9ed4dc6cdaf6e675e4a
- name: device-plugin-image
image: nvcr.io/nvidia/k8s-device-plugin@sha256:2d16df5f3f12081b4bd6b317cf697e5c7a195c53cec7e0bab756db02a06b985c
image: nvcr.io/nvidia/k8s-device-plugin:v0.18.0@sha256:2d16df5f3f12081b4bd6b317cf697e5c7a195c53cec7e0bab756db02a06b985c
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we adding the version numbers here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renovate always pulls the main:latest digest and updates the sha256. However, we have a specicfic requirement:
the sha in the bundle/manifests/gpu-operator-certified.clusterserviceversion.yaml must match the version specified in values.yaml

To achieve this and to ensure both files are updated to the same version we need an identifier. with this change , renovate should first update the tags/version in values.yaml , then use that version to fetch the corresponding sha256 and update the bundle/manifests/gpu-operator-certified.clusterserviceversion.yaml .

if we dont associate the version with the sha renovate will not be able to identify which sha belongs to which version and it will use main:latest .

@shivakunv shivakunv force-pushed the grouptagcommitwithbundle branch 3 times, most recently from a8eb42b to 6f8277b Compare December 3, 2025 05:26
Copy link
Contributor

@rajathagasthya rajathagasthya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This largely looks good to me. Just a couple of questions.

"[-\\s]*value:\\s*\"?(?<depName>[^:\"]+)(?::(?<currentValue>[^@\"]+))?@(?<currentDigest>sha256:[a-f0-9]{64})\"?",
"[-\\s]*image: (?<depName>.*?)(?::(?<currentValue>.*?))?@(?<currentDigest>sha256:[a-f0-9]{64})",
"- name: (?<suffix>[\\w-]+)[-\\s]*image: (?<depName>.*?)(?::(?<currentValue>.*?))?@(?<currentDigest>sha256:[a-f0-9]{64})"
"[-\\s]*repository:\\s*(?<repo>\\S+)\\s*\\n(?:\\s*#.*\\n|\\s*\\n)*[-\\s]*image:\\s*(?<image>\\S+)\\s*\\n(?:\\s*#.*\\n|\\s*\\n)*[-\\s]*version:\\s*(?<currentValue>\\S+)"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regexes are hard to understand. Do you mind giving a quick rundown of what this is doing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suppose we have image format as this in values.yaml :-

driverManager:
    repository: nvcr.io/nvidia/cloud-native
    image: k8s-driver-manager
    # When choosing a different version of k8s-driver-manager, DO NOT downgrade to a version lower than v0.6.4
    # to ensure k8s-driver-manager stays compatible with gpu-operator starting from v24.3.0
    version: v0.9.1
    imagePullPolicy: IfNotPresent
    env: []

[-\\s]*repository:\\s*(?<repo>\\S+)\\s*\\n
: [-\\s]* : - (list preseent or not ) or indentation match (optional)
: repository: match repository word
: \\s*(?<repo>\\S+): capture nvcr.io/nvidia/cloud-native
: \s*\n : ends at new line

(?:\\s*#.*\\n|\\s*\\n)*
: to ignore any blank lines, comment (#) or any formatting issue.

[-\\s]*image:\\s*(?<image>\\S+)\\s*\\n :
only change here is capture image: k8s-driver-manager

another
(?:\\s*#.*\\n|\\s*\\n)* :
ignore blank , line comments etc .
these lines will be ignored
# When choosing a different version of k8s-driver-manager, DO NOT downgrade to a version lower than v0.6.4
# to ensure k8s-driver-manager stays compatible with gpu-operator starting from v24.3.0

(?:\\s*#.*\\n|\\s*\\n)*[-\\s]*version:\\s* : captures version
(?\S+)" : use version value v0.9.1` as currentValue and query docker data source for new version due to rule L:46 and L:47:
```
"depNameTemplate": "{{repo}}/{{image}}",
"datasourceTemplate": "docker",


],
"matchStrings": [
"[-\\s]*repository:\\s*(?<repo>\\S+)\\s*\\n(?:\\s*#.*\\n|\\s*\\n)*[-\\s]*image:\\s*(?<image>\\S+)\\s*\\n(?:\\s*#.*\\n|\\s*\\n)*[-\\s]*version:\\s*(?<currentValue>\\S+)"
"(?<depName>[^:\"]+):(?<currentValue>[^@\"]+)@(?<currentDigest>sha256:[a-f0-9]{64})"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suppose we have sha image in file bundle/manifests/gpu-operator-certified.clusterserviceversion.yaml :
image: nvcr.io/nvidia/cloud-native/k8s-driver-manager:v0.9.1@sha256:c549346eb993fda62e9bf665aabaacc88abc06b0b24e69635427d4d71c2d5ed4

(?<depName>[^:\"]+) :
renovate captures depName : nvcr.io/nvidia/cloud-native/k8s-driver-manager [^:\"] : match any characters , stop when we reach : or " , final match will be anything before the version tag v0.9.1 .

:(?<currentValue>[^@\"]+)@ :
start matching after :, captures all characters upto @ , stop at @ , avoid " qoutotes currentValue=v0.9.1

(?<currentDigest>sha256:[a-f0-9]{64}) :
start with sha256
matches only characters [a-f] and [0-9]
match valid sha update 64 characters
currentDigest = c549346eb993fda62e9bf665aabaacc88abc06b0b24e69635427d4d71c2d5ed4

"[-\\s]*value:\\s*\"?(?<depName>[^:\"]+)(?::(?<currentValue>[^@\"]+))?@(?<currentDigest>sha256:[a-f0-9]{64})\"?",
"[-\\s]*image: (?<depName>.*?)(?::(?<currentValue>.*?))?@(?<currentDigest>sha256:[a-f0-9]{64})",
"- name: (?<suffix>[\\w-]+)[-\\s]*image: (?<depName>.*?)(?::(?<currentValue>.*?))?@(?<currentDigest>sha256:[a-f0-9]{64})"
"[-\\s]*repository:\\s*(?<repo>\\S+)\\s*\\n(?:\\s*#.*\\n|\\s*\\n)*[-\\s]*image:\\s*(?<image>\\S+)\\s*\\n(?:\\s*#.*\\n|\\s*\\n)*[-\\s]*version:\\s*(?<currentValue>\\S+)"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suppose we have image format as this in values.yaml :-

driverManager:
    repository: nvcr.io/nvidia/cloud-native
    image: k8s-driver-manager
    # When choosing a different version of k8s-driver-manager, DO NOT downgrade to a version lower than v0.6.4
    # to ensure k8s-driver-manager stays compatible with gpu-operator starting from v24.3.0
    version: v0.9.1
    imagePullPolicy: IfNotPresent
    env: []

[-\\s]*repository:\\s*(?<repo>\\S+)\\s*\\n
: [-\\s]* : - (list preseent or not ) or indentation match (optional)
: repository: match repository word
: \\s*(?<repo>\\S+): capture nvcr.io/nvidia/cloud-native
: \s*\n : ends at new line

(?:\\s*#.*\\n|\\s*\\n)*
: to ignore any blank lines, comment (#) or any formatting issue.

[-\\s]*image:\\s*(?<image>\\S+)\\s*\\n :
only change here is capture image: k8s-driver-manager

another
(?:\\s*#.*\\n|\\s*\\n)* :
ignore blank , line comments etc .
these lines will be ignored
# When choosing a different version of k8s-driver-manager, DO NOT downgrade to a version lower than v0.6.4
# to ensure k8s-driver-manager stays compatible with gpu-operator starting from v24.3.0

(?:\\s*#.*\\n|\\s*\\n)*[-\\s]*version:\\s* : captures version
(?\S+)" : use version value v0.9.1` as currentValue and query docker data source for new version due to rule L:46 and L:47:
```
"depNameTemplate": "{{repo}}/{{image}}",
"datasourceTemplate": "docker",


],
"matchStrings": [
"[-\\s]*repository:\\s*(?<repo>\\S+)\\s*\\n(?:\\s*#.*\\n|\\s*\\n)*[-\\s]*image:\\s*(?<image>\\S+)\\s*\\n(?:\\s*#.*\\n|\\s*\\n)*[-\\s]*version:\\s*(?<currentValue>\\S+)"
"(?<depName>[^:\"]+):(?<currentValue>[^@\"]+)@(?<currentDigest>sha256:[a-f0-9]{64})"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suppose we have sha image in file bundle/manifests/gpu-operator-certified.clusterserviceversion.yaml :
image: nvcr.io/nvidia/cloud-native/k8s-driver-manager:v0.9.1@sha256:c549346eb993fda62e9bf665aabaacc88abc06b0b24e69635427d4d71c2d5ed4

(?<depName>[^:\"]+) :
renovate captures depName : nvcr.io/nvidia/cloud-native/k8s-driver-manager [^:\"] : match any characters , stop when we reach : or " , final match will be anything before the version tag v0.9.1 .

:(?<currentValue>[^@\"]+)@ :
start matching after :, captures all characters upto @ , stop at @ , avoid " qoutotes currentValue=v0.9.1

(?<currentDigest>sha256:[a-f0-9]{64}) :
start with sha256
matches only characters [a-f] and [0-9]
match valid sha update 64 characters
currentDigest = c549346eb993fda62e9bf665aabaacc88abc06b0b24e69635427d4d71c2d5ed4

"nvcr.io/nvidia/kubevirt-gpu-device-plugin",
"nvcr.io/nvidia/k8s-device-plugin"
],
"versioning": "regex:^v?(?<major>\\d+)\\.(?<minor>\\d+)\\.(?<patch>\\d+)$",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\.(?\d+)$` :

^: start of string
v? : optionsla v0.18.0 or 0.18.0
(?<major>\\d+): major version in digit
\\. : dot match
\\.(?<minor>\\d+) : minor version in digit
\. : dot match
(?\d+) : patch version in digit
$: end of string

it will match :
v0.18.0
0.18.0

it will not match
0.18.0-ubuntu
v0.18 etc

],
"versioningTemplate": "{{depName}}{{#if newValue}}:{{newValue}}{{/if}}{{#if newDigest}}@{{newDigest}}{{/if}}",
"datasourceTemplate": "docker"
"depNameTemplate": "{{repo}}/{{image}}",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this depNameTemplate ensures matching updates in file bundle/manifests/gpu-operator-certified.clusterserviceversion.yaml for depName variable

image: ghcr.io/nvidia/gpu-operator:main-latest
- name: dcgm-exporter-image
image: nvcr.io/nvidia/k8s/dcgm-exporter@sha256:7c0ac4430bb0a5868b7404a0e06c47e02b0375b61aadd614385ad0bc2d43815a
image: nvcr.io/nvidia/k8s/dcgm-exporter:4.4.2-4.7.0-distroless@sha256:7c0ac4430bb0a5868b7404a0e06c47e02b0375b61aadd614385ad0bc2d43815a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shivakunv Let's separate changes to this file which introduces a new tagging convention into a different PR. The reason is we may need to check with RedHat to make sure this doesn't break their certification.

Once that change is done, we can run your renovate changes to see if things are updating correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes to this file will not be separated, because the purpose of Renovate ( here ) is to update both values.yaml and the bundle/manifests/gpu-operator-certified.clusterserviceversion.yaml file simultaneously.

Currently, the Renovate bot is broken for the bundle/manifests/gpu-operator-certified.clusterserviceversion.yaml file, as it sometimes opens the wrong PR. The Renovate PR for values.yaml is working fine.
Both PRs (opened by renovate) should open together and be merged into main.

Until this change is merged into main, we need to manually review and update the PR created by Renovate bot for thebundle/manifests/gpu-operator-certified.clusterserviceversion.yaml . @rajathagasthya

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rajathagasthya for your review
@cdesiniotis PTAL

Signed-off-by: Shiva Kumar (SW-CLOUD) <shivaku@nvidia.com>
@shivakunv shivakunv force-pushed the grouptagcommitwithbundle branch from 6f8277b to f3e6ed9 Compare December 13, 2025 06:48
@shivakunv
Copy link
Contributor Author

PTAL @tariq1890
Since @cdesiniotis has already confirmed that this image tagging convention will work, the renovate.json file will not impact our build process.

Regarding this section:

- name: vfio-manager-image
   image: ghcr.io/nvidia/k8s-driver-manager:0d30fa9c

I have only added TODO comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants