Skip to content

Conversation

@nikhaild
Copy link

@nikhaild nikhaild commented Dec 3, 2025

Add support to toggle hostNetwork field of pod spec for dcgm-exporter daemonset pod spec,
allowing dcgm-exporter pods to be scraped by say prometheus-server that runs outside of the
bounds of the k8s cluster overlay network (and still be able to reach dcgm-exporter pods,
i.e. scraping each daemonset pod's port on the node - this is easier to reason and deal with
compared to say scraping a NodePort service ports on each node).

Fixes #1086

@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 3, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@nikhaild nikhaild force-pushed the pull-request/dcgm-exporter-hostnetwork branch from e12adfc to 9c1b323 Compare December 3, 2025 07:23
@nikhaild nikhaild marked this pull request as ready for review December 3, 2025 07:28
Add support to toggle `hostNetwork` field of pod spec for
dcgm-exporter daemonset pod spec, allowing dcgm-exporter pods
to be scraped by say prometheus-server that runs outside of the
bounds of the k8s cluster overlay network (and still be able
to reach dcgm-exporter pods).

Fixes NVIDIA#1086

Signed-off-by: Nikhil R Deshpande <nikhil-nd.deshpande@broadcom.com>
@nikhaild nikhaild force-pushed the pull-request/dcgm-exporter-hostnetwork branch from 9c1b323 to d858d24 Compare December 3, 2025 16:41
@rajathagasthya
Copy link
Contributor

/ok-to-test d858d24

@tariq1890
Copy link
Contributor

@nikhaild, thanks for your contribution. In general, we like to ensure that our generated YAML manifests are aligned with the upstream helm chart of dcgm-exporter. It would be good if you can open a PR/issue in NVIDIA/dcgm-exporter and get the maintainers of dcgm-exporter to weigh in on this first.

@nikhaild
Copy link
Author

nikhaild commented Dec 10, 2025

In general, we like to ensure that our generated YAML manifests are aligned with the upstream helm chart of dcgm-exporter. It would be good if you can open a PR/issue in NVIDIA/dcgm-exporter and get the maintainers of dcgm-exporter to weigh in on this first.

Thanks @tariq1890 for taking a look!

Upstream dcgm-exporter helm chart already supports "templatizing" this hostNetwork field (was done a while ago, tracker PR#64), code ref

      {{- if .Values.hostNetwork }}
      hostNetwork: {{ .Values.hostNetwork }}

It's just not set explicitly to a default value in dcgm-exporter helm chart values.yaml, but looks like some folks use it already (ref issue#495).

Mind clarifying what exactly should I ask from dcgm-exporter maintainer folks? Would appreciate if you could elaborate a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Add hostNetwork mode for dcgmExporter

3 participants