Skip to content

Agent DaemonSet: Host Log Access Permission Denied & Kubelet Metrics Endpoint Resolution Failure #134

@alimorseltelostouch

Description

@alimorseltelostouch

When deploying the OpenObserve Collector agent DaemonSet using this Helm chart, we are encountering two issues in a specific managed Kubernetes environment: a permission denied error when attempting to read host logs from /var/log/pods, and a failure to resolve the node hostname when scraping Kubelet statistics.

To Reproduce

Steps to reproduce the behavior:

Deploy the openobserve/openobserve-collector Helm chart using the following command :

helm --namespace openobserve-collector \
  upgrade --install o2c openobserve/openobserve-collector \
  --set exporters."otlphttp/openobserve".endpoint=*** \
  --set exporters."otlphttp/openobserve".headers.Authorization="****" \
  --set exporters."otlphttp/openobserve_k8s_events".endpoint=*** \
  --set exporters."otlphttp/openobserve_k8s_events".headers.Authorization="***" \
  --set securityContext.runAsUser=0 \
  --set securityContext.runAsGroup=0 \
  --set securityContext.runAsNonRoot=false \
  --set podSecurityContext.runAsUser=0 \
  --set podSecurityContext.runAsGroup=0 \
  --create-namespace

in a managed Kubernetes environment where host filesystem access requires specific security context configurations and node hostname resolution via cluster DNS may present challenges.

Expected behavior

The OpenObserve Collector agent DaemonSet should be able to:

  • Access and read log files located at /var/log/pods///*.log on the host node.
  • Successfully resolve the node's Kubelet endpoint (typically on port 10250) to scrape metrics.

Actual behavior

  • Permission Denied accessing host logs:
  • The filelog receiver fails to read logs from the host with a "permission denied" error on the /var/log/pods directory.
  • Error snippet from agent pod logs:

{"kind": "receiver", "name": "filelog/std", "data_type": "logs", "component": "fileconsumer", "error": "no files match the configured criteria\nfind files with '/var/log/pods/*/*/*.log' pattern: open .: permission denied"}
We have investigated this by deploying a minimal test DaemonSet pod with a hostPath mount for /var/log and setting its securityContext to run with root privileges (runAsUser: 0, runAsGroup: 0). This test pod was successfully able to list and read files within /var/log/pods, demonstrating that the underlying environment does permit such access when the pod has the necessary privileges.
Upon inspecting the YAML of a failing OpenObserve agent pod, we observed that the spec.securityContext block is present but empty ({}), and there is no securityContext configured at the container level. This indicates the agent pod is not being deployed with the required root-level privileges to access these host files. We are using the Helm chart and are seeking the recommended method within the values.yaml to configure the agent DaemonSet's pod template to include the necessary securityContext settings (runAsUser: 0, runAsGroup: 0).

Kubelet Metrics Endpoint Resolution Failure:
The kubeletstats receiver is unable to connect to the node's Kubelet endpoint, reporting a DNS lookup failure for the node hostname.
Error snippet from agent pod logs:

Error scraping metrics {"kind": "receiver", "name": "kubeletstats", "data_type": "metrics", "error": "Get \"https://tt-poc-cluster-worker-node-pool-1-54c795dc6cxz9cw9-r8lmg:10250/stats/summary\": dial tcp: lookup tt-poc-cluster-worker-node-pool-1-54c795dc6cxz9cw9-r8lmg on 100.64.0.10:53: no such host", "scraper": "kubeletstats"}
The pod correctly obtains the node name via the spec.nodeName field reference. However, the cluster's DNS service appears unable to resolve this hostname. While this may be related to the specific networking setup of the managed Kubernetes environment, we would like to inquire if the kubeletstats receiver, as configured by this chart, can be instructed to connect to the Kubelet endpoint using the node's IP address as an alternative to hostname resolution, particularly in environments where hostname resolution might be unreliable. If so, is this configurable via the Helm chart's values.yaml?

Environment

Kubernetes distribution type: Managed Kubernetes Service with Tanzu
OpenObserve Collector Helm Chart version: latest
OpenTelemetry : latest
Deployment method: Helm chart

Additional context

We previously deployed this same Helm chart version in a different Kubernetes environment (e.g., AWS EKS) without encountering these specific issues. This suggests the problems may be influenced by the configuration or policies of the environment where we are currently deploying.

We would appreciate guidance on:

The correct and recommended parameters in values.yaml to apply a securityContext with runAsUser: 0 and runAsGroup: 0 to the agent DaemonSet pod for host log access.
Whether the kubeletstats receiver can be configured via the chart to use node IP instead of hostname for endpoint access, as a potential workaround for DNS resolution issues.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions