-
Notifications
You must be signed in to change notification settings - Fork 50
Description
tl;dr the values.yaml
of openobserve-collector is over-complicated. A simpler solution can be achieved using the upstream OpenTelemetry collector's chart.
I am reviewing the code of the openobserve-collector and would like to ask some questions about how it works.
Currently I'm running a Kubernetes cluster with OpenObserve deployed in the monitoring
namespace. Instead of using the openobserve-collector chart, I am using the upstream OpenTelemetry collector's chart with presets enabled. The setup can be achieved with a relatively concise helmfile:
repositories:
- name: open-telemetry
url: https://open-telemetry.github.io/opentelemetry-helm-charts
releases:
- name: collector-agent
namespace: monitoring
chart: open-telemetry/opentelemetry-collector
version: 0.111.2
values:
- image:
repository: ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector-k8s
mode: daemonset
presets:
logsCollection:
enabled: true
hostMetrics:
enabled: true
kubernetesAttributes:
enabled: true
extractAllPodLabels: true
extractAllPodAnnotations: false
kubeletMetrics:
enabled: true
config: &CONFIG
receivers:
kubeletstats:
insecure_skip_verify: true
exporters:
otlp/openobserve:
endpoint: http://openobserve.monitoring.svc:5081
headers:
Authorization: {{
printf "%s:%s"
(fetchSecretValue "ref+k8s://v1/Secret/monitoring/openobserve-root-user/ZO_ROOT_USER_EMAIL")
(fetchSecretValue "ref+k8s://v1/Secret/monitoring/openobserve-root-user/ZO_ROOT_USER_PASSWORD")
| b64enc | print "Basic " | quote
}}
organization: default
stream-name: default
tls:
insecure: true
service:
pipelines:
logs:
exporters:
- otlp/openobserve
metrics:
exporters:
- otlp/openobserve
traces:
exporters:
- otlp/openobserve
resources: {} # -- snip --
- name: collector-cluster
namespace: monitoring
chart: open-telemetry/opentelemetry-collector
version: 0.111.2
values:
- image:
repository: ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector-k8s
mode: deployment
replicaCount: 1
presets:
clusterMetrics:
enabled: true
kubernetesEvents:
enabled: true
config: *CONFIG
resources: {} # -- snip --
The helmfile.yaml
defines two releases. The one called collector-agent
handles log ingestion. The generated collector config is obtained with the command:
kubectl get -n monitoring configmap collector-agent-opentelemetry-collector-agent -o jsonpath='{.data.relay}'
Upstream OpenTelemetry collector generated configuration
exporters:
debug: {}
otlp/openobserve:
endpoint: http://openobserve.monitoring.svc:5081
headers:
Authorization: Basic ZGV2QGJhYnltcmkub3JnOmNocmlzMTIzNA==
organization: default
stream-name: otel-chart
tls:
insecure: true
extensions:
health_check:
endpoint: ${env:MY_POD_IP}:13133
processors:
batch: {}
k8sattributes:
extract:
labels:
- from: pod
key_regex: (.*)
tag_name: $$1
metadata:
- k8s.namespace.name
- k8s.deployment.name
- k8s.statefulset.name
- k8s.daemonset.name
- k8s.cronjob.name
- k8s.job.name
- k8s.node.name
- k8s.pod.name
- k8s.pod.uid
- k8s.pod.start_time
filter:
node_from_env_var: K8S_NODE_NAME
passthrough: false
pod_association:
- sources:
- from: resource_attribute
name: k8s.pod.ip
- sources:
- from: resource_attribute
name: k8s.pod.uid
- sources:
- from: connection
memory_limiter:
check_interval: 5s
limit_percentage: 80
spike_limit_percentage: 25
receivers:
filelog:
exclude:
- /var/log/pods/monitoring_collector-agent-opentelemetry-collector*_*/opentelemetry-collector/*.log
include:
- /var/log/pods/*/*/*.log
include_file_name: false
include_file_path: true
operators:
- id: container-parser
max_log_size: 102400
type: container
retry_on_failure:
enabled: true
start_at: end
hostmetrics:
collection_interval: 10s
root_path: /hostfs
scrapers:
cpu: null
disk: null
filesystem:
exclude_fs_types:
fs_types:
- autofs
- binfmt_misc
- bpf
- cgroup2
- configfs
- debugfs
- devpts
- devtmpfs
- fusectl
- hugetlbfs
- iso9660
- mqueue
- nsfs
- overlay
- proc
- procfs
- pstore
- rpc_pipefs
- securityfs
- selinuxfs
- squashfs
- sysfs
- tracefs
match_type: strict
exclude_mount_points:
match_type: regexp
mount_points:
- /dev/*
- /proc/*
- /sys/*
- /run/k3s/containerd/*
- /var/lib/docker/*
- /var/lib/kubelet/*
- /snap/*
load: null
memory: null
network: null
jaeger:
protocols:
grpc:
endpoint: ${env:MY_POD_IP}:14250
thrift_compact:
endpoint: ${env:MY_POD_IP}:6831
thrift_http:
endpoint: ${env:MY_POD_IP}:14268
kubeletstats:
auth_type: serviceAccount
collection_interval: 20s
endpoint: ${env:K8S_NODE_IP}:10250
insecure_skip_verify: true
otlp:
protocols:
grpc:
endpoint: ${env:MY_POD_IP}:4317
http:
endpoint: ${env:MY_POD_IP}:4318
prometheus:
config:
scrape_configs:
- job_name: opentelemetry-collector
scrape_interval: 10s
static_configs:
- targets:
- ${env:MY_POD_IP}:8888
zipkin:
endpoint: ${env:MY_POD_IP}:9411
service:
extensions:
- health_check
pipelines:
logs:
exporters:
- otlp/openobserve
processors:
- k8sattributes
- memory_limiter
- batch
receivers:
- otlp
- filelog
metrics:
exporters:
- otlp/openobserve
processors:
- k8sattributes
- memory_limiter
- batch
receivers:
- otlp
- prometheus
- hostmetrics
- kubeletstats
traces:
exporters:
- otlp/openobserve
processors:
- k8sattributes
- memory_limiter
- batch
receivers:
- otlp
- jaeger
- zipkin
telemetry:
metrics:
address: ${env:MY_POD_IP}:8888
Here is an example log entry from OpenObserve using the above upstream OpenTelemetry collector chart:
{
"_timestamp": 1737473264323746,
"app": "openobserve",
"apps_kubernetes_io_pod_index": "0",
"body": "2025-01-21T15:27:44.323513488+00:00 INFO actix_web::middleware::logger: 172.18.0.4 \"GET /api/default/otel_chart/_values?fields=k8s_container_name&size=10&start_time=1737472364215000&end_time=1737473264215000&sql=U0VMRUNUICogRlJPTSAib3RlbF9jaGFydCIg&type=logs HTTP/1.1\" 200 250 \"-\" \"http://localhost:32020/web/logs?stream_type=logs&stream=otel_chart&period=15m&refresh=0&sql_mode=false&query=YXBwX2t1YmVybmV0ZXNfaW9fbmFtZSA9ICdjaHJpcy13b3JrZXItbWFpbnMn&type=stream_explorer&defined_schemas=user_defined_schema&org_identifier=default&quick_mode=false&show_histogram=true\" \"Mozilla/5.0 (X11; Linux x86_64; rv:134.0) Gecko/20100101 Firefox/134.0\" 0.099962",
"controller_revision_hash": "openobserve-69f6d688f6",
"dropped_attributes_count": 0,
"k8s_container_name": "openobserve",
"k8s_container_restart_count": "1",
"k8s_namespace_name": "monitoring",
"k8s_node_name": "khris-worker",
"k8s_pod_name": "openobserve-0",
"k8s_pod_start_time": "2025-01-20T22:14:56Z",
"k8s_pod_uid": "1c857c0a-066e-40ba-8676-6c874631f1ca",
"k8s_statefulset_name": "openobserve",
"log_file_path": "/var/log/pods/monitoring_openobserve-0_1c857c0a-066e-40ba-8676-6c874631f1ca/openobserve/1.log",
"log_iostream": "stdout",
"logtag": "F",
"name": "openobserve",
"severity": 0,
"statefulset_kubernetes_io_pod_name": "openobserve-0"
}
Meanwhile, openobserve-collector's default values.yaml specifies complex routing and regular expression named capture groups to extract metadata from log file names:
openobserve-helm-chart/charts/openobserve-collector/values.yaml
Lines 130 to 170 in b146f80
# Find out which format is used by kubernetes | |
- type: router | |
id: get-format | |
routes: | |
- output: parser-docker | |
expr: 'body matches "^\\{"' | |
- output: parser-crio | |
expr: 'body matches "^[^ Z]+ "' | |
- output: parser-containerd | |
expr: 'body matches "^[^ Z]+Z"' | |
# Parse CRI-O format | |
- type: regex_parser | |
id: parser-crio | |
regex: "^(?P<time>[^ Z]+) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$" | |
output: extract_metadata_from_filepath | |
timestamp: | |
parse_from: attributes.time | |
layout_type: gotime | |
layout: "2006-01-02T15:04:05.999999999Z07:00" | |
# Parse CRI-Containerd format | |
- type: regex_parser | |
id: parser-containerd | |
regex: "^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$" | |
output: extract_metadata_from_filepath | |
timestamp: | |
parse_from: attributes.time | |
layout: "%Y-%m-%dT%H:%M:%S.%LZ" | |
# Parse Docker format | |
- type: json_parser | |
id: parser-docker | |
output: extract_metadata_from_filepath | |
timestamp: | |
parse_from: attributes.time | |
layout: "%Y-%m-%dT%H:%M:%S.%LZ" | |
# Extract metadata from file path | |
- type: regex_parser | |
id: extract_metadata_from_filepath | |
regex: '^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\-]{36})\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$' | |
parse_from: attributes["log.file.path"] | |
cache: | |
size: 128 # default maximum amount of Pods per Node is 110 |
Seeing that the upstream's config can produce logs with the metadata k8s_pod_name
, k8s_namespace_name
, etc. (via the k8sattributes
processor) with a simpler config, why does openobserve-collector's values.yaml have these regexes?