Description
Component(s)
- logs
- otlpexporter
- otlprecevier
What happened?
Description
We have a standard tiered k8s collector setup:
- opentelemetry-operator installed
- per node collector with all the various receivers we use (otlp, zipkin, jaeger, statsd, others). exports only otlp
- per node collectors ONLY for managing k8s telemetry. This is the first config provided below
io.opentelemetry.discover.logs
annotations on pods- central collector which only takes in otlp grpc/http. This is the second config provided below
- export everything to datadog
When exporting DIRECTLY to datadog from the node collectors it works fine. We wanted to move to the central collector where we like to isolate our api keys and such but exporting to that collector fails with the following:
{"level":"error","ts":"2025-06-20T21:29:49.452Z","caller":"internal/queue_sender.go:57","msg":"Exporting failed. Dropping data.","resource":{"service.instance.id":"042d389e-99da-4f48-8c9d-ed9da1c98bbc","service.name":"otelcol-contrib","service.version":"0.128.0"},"otelcol.component.id":"otlp/default","otelcol.component.kind":"exporter","otelcol.signal":"logs","error":"not retryable error: Permanent error: rpc error: code = Unimplemented desc = unknown service opentelemetry.proto.collector.logs.v1.LogsService","dropped_items":2,"stacktrace":"go.opentelemetry.io/collector/exporter/exporterhelper/internal.NewQueueSender.func1\n\tgo.opentelemetry.io/collector/exporter@v0.128.0/exporterhelper/internal/queue_sender.go:57\ngo.opentelemetry.io/collector/exporter/exporterhelper/internal/queuebatch.(*disabledBatcher[...]).Consume\n\tgo.opentelemetry.io/collector/exporter@v0.128.0/exporterhelper/internal/queuebatch/disabled_batcher.go:22\ngo.opentelemetry.io/collector/exporter/exporterhelper/internal/queuebatch.(*asyncQueue[...]).Start.func1\n\tgo.opentelemetry.io/collector/exporter@v0.128.0/exporterhelper/internal/queuebatch/async_queue.go:47"}
Both the node and central collectors are on 0.128.0
. I've also tried exporting via http and get a similar 404 on the logs path.
My understanding is the out-of-the-box experience for ANYTHING otlp -> otlp should work out of the box. At least that's been my experience.
I also have the debug exporter currently running on the node collector for testing that the filelog reciever_creator is actually working. I do see the test app logs.
Again the entire pipeline works wonderfully EXCEPT for sending logs to the central collector. Metrics and traces both transit fine all the way to DataDog.
Steps to Reproduce
pod annotations
io.opentelemetry.discovery.logs/config: |
max_log_size: "2MiB"
operators:
- type: container
id: container-parser
- type: json_parser
id: json-parser
io.opentelemetry.discovery.logs/enabled: "true"
node collector config
apiVersion: opentelemetry.io/v1beta1
kind: OpenTelemetryCollector
metadata:
name: otel-k8s-collector
namespace: observability
spec:
tolerations:
- operator: Exists
podAnnotations:
linkerd.io/inject: disabled
targetAllocator:
enabled: false
allocationStrategy: per-node
prometheusCR:
enabled: false
podMonitorSelector: {}
serviceMonitorSelector: {}
serviceAccount: otelcontribcol
mode: daemonset
# needed to read logs
volumes:
- name: varlogpods
hostPath:
path: /var/log/pods
volumeMounts:
- name: varlogpods
mountPath: /var/log/pods
readOnly: true
env:
- name: K8S_NODE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
serviceAccount: otelcontribcol
config:
receivers:
receiver_creator/metrics:
watch_observers: [ k8s_observer ]
discovery:
enabled: true
receivers:
receiver_creator/logs:
watch_observers: [ k8s_observer ]
discovery:
enabled: true
receivers:
kubeletstats/default:
collection_interval: 10s
auth_type: 'serviceAccount'
endpoint: '${env:K8S_NODE_NAME}:10250'
insecure_skip_verify: true
metric_groups:
- node
- pod
- container
prometheus/collector_metrics:
config:
scrape_configs:
- job_name: "otelcol"
scrape_interval: 60s
static_configs:
- targets: ["0.0.0.0:8888"]
labels:
# env will get injected centrally
service_name: "otel-k8s-collector"
# the following two are internal tags we want to use to identify a given collector config + collector pair
collector_mode: "daemonset"
collector_name: "otel-k8s-collector"
processors:
k8sattributes:
auth_type: "serviceAccount"
passthrough: false
extract:
metadata:
- k8s.pod.name
- k8s.pod.uid
- k8s.deployment.name
- k8s.namespace.name
- k8s.node.name
- k8s.pod.start_time
- k8s.cluster.uid
# Pod labels which can be fetched via K8sattributeprocessor
# labels:
# - tag_name: key1
# key: label1
# from: pod
# - tag_name: key2
# key: label2
# from: pod
# Pod association using resource attributes and connection
pod_association:
- sources:
- from: resource_attribute
name: k8s.pod.uid
- from: resource_attribute
name: k8s.pod.ip
- from: connection
attributes/tags:
actions:
- key: "managed-collector"
value: "true"
action: "upsert"
- key: "collector_name"
value: "otel-k8s-collector"
action: "upsert"
batch/metrics:
timeout: 100s
batch/traces:
timeout: 100s
batch/logs:
timeout: 100s
exporters:
otlp/default:
tls:
insecure: true
endpoint: "otel-grpc.default:4317"
retry_on_failure: {}
sending_queue:
queue_size: 100000
otlphttp/default:
endpoint: "otel-http.default:4318"
tls:
insecure: true
debug/default:
verbosity: normal
use_internal_logger: false
extensions:
health_check/default:
endpoint: 0.0.0.0:13133
path: /
k8s_observer:
auth_type: serviceAccount
node: ${env:K8S_NODE_NAME}
observe_ingresses: true
observe_pods: true
observe_services: true
service:
telemetry:
metrics:
readers:
- pull:
exporter:
prometheus:
host: "0.0.0.0"
port: 8888
logs:
encoding: json
level: "warn"
extensions: [health_check/default, k8s_observer]
pipelines:
metrics/default:
receivers: [kubeletstats/default, prometheus/collector_metrics, receiver_creator/metrics]
processors: [attributes/tags, batch/metrics]
exporters: [otlp/default]
logs/default:
receivers: [receiver_creator/logs]
processors: [attributes/tags, k8sattributes, batch/logs]
exporters: [debug/default, otlp/default]
Central collector config (with some unrelated internal bits removed)
receivers:
otlp/default:
protocols:
grpc:
endpoint: "0.0.0.0:4317"
http:
endpoint: "0.0.0.0:4318"
prometheus/collector_metrics:
config:
scrape_configs:
- job_name: "otelcol"
scrape_interval: 60s
static_configs:
- targets: ["0.0.0.0:8888"]
labels:
env: "dev"
service_name: "otel-collector"
# the following two are internal tags we want to use to identify a given collector config + collector pair
collector_mode: "deployment"
collector_name: "otel-collector"
connectors:
datadog/connector:
traces:
ignore_resources: ["(GET|POST) /healthcheck"]
processors:
# centralized tag stuff
attributes/tags:
actions:
- key: "env"
value: "dev"
action: "upsert"
- key: "collector_name"
value: "otel-collector"
action: "upsert"
# this is to remove spans we don't ever want to see and not have to change every app
filter/health_check:
spans:
exclude:
match_type: regexp
span_names:
- "grpc.health.v1.Health"
- "sql-ping"
- "/health"
- "sql-stmt-close" # not a valuable span for us right now
- "IntrospectionQuery" # graphql noise
filter/drop_metrics:
metrics:
exclude:
match_type: regexp
metric_names:
- "go_godebug_.*"
- "go_cgo_go_.*"
batch/traces:
send_batch_size: 100
send_batch_max_size: 1000
timeout: 10s
batch/metrics:
send_batch_size: 100
send_batch_max_size: 1000
timeout: 100s
batch/logs:
send_batch_size: 100
send_batch_max_size: 1000
timeout: 100s
memory_limiter/default:
limit_mib: 12000
spike_limit_mib: 2000
check_interval: 100ms
extensions:
health_check/default:
endpoint: "0.0.0.0:13133"
path: "/"
exporters:
# for testing
debug/default:
verbosity: normal
use_internal_logger: false
# datadog
datadog/default:
api:
key: "${env:DATADOG_API_KEY}"
otlp/jaeger:
tls:
insecure: true
endpoint: "jaeger-collector.jaeger:4319"
retry_on_failure:
sending_queue:
queue_size: 500000
service:
telemetry:
metrics:
readers:
- pull:
exporter:
prometheus:
host: "0.0.0.0"
port: 8888
logs:
encoding: json
level: "warn"
extensions: [health_check/default]
pipelines:
metrics/datadog:
receivers: [otlp/default, prometheus/collector_metrics]
processors: [attributes/tags, filter/drop_metrics, batch/metrics]
exporters: [datadog/default]
traces/jaeger:
receivers: [otlp/default]
processors: [attributes/tags, memory_limiter/default, filter/health_check, batch/traces]
exporters: [otlp/jaeger]
logs:
receivers: [otlp/default]
processors: [attributes/tags, memory_limiter/default, batch/logs]
exporters: [datadog/default]
I see log entries from the debug like so:
otel-k8s-collector-collector-4s995 otc-container {"appLang":"go","level":"info","message":"Finished cleanup. Shutting down.","spanId":"unknown","time":"2025-06-20T21:28:45Z","traceId":"unknown","traceSampled":"false"} log.iostream=stderr traceId=unknown traceSampled=false appLang=go spanId=unknown log.file.path=/var/log/pods/thundercats_go-grpc-service-example-c48d4f79-8zxl2_7dbf8ca6-b72f-4ac3-95d6-e4dda7b50182/go-grpc-service-example/0.log logtag=F time=2025-06-20T21:28:45Z level=info message=Finished cleanup. Shutting down. managed-collector=true collector_name=otel-k8s-collector
Expected Result
Logs would flow to datadog with no error.
Actual Result
k8s node exporter for logs throws error regardless of transport.
Collector version
0.127.0/0.128.0
Environment information
OpenTelemetry Collector configuration
Log output
Additional context
No response
Activity
lusis commentedon Jun 24, 2025
I'm closing this but wanted to offer a bit of information for anyone else who finds this.
Our problem was that we were sending to the wrong collector endpoint. The reason it was confusing is that the message came from not having a pipeline for logs defined even though we had an otlp receiver. It was just confusing that the rpc isn't even enabled when you don't have the pipeline. I would have expected a different error message than a generic notimplemented.