Skip to content

Unable to send logs to otlp receiver #40848

Closed
@lusis

Description

@lusis

Component(s)

  • logs
  • otlpexporter
  • otlprecevier

What happened?

Description

We have a standard tiered k8s collector setup:

  • opentelemetry-operator installed
  • per node collector with all the various receivers we use (otlp, zipkin, jaeger, statsd, others). exports only otlp
  • per node collectors ONLY for managing k8s telemetry. This is the first config provided below
  • io.opentelemetry.discover.logs annotations on pods
  • central collector which only takes in otlp grpc/http. This is the second config provided below
  • export everything to datadog

When exporting DIRECTLY to datadog from the node collectors it works fine. We wanted to move to the central collector where we like to isolate our api keys and such but exporting to that collector fails with the following:

{"level":"error","ts":"2025-06-20T21:29:49.452Z","caller":"internal/queue_sender.go:57","msg":"Exporting failed. Dropping data.","resource":{"service.instance.id":"042d389e-99da-4f48-8c9d-ed9da1c98bbc","service.name":"otelcol-contrib","service.version":"0.128.0"},"otelcol.component.id":"otlp/default","otelcol.component.kind":"exporter","otelcol.signal":"logs","error":"not retryable error: Permanent error: rpc error: code = Unimplemented desc = unknown service opentelemetry.proto.collector.logs.v1.LogsService","dropped_items":2,"stacktrace":"go.opentelemetry.io/collector/exporter/exporterhelper/internal.NewQueueSender.func1\n\tgo.opentelemetry.io/collector/exporter@v0.128.0/exporterhelper/internal/queue_sender.go:57\ngo.opentelemetry.io/collector/exporter/exporterhelper/internal/queuebatch.(*disabledBatcher[...]).Consume\n\tgo.opentelemetry.io/collector/exporter@v0.128.0/exporterhelper/internal/queuebatch/disabled_batcher.go:22\ngo.opentelemetry.io/collector/exporter/exporterhelper/internal/queuebatch.(*asyncQueue[...]).Start.func1\n\tgo.opentelemetry.io/collector/exporter@v0.128.0/exporterhelper/internal/queuebatch/async_queue.go:47"}

Both the node and central collectors are on 0.128.0. I've also tried exporting via http and get a similar 404 on the logs path.

My understanding is the out-of-the-box experience for ANYTHING otlp -> otlp should work out of the box. At least that's been my experience.

I also have the debug exporter currently running on the node collector for testing that the filelog reciever_creator is actually working. I do see the test app logs.

Again the entire pipeline works wonderfully EXCEPT for sending logs to the central collector. Metrics and traces both transit fine all the way to DataDog.

Steps to Reproduce

pod annotations

        io.opentelemetry.discovery.logs/config: |
          max_log_size: "2MiB"
          operators:
            - type: container
              id: container-parser
            - type: json_parser
              id: json-parser
        io.opentelemetry.discovery.logs/enabled: "true"

node collector config

apiVersion: opentelemetry.io/v1beta1
kind: OpenTelemetryCollector
metadata:
  name: otel-k8s-collector
  namespace: observability
spec:
  tolerations:
    - operator: Exists
  podAnnotations:
    linkerd.io/inject: disabled

  targetAllocator:
    enabled: false
    allocationStrategy: per-node
    prometheusCR:
      enabled: false
      podMonitorSelector: {}
      serviceMonitorSelector: {}
    serviceAccount: otelcontribcol

  mode: daemonset

  # needed to read logs
  volumes:
    - name: varlogpods
      hostPath:
        path: /var/log/pods
  volumeMounts:
    - name: varlogpods
      mountPath: /var/log/pods
      readOnly: true
  env:
    - name: K8S_NODE_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: spec.nodeName
  serviceAccount: otelcontribcol
  config:
    receivers:
      receiver_creator/metrics:
        watch_observers: [ k8s_observer ]
        discovery:
          enabled: true
        receivers:
      receiver_creator/logs:
        watch_observers: [ k8s_observer ]
        discovery:
          enabled: true
        receivers:
      kubeletstats/default:
        collection_interval: 10s
        auth_type: 'serviceAccount'
        endpoint: '${env:K8S_NODE_NAME}:10250'
        insecure_skip_verify: true
        metric_groups:
          - node
          - pod
          - container
      prometheus/collector_metrics:
        config:
          scrape_configs:
            - job_name: "otelcol"
              scrape_interval: 60s
              static_configs:
                - targets: ["0.0.0.0:8888"]
                  labels:
                    # env will get injected centrally
                    service_name: "otel-k8s-collector"
                    # the following two are internal tags we want to use to identify a given collector config + collector pair
                    collector_mode: "daemonset"
                    collector_name: "otel-k8s-collector"
    processors:
      k8sattributes:
        auth_type: "serviceAccount"
        passthrough: false
        extract:
          metadata:
            - k8s.pod.name
            - k8s.pod.uid
            - k8s.deployment.name
            - k8s.namespace.name
            - k8s.node.name
            - k8s.pod.start_time
            - k8s.cluster.uid
          # Pod labels which can be fetched via K8sattributeprocessor
          # labels:
          #   - tag_name: key1
          #     key: label1
          #     from: pod
          #   - tag_name: key2
          #     key: label2
          #     from: pod
        # Pod association using resource attributes and connection
        pod_association:
          - sources:
              - from: resource_attribute
                name: k8s.pod.uid
              - from: resource_attribute
                name: k8s.pod.ip
              - from: connection
      attributes/tags:
        actions:
          - key: "managed-collector"
            value: "true"
            action: "upsert"
          - key: "collector_name"
            value: "otel-k8s-collector"
            action: "upsert"
      batch/metrics:
        timeout: 100s
      batch/traces:
        timeout: 100s
      batch/logs:
        timeout: 100s
    exporters:
      otlp/default:
        tls:
          insecure: true
        endpoint: "otel-grpc.default:4317"
        retry_on_failure: {}
        sending_queue:
          queue_size: 100000
      otlphttp/default:
        endpoint: "otel-http.default:4318"
        tls:
          insecure: true
      debug/default:
        verbosity: normal
        use_internal_logger: false
    extensions:
      health_check/default:
        endpoint: 0.0.0.0:13133
        path: /
      k8s_observer:
        auth_type: serviceAccount
        node: ${env:K8S_NODE_NAME}
        observe_ingresses: true
        observe_pods: true
        observe_services: true
    service:
      telemetry:
        metrics:
          readers:
            - pull:
                exporter:
                  prometheus:
                    host: "0.0.0.0"
                    port: 8888
        logs:
          encoding: json
          level: "warn"
      extensions: [health_check/default, k8s_observer]
      pipelines:
        metrics/default:
          receivers: [kubeletstats/default, prometheus/collector_metrics, receiver_creator/metrics]
          processors: [attributes/tags, batch/metrics]
          exporters: [otlp/default]
        logs/default:
          receivers: [receiver_creator/logs]
          processors: [attributes/tags, k8sattributes, batch/logs]
          exporters: [debug/default, otlp/default]

Central collector config (with some unrelated internal bits removed)

receivers:
  otlp/default:
    protocols:
      grpc:
        endpoint: "0.0.0.0:4317"
      http:
        endpoint: "0.0.0.0:4318"
  prometheus/collector_metrics:
    config:
      scrape_configs:
        - job_name: "otelcol"
          scrape_interval: 60s
          static_configs:
            - targets: ["0.0.0.0:8888"]
              labels:
                env: "dev"
                service_name: "otel-collector"
                # the following two are internal tags we want to use to identify a given collector config + collector pair
                collector_mode: "deployment"
                collector_name: "otel-collector"
connectors:
  datadog/connector:
    traces:
      ignore_resources: ["(GET|POST) /healthcheck"]
processors:
  # centralized tag stuff
  attributes/tags:
    actions:
      - key: "env"
        value: "dev"
        action: "upsert"
      - key: "collector_name"
        value: "otel-collector"
        action: "upsert"
  # this is to remove spans we don't ever want to see and not have to change every app
  filter/health_check:
    spans:
      exclude:
        match_type: regexp
        span_names:
          - "grpc.health.v1.Health"
          - "sql-ping"
          - "/health"
          - "sql-stmt-close" # not a valuable span for us right now
          - "IntrospectionQuery" # graphql noise
  filter/drop_metrics:
    metrics:
      exclude:
        match_type: regexp
        metric_names:
          - "go_godebug_.*"
          - "go_cgo_go_.*"

  batch/traces:
    send_batch_size: 100
    send_batch_max_size: 1000
    timeout: 10s
  batch/metrics:
    send_batch_size: 100
    send_batch_max_size: 1000
    timeout: 100s
  batch/logs:
    send_batch_size: 100
    send_batch_max_size: 1000
    timeout: 100s
  memory_limiter/default:
    limit_mib: 12000
    spike_limit_mib: 2000
    check_interval: 100ms
extensions:
  health_check/default:
    endpoint: "0.0.0.0:13133"
    path: "/"
exporters:
  # for testing
  debug/default:
    verbosity: normal
    use_internal_logger: false
  # datadog
  datadog/default:
    api:
      key: "${env:DATADOG_API_KEY}"
  otlp/jaeger:
    tls:
      insecure: true
    endpoint: "jaeger-collector.jaeger:4319"
    retry_on_failure:
    sending_queue:
      queue_size: 500000
service:
  telemetry:
    metrics:
      readers:
        - pull:
            exporter:
              prometheus:
                host: "0.0.0.0"
                port: 8888
    logs:
      encoding: json
      level: "warn"
  extensions: [health_check/default]
  pipelines:
    metrics/datadog:
      receivers: [otlp/default, prometheus/collector_metrics]
      processors: [attributes/tags, filter/drop_metrics, batch/metrics]
      exporters: [datadog/default]
    traces/jaeger:
      receivers: [otlp/default]
      processors: [attributes/tags, memory_limiter/default, filter/health_check, batch/traces]
      exporters: [otlp/jaeger]
    logs:
      receivers: [otlp/default]
      processors: [attributes/tags, memory_limiter/default, batch/logs]
      exporters: [datadog/default]

I see log entries from the debug like so:

otel-k8s-collector-collector-4s995 otc-container {"appLang":"go","level":"info","message":"Finished cleanup. Shutting down.","spanId":"unknown","time":"2025-06-20T21:28:45Z","traceId":"unknown","traceSampled":"false"} log.iostream=stderr traceId=unknown traceSampled=false appLang=go spanId=unknown log.file.path=/var/log/pods/thundercats_go-grpc-service-example-c48d4f79-8zxl2_7dbf8ca6-b72f-4ac3-95d6-e4dda7b50182/go-grpc-service-example/0.log logtag=F time=2025-06-20T21:28:45Z level=info message=Finished cleanup. Shutting down. managed-collector=true collector_name=otel-k8s-collector

Expected Result

Logs would flow to datadog with no error.

Actual Result

k8s node exporter for logs throws error regardless of transport.

Collector version

0.127.0/0.128.0

Environment information

OpenTelemetry Collector configuration

Log output

Additional context

No response

Activity

added
bugSomething isn't working
needs triageNew item requiring triage
on Jun 20, 2025
lusis

lusis commented on Jun 24, 2025

@lusis
Author

I'm closing this but wanted to offer a bit of information for anyone else who finds this.

Our problem was that we were sending to the wrong collector endpoint. The reason it was confusing is that the message came from not having a pipeline for logs defined even though we had an otlp receiver. It was just confusing that the rpc isn't even enabled when you don't have the pipeline. I would have expected a different error message than a generic notimplemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingneeds triageNew item requiring triage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @lusis

        Issue actions

          Unable to send logs to otlp receiver · Issue #40848 · open-telemetry/opentelemetry-collector-contrib