datadog-agent: Tagger error on Kubernetes

Describe what happened: I am trying to set up log collection for Docker containers on a Kubernetes cluster.

Snippet from my config:

    logs:
    - type: docker
      image: "datadog/agent"
      servie: datadog
      source: datadog

Logs are not being sent to Datadog, and I see a lot of these errors in the DD container log:

2018-01-10 16:47:50 UTC | WARN | (tagger.go:248 in Tag) | error collecting from kubelet: container docker://3cacb7a1688be922adcb89f39950d4526cafb1c0e92ee050f630fe0e90132b0b not found in podlist

Describe what you expected: Working log collection.

Steps to reproduce the issue: Deploy DD agent on Kuberentes with Daemonset.

Additional environment details (Operating System, Cloud provider, etc):

Kubernetes 1.8.6 Docker 0.13

agent status ===================== Agent (v6.0.0-beta.7) =====================

Status date: 2018-01-10 16:49:09.760419 UTC Pid: 7 Python Version: 2.7.14 Logs: Check Runners: 10 Log Level: info

Paths

Config File: /etc/datadog-agent/datadog.yaml
conf.d: /etc/datadog-agent/conf.d
checks.d: /etc/datadog-agent/checks.d

Clocks

NTP offset: 0.00326513 s
System UTC time: 2018-01-10 16:49:09.760419 UTC

Host Info

bootTime: 2018-01-09 20:04:23.000000 UTC
kernelVersion: 4.4.65-k8s
os: linux
platform: debian
platformFamily: debian
platformVersion: 9.3
procs: 63
uptime: 74247
virtualizationRole: guest
virtualizationSystem: xen

Hostnames

ec2-hostname: ip-172-20-174-85.ec2.internal
hostname: i-0d84be79a0113f36b
instance-id: i-0d84be79a0113f36b
socket-fqdn: dd-agent-j6b4h
socket-hostname: dd-agent-j6b4h

========= Collector

Running Checks

cpu
---
  Total Runs: 29
  Metrics: 6, Total Metrics: 168
  Events: 0, Total Events: 0
  Service Checks: 0, Total Service Checks: 0

disk
----
  Total Runs: 29
  Metrics: 160, Total Metrics: 4640
  Events: 0, Total Events: 0
  Service Checks: 0, Total Service Checks: 0

docker
------
  Total Runs: 29
  Metrics: 266, Total Metrics: 7342
  Events: 0, Total Events: 5
  Service Checks: 1, Total Service Checks: 29

file_handle
-----------
  Total Runs: 29
  Metrics: 1, Total Metrics: 29
  Events: 0, Total Events: 0
  Service Checks: 0, Total Service Checks: 0

io
--
  Total Runs: 29
  Metrics: 52, Total Metrics: 1472
  Events: 0, Total Events: 0
  Service Checks: 0, Total Service Checks: 0

kube_dns
--------
  Total Runs: 29
  Metrics: 41, Total Metrics: 1189
  Events: 0, Total Events: 0
  Service Checks: 0, Total Service Checks: 0

load
----
  Total Runs: 29
  Metrics: 6, Total Metrics: 174
  Events: 0, Total Events: 0
  Service Checks: 0, Total Service Checks: 0

memory
------
  Total Runs: 29
  Metrics: 14, Total Metrics: 406
  Events: 0, Total Events: 0
  Service Checks: 0, Total Service Checks: 0

network
-------
  Total Runs: 29
  Metrics: 20, Total Metrics: 580
  Events: 0, Total Events: 0
  Service Checks: 0, Total Service Checks: 0

ntp
---
  Total Runs: 29
  Metrics: 1, Total Metrics: 28
  Events: 0, Total Events: 0
  Service Checks: 1, Total Service Checks: 29

uptime
------
  Total Runs: 29
  Metrics: 1, Total Metrics: 29
  Events: 0, Total Events: 0
  Service Checks: 0, Total Service Checks: 0

Loading Errors

docker_daemon
-------------
  Core Check Loader:
    Check docker_daemon not found in Catalog
    
  JMX Check Loader:
    check is not a jmx check, or unable to determine if it's so
    
  Python Check Loader:
    No module named docker_daemon

======== JMXFetch

Initialized checks

no checks

Failed checks

no checks

========= Forwarder

CheckRunsV1: 29 IntakeV1: 6 RetryQueueSize: 0 Success: 64 TimeseriesV1: 29

API Keys status

https://6-0-0-app.agent.datadoghq.com,*************************4aa08: API Key valid

========= DogStatsD

Checks Metric Sample: 16694 Event: 6 Events Flushed: 6 Number Of Flushes: 29 Series Flushed: 12074 Service Check: 377 Service Checks Flushed: 393 Dogstatsd Metric Sample: 697

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 3
  • Comments: 17 (6 by maintainers)

Most upvoted comments

I got the same error.

[ AGENT ] 2018-03-08 10:15:13 UTC | WARN | (tagger.go:246 in Tag) | error collecting from kube-service-collector: container docker://320fd8ad981a5b62d67d7d89b287af341f0e62581fc4b56e866ad774d78b540f not found in podList
[ AGENT ] 2018-03-08 10:15:13 UTC | WARN | (tagger.go:246 in Tag) | error collecting from kubelet: container docker://320fd8ad981a5b62d67d7d89b287af341f0e62581fc4b56e866ad774d78b540f not found in podList
[ AGENT ] 2018-03-08 10:15:13 UTC | WARN | (tagger.go:246 in Tag) | error collecting from kubelet: container docker://7e4d3e55e060ce3077bdc675526ab1562677c9d0e072596d275de46ed7c3d64c not found in podList
[ AGENT ] 2018-03-08 10:15:13 UTC | WARN | (tagger.go:246 in Tag) | error collecting from kube-service-collector: container docker://7e4d3e55e060ce3077bdc675526ab1562677c9d0e072596d275de46ed7c3d64c not found in podList
[ AGENT ] 2018-03-08 10:15:13 UTC | WARN | (tagger.go:246 in Tag) | error collecting from kubelet: container docker://330e0acf6f5f8715f9df19711cf5874835c793e22e355424a306df1c0348a31c not found in podList
[ AGENT ] 2018-03-08 10:15:13 UTC | WARN | (tagger.go:246 in Tag) | error collecting from kube-service-collector: container docker://330e0acf6f5f8715f9df19711cf5874835c793e22e355424a306df1c0348a31c not found in podList
[ AGENT ] 2018-03-08 10:15:13 UTC | WARN | (tagger.go:246 in Tag) | error collecting from kubelet: container docker://f31c94bd9762573cf96bfbc8a85c2d699285102eec6ef47af0bdca809b289a1e not found in podList
[ AGENT ] 2018-03-08 10:15:13 UTC | WARN | (tagger.go:246 in Tag) | error collecting from kube-service-collector: container docker://f31c94bd9762573cf96bfbc8a85c2d699285102eec6ef47af0bdca809b289a1e not found in podList
[ AGENT ] 2018-03-08 10:15:13 UTC | WARN | (tagger.go:246 in Tag) | error collecting from kubelet: container docker://6355e733d9f4d06267ff7ed2c156344516502e9986692921a69a6fadb4c1df46 not found in podList
[ AGENT ] 2018-03-08 10:15:13 UTC | WARN | (tagger.go:246 in Tag) | error collecting from kube-service-collector: container docker://6355e733d9f4d06267ff7ed2c156344516502e9986692921a69a6fadb4c1df46 not found in podList
[ AGENT ] 2018-03-08 10:15:14 UTC | WARN | (tagger.go:246 in Tag) | error collecting from kubelet: container docker://3ab71114c06ebd2ed0b641f604ef7e13034399c86de2bf38d323139e3db86735 not found in podList
  • Kubernetes v1.8.7-gke
  • Datadog Agent v6.0.2
  • Install agent using Helm (stable/datadog)
values.yaml
image:
  repository: datadog/agent               # Agent6
  tag: 6.0.2  # Use 6.0.0-jmx to enable jmx fetch collection
  pullPolicy: IfNotPresent

daemonset:
  enabled: true
  updateStrategy: RollingUpdate

deployment:
  enabled: false
  replicas: 1

kubeStateMetrics:
  enabled: true

datadog:
  apiKey: xxxxx
  name: dd-agent
  logLevel: WARNING
  collectEvents: false
  env:
    - name: DD_LOGS_ENABLED
      value: "true"
  leaderLeaseDuration: 600s
  confd:
    kubernetes.yaml: |-
      init_config:
      instances:
        - port: 4194
          collect_events: True
          namespace_name_regexp: .*
    docker_daemon.yaml: |-
      logs:
        - type: docker
          service: docker
          source: docker
  resources:
    requests:
      cpu: 100m
      memory: 128Mi
    limits:
      cpu: 256m
      memory: 512Mi

rbac:
  create: true
  serviceAccountName: default

tolerations: []

kube-state-metrics:
  rbac:
    create: true
    serviceAccountName: default

@mfpierre when will the 6.1.1 release be available as a Helm chart? Why doesn’t every new release result in a new version of the Helm chart? Related to #1447

Hi everyone, we have a fix https://github.com/DataDog/datadog-agent/pull/1345 that should resolve the logging issues and will be included in the next 6.1 release.

We’re aware of an issue on kubernetes side where static pods are not correctly updated in the kubelet podlist (#1447) but we’ll keep an eye on this to see if there could be other issues.

Hi everyone,

A little message just to signal you that my DataDog agent is reporting the same kind of errors. I have an opened ticket that talk about that here: https://help.datadoghq.com/hc/en-us/requests/130699 (with logs from the agent)

I have a k8s 1.9.2 cluster. My Datadog agent is the v6 one and it is deployed thanks to the stable chart you provide.

I deploy my agent thanks to this command:

helm install --name ${DD_AGENT_RELEASE_NAME} -f deploy/resources/datadog/values.yml stable/datadog

with the following values.yml:

# Copied from here: https://github.com/kubernetes/charts/blob/master/stable/datadog/values.yaml

# Default values for datadog.
image:
  # This chart is compatible with different images, please choose one
  # repository: datadog/docker-dd-agent  # Agent5
  repository: datadog/agent          # Agent6 (beta)
  # repository: datadog/dogstatsd      # Standalone DogStatsD6 (beta)
  tag: latest-jmx
  pullPolicy: Always

# NB! Normally you need to keep Datadog DaemonSet enabled!
# The exceptional case could be a situation when you need to run
# single DataDog pod per every namespace, but you do not need to
# re-create a DaemonSet for every non-default namespace install.
# Note, that StatsD and DogStatsD work over UDP, so you may not
# get guaranteed delivery of the metrics in Datadog-per-namespace setup!
daemonset:
  enabled: true
  ## Bind ports on the hostNetwork. Useful for CNI networking where hostPort might
  ## not be supported. The ports will need to be available on all hosts. Can be
  ## used for custom metrics instead of a service endpoint.
  ## WARNING: Make sure that hosts using this are properly firewalled otherwise
  ## metrics and traces will be accepted from any host able to connect to this host.
  # useHostNetwork: true

  ## Sets the hostPort to the same value of the container port. Can be used as
  ## for sending custom metrics. The ports will need to be available on all
  ## hosts.
  ## WARNING: Make sure that hosts using this are properly firewalled otherwise
  ## metrics and traces will be accepted from any host able to connect to this host.
  useHostPort: true

  ## Annotations to add to the DaemonSet's Pods
  # podAnnotations:
  #   scheduler.alpha.kubernetes.io/tolerations: '[{"key": "example", "value": "foo"}]'

  ## Allow the DaemonSet to schedule on tainted nodes (requires Kubernetes >= 1.6)
  # tolerations: []

  ## Allow the DaemonSet to perform a rolling update on helm update
  ## ref: https://kubernetes.io/docs/tasks/manage-daemon/update-daemon-set/
  # updateStrategy: RollingUpdate

# Apart from DaemonSet, deploy Datadog agent pods and related service for
# applications that want to send custom metrics. Provides DogStasD service.
#
# HINT: If you want to use datadog.collectEvents, keep deployment.replicas set to 1.
deployment:
  enabled: false
  replicas: 1

## deploy the kube-state-metrics deployment
## ref: https://github.com/kubernetes/charts/tree/master/stable/kube-state-metrics
##
kubeStateMetrics:
  enabled: true

datadog:
  ## You'll need to set this to your Datadog API key before the agent will run.
  ## ref: https://app.datadoghq.com/account/settings#agent/kubernetes
  ##
  apiKey: 'TOCHANGE'

  ## dd-agent container name
  ##
  name: dd-agent

  ## Set logging verbosity.
  ## ref: https://github.com/DataDog/docker-dd-agent#environment-variables
  ##
  logLevel: WARNING

  ## Un-comment this to make each node accept non-local statsd traffic.
  ## ref: https://github.com/DataDog/docker-dd-agent#environment-variables
  ##
  # nonLocalTraffic: true

  ## Set host tags.
  ## ref: https://github.com/DataDog/docker-dd-agent#environment-variables
  ##
  # tags:

  ## Enables event collection from the kubernetes API
  ## ref: https://github.com/DataDog/docker-dd-agent#environment-variables
  ##
  collectEvents: true

  ## Un-comment this to enable APM and tracing, on ports 7777 and 8126
  ## ref: https://github.com/DataDog/docker-dd-agent#tracing-from-the-host
  ##
  apmEnabled: true

  ## The dd-agent supports many environment variables
  ## ref: https://github.com/DataDog/docker-dd-agent#environment-variables
  ##
  env:
    # https://docs.datadoghq.com/guides/process/
    - name: DD_PROCESS_AGENT_ENABLED
      value: "true"
    # https://app.datadoghq.com/logs/onboarding/container
    - name: DD_LOGS_ENABLED
      value: "true"
    # https://github.com/DataDog/datadog-agent/tree/master/Dockerfiles/agent#kubernetes-integration
    - name: DD_LEADER_ELECTION
      value: "true"
    # https://github.com/DataDog/datadog-agent/tree/master/Dockerfiles/agent#event-collection
    - name: DD_COLLECT_KUBERNETES_EVENTS
      value: "true"

  ## The dd-agent supports detailed process and container monitoring and
  ## requires control over the volume and volumeMounts for the daemonset
  ## or deployment.
  ## ref: https://docs.datadoghq.com/guides/process/
  ##
  volumes:
    - hostPath:
        path: /etc/passwd
      name: passwd
  volumeMounts:
    - name: passwd
      mountPath: /etc/passwd
      readOnly: true

  ## Enable leader election mechanism for event collection
  ##
  leaderElection: true

  ## Set the lease time for leader election
  ##
  # leaderLeaseDuration: 600

  ## Provide additonal service definitions
  ## Each key will become a file in /conf.d/auto_conf
  ## ref: https://github.com/DataDog/docker-dd-agent#configuration-files
  ##
  # autoconf:
  #   kubernetes_state.yaml: |-
  #     docker_images:
  #       - kube-state-metrics
  #     init_config:
  #     instances:
  #       - kube_state_url: http://%%host%%:%%port%%/metrics

  ## Provide additonal service definitions
  ## Each key will become a file in /conf.d
  ## ref: https://github.com/DataDog/docker-dd-agent#configuration-files
  ##
  confd:
  #   redisdb.yaml: |-
  #     init_config:
  #     instances:
  #       - host: "name"
  #         port: "6379"
  # https://app.datadoghq.com/logs/onboarding/container
  # https://github.com/DataDog/datadog-agent/tree/master/Dockerfiles/agent#configuration-file-example
    logs.yaml: |-
      init_config:
      instances:
        [{}]
      logs:
        - type: docker
          service: myapp
          source: myapp-logs

  ## Provide additonal service checks
  ## Each key will become a file in /checks.d
  ## ref: https://github.com/DataDog/docker-dd-agent#configuration-files
  ##
  # checksd:
  #   service.py: |-

  ## dd-agent resource requests and limits
  ## Ref: http://kubernetes.io/docs/user-guide/compute-resources/
  ##
  resources:
    requests:
      cpu: 100m
      memory: 64Mi
    limits:
      cpu: 256m
      memory: 256Mi

rbac:
  ## If true, create & use RBAC resources
  create: false

  ## Ignored if rbac.create is true
  serviceAccountName: default

tolerations: []

kube-state-metrics:
  rbac:
    create: false

    ## Ignored if rbac.create is true
    serviceAccountName: default

I’m facing the same issue on beta9, except I’m not getting logs even with the same config as from 2 comments above, i.e. lots of the following in the agent logs:

datadog-agent-6wnqb datadog-agent 2018-02-12 13:21:27 UTC | WARN | (tagger.go:247 in Tag) | error collecting from kubelet: container docker://338ee23f86516d2740c17fd5a920655316f9ed1597d00b46bab600033e80573f not found in podlist
datadog-agent-6wnqb datadog-agent 2018-02-12 13:21:27 UTC | WARN | (tagger.go:247 in Tag) | error collecting from kubelet: container docker://6c85647890f782587a4066cad6ad8a38e48a1ff448cced139d43d5670d5cc279 not found in podlist
datadog-agent-6wnqb datadog-agent 2018-02-12 13:21:27 UTC | WARN | (tagger.go:247 in Tag) | error collecting from kubelet: container docker://6c85647890f782587a4066cad6ad8a38e48a1ff448cced139d43d5670d5cc279 not found in podlist
datadog-agent-6wnqb datadog-agent 2018-02-12 13:21:27 UTC | WARN | (tagger.go:247 in Tag) | error collecting from kubelet: container docker://e85550c58ba96f0c73df8a5fe0e5220a320da601a6626e2a97ce5a36362f606c not found in podlist
datadog-agent-6wnqb datadog-agent 2018-02-12 13:21:27 UTC | WARN | (tagger.go:247 in Tag) | error collecting from kubelet: container docker://e85550c58ba96f0c73df8a5fe0e5220a320da601a6626e2a97ce5a36362f606c not found in podlist
datadog-agent-6wnqb datadog-agent 2018-02-12 13:21:27 UTC | WARN | (tagger.go:247 in Tag) | error collecting from kubelet: container docker://e447687e5547fb6e5a003c2baf67566df809516c90b5f34d03b6c67536f44d60 not found in podlist
datadog-agent-6wnqb datadog-agent 2018-02-12 13:21:27 UTC | WARN | (tagger.go:247 in Tag) | error collecting from kubelet: container docker://e447687e5547fb6e5a003c2baf67566df809516c90b5f34d03b6c67536f44d60 not found in podlist
datadog-agent-6wnqb datadog-agent 2018-02-12 13:21:27 UTC | WARN | (tagger.go:247 in Tag) | error collecting from kubelet: container docker://ae191ca31af5253190bd7e531d1dd221a8c3190d0036dd67d95be3d0a06a106d not found in podlist
datadog-agent-6wnqb datadog-agent 2018-02-12 13:21:27 UTC | WARN | (tagger.go:247 in Tag) | error collecting from kubelet: container docker://8c4afa89049125610a9b8d0459259c5495e79a6d7d52f03a7a9748fc72578835 not found in podlist
datadog-agent-6wnqb datadog-agent 2018-02-12 13:21:27 UTC | WARN | (tagger.go:247 in Tag) | error collecting from kubelet: container docker://012eee5cf8a6880b3ddbdc931192ed1fdfa58a7f1e5d517a40ade75c0ed6f418 not found in podlist
datadog-agent-6wnqb datadog-agent 2018-02-12 13:21:27 UTC | WARN | (tagger.go:247 in Tag) | error collecting from kubelet: container docker://4f4d0fd57b579145253bb04dfc955c1cebabf7924af20f38ec8114e4dee8214e not found in podlist
datadog-agent-6wnqb datadog-agent 2018-02-12 13:21:27 UTC | WARN | (tagger.go:247 in Tag) | error collecting from kubelet: container docker://2cbf1bea1ffea441c6787800bc23b15597a868dcacd7a4671fc08fb77df9b77a not found in podlist
datadog-agent-6wnqb datadog-agent 2018-02-12 13:21:27 UTC | WARN | (tagger.go:247 in Tag) | error collecting from kubelet: container docker://3a79db3bf02f5a428bcf79624965b26dd4ca042187b8d55c18084ac226c34e78 not found in podlist
datadog-agent-6wnqb datadog-agent 2018-02-12 13:21:27 UTC | WARN | (tagger.go:247 in Tag) | error collecting from kubelet: container docker://f727753e7b87975e37783994b0a2342d28a192c93d26b572bc4be40cb56ec81f not found in podlist
datadog-agent-6wnqb datadog-agent 2018-02-12 13:21:27 UTC | WARN | (tagger.go:247 in Tag) | error collecting from kubelet: container docker://f727753e7b87975e37783994b0a2342d28a192c93d26b572bc4be40cb56ec81f not found in podlist
datadog-agent-6wnqb datadog-agent 2018-02-12 13:21:27 UTC | WARN | (tagger.go:247 in Tag) | error collecting from kubelet: container docker://8b1a01e0dea30d6197ed9edca8dfbe49d5ba818c19d099541f2cdd9c56fe9013 not found in podlist
datadog-agent-6wnqb datadog-agent 2018-02-12 13:21:27 UTC | WARN | (tagger.go:247 in Tag) | error collecting from kubelet: container docker://4d08f6a6b058d0475a74b123c619596b892b69fceefe867dd7c51c1f46fee326 not found in podlist
datadog-agent-6wnqb datadog-agent 2018-02-12 13:21:27 UTC | WARN | (tagger.go:247 in Tag) | error collecting from kubelet: container docker://4d08f6a6b058d0475a74b123c619596b892b69fceefe867dd7c51c1f46fee326 not found in podlist
datadog-agent-6wnqb datadog-agent 2018-02-12 13:21:27 UTC | WARN | (tagger.go:247 in Tag) | error collecting from kubelet: container docker://98d55025d794ef4fe1ae42854bb2ad21934d83c89144661b29aa0dade14d3a29 not found in podlist
datadog-agent-6wnqb datadog-agent 2018-02-12 13:21:27 UTC | WARN | (tagger.go:247 in Tag) | error collecting from kubelet: container docker://4b02e64f81611660cb49d0ac403fa439051338c3777bd33001a06c5203d95f69 not found in podlist

Pod metrics/tagging seem to work fine though.

Edit: disregard everything, I wasn’t paying attention close enough, and I was mounting the checks configs in /opt/datadog-agent/conf.d instead of just /conf.d.

When I mount them in the correct location, I get some different errors though:

datadog-agent-kjlsf datadog-agent 2018-02-12 13:52:02 UTC | ERROR | (integration_config.go:105 in buildLogsSources) | Invalid file path: ..2982_12_02_13_52_00.704149069/datadog.yaml
datadog-agent-kjlsf datadog-agent 2018-02-12 13:52:02 UTC | ERROR | (integration_config.go:105 in buildLogsSources) | Invalid file path: ..2982_12_02_13_52_00.704149069/docker_daemon.yaml

They’re under these weird paths because that’s how k8s mounts configmaps as files. Despite these errors it seems that logs-agent does attempt to read container logs though:

datadog-agent-kjlsf datadog-agent 2018-02-12 13:52:23 UTC | ERROR | (scanner.go:118 in listContainers) | Can't tail containers, Error response from daemon: client is newer than server (client API version: 1.25, server API version: 1.24)
datadog-agent-kjlsf datadog-agent 2018-02-12 13:52:23 UTC | ERROR | (scanner.go:119 in listContainers) | Is datadog-agent part of docker user group?

It seems like our Docker (1.12.6, CoreOS) is too old? @macat if you don’t mind me asking, what Docker version are you running? In your initial comment you said 0.13, but that seems wrong, I’m guessing it’s 1.13.x?

OK, thank you. Actually, I realized that error does not prevent my logs to be shipped. After removing the image filter, all logs are being transferred to DataDog, which is great. The agent even picked up the Kubernetes attributes from the containers, so I’m very happy.