datadog-agent: Agent does not start with read-only file system

Our security team asked me to make the root file system of all containers read only. But I figured out that the Datadog agent dies and is not able to run on a read only file system.

Log output

2023-01-18T09:21:53.820+01:00 | [s6-init] making user provided files available at /var/run/s6/etc...exited 0.
2023-01-18T09:21:53.906+01:00 | [s6-init] ensuring user provided files have correct perms...exited 0.
2023-01-18T09:21:53.945+01:00 | [fix-attrs.d] applying ownership & permissions fixes...
2023-01-18T09:21:53.959+01:00 | [fix-attrs.d] done.
2023-01-18T09:21:53.959+01:00 | [cont-init.d] executing container initialization scripts...
2023-01-18T09:21:53.959+01:00 | [cont-init.d] 01-check-apikey.sh: executing...
2023-01-18T09:21:53.960+01:00 | [cont-init.d] 01-check-apikey.sh: exited 0.
2023-01-18T09:21:53.962+01:00 | [cont-init.d] 50-ci.sh: executing...
2023-01-18T09:21:53.972+01:00 | ln: failed to create symbolic link '/etc/datadog-agent/datadog.yaml': Read-only file system
2023-01-18T09:21:53.972+01:00 | [cont-init.d] 50-ci.sh: exited 0.
2023-01-18T09:21:53.972+01:00 | [cont-init.d] 50-ecs.sh: executing...
2023-01-18T09:21:53.990+01:00 | rm: cannot remove '/etc/datadog-agent/conf.d/cpu.d/conf.yaml.default': Read-only file system
2023-01-18T09:21:53.990+01:00 | rm: cannot remove '/etc/datadog-agent/conf.d/network.d/conf.yaml.default': Read-only file system
2023-01-18T09:21:53.990+01:00 | rm: cannot remove '/etc/datadog-agent/conf.d/io.d/conf.yaml.default': Read-only file system
2023-01-18T09:21:53.990+01:00 | rm: cannot remove '/etc/datadog-agent/conf.d/ntp.d/conf.yaml.default': Read-only file system
2023-01-18T09:21:53.990+01:00 | rm: cannot remove '/etc/datadog-agent/conf.d/uptime.d/conf.yaml.default': Read-only file system
2023-01-18T09:21:53.990+01:00 | rm: cannot remove '/etc/datadog-agent/conf.d/disk.d/conf.yaml.default': Read-only file system
2023-01-18T09:21:53.990+01:00 | rm: cannot remove '/etc/datadog-agent/conf.d/load.d/conf.yaml.default': Read-only file system
2023-01-18T09:21:53.990+01:00 | rm: cannot remove '/etc/datadog-agent/conf.d/file_handle.d/conf.yaml.default': Read-only file system
2023-01-18T09:21:53.990+01:00 | rm: cannot remove '/etc/datadog-agent/conf.d/memory.d/conf.yaml.default': Read-only file system
2023-01-18T09:21:53.993+01:00 | [cont-init.d] 50-ecs.sh: exited 123.
2023-01-18T09:21:54.020+01:00 | [cont-finish.d] executing container finish scripts...
2023-01-18T09:21:54.022+01:00 | [cont-finish.d] done.
2023-01-18T09:21:54.023+01:00 | [s6-finish] waiting for services.
2023-01-18T09:21:54.227+01:00 | [s6-finish] sending all processes the TERM signal.
2023-01-18T09:21:57.262+01:00 | [s6-finish] sending all processes the KILL signal and exiting.

Agent Environment

I am pulling the agent from public.ecr.aws/datadog/agent:latest. I do not see a version number in the log. I included it as a side car to my AWS ECS task definition.

Describe what happened: After setting "readonlyRootFilesystem": true, in the task definition, the Datadog agent isn’t able to start.

Describe what you expected: Datadog agent should run as normal.

Steps to reproduce the issue: Run the agent as a sidecar in AWS ECS. Set "readonlyRootFilesystem": true, in your container task definition.

Additional environment details (Operating System, Cloud provider, etc): AWS ECS

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Reactions: 24
  • Comments: 22 (1 by maintainers)

Most upvoted comments

Funny Im just now checking this off on my InfoSec checklist… Perfect timing?

+1 waiting for Datadog agent to work with read-only FS.

@kayman-mk 100% agree, this is definitely the concern we have. I suspect the solution might end up being the configuration I recommended and a promise from DD that the filesystem will not be changed without proper notice. And some extra caution that our stacks are nothing alike, results may vary.

FWIW, our pipelines for our agents always grab the latest DD image, build and deploys, on a routine schedule. We haven’t had any issues since and there have been updates.

I suppose a script that monitors syslog messages for permission errors on writing to files outside of the mounted volumes would save some headaches, but Im going to cross that bridge when DD breaks. I have a feeling the agents are well engineered and wont be throwing many surprises.

Good solution, @tomwire, but I am a little afraid that I run into problems if I update the version of the agent and it needs a different file set than the one before.

+1 waiting for Datadog agent to work with read-only FS.