amazon-ecs-agent: ECS agent fails to launch with latest AMI/Docker
After rebasing our AMI off the latest ECS optimized AMI to get version 1.11.1 of docker I’m seeing the ECS agent fail to start with this message:
docker: Error response from daemon: rpc error: code = 2 desc = "oci runtime error: rootfs (\"/var/lib/docker/devicemapper/mnt/14f9d36675aabe5b45170f0e2ee9206ed61421959c497a7c468091ad0df7d425/rootfs\") does not exist".
The beginning of my docker log file looks like this:
Mon Jun 6 01:10:48 UTC 2016\n
time="2016-06-06T01:10:48.591674132Z" level=info msg="New containerd process, pid: 2720\n"
time="2016-06-06T01:10:49Z" level=warning msg="containerd: low RLIMIT_NOFILE changing to max" current=1024 max=4096
time="2016-06-06T01:10:49.712144800Z" level=info msg="devmapper: Creating filesystem ext4 on device docker-202:1-263764-base"
\nMon Jun 6 01:11:04 UTC 2016\n
time="2016-06-06T01:11:04.943144528Z" level=info msg="previous instance of containerd still alive (2720)"
time="2016-06-06T01:11:08.987919664Z" level=fatal msg="Error starting daemon: error initializing graphdriver: Device is Busy"
\nMon Jun 6 01:11:15 UTC 2016\n
time="2016-06-06T01:11:16.000378921Z" level=info msg="previous instance of containerd still alive (2720)"
time="2016-06-06T01:11:16.032042379Z" level=info msg="devmapper: Creating filesystem ext4 on device docker-202:1-263764-base"
time="2016-06-06T01:11:18.418423073Z" level=info msg="devmapper: Successfully created filesystem ext4 on device docker-202:1-263764-base"
time="2016-06-06T01:11:18.486581118Z" level=info msg="Graph migration to content-addressability took 0.00 seconds"
time="2016-06-06T01:11:19.323088453Z" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
time="2016-06-06T01:11:19.361038386Z" level=warning msg="Your kernel does not support cgroup blkio weight"
time="2016-06-06T01:11:19.361066723Z" level=warning msg="Your kernel does not support cgroup blkio weight_device"
time="2016-06-06T01:11:19.361172724Z" level=warning msg="mountpoint for pids not found"
time="2016-06-06T01:11:19.361957135Z" level=info msg="Loading containers: start."
time="2016-06-06T01:11:19.362075330Z" level=info msg="Loading containers: done."
time="2016-06-06T01:11:19.362092940Z" level=info msg="Daemon has completed initialization"
time="2016-06-06T01:11:19.362129999Z" level=info msg="Docker daemon" commit="5604cbe/1.11.1" graphdriver=devicemapper version=1.11.1
time="2016-06-06T01:11:19.375313966Z" level=info msg="API listen on /var/run/docker.sock"
time="2016-06-06T01:11:50Z" level=error msg="containerd: start container" error="oci runtime error: rootfs (\"/var/lib/docker/devicemapper/mnt/bc3d9e8c25aff497e5c69c0951607a7527399a80e289ba477aa1ba9248520914/rootfs\") does not exist" id=ca65f0918a43843fc84a130381efc347da2602fa9a0273402e5de2edf78efd4a
No doubt some of my scripting to do with docker startup is no longer playing nicely with the way ECS docker expects the storage to be configured, any suggestions what it might be?
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Comments: 20 (9 by maintainers)
We’ve also been encountering this issue, thankfully in a non-production environment. I got the docker daemon running again by manually restarting the instance, and that aloud our cluster to connect & run the needed container.
The instance is spun up by an AutoScalingGroup. Here’s the following output as requested, and below is the user-data for the launch configuration. amazon-ecs-docker-log-errors.txt
user-data:
echo ECS_CLUSTER=[cluster_name] > /etc/ecs/ecs.config
yum install -y docker service docker start usermod -a -G docker ec2-user
Hope this helps!
@alexmac My apologies for the delay in response; I got pretty busy last week and this week with DockerCon.
So, a bit of background on what is happening and how:
When we build the AMI, we include a
BlockDeviceMapping
for an empty EBS volume. At boot,upstart
on the instance starts running various software, includingcloud-init
. Among other things like setting up SSH using the public key you specified when launching the instance,cloud-init
is used to configure the instance on boot. The ECS-optimized AMI specifies somecloud-config
configuration in a file located at/etc/cloud/cloud.cfg.d/90_ecs.cfg
and tellscloud-init
to invokedocker-storage-setup
through thecloud-init-per
helper as abootcmd
. Thecloud-config
configuration is read very early in the boot process, prior to Docker being started, andbootcmd
s in particular are executed early in the boot process (this is different from normal user-data scripts, which are executed toward the end). We picked abootcmd
as it was a good way for us to ensure thatdocker-storage-setup
ran before Docker was started the very first time.I haven’t used Packer before, but there are a few different general techniques you might be able to apply. For example:
cloud-config
configuration when the source instance is launched that overrides thebootcmd
/var/lib/docker
, and remove/etc/sysconfig/docker-storage
)BlockDeviceMapping
for the second volume (as/dev/xvdcz
) without a snapshotdocker-storage-setup
should run and set up the second volume as the LVM thin poolBlockDeviceMapping
for the second volume (as/dev/xvdcz
) without a snapshot/dev/xvdcz
explicitly at launch through theBlockDeviceMapping
parameter ofRunInstances
and usedocker ps
to wait for initialization to finish prior to stopping Docker.I haven’t tested each of these, but hopefully this helps give you some general ideas of how you can approach it.