amazon-ecs-agent: ECS agent fails to launch with latest AMI/Docker
After rebasing our AMI off the latest ECS optimized AMI to get version 1.11.1 of docker I’m seeing the ECS agent fail to start with this message:
docker: Error response from daemon: rpc error: code = 2 desc = "oci runtime error: rootfs (\"/var/lib/docker/devicemapper/mnt/14f9d36675aabe5b45170f0e2ee9206ed61421959c497a7c468091ad0df7d425/rootfs\") does not exist".
The beginning of my docker log file looks like this:
Mon Jun 6 01:10:48 UTC 2016\n
time="2016-06-06T01:10:48.591674132Z" level=info msg="New containerd process, pid: 2720\n"
time="2016-06-06T01:10:49Z" level=warning msg="containerd: low RLIMIT_NOFILE changing to max" current=1024 max=4096
time="2016-06-06T01:10:49.712144800Z" level=info msg="devmapper: Creating filesystem ext4 on device docker-202:1-263764-base"
\nMon Jun 6 01:11:04 UTC 2016\n
time="2016-06-06T01:11:04.943144528Z" level=info msg="previous instance of containerd still alive (2720)"
time="2016-06-06T01:11:08.987919664Z" level=fatal msg="Error starting daemon: error initializing graphdriver: Device is Busy"
\nMon Jun 6 01:11:15 UTC 2016\n
time="2016-06-06T01:11:16.000378921Z" level=info msg="previous instance of containerd still alive (2720)"
time="2016-06-06T01:11:16.032042379Z" level=info msg="devmapper: Creating filesystem ext4 on device docker-202:1-263764-base"
time="2016-06-06T01:11:18.418423073Z" level=info msg="devmapper: Successfully created filesystem ext4 on device docker-202:1-263764-base"
time="2016-06-06T01:11:18.486581118Z" level=info msg="Graph migration to content-addressability took 0.00 seconds"
time="2016-06-06T01:11:19.323088453Z" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
time="2016-06-06T01:11:19.361038386Z" level=warning msg="Your kernel does not support cgroup blkio weight"
time="2016-06-06T01:11:19.361066723Z" level=warning msg="Your kernel does not support cgroup blkio weight_device"
time="2016-06-06T01:11:19.361172724Z" level=warning msg="mountpoint for pids not found"
time="2016-06-06T01:11:19.361957135Z" level=info msg="Loading containers: start."
time="2016-06-06T01:11:19.362075330Z" level=info msg="Loading containers: done."
time="2016-06-06T01:11:19.362092940Z" level=info msg="Daemon has completed initialization"
time="2016-06-06T01:11:19.362129999Z" level=info msg="Docker daemon" commit="5604cbe/1.11.1" graphdriver=devicemapper version=1.11.1
time="2016-06-06T01:11:19.375313966Z" level=info msg="API listen on /var/run/docker.sock"
time="2016-06-06T01:11:50Z" level=error msg="containerd: start container" error="oci runtime error: rootfs (\"/var/lib/docker/devicemapper/mnt/bc3d9e8c25aff497e5c69c0951607a7527399a80e289ba477aa1ba9248520914/rootfs\") does not exist" id=ca65f0918a43843fc84a130381efc347da2602fa9a0273402e5de2edf78efd4a
No doubt some of my scripting to do with docker startup is no longer playing nicely with the way ECS docker expects the storage to be configured, any suggestions what it might be?
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Comments: 20 (9 by maintainers)
We’ve also been encountering this issue, thankfully in a non-production environment. I got the docker daemon running again by manually restarting the instance, and that aloud our cluster to connect & run the needed container.
The instance is spun up by an AutoScalingGroup. Here’s the following output as requested, and below is the user-data for the launch configuration. amazon-ecs-docker-log-errors.txt
user-data:
echo ECS_CLUSTER=[cluster_name] > /etc/ecs/ecs.config
yum install -y docker service docker start usermod -a -G docker ec2-user
Hope this helps!
@alexmac My apologies for the delay in response; I got pretty busy last week and this week with DockerCon.
So, a bit of background on what is happening and how:
When we build the AMI, we include a
BlockDeviceMappingfor an empty EBS volume. At boot,upstarton the instance starts running various software, includingcloud-init. Among other things like setting up SSH using the public key you specified when launching the instance,cloud-initis used to configure the instance on boot. The ECS-optimized AMI specifies somecloud-configconfiguration in a file located at/etc/cloud/cloud.cfg.d/90_ecs.cfgand tellscloud-initto invokedocker-storage-setupthrough thecloud-init-perhelper as abootcmd. Thecloud-configconfiguration is read very early in the boot process, prior to Docker being started, andbootcmds in particular are executed early in the boot process (this is different from normal user-data scripts, which are executed toward the end). We picked abootcmdas it was a good way for us to ensure thatdocker-storage-setupran before Docker was started the very first time.I haven’t used Packer before, but there are a few different general techniques you might be able to apply. For example:
cloud-configconfiguration when the source instance is launched that overrides thebootcmd/var/lib/docker, and remove/etc/sysconfig/docker-storage)BlockDeviceMappingfor the second volume (as/dev/xvdcz) without a snapshotdocker-storage-setupshould run and set up the second volume as the LVM thin poolBlockDeviceMappingfor the second volume (as/dev/xvdcz) without a snapshot/dev/xvdczexplicitly at launch through theBlockDeviceMappingparameter ofRunInstancesand usedocker psto wait for initialization to finish prior to stopping Docker.I haven’t tested each of these, but hopefully this helps give you some general ideas of how you can approach it.