docker-alpine: apk fetch hangs
fetch of the apk index just hangs. I hit this now on a Ubuntu server and Docker for Windows
Step 1/14 : FROM maven:3.3.9-jdk-8-alpine
---> dd9d4e1cd9db
Step 2/14 : RUN apk update && apk upgrade && apk add --no-cache --update ca-certificates bash wget curl tree libxml2-utils putty git && rm -rf /var/lib/apt/lists/* && rm -rf /var/cache/apk/*
---> Running in 536cbd484c36
fetch http://dl-cdn.alpinelinux.org/alpine/v3.5/main/x86_64/APKINDEX.tar.gz
Docker version 17.03.1-ce, build c6d412e
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 59
- Comments: 39 (2 by maintainers)
Links to this issue
Commits related to this issue
- Added parameter on build to avoid apk to hang (#178) Made this small change on the playbook to include `--network host` parameter on docker build command to avoid apk to hang when trying to fetch Alp... — committed to IBM/deploy-ibm-cloud-private by silveiralexf 5 years ago
- Added parameter on build to avoid apk to hang (#178) Made this small change on the playbook to include `--network host` parameter on docker build command to avoid apk to hang when trying to fetch Alp... — committed to yussufsh/deploy-ibm-cloud-private by silveiralexf 5 years ago
- Set network_mode: host https://github.com/gliderlabs/docker-alpine/issues/307 — committed to ikaruswill/drone-yamls by ikaruswill 4 years ago
- jenkins dind using network host, see: https://github.com/gliderlabs/docker-alpine/issues/307 — committed to GeraldWodni/kern.js by GeraldWodni 4 years ago
- Fix apk update sometimes hanging at 'fetch http://dl-cdn.alpinelinux.org/alpine/v3.11/main/x86_64/APKINDEX.tar.gz' solution from https://github.com/gliderlabs/docker-alpine/issues/307#issuecomment-35... — committed to 1ttric/versefind by invalid-email-address 4 years ago
- https://github.com/gliderlabs/docker-alpine/issues/307 — committed to mrabbah/corteza-server by mrabbah 2 years ago
- updated NFS mount settings in camQuick.sh updated NFS mount settings lines 63-66 Added 2.1.0.2 support, support for ICP docker package and printing of worker IPs. Support for ICP 2.1.0.3 add cfc-c... — committed to IBM/deploy-ibm-cloud-private by deleted user 6 years ago
I added it to my docker client commands, e.g. ‘docker build --network host …’
I had a similar issue. We have a docker-in-docker build container within a Rancher 2 / Kubernetes environment. I had to decrease the MTU of the inner docker service by adding
"mtu": 1200into/etc/docker/daemon.json. The host servers MTU is 1500.daemon.json
I only have this with dind + kubernetes. However it doesn’t happen if I use ‘–network host’ or ‘–net host’. I am using weave overlay network.
Seems like a DNS issue. Not sure why, I’ve set correct dns settings in
%programdata%\docker\config\daemon.jsonGot around this by running using https
After 4 hours of debugging managed to solve this by changing this in the gitlab-ci file:
TO
source: https://github.com/docker-library/docker/issues/103#issuecomment-478619847
@evanrich My gitlab CI was using docker:dind as a service container, and my main build container had a docker client in it which I used to connect to the service container. My repo has a dockerfile in it that I need to be built by the gitlab runner. My .gitlab-ci.yaml file contained the command
This builds my docker image. One of my layers in the dockerfile runs
apk update. This command hangs, causing thedocker buildcommand and the CI as a whole to fail. However, if I modify my .gitlab-ci.yaml file to havedocker will run the
apk updatecommand from my dockerfile without hanging.observation, networking is hard.
Also facing this thing from time to time. Here’s a typical output of GitLab CI when fetching fails:
Manually stopping and retrying the stuck CI job helps, but there’s no guarantee of reliability
If you come here from Drone CI and their drone plugin, set the MTU that fits you in the settings of the plugin. Probably could save you some hours of debugging and desperate attempts:
I believe that the problem is that in docker the MTU is lower than on the host. The way this is supposed to work is via path MTU discovery, but fastly appears to block the PMTU icmp packet (I guess it is a part of their DDoS defence). The way to “fix” this properly is to enable MSS clamping on the host. https://blog.ipspace.net/2013/01/tcp-mss-clamping-what-is-it-and-why-do.html
The other alternative is to use a different mirror that does not block the PMTU traffic.
It seems like fastly is filtering ICMP need to frag packets, which means that PMTU does not work. This can be a problem is your traffic goes via a network link that has MTU lower than 1500 (typically tunnels/vpns, PPPoE and similar). This can be worked around by enabling tcp mss clamping in the network.
Old problem, but it still happens!
For me, none of the options worked! I will mention some of the steps that alleviated the problem and allowed me to generate the image, even after 2 or 3 attempts, which is already good, since I could not even generate the image!
1# Repository change, for any mirror, add a RUN line or Joining an existing RUN:
echo "http://dl-4.alpinelinux.org/alpine/v3.12/main" > /etc/apk/repositories \ && apk update… The Official List is here: https://mirrors.alpinelinux.org/2 # The one that best behaved was to change the DNS of the Image, add a RUN line or Joining an existing RUN:
RUN printf "nameserver 208.67.222.222\nnameserver 8.8.4.4\nnameserver 1.1.1.1\nnameserver 9.9.9.9\nnameserver 8.8.8"> /etc/resolv.conf \ && apk update && apk add ...*** This Line must be included for all RUNs that update.3 # Change the Docker DNS: In Ubuntum, just edit the file: / etc / default / docker Ex:
sudo gedit / etc / default / dockerE Include in the file, the Line:DOCKER_OPTS = "- dns 208.67.222.222 --dns 8.8.8.8 --dns 1.1.1.1 --dns 8.8.4.4 --dns 208.67.220.220 --dns 9.9.9.9"I was able to narrow down the issue and is IPv6. If docker host has IPv6 enabled you are pretty much f**** as apk fetch from inside container will get stuck trying to fetch from
dl-cdn.alpinelinux.orgwhich will return “dualstack” results, but we all know that IPv6 does not work in containers.APK gets fully stuck without ever timing out or trying to to use IPv4 addresses, which will likely work.
That problem is a huge PITA as normal debugging techiniques will not give any usable results:
--network hostdoes not matterpingor nslookup ondl-cdn.alpinelinux.orgfrom inside container works toowgetworks (curl is absent from base image)UPDATE, we have a working hack
I can confirm that https://stackoverflow.com/a/41497555/99834 hack works on both docker and podman, mainly adding
--dns-opt='options single-request' --sysctl net.ipv6.conf.all.disable_ipv6=1when running/building the containers.Hi, thanks for this. This help my build. I remember I found article about MTU that maybe useful to give more information https://medium.com/@liejuntao001/fix-docker-in-docker-network-issue-in-kubernetes-cc18c229d9e5
are you not using auto devops? I haven’t specified a .gitlab-ci.yml file yet, I seem to have worked around part of it via switching to alpine.global.ssl.fastly.net, but i get this
and it just hangs at installing binutils every time. Found this: https://github.com/gliderlabs/docker-alpine/issues/279 . seems to be a wide spread issue in k8s due to lower mtu.
I was able to get slightly further with changing my mirror from a fastly mirror to mirror.clarkson.edu using
RUN sed -i 's/http\:\/\/dl-cdn.alpinelinux.org/http\:\/\/mirror.clarkson.edu/g' /etc/apk/repositoriesbuilds are running, will update when they finish.
Edit: Just finished successfully… build 174 (that’s how many times it’s taken trying to get this to work"
I’ve see the k8s issue quite a bit. wireshark shows fastly getting stuck sending oversized packets with a do not fragment flag. I don’t think this is OP’s issue though as its docker for windows.
I recently started running into a similar issue to this as well though. On linux but the behavior is the same, apk fails fetching and mostly on the index. Again, pulled up wireshark and recreated the problem. I see things going smoothly then the apk process seems to stop ACK’ing segments from the fastly server. Fastly starts throttling and resending segments and it lags out.
I’ve never recreated this with curl, but it looks like apk uses a built in BSD libfetch for its HTTP communications so maybe there’s a bug in there?
My network communication understanding is just enough to get me this far so here’s a link to the wireshark log of the communications. hopefully an alpine dev has a better understanding and can parse out a clue or find the problem.
We just got hit by this, running Drone docker plugin in a Kubernetes cluster. Decreasing the MTU to the value used by the
eth0interface on the docker plugin container fixed the issue, thank you so much for sharing this fix.What I absolutely do not understand is how it worked for almost a year without this workaround. We didn’t change anything about our cluster or Drone setup, or the Alpine versions used in our pipelines. If someone has discovered more information about this, please do share.
@andremarianiello so you mean you set
--network hostfor dockerized docker daemon? Where did you set it?Doesn’t seem like a DNS issue, since it has resolved. Unfortunately, for while, I’m at the same point: names are resolved, but can’t connect to anything.
Running the offical Drone helm chart on k3os (v0.11) I had to set the MTU to 1450 for my build to finish and not stall on fetching the apkindex.
@ncopa How can we check to see if our docker mtu is lower than our host mtu?
Yeah I was treating this as a different issue because it has slightly different characteristics and not the same as #279.
The Wireshark link in https://github.com/gliderlabs/docker-alpine/issues/307#issuecomment-387613710 shows a different traffic behavior. Instead of the traffic doesn’t get killed at the bridge, it is never ACK’d by libfetch and Fastly’s TCP session gets stuck trying to get recover. I don’t know if it’s even fastly’s fault as on the surface it seems to be doing the right thing.