actions-runner-controller: containers create by Github workflow have wrong dockerMTU

Describe the bug No network possible from Docker containers created by Github workflows (dind - Docker in Docker)

To Reproduce Steps to reproduce the behavior:

  1. create workflow with docker image, e.g.
name: Setup Go

on:
  workflow_dispatch:

jobs:
  setupgo:
    runs-on: [kubernetes]
    container:
      image: 'ubuntu:latest'
    steps:
    - name: Set up Go
      uses: actions/setup-go@v2
      with:
        go-version: 1.17
    
    - name: sleep
      shell: bash
      run: sleep 300
  1. exec into the dind sidecar container and confirm that mtu parameter is propagated properly from runner deployment spec:
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
  name: my-runner
spec:
  template:
    spec:
      organization: abc
      ephemeral: true
      dockerMTU: 1440
$ kubectl exec -it my-runner-xyz -c docker -- /bin/sh 
$ ps auxww
    1 root      0:00 docker-init -- dockerd --host=unix:///var/run/docker.sock --host=tcp://0.0.0.0:2376 --tlsverify --tlscacert /certs/server/ca.pem --tlscert /certs/server/cert.pem --tlskey /certs/server/key.pem --mtu 1440
   53 root      0:03 dockerd --host=unix:///var/run/docker.sock --host=tcp://0.0.0.0:2376 --tlsverify --tlscacert /certs/server/ca.pem --tlscert /certs/server/cert.pem --tlskey /certs/server/key.pem --mtu 1440
   61 root      0:04 containerd --config /var/run/docker/containerd/containerd.toml --log-level info
   ...
$ ip a
  ...
  4: eth0@if700: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1440 qdisc noqueue state UP
      link/ether d2:ee:d5:2f:04:ae brd ff:ff:ff:ff:ff:ff
      inet 100.96.2.200/32 brd 100.96.2.200 scope global eth0
         valid_lft forever preferred_lft forever
      inet6 fe80::d0ee:d5ff:fe2f:4ae/64 scope link
         valid_lft forever preferred_lft forever
  1. run the workflow and exec into the workflow docker container
$ kubectl exec -it my-runner-xyz -c docker -- /bin/sh
$ docker ps
$ docker exec -it <containerID> bash
$ apt update
$ apt install curl -y
$ curl -vOL https://github.com/actions/go-versions/releases/download/1.17.6-1668090892/go-1.17.6-linux-x64.tar.gz
  0     0    0     0    0     0      0      0 --:--:--  0:00:54 --:--:--     0
  -> nothing is happening...
$ apt install iproute2 -y
$ ip a 
  ...
  eth0@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500
  -> wrong MTU

Expected behavior I would expect that downloading files should work from docker containers. If I manually start a docker container from the dind sidecar container -> no network problems:

$ kubectl exec -it my-runner-xyz -c docker -- /bin/sh
$ docker run -it --rm ubuntu bash
$ apt update
$ apt install curl -y
$ curl -vOL https://github.com/actions/go-versions/releases/download/1.17.6-1668090892/go-1.17.6-linux-x64.tar.gz
   100  128M  100  128M    0     0  22.9M      0  0:00:05  0:00:05 --:--:-- 23.7M
   --> download completed
$ apt install iproute2 -y
$ ip a 
  ...
  eth0@if10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1440
  -> correct MTU

Environment:

  • Helm Chart Version 0.14
  • Controller Version: 0.20.2
  • Deployment Method: Helm + Kustomize + fluxV2
  • Kubernetes: ip-in-ip Calico on AWS

About this issue

  • Original URL
  • State: open
  • Created 2 years ago
  • Comments: 15 (4 by maintainers)

Commits related to this issue

Most upvoted comments

Hello,

I’ve come across this issue as well since the Kubernetes network I’m setting up uses a smaller MTU (1420 instead of the usual 1500).

Despite the MTU being configurable with the dockerMTU setting on the Runner spec, it only affects the default docker network and does not propagate to networks created by the Github runner. This poses a problem when running workflows that use containers such as the actions/checkout@v2 one.

So, I followed-up on the workaround proposed by @FalconerTC here and changed it to support the sidecar dind container since /etc/docker/daemon.json is not available:

#!/usr/bin/env bash

# Inspired by https://github.com/actions-runner-controller/actions-runner-controller/issues/848#issuecomment-929394653
# Inspired by https://github.com/actions/runner/issues/775#issuecomment-927826684

if [[ $1 = "network" ]] && [[ $2 = "create" ]] ; then
    shift; shift #pop 2 first parameters

    MTU=$(docker network inspect bridge --format '{{index .Options "com.docker.network.driver.mtu"}}' 2>/dev/null); 
    if [[ ! -z "$MTU" ]]; then 
        /usr/local/bin/docker.bin network create --opt com.docker.network.driver.mtu=$MTU "${@}"  
    else
        /usr/local/bin/docker.bin network create "${@}"
    fi
else
    #just call docker as normal if not network create
    /usr/local/bin/docker.bin "${@}"
fi

This has the added benefit of propagating whatever MTU is set on the default docker bridge, which works in both dind and docker in runner situations.

I’ve pushed a docker image with the change to tiagomelo/github-actions-runner which basically places the script shim the docker container and propagate the docker MTU to newly created networks.

However, since using containers in Github Actions is pretty common and - apparently - having Kubernetes overlay networks with MTUs smaller than 1500, I think this should be integrated in the official summerwind/actions-runner image. I’m willing to open a PR with the change if you wish.

Meanwhile, this can be tested by creating a runner with the custom runner image:

apiVersion: actions.summerwind.dev/v1alpha1
kind: Runner
metadata:
  name: test-runner
spec:
  dockerEnabled: true
  dockerMTU: 1420
  image: tiagomelo/docker-actions-runner
  repository: <github-repository>

You can validate this is working with:

kubectl exec -it -c runner test-runner -- /bin/bash

within the runner:

docker network inspect bridge --format '{{index .Options "com.docker.network.driver.mtu"}}' 2>/dev/null
# returns 1420 (or whatever MTU is set)

docker run --rm rancher/curl https://github.com >/dev/null
# should complete

docker network create test

docker network inspect bridge --format '{{index .Options "com.docker.network.driver.mtu"}}' 2>/dev/null
# returns 1420 (or whatever MTU is set)

docker run --rm --network test rancher/curl https://github.com >/dev/null
# should complete

docker network rm test

Edit: Published a repository with the image under tiagoblackcode/github-actions-runner.

Hello,

I finally had some time to work on this. Please check out #1201 which adds a new docker image to the repository that propagates the MTU setting to networks created in Github workflows.

I’m willing to help you with support.

I’m gonna prepare a PR with the secondary image and regarding it in the next couple of days.

I agree with the general sentiment that the issue is not in ARC but instead in actions/runner or even in docker, honestly. Since docker supports the --mtu flag, it should use its MTU value as default to any networks created unless specified otherwise.

However, since it seems we’re kind of stuck on where to implement this, I suggest the following:

  • Document in the Troubleshooting section the issue, and the workaround.
  • Optionally, create a secondary image in ARC with the fix, and a USE AT YOUR OWN RISK advert. Alternatively, point to my image, but I’d rather have this in the official repository since it has more visibility. I suggest a secondary image because this can have unintended consequences we’re not aware of at the moment.
  • Bring visibility to the actions/runner#775 issue.

This way ARC users have a documented way to fix the issue while we wait for actions/runner to support this.

@mumoshu what are your thoughts on this?

@NicklasWallgren regarding https://github.com/actions/runner/pull/1650, instead of having an env variable I think it’s better to inspect the bridge network, figure out its MTU and use it to create the network.

Edit:

There’s an open PR in https://github.com/moby/moby/pull/43197 to propagate the MTU set in --mtu to networks created without an explicit MTU.

Can this be reopened since the issue still persist? We’re running into the same issue where we use dind and buildah to create container images.

@tiagoblackcode Were you able to create a PR for this?