actions-runner-controller: MTU change doesn't cascade to containers

Describe the bug I run actions-runner-controller on GKE, which uses a default MTU of 1460. I’ve been investigating out bound connection (npm install, apk add, etc) that have been freezing periodically. I saw the following (https://github.com/actions-runner-controller/actions-runner-controller/pull/385) and set

dockerMTU: 1460

For my runners, and confirmed this change

runner@ubuntu-20:/$ ifconfig | grep mtu
br-d788e7849cda: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1460
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1460

But I continue to see inconsistent outbound activity on workflows using container (https://docs.github.com/en/actions/learn-github-actions/workflow-syntax-for-github-actions#jobsjob_idcontainer). It seems like MTU changes aren’t applying to these containers

runner@ubuntu-20:/$ docker exec -it 88fef7adee1a4e198f47864a7acfe454_redactedname_84e321 bash

root@72de5b32b2d5:/__w/workdir/workdir# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
6: eth0@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:ac:12:00:02 brd ff:ff:ff:ff:ff:ff
    inet 172.18.0.2/16 brd 172.18.255.255 scope global eth0
       valid_lft forever preferred_lft forever

Is there a way to use customMTU for workflows using container?

Checks

  • My actions-runner-controller version (v0.x.y) does support the feature
  • I’m using an unreleased version of the controller I built from HEAD of the default branch

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 19

Most upvoted comments

@callum-tait-pbx I think this is worth being covered in our documentation as the MTU issue itself is not GKE specific

@mumoshu You’re absolutely right, my mistake.

I’m not very keen to do that either, as I consider this as something that needs to be fixed on GitHub and actions/runner side, isn’t it? 🤔

GKE has resolved the ticket on their side and there’s a usable workaround here for any other MTU issues. Closing this.

I want to confirm the workaround also works for me.

I’m not, no. Haven’t seen any issues with it today. Here’s output from a Stop Containers task on the workflow I linked earlier

Stop and remove container: 2e835be9889146d38677c36a2f7c09e1_artifactoryrtrclouddockernodechromelatest_26410e
/usr/local/bin/docker rm --force bae0166a551ea03d6a2ab130e934c2665292332cfc1f8a0133db487106716f33
bae0166a551ea03d6a2ab130e934c2665292332cfc1f8a0133db487106716f33
Remove container network: github_network_4dc6c7707c5e4a6ab795ac422f3c2395
/usr/local/bin/docker network rm github_network_4dc6c7707c5e4a6ab795ac422f3c2395
github_network_4dc6c7707c5e4a6ab795ac422f3c2395