actions-runner-controller: MTU change doesn't cascade to containers
Describe the bug
I run actions-runner-controller on GKE, which uses a default MTU of 1460. I’ve been investigating out bound connection (npm install
, apk add
, etc) that have been freezing periodically. I saw the following (https://github.com/actions-runner-controller/actions-runner-controller/pull/385) and set
dockerMTU: 1460
For my runners, and confirmed this change
runner@ubuntu-20:/$ ifconfig | grep mtu
br-d788e7849cda: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1460
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1460
But I continue to see inconsistent outbound activity on workflows using container
(https://docs.github.com/en/actions/learn-github-actions/workflow-syntax-for-github-actions#jobsjob_idcontainer). It seems like MTU changes aren’t applying to these containers
runner@ubuntu-20:/$ docker exec -it 88fef7adee1a4e198f47864a7acfe454_redactedname_84e321 bash
root@72de5b32b2d5:/__w/workdir/workdir# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
6: eth0@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 02:42:ac:12:00:02 brd ff:ff:ff:ff:ff:ff
inet 172.18.0.2/16 brd 172.18.255.255 scope global eth0
valid_lft forever preferred_lft forever
Is there a way to use customMTU for workflows using container
?
Checks
- My actions-runner-controller version (v0.x.y) does support the feature
- I’m using an unreleased version of the controller I built from HEAD of the default branch
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 19
@callum-tait-pbx I think this is worth being covered in our documentation as the MTU issue itself is not GKE specific
@mumoshu You’re absolutely right, my mistake.
I’m not very keen to do that either, as I consider this as something that needs to be fixed on GitHub and actions/runner side, isn’t it? 🤔
GKE has resolved the ticket on their side and there’s a usable workaround here for any other MTU issues. Closing this.
I want to confirm the workaround also works for me.
I’m not, no. Haven’t seen any issues with it today. Here’s output from a Stop Containers task on the workflow I linked earlier
https://github.com/kubernetes/test-infra/issues/23741#issuecomment-929518690
Current ETA from Google on a fix is Thursday