skaffold: Recently started getting "Too many open files..." spinning up cluster on Apple M1

Expected behavior

That my project would be deployed and spin-up without any issues.

Actual behavior

It is safe to say, I think, that this is isolated to Apple Silicon.

I get all of these “Too many open files…” warnings, the deployment slows down to snails crawl, other messages related to “failed to port forward <service> is taken, retrying…” and then eventually it acts like it starts trying to rebuild the images again. Basically it is a mess…

Interestingly you can access the services, but none of the communication between them is working.

Without -vdebug:

https://user-images.githubusercontent.com/59094233/123156973-d2762e80-d41e-11eb-8e55-f91cba88a532.mov

With -vdebug:

https://user-images.githubusercontent.com/59094233/123155811-68a95500-d41d-11eb-8b5f-b680ba2d7f76.mov

I’ve only tested it is still working on AMD and Intel Linux and Windows (WSL2) machines. Works perfectly fine, the application spins up quickly and without issues.

I don’t have an Intel Mac around me at this point in time to test it, but I’d imagine it works fine there as well. At least that is what I was working on for two years up until about a month ago.

Pretty sure this was working perfectly about a month ago on an M1 Mac that has since been reformatted… but I could be mistaken. I reinstalled everything following the exact same steps, however.

Information

  • Skaffold version: 1.26.1
  • Operating system: macOS 11.4
  • Installed via: skaffold.dev, but did also try brew with the same results… using the darwin-arm64 version
  • Contents of skaffold.yaml:

You can comment out the individual artifacts, or the manifests, to just deploy one service at a time. The results are the same… a ton of errors about “Too many open files…” and eventually it becomes unresponsive.

apiVersion: skaffold/v2beta12
kind: Config
build:
  artifacts:
  - image: admin
    context: admin
    sync:
      manual:
      - src: "src/**/*.php"
        dest: .
      - src: "conf/**/*.conf"
        dest: .
      - src: "src/Templates/**/*.tbs"
        dest: .
      - src: "src/css/**/*.css"
        dest: .
      - src: "src/js/**/*.js"
        dest: .
    docker:
      dockerfile: Dockerfile
  - image: admin-v2
    context: admin-v2
    sync:
      manual:
      - src: 'src/**/*.ts'
        dest: .
      - src: 'src/**/*.tsx'
        dest: .
      - src: '**/*.json'
        dest: .
      - src: 'public/**/*.html'
        dest: .
      - src: 'src/assets/sass/**/*.scss'
        dest: .
      - src: 'src/build/**/*.js'
        dest: .
    docker:
      dockerfile: Dockerfile.dev
  - image: api
    context: api
    sync:
      manual:
      - src: "**/*.py"
        dest: .
    docker:
      dockerfile: Dockerfile.dev
  - image: api-v2
    context: api-v2
    sync:
      manual:
      - src: "**/*.py"
        dest: .
    docker:
      dockerfile: Dockerfile.dev
  - image: client
    context: client
    sync:
      manual:
      - src: 'src/**/*.js'
        dest: .
      - src: 'src/**/*.jsx'
        dest: .
      - src: '**/*.json'
        dest: .
      - src: 'public/**/*.html'
        dest: .
      - src: 'src/assets/sass/**/*.scss'
        dest: .
      - src: 'src/build/**/*.js'
        dest: .
    docker:
      dockerfile: Dockerfile.dev
  - image: postgres
    context: postgres
    sync:
      manual:
      - src: "**/*.sql"
        dest: .
    docker:
      dockerfile: Dockerfile.dev
  local:
    push: false
deploy:
  kubectl:
    manifests:
      - k8s/dev/ingress.yaml
      - k8s/dev/postgres.yaml
      - k8s/dev/client.yaml
      - k8s/dev/admin.yaml
      - k8s/dev/admin-v2.yaml
      - k8s/dev/api.yaml
      - k8s/dev/api-v2.yaml
    defaultNamespace: dev

Steps to reproduce the behavior

  1. Unfortunately I can’t post the project due to it being proprietary. I’d suspect anyone with not even a moderate size or complex project will also run into the issue.
  2. skaffold dev --port-forward -n dev

These are other dependencies:

  • Docker Desktop for Apple Silicon 3.4.0 (65384)
  • Kubernetes v1.21.1 (enabled via Docker Desktop)
  • Minikube v1.21.0 (arm64)
  • Skaffold v1.26.1 (arm64)
  • Homebrew 3.2.0

Pretty sure can’t install anything but arm64 for these and haven’t tried the non-arm versions.

This is what I’ve tried:

  • Several reformats of the system.
  • Using virtualization.framework and hypervisor.framework.
  • Installing all in terminal under as “Universal” or “Apple Silicon”.
  • Running with VS Code as “Universal” or “Apple Silicon”.
  • Running with VS Code as “Intel”.
  • Installing all in VS Code as “Intel” and running in VS Code as “Intel”.
  • 8GB and 16GB M1 models.
  • 4 CPU cores and 6GB RAM in Docker Desktop.
  • Minikube has 2 CPU cores and 4GB RAM.
  • Using Docker driver in Minikube.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 34 (15 by maintainers)

Most upvoted comments

I tried to look in to the sync/dockerignore code involved here, but didn’t find anything obvious that would cause things to behave differently on my M1.

While poking around, I did notice that the issue resolves itself by using a local skaffold build instead of the official release.

skaffold-darwin-amd64

With skaffold-darwin-amd64 it works like a charm. The error has gone. I would be great if arm64 version could be fixed.

@sampullman: While poking around, I did notice that the issue resolves itself by using a local skaffold build instead of the official release.

Ah! 💡 We’re currently cross-compiling Skaffold for darwin/arm64 but have to disable cgo (#5286) as we don’t have the required headers and libraries available. I’ve been meaning to retool our release process and this provides the impetus.

@ryan-efendy - by “locally compiled version” I mean literally cloning the https://github.com/GoogleContainerTools/skaffold repository on an M1 Mac and running make && make install and using that skaffold binary directly, rather than using a downloaded one. The bug seems to only exist in the cross-compiled version that is available for download from brew and/or the releases page here on github.

This is my workaround

node-setup-daemon-set.yaml

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-setup
  namespace: kube-system
  labels:
    k8s-app: node-setup
spec:
  selector:
    matchLabels:
      name: node-setup
  template:
    metadata:
      labels:
        name: node-setup
    spec:
      containers:
        - name: node-setup
          image: ubuntu
          command: ["/bin/sh", "-c"]
          args:
            [
              "/script/node-setup.sh; while true; do echo Sleeping && sleep 3600; done",
            ]
          volumeMounts:
            - name: node-setup-script
              mountPath: /script
          securityContext:
            allowPrivilegeEscalation: true
            privileged: true
      volumes:
        - name: node-setup-script
          configMap:
            name: node-setup-script
            defaultMode: 0755
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: node-setup-script
  namespace: kube-system
data:
  node-setup.sh: |
    #!/bin/bash
    # change the file-watcher max-count on each node to 524288

    # insert the new value into the system config
    sysctl -w fs.inotify.max_user_watches=524288
    sysctl -w fs.inotify.max_user_instances=512

    # check that the new value was applied
    cat /proc/sys/fs/inotify/max_user_watches
    cat /proc/sys/fs/inotify/max_user_instances