kubernetes: kubectl discovery dramatically slower on MacOS than Linux

What happened?

We use a lot of CRDs (frequently more than 1,000) in the @crossplane project. This has historically caused a bunch of problems with client-side API discovery, among other things. At the time of writing the latest releases of kubectl use a discovery client rate limiter with a generous QPS and burst (50 qps, 300 qps burst), so client-side rate limiting doesn’t become a problem until the API server is serving more than 300 API groups.

Despite rate-limiting no longer coming into effect, we noticed that discovery was very slow - taking 10-20 seconds to complete. After having a few folks at @upbound run a little test (see https://github.com/crossplane/crossplane/issues/2895#issuecomment-1162688419 and following comments) we noticed that discovery specifically seems slow on MacOS - it’s dramatically faster on Linux.

Here’s discovery happening on Linux with 1,577 CRDs across 343 API groups in around a second:

$ kubectl --kubeconfig=$HOME/.kube/config.eks version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.5", GitCommit:"c285e781331a3785a7f436042c65c5641ce8a9e9", GitTreeState:"archive", BuildDate:"1980-01-01T00:00:00Z", GoVersion:"go1.17.10", Compiler:"gc", Platform:"linux/arm64"}
Server Version: version.Info{Major:"1", Minor:"22+", GitVersion:"v1.22.9-eks-a64ea69", GitCommit:"540410f9a2e24b7a2a870ebfacb3212744b5f878", GitTreeState:"clean", BuildDate:"2022-05-12T19:15:31Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}

$ kubectl --kubeconfig=$HOME/.kube/config.eks get crd|wc -l
1577

$ kubectl --kubeconfig=$HOME/.kube/config.eks api-versions|wc -l
343

$ rm -rf ~/.kube/cache

$ time kubectl --kubeconfig=$HOME/.kube/config.eks get nodes
NAME                                            STATUS   ROLES    AGE   VERSION
ip-192-168-103-188.us-west-2.compute.internal   Ready    <none>   8h    v1.22.9-eks-810597c
ip-192-168-63-15.us-west-2.compute.internal     Ready    <none>   8h    v1.22.9-eks-810597c
ip-192-168-93-0.us-west-2.compute.internal      Ready    <none>   8h    v1.22.9-eks-810597c
kubectl --kubeconfig=$HOME/.kube/config.eks get nodes  0.17s user 0.21s system 35% cpu 1.067 total

$ du -hs ~/.kube/cache
7.5M    /home/negz/.kube/cache

$ speedtest
Retrieving speedtest.net configuration...
Testing from CenturyLink (REDACTED)...
Retrieving speedtest.net server list...
Selecting best server based on ping...
Hosted by Ziply Fiber (Seattle, WA) [6.71 km]: 12.355 ms
Testing download speed................................................................................
Download: 443.00 Mbit/s
Testing upload speed......................................................................................................
Upload: 410.68 Mbit/s

Here’s the same test run from MacOS against the exact same EKS cluster, taking 12+ seconds.

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.5", GitCommit:"c285e781331a3785a7f436042c65c5641ce8a9e9", GitTreeState:"archive", BuildDate:"1980-01-01T00:00:00Z", GoVersion:"go1.17.10", Compiler:"gc", Platform:"darwin/arm64"}
Server Version: version.Info{Major:"1", Minor:"22+", GitVersion:"v1.22.9-eks-a64ea69", GitCommit:"540410f9a2e24b7a2a870ebfacb3212744b5f878", GitTreeState:"clean", BuildDate:"2022-05-12T19:15:31Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}
$ rmd ~/.kube/cache
$ time k get nodes
NAME                                            STATUS   ROLES    AGE   VERSION
ip-192-168-103-188.us-west-2.compute.internal   Ready    <none>   47h   v1.22.9-eks-810597c
ip-192-168-63-15.us-west-2.compute.internal     Ready    <none>   47h   v1.22.9-eks-810597c
ip-192-168-93-0.us-west-2.compute.internal      Ready    <none>   47h   v1.22.9-eks-810597c
kubectl get nodes  0.28s user 0.42s system 5% cpu 12.459 total

What did you expect to happen?

I expected kubectl discovery on MacOS to be as fast as it was on Linux.

How can we reproduce it (as minimally and precisely as possible)?

  1. Create an EKS cluster.
  2. Create ~1,500 CRDs across ~350 API groups.
  3. Delete the discovery cache in ~/.kube/cache
  4. Time how long the first kubectl command you run against the EKS cluster takes to complete

Anything else we need to know?

Refer to https://github.com/crossplane/crossplane/issues/2895#issuecomment-1162688419 and subsequent comments for more examples of folks reproducing this.

Kubernetes version

Note that the version of kubectl was the same on Mac and Linux for my tests.
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.5", GitCommit:"c285e781331a3785a7f436042c65c5641ce8a9e9", GitTreeState:"archive", BuildDate:"1980-01-01T00:00:00Z", GoVersion:"go1.17.10", Compiler:"gc", Platform:"linux/arm64"}
Server Version: version.Info{Major:"1", Minor:"22+", GitVersion:"v1.22.9-eks-a64ea69", GitCommit:"540410f9a2e24b7a2a870ebfacb3212744b5f878", GitTreeState:"clean", BuildDate:"2022-05-12T19:15:31Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider

AWS - EKS

OS version

MacOS 12.4, M1 Max

Linux as below

# On Linux:
$ cat /etc/os-release
BUG_REPORT_URL="https://github.com/NixOS/nixpkgs/issues"
BUILD_ID="22.05.20220528.d108690"
DOCUMENTATION_URL="https://nixos.org/learn.html"
HOME_URL="https://nixos.org/"
ID=nixos
LOGO="nix-snowflake"
NAME=NixOS
PRETTY_NAME="NixOS 22.05 (Quokka)"
SUPPORT_URL="https://nixos.org/community.html"
VERSION="22.05 (Quokka)"
VERSION_CODENAME=quokka
VERSION_ID="22.05"%
$ uname -a
Linux mael 5.18.0 #1-NixOS SMP Sun May 22 19:52:31 UTC 2022 aarch64 GNU/Linux

Install tools

N/A

Container runtime (CRI) and version (if applicable)

N/A

Related plugins (CNI, CSI, …) and versions (if applicable)

N/A

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 1
  • Comments: 22 (22 by maintainers)

Commits related to this issue

Most upvoted comments

Watch out for anything where writes trigger fsync(): there was an issue on macOS where the only way for an app to be confident that a write had completed was to trigger a whole-filesystem sync. Libraries that like data integrity opted for the slower-but-effective approach. I’ve stopped using macOS so this hint is from a while back, but it might relevant.

Given that it is indeed only a cache I lean toward optimizing for write performance rather than data safety.

I don’t care a ton if we drop a cache file coherently. I care a lot if a write fails in a way that leaves a partially written file in a way that pollutes the cache and makes the user have to delete the file from the cache manually.

I don’t know which of those we’re opening ourselves up to by writing without sync. Can we find out?

Also note that this should fix a lot of that problem: https://github.com/kubernetes/enhancements/pull/3364

Thanks for the tip @sftim!

The issue definitely seems to be in the *disk.CachedDiscoveryClient. When I apply the diff below to switch over to a *memory.memCacheClient the slowness disappears and MacOS performance matches that of Linux - about a second for kubectl get nodes to perform discovery:

$ bin/kubectl api-resources|wc -l
1633

$ time bin/kubectl get nodes
NAME                                            STATUS   ROLES    AGE    VERSION
ip-192-168-103-188.us-west-2.compute.internal   Ready    <none>   5d3h   v1.22.9-eks-810597c
ip-192-168-63-15.us-west-2.compute.internal     Ready    <none>   5d3h   v1.22.9-eks-810597c
ip-192-168-93-0.us-west-2.compute.internal      Ready    <none>   5d3h   v1.22.9-eks-810597c
bin/kubectl get nodes  0.13s user 0.06s system 17% cpu 1.058 total

# We're using an in-memory cache - no cache directory is being created
$ l  ~/.kube
config  configbak  kind.yaml
~ $
diff --git a/staging/src/k8s.io/cli-runtime/pkg/genericclioptions/config_flags.go b/staging/src/k8s.io/cli-runtime/pkg/genericclioptions/config_flags.go
index 0d604b9c2fb..12d7c51da0b 100644
--- a/staging/src/k8s.io/cli-runtime/pkg/genericclioptions/config_flags.go
+++ b/staging/src/k8s.io/cli-runtime/pkg/genericclioptions/config_flags.go
@@ -22,13 +22,12 @@ import (
        "regexp"
        "strings"
        "sync"
-       "time"

        "github.com/spf13/pflag"

        "k8s.io/apimachinery/pkg/api/meta"
        "k8s.io/client-go/discovery"
-       diskcached "k8s.io/client-go/discovery/cached/disk"
+       "k8s.io/client-go/discovery/cached/memory"
        "k8s.io/client-go/rest"
        "k8s.io/client-go/restmapper"
        "k8s.io/client-go/tools/clientcmd"
@@ -275,17 +274,8 @@ func (f *ConfigFlags) toDiscoveryClient() (discovery.CachedDiscoveryInterface, e
        config.Burst = f.discoveryBurst
        config.QPS = f.discoveryQPS

-       cacheDir := defaultCacheDir
-
-       // retrieve a user-provided value for the "cache-dir"
-       // override httpCacheDir and discoveryCacheDir if user-value is given.
-       if f.CacheDir != nil {
-               cacheDir = *f.CacheDir
-       }
-       httpCacheDir := filepath.Join(cacheDir, "http")
-       discoveryCacheDir := computeDiscoverCacheDir(filepath.Join(cacheDir, "discovery"), config.Host)
-
-       return diskcached.NewCachedDiscoveryClientForConfig(config, discoveryCacheDir, httpCacheDir, time.Duration(6*time.Hour))
+       dc, err := discovery.NewDiscoveryClientForConfig(config)
+       return memory.NewMemCacheClient(dc), err
 }

 // ToRESTMapper returns a mapper.