kubernetes: kubectl discovery dramatically slower on MacOS than Linux

What happened?

We use a lot of CRDs (frequently more than 1,000) in the @crossplane project. This has historically caused a bunch of problems with client-side API discovery, among other things. At the time of writing the latest releases of kubectl use a discovery client rate limiter with a generous QPS and burst (50 qps, 300 qps burst), so client-side rate limiting doesn’t become a problem until the API server is serving more than 300 API groups.

Despite rate-limiting no longer coming into effect, we noticed that discovery was very slow - taking 10-20 seconds to complete. After having a few folks at @upbound run a little test (see https://github.com/crossplane/crossplane/issues/2895#issuecomment-1162688419 and following comments) we noticed that discovery specifically seems slow on MacOS - it’s dramatically faster on Linux.

Here’s discovery happening on Linux with 1,577 CRDs across 343 API groups in around a second:

$ kubectl --kubeconfig=$HOME/.kube/config.eks version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.5", GitCommit:"c285e781331a3785a7f436042c65c5641ce8a9e9", GitTreeState:"archive", BuildDate:"1980-01-01T00:00:00Z", GoVersion:"go1.17.10", Compiler:"gc", Platform:"linux/arm64"}
Server Version: version.Info{Major:"1", Minor:"22+", GitVersion:"v1.22.9-eks-a64ea69", GitCommit:"540410f9a2e24b7a2a870ebfacb3212744b5f878", GitTreeState:"clean", BuildDate:"2022-05-12T19:15:31Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}

$ kubectl --kubeconfig=$HOME/.kube/config.eks get crd|wc -l
1577

$ kubectl --kubeconfig=$HOME/.kube/config.eks api-versions|wc -l
343

$ rm -rf ~/.kube/cache

$ time kubectl --kubeconfig=$HOME/.kube/config.eks get nodes
NAME                                            STATUS   ROLES    AGE   VERSION
ip-192-168-103-188.us-west-2.compute.internal   Ready    <none>   8h    v1.22.9-eks-810597c
ip-192-168-63-15.us-west-2.compute.internal     Ready    <none>   8h    v1.22.9-eks-810597c
ip-192-168-93-0.us-west-2.compute.internal      Ready    <none>   8h    v1.22.9-eks-810597c
kubectl --kubeconfig=$HOME/.kube/config.eks get nodes  0.17s user 0.21s system 35% cpu 1.067 total

$ du -hs ~/.kube/cache
7.5M    /home/negz/.kube/cache

$ speedtest
Retrieving speedtest.net configuration...
Testing from CenturyLink (REDACTED)...
Retrieving speedtest.net server list...
Selecting best server based on ping...
Hosted by Ziply Fiber (Seattle, WA) [6.71 km]: 12.355 ms
Testing download speed................................................................................
Download: 443.00 Mbit/s
Testing upload speed......................................................................................................
Upload: 410.68 Mbit/s

Here’s the same test run from MacOS against the exact same EKS cluster, taking 12+ seconds.

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.5", GitCommit:"c285e781331a3785a7f436042c65c5641ce8a9e9", GitTreeState:"archive", BuildDate:"1980-01-01T00:00:00Z", GoVersion:"go1.17.10", Compiler:"gc", Platform:"darwin/arm64"}
Server Version: version.Info{Major:"1", Minor:"22+", GitVersion:"v1.22.9-eks-a64ea69", GitCommit:"540410f9a2e24b7a2a870ebfacb3212744b5f878", GitTreeState:"clean", BuildDate:"2022-05-12T19:15:31Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}
$ rmd ~/.kube/cache
$ time k get nodes
NAME                                            STATUS   ROLES    AGE   VERSION
ip-192-168-103-188.us-west-2.compute.internal   Ready    <none>   47h   v1.22.9-eks-810597c
ip-192-168-63-15.us-west-2.compute.internal     Ready    <none>   47h   v1.22.9-eks-810597c
ip-192-168-93-0.us-west-2.compute.internal      Ready    <none>   47h   v1.22.9-eks-810597c
kubectl get nodes  0.28s user 0.42s system 5% cpu 12.459 total

What did you expect to happen?

I expected kubectl discovery on MacOS to be as fast as it was on Linux.

How can we reproduce it (as minimally and precisely as possible)?

Create an EKS cluster.
Create ~1,500 CRDs across ~350 API groups.
Delete the discovery cache in ~/.kube/cache
Time how long the first kubectl command you run against the EKS cluster takes to complete

Anything else we need to know?

Refer to https://github.com/crossplane/crossplane/issues/2895#issuecomment-1162688419 and subsequent comments for more examples of folks reproducing this.

Kubernetes version

Note that the version of kubectl was the same on Mac and Linux for my tests.

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.5", GitCommit:"c285e781331a3785a7f436042c65c5641ce8a9e9", GitTreeState:"archive", BuildDate:"1980-01-01T00:00:00Z", GoVersion:"go1.17.10", Compiler:"gc", Platform:"linux/arm64"}
Server Version: version.Info{Major:"1", Minor:"22+", GitVersion:"v1.22.9-eks-a64ea69", GitCommit:"540410f9a2e24b7a2a870ebfacb3212744b5f878", GitTreeState:"clean", BuildDate:"2022-05-12T19:15:31Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider

AWS - EKS

OS version

MacOS 12.4, M1 Max

Linux as below

# On Linux:
$ cat /etc/os-release
BUG_REPORT_URL="https://github.com/NixOS/nixpkgs/issues"
BUILD_ID="22.05.20220528.d108690"
DOCUMENTATION_URL="https://nixos.org/learn.html"
HOME_URL="https://nixos.org/"
ID=nixos
LOGO="nix-snowflake"
NAME=NixOS
PRETTY_NAME="NixOS 22.05 (Quokka)"
SUPPORT_URL="https://nixos.org/community.html"
VERSION="22.05 (Quokka)"
VERSION_CODENAME=quokka
VERSION_ID="22.05"%
$ uname -a
Linux mael 5.18.0 #1-NixOS SMP Sun May 22 19:52:31 UTC 2022 aarch64 GNU/Linux

Install tools

N/A

Container runtime (CRI) and version (if applicable)

N/A

Related plugins (CNI, CSI, …) and versions (if applicable)

N/A

About this issue

Original URL
State: closed
Created 2 years ago
Reactions: 1
Comments: 22 (22 by maintainers)

Commits related to this issue

Duplicate github.com/gregjones/httpcache/diskcache in tree. The httpcache package uses github.com/peterbourgon/diskv to back its disk cache. It is hard-coded to explicitly call f.Sync() for every fil... — committed to negz/kubernetes by negz 2 years ago
Avoid calling file.Sync() for every cached discovery endpoint The httpcache package creates a cache entry for each HTTP response body that it caches, and each cache entry corresponds to an individual... — committed to negz/kubernetes by negz 2 years ago
Add a benchmark for the discovery cache RoundTripper This benchmark is intended to demonstrate a performance improvement gained by removing fsyncs. Refer to the below issue for more detail. https://... — committed to negz/kubernetes by negz 2 years ago
Add a benchmark for the discovery cache RoundTripper This benchmark is intended to demonstrate a performance improvement gained by removing fsyncs. Refer to the below issue for more detail. https://... — committed to kubernetes/client-go by negz 2 years ago

Most upvoted comments

Watch out for anything where writes trigger fsync(): there was an issue on macOS where the only way for an app to be confident that a write had completed was to trigger a whole-filesystem sync. Libraries that like data integrity opted for the slower-but-effective approach. I’ve stopped using macOS so this hint is from a while back, but it might relevant.

sftim on Jun 25, 2022

Given that it is indeed only a cache I lean toward optimizing for write performance rather than data safety.

I don’t care a ton if we drop a cache file coherently. I care a lot if a write fails in a way that leaves a partially written file in a way that pollutes the cache and makes the user have to delete the file from the cache manually.

I don’t know which of those we’re opening ourselves up to by writing without sync. Can we find out?

liggitt on Jun 28, 2022

Also note that this should fix a lot of that problem: https://github.com/kubernetes/enhancements/pull/3364

apelisse on Jun 27, 2022

Thanks for the tip @sftim!

The issue definitely seems to be in the *disk.CachedDiscoveryClient. When I apply the diff below to switch over to a *memory.memCacheClient the slowness disappears and MacOS performance matches that of Linux - about a second for kubectl get nodes to perform discovery:

$ bin/kubectl api-resources|wc -l
1633

$ time bin/kubectl get nodes
NAME                                            STATUS   ROLES    AGE    VERSION
ip-192-168-103-188.us-west-2.compute.internal   Ready    <none>   5d3h   v1.22.9-eks-810597c
ip-192-168-63-15.us-west-2.compute.internal     Ready    <none>   5d3h   v1.22.9-eks-810597c
ip-192-168-93-0.us-west-2.compute.internal      Ready    <none>   5d3h   v1.22.9-eks-810597c
bin/kubectl get nodes  0.13s user 0.06s system 17% cpu 1.058 total

# We're using an in-memory cache - no cache directory is being created
$ l  ~/.kube
config  configbak  kind.yaml
~ $

diff --git a/staging/src/k8s.io/cli-runtime/pkg/genericclioptions/config_flags.go b/staging/src/k8s.io/cli-runtime/pkg/genericclioptions/config_flags.go
index 0d604b9c2fb..12d7c51da0b 100644
--- a/staging/src/k8s.io/cli-runtime/pkg/genericclioptions/config_flags.go
+++ b/staging/src/k8s.io/cli-runtime/pkg/genericclioptions/config_flags.go
@@ -22,13 +22,12 @@ import (
        "regexp"
        "strings"
        "sync"
-       "time"

        "github.com/spf13/pflag"

        "k8s.io/apimachinery/pkg/api/meta"
        "k8s.io/client-go/discovery"
-       diskcached "k8s.io/client-go/discovery/cached/disk"
+       "k8s.io/client-go/discovery/cached/memory"
        "k8s.io/client-go/rest"
        "k8s.io/client-go/restmapper"
        "k8s.io/client-go/tools/clientcmd"
@@ -275,17 +274,8 @@ func (f *ConfigFlags) toDiscoveryClient() (discovery.CachedDiscoveryInterface, e
        config.Burst = f.discoveryBurst
        config.QPS = f.discoveryQPS

-       cacheDir := defaultCacheDir
-
-       // retrieve a user-provided value for the "cache-dir"
-       // override httpCacheDir and discoveryCacheDir if user-value is given.
-       if f.CacheDir != nil {
-               cacheDir = *f.CacheDir
-       }
-       httpCacheDir := filepath.Join(cacheDir, "http")
-       discoveryCacheDir := computeDiscoverCacheDir(filepath.Join(cacheDir, "discovery"), config.Host)
-
-       return diskcached.NewCachedDiscoveryClientForConfig(config, discoveryCacheDir, httpCacheDir, time.Duration(6*time.Hour))
+       dc, err := discovery.NewDiscoveryClientForConfig(config)
+       return memory.NewMemCacheClient(dc), err
 }

 // ToRESTMapper returns a mapper.

negz on Jun 27, 2022