karpenter-provider-aws: Nodes come up slower then expected

Version

Karpenter: v0.8.1

Kubernetes: v1.21.1

Expected Behavior

Brandon (on slack) suggested that my nodes should be coming up a little faster and asked me to post an issue so the team could look into it.

Actual Behavior

I have an EKS cluster running with the latest CNI plugin (v.1.10.2) and I am using the latest version of Karpenter to scale nodes out dynamically. That works great but I am seeing some slowness when a new nodes is added that I can’t explain and I am trying to understand and if possible improve. The following are my observations. I have found these times to be fairly consistent over multiple tests.

Core cluster nodes support 6 pods of our application running. When I added a new pod to our cluster and it gets scheduled on an existing node, the pods are ready and available in ~40s.

Adding two pods so that karpenter creates a new node creates a t3.medium.

  • Without linkerd-proxy
    • node is ready in ~ 80-90s
    • pods sit at CreatingContainer for 3m, waiting on the CNI
    • pods ready and available at ~ 3m45s
  • With linkerd-proxy
    • node is ready in ~ 80-90s
    • init container executes almost immediately.
    • pods sit at podInitializing for ~3m10 (When proxy is ready)
    • pods fully ready in ~ 4m

Adding eight new pods, karpenter creates a t3.2xlarge

  • With linkerd-proxy
    • node is ready in ~ 80-90s
    • aws-node pod takes 30s to start
    • init continers executes ~2m20s
    • pods sit at podInitializing for ~ 6m40s (When proxy is ready)
    • pods fully ready in ~ 7m30s

During this time the pods report that they are waiting on the CNI to be available but that only takes 30s after the node is ready. It also doesn’t explain why it takes so much longer for a larger node then a smaller one. I suspect that this might be related: https://github.com/aws/amazon-vpc-cni-k8s/pull/1943 but I don’t know when that will make it into a release.

I have also created an issue on the aws/amazon-vpc-cni-k8s repo: https://github.com/aws/amazon-vpc-cni-k8s/issues/1956

Steps to Reproduce the Problem

Use karpenter to create a small node (t3.small) and node the times, then have it create a large node (t3.xlarge) and node the difference in how long it takes for your pods to be available.

Resource Specs and Logs

Provisioner spec

---
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: docnetwork-app

spec:
  ttlSecondsUntilExpired: 21600 # 6 Hours = 60 * 60 * 6 Seconds
  ttlSecondsAfterEmpty: 30

  taints:
    - key: dedicated
      value: dn-prod
      effect: NoSchedule

  # Labels are arbitrary key-values that are applied to all nodes
  labels:
    app: dn
    dedicated: dn-prod
    group: app-dedicated
    ManagedBy: karpenter
    # node-role.kubernetes.io/karpenter: app-dedicated

  requirements:
  # - key: node.kubernetes.io/instance-type
  #   operator: In
  #   values: ["c5.xlarge"]
  - key: karpenter.sh/capacity-type
    operator: In
    values: ["spot", "on-demand"]
  - key: kubernetes.io/arch
    operator: In
    values:
    - amd64

  provider:
    instanceProfile: dn-main-eks-karpenter-instance-profile
    blockDeviceMappings:
      - deviceName: /dev/xvda
        ebs:
          volumeSize: 50Gi
          volumeType: gp2
          encrypted: false
          deleteOnTermination: true

    securityGroupSelector:
      aws:eks:cluster-name: dn-main-eks
    subnetSelector:
      Name: dn-main-private-*
    tags:
      Name: dn-main-eks-node
      datadog: "yes"
      ManagedBy: karpenter

Deployment Spec:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: app
  namespace: prod
  annotations:
    ignore-check.kube-linter.io/no-read-only-root-fs: "PM2 needs access to write to
      root fs"
    ignore-check.kube-linter.io/run-as-non-root: "PM2 needs access to write to root
      fs"
    ignore-check.kube-linter.io/no-anti-affinity: "Not supported by karpenter"
  labels:
    app: dn
spec:
  replicas: 2 # Prod value is 16
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
  selector:
    matchLabels:
      app: dn
  template:
    metadata:
      annotations:
        config.alpha.linkerd.io/proxy-wait-before-exit-seconds: "30"
        linkerd.io/inject: enabled
        # The following configures our custom metrics picked up by datadog's openmetrics integration.
        ad.datadoghq.com/app.check_names: |
          ["openmetrics"]
        ad.datadoghq.com/app.init_configs: |
          [{}]
        ad.datadoghq.com/app.instances: |
          [
            {
              "prometheus_url": "http://%%host%%:3001/cluster_metrics",
              "namespace": "prom",
              "metrics": [
                {
                  "docnetwork_api_http_requests_total": "docnetwork_api_http_requests_total",
                  "docnetwork_read_replica_choice_count": "docnetwork_read_replica_choice_count",
                  "docnetwork_application_query_time_milliseconds": "docnetwork_application_query_time_milliseconds",
                  "docnetwork_notifications_created_count": "docnetwork_notifications_created_count",
                  "docnetwork_app_transaction_success_total": "docnetwork_app_transaction_success_total",
                  "docnetwork_app_transaction_failure_total": "docnetwork_app_transaction_failure_total",
                  "docnetwork_app_transaction_absolute_amount_cents": "docnetwork_app_transaction_absolute_amount_cents",
                  "docnetwork_get_access_time_milliseconds": "docnetwork_get_access_time_milliseconds",
                  "docnetwork_slow_time_seconds": "docnetwork_slow_time_seconds",
                  "docnetwork_pending_query_pressure_primary": "docnetwork_pending_query_pressure_primary",
                  "docnetwork_pending_query_pressure_replicas": "docnetwork_pending_query_pressure_replicas",
                  "docnetwork_app_memory_rss_bytes": "docnetwork_app_memory_rss_bytes",
                  "docnetwork_app_memory_heap_total_bytes": "docnetwork_app_memory_heap_total_bytes",
                  "docnetwork_app_memory_heap_used_bytes": "docnetwork_app_memory_heap_used_bytes",
                  "docnetwork_app_memory_external_bytes": "docnetwork_app_memory_external_bytes",
                  "docnetwork_http_requests_total": "docnetwork_http_requests_total",
                  "docnetwork_http_request_time_nanoseconds": "docnetwork_http_request_time_nanoseconds",
                  "docnetwork_api_http_requests_total": "docnetwork_api_http_requests_total",
                  "docnetwork_api_http_request_time_nanoseconds": "docnetwork_api_http_request_time_nanoseconds",
                  "docnetwork_api_http_request_error_total": "docnetwork_api_http_request_error_total",
                  "docnetwork_authentication_failed_total": "docnetwork_authentication_failed_total",
                  "docnetwork_authentication_success_total": "docnetwork_authentication_success_total",
                  "docnetwork_app_oauth2_successful_authentication_count": "docnetwork_app_oauth2_successful_authentication_count",
                  "docnetwork_app_oauth2_failed_authentication_count": "docnetwork_app_oauth2_failed_authentication_count",
                  "docnetwork_app_oauth_refresh_token_success_total": "docnetwork_app_oauth_refresh_token_success_total",
                  "docnetwork_app_oauth_refresh_token_failed_total": "docnetwork_app_oauth_refresh_token_failed_total",
                  "docnetwork_app_csv_parse_time_milliseconds": "docnetwork_app_csv_parse_time_milliseconds",
                  "docnetwork_app_xlsx_parse_time_milliseconds": "docnetwork_app_xlsx_parse_time_milliseconds",
                  "docnetwork_imported_registration_successful": "docnetwork_imported_registration_successful",
                  "docnetwork_patient_imported_rows_total": "docnetwork_patient_imported_rows_total",
                  "docnetwork_health_log_route_time_milliseconds": "docnetwork_health_log_route_time_milliseconds",
                  "docnetwork_app_custom_reports_profiles_count": "docnetwork_app_custom_reports_profiles_count",
                  "docnetwork_app_custom_reports_questions_count": "docnetwork_app_custom_reports_questions_count",
                  "docnetwork_app_custom_reports_pdf_count": "docnetwork_app_custom_reports_pdf_count",
                  "docnetwork_app_custom_reports_csv_count": "docnetwork_app_custom_reports_csv_count",
                  "docnetwork_app_custom_reports_include_images_count": "docnetwork_app_custom_reports_include_images_count",
                  "docnetwork_app_custom_reports_include_expired_count": "docnetwork_app_custom_reports_include_expired_count",
                  "docnetwork_profiles_register_failed": "docnetwork_profiles_register_failed",
                  "docnetwork_profiles_register_successful": "docnetwork_profiles_register_successful",
                  "docnetwork_provider_registration_success": "docnetwork_provider_registration_success"
                }
              ],
              "password": "%%env_APP_METRICS_ENDPOINT_PASSWORD%%",
              "username": "user-prometheus",
              "use_legacy_auth_encoding": "true",
              "auth_type": "basic",
              "tags": ["env:prod"]
            }
          ]
      labels:
        app: dn
    spec:
      containers:
      - name: app
        image: 878788551012.dkr.ecr.us-east-1.amazonaws.com/dn-testing-app:2022.0329.0715 # {"$imagescan": "app"}
        imagePullPolicy: "Always"
        env:
        - name: DN_ENVIRONMENT
          value: "development"
        ports:
        - containerPort: 3000
          name: dn-app
        - containerPort: 3001
          name: metrics
        resources:
          requests:
            memory: "1G"
            cpu: ".5"
          limits:
            memory: "4G"
            cpu: "2"
        livenessProbe:
          httpGet:
            path: /healthz
            port: 3000
          initialDelaySeconds: 5
          periodSeconds: 5
          timeoutSeconds: 2
          successThreshold: 1
          failureThreshold: 3
        readinessProbe:
          httpGet:
            scheme: HTTP
            path: /healthz
            port: 3000
          initialDelaySeconds: 10
          periodSeconds: 2
          timeoutSeconds: 2
          successThreshold: 3
          failureThreshold: 1
      serviceAccountName: app
      terminationGracePeriodSeconds: 60 # 1Min - Default is 30 Seconds
      priorityClassName: docnetwork-application
      tolerations:
      - key: "dedicated"
        operator: "Equal"
        value: "dn-prod"
        effect: "NoSchedule"
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: group
                operator: In
                values:
                - app-dedicated

Controller Logs for small batch:

2022-04-07T15:33:26.788Z	INFO	Successfully created the logger.
2022-04-07T15:33:26.788Z	INFO	Logging level set to: debug
{"level":"info","ts":1649345606.8944435,"logger":"fallback","caller":"injection/injection.go:61","msg":"Starting informers..."}
2022-04-07T15:33:26.894Z	INFO	controller	Initializing with version v0.8.1	{"commit": "8280611"}
2022-04-07T15:33:26.894Z	DEBUG	controller.aws	Using AWS region us-east-1	{"commit": "8280611"}
2022-04-07T15:33:26.894Z	DEBUG	controller.aws	Discovered caBundle, length 1066	{"commit": "8280611"}
2022-04-07T15:33:26.894Z	DEBUG	controller.aws.launchtemplate	Hydrating the launch template cache with names matching "Karpenter-dn-main-eks-*"	{"commit": "8280611"}
2022-04-07T15:33:27.184Z	DEBUG	controller.aws.launchtemplate	Finished hydrating the launch template cache with 1 items	{"commit": "8280611"}
I0407 15:33:27.204351       1 leaderelection.go:243] attempting to acquire leader lease karpenter/karpenter-leader-election...
2022-04-07T15:33:27.204Z	INFO	controller	starting metrics server	{"commit": "8280611", "path": "/metrics"}
I0407 15:33:42.786013       1 leaderelection.go:253] successfully acquired lease karpenter/karpenter-leader-election
2022-04-07T15:33:42.786Z	DEBUG	controller.events	Normal	{"commit": "8280611", "object": {"kind":"ConfigMap","namespace":"karpenter","name":"karpenter-leader-election","uid":"73244275-6dfb-416b-8f82-b45e9297c73b","apiVersion":"v1","resourceVersion":"14121116"}, "reason": "LeaderElection", "message": "karpenter-546cf59795-zzcgd_35301933-d7c6-40df-bb5d-51516c8361e8 became leader"}
2022-04-07T15:33:42.786Z	DEBUG	controller.events	Normal	{"commit": "8280611", "object": {"kind":"Lease","namespace":"karpenter","name":"karpenter-leader-election","uid":"6650b7aa-5a39-4696-8162-a5e9f3752a7b","apiVersion":"coordination.k8s.io/v1","resourceVersion":"14121117"}, "reason": "LeaderElection", "message": "karpenter-546cf59795-zzcgd_35301933-d7c6-40df-bb5d-51516c8361e8 became leader"}
2022-04-07T15:33:42.786Z	INFO	controller.controller.counter	Starting EventSource	{"commit": "8280611", "reconciler group": "karpenter.sh", "reconciler kind": "Provisioner", "source": "kind source: /, Kind="}
2022-04-07T15:33:42.786Z	INFO	controller.controller.counter	Starting EventSource	{"commit": "8280611", "reconciler group": "karpenter.sh", "reconciler kind": "Provisioner", "source": "kind source: /, Kind="}
2022-04-07T15:33:42.786Z	INFO	controller.controller.counter	Starting Controller	{"commit": "8280611", "reconciler group": "karpenter.sh", "reconciler kind": "Provisioner"}
2022-04-07T15:33:42.786Z	INFO	controller.controller.provisioning	Starting EventSource	{"commit": "8280611", "reconciler group": "karpenter.sh", "reconciler kind": "Provisioner", "source": "kind source: /, Kind="}
2022-04-07T15:33:42.786Z	INFO	controller.controller.provisioning	Starting Controller	{"commit": "8280611", "reconciler group": "karpenter.sh", "reconciler kind": "Provisioner"}
2022-04-07T15:33:42.787Z	INFO	controller.controller.volume	Starting EventSource	{"commit": "8280611", "reconciler group": "", "reconciler kind": "PersistentVolumeClaim", "source": "kind source: /, Kind="}
2022-04-07T15:33:42.787Z	INFO	controller.controller.volume	Starting EventSource	{"commit": "8280611", "reconciler group": "", "reconciler kind": "PersistentVolumeClaim", "source": "kind source: /, Kind="}
2022-04-07T15:33:42.787Z	INFO	controller.controller.volume	Starting Controller	{"commit": "8280611", "reconciler group": "", "reconciler kind": "PersistentVolumeClaim"}
2022-04-07T15:33:42.787Z	INFO	controller.controller.termination	Starting EventSource	{"commit": "8280611", "reconciler group": "", "reconciler kind": "Node", "source": "kind source: /, Kind="}
2022-04-07T15:33:42.787Z	INFO	controller.controller.termination	Starting Controller	{"commit": "8280611", "reconciler group": "", "reconciler kind": "Node"}
2022-04-07T15:33:42.787Z	INFO	controller.controller.node	Starting EventSource	{"commit": "8280611", "reconciler group": "", "reconciler kind": "Node", "source": "kind source: /, Kind="}
2022-04-07T15:33:42.787Z	INFO	controller.controller.node	Starting EventSource	{"commit": "8280611", "reconciler group": "", "reconciler kind": "Node", "source": "kind source: /, Kind="}
2022-04-07T15:33:42.787Z	INFO	controller.controller.node	Starting EventSource	{"commit": "8280611", "reconciler group": "", "reconciler kind": "Node", "source": "kind source: /, Kind="}
2022-04-07T15:33:42.787Z	INFO	controller.controller.node	Starting Controller	{"commit": "8280611", "reconciler group": "", "reconciler kind": "Node"}
2022-04-07T15:33:42.787Z	INFO	controller.controller.podmetrics	Starting EventSource	{"commit": "8280611", "reconciler group": "", "reconciler kind": "Pod", "source": "kind source: /, Kind="}
2022-04-07T15:33:42.787Z	INFO	controller.controller.podmetrics	Starting Controller	{"commit": "8280611", "reconciler group": "", "reconciler kind": "Pod"}
2022-04-07T15:33:42.787Z	INFO	controller.controller.nodemetrics	Starting EventSource	{"commit": "8280611", "reconciler group": "", "reconciler kind": "Node", "source": "kind source: /, Kind="}
2022-04-07T15:33:42.787Z	INFO	controller.controller.nodemetrics	Starting EventSource	{"commit": "8280611", "reconciler group": "", "reconciler kind": "Node", "source": "kind source: /, Kind="}
2022-04-07T15:33:42.787Z	INFO	controller.controller.nodemetrics	Starting EventSource	{"commit": "8280611", "reconciler group": "", "reconciler kind": "Node", "source": "kind source: /, Kind="}
2022-04-07T15:33:42.787Z	INFO	controller.controller.nodemetrics	Starting Controller	{"commit": "8280611", "reconciler group": "", "reconciler kind": "Node"}
2022-04-07T15:33:42.887Z	INFO	controller.controller.counter	Starting workers	{"commit": "8280611", "reconciler group": "karpenter.sh", "reconciler kind": "Provisioner", "worker count": 10}
2022-04-07T15:33:42.887Z	INFO	controller.controller.provisioning	Starting workers	{"commit": "8280611", "reconciler group": "karpenter.sh", "reconciler kind": "Provisioner", "worker count": 10}
2022-04-07T15:33:42.907Z	INFO	controller.controller.volume	Starting workers	{"commit": "8280611", "reconciler group": "", "reconciler kind": "PersistentVolumeClaim", "worker count": 1}
2022-04-07T15:33:42.907Z	INFO	controller.controller.termination	Starting workers	{"commit": "8280611", "reconciler group": "", "reconciler kind": "Node", "worker count": 10}
2022-04-07T15:33:42.907Z	INFO	controller.controller.node	Starting workers	{"commit": "8280611", "reconciler group": "", "reconciler kind": "Node", "worker count": 10}
2022-04-07T15:33:42.907Z	INFO	controller.controller.podmetrics	Starting workers	{"commit": "8280611", "reconciler group": "", "reconciler kind": "Pod", "worker count": 1}
2022-04-07T15:33:42.910Z	INFO	controller.controller.nodemetrics	Starting workers	{"commit": "8280611", "reconciler group": "", "reconciler kind": "Node", "worker count": 1}
2022-04-07T15:33:43.683Z	DEBUG	controller.provisioning	Discovered 402 EC2 instance types	{"commit": "8280611", "provisioner": "default"}
2022-04-07T15:33:43.979Z	DEBUG	controller.provisioning	Discovered subnets: [subnet-0537bf15c0a12d36c (us-east-1d) subnet-077a499ddde3c6a0a (us-east-1c) subnet-090dee37fba35334f (us-east-1a)]	{"commit": "8280611", "provisioner": "default"}
2022-04-07T15:33:44.171Z	DEBUG	controller.provisioning	Discovered EC2 instance types zonal offerings	{"commit": "8280611", "provisioner": "default"}
2022-04-07T15:33:44.178Z	INFO	controller.provisioning	Waiting for unschedulable pods	{"commit": "8280611", "provisioner": "default"}
2022-04-07T15:33:44.178Z	INFO	controller.provisioning	Waiting for unschedulable pods	{"commit": "8280611", "provisioner": "docnetwork-app"}
2022-04-07T15:34:46.369Z	INFO	controller.provisioning	Batched 2 pods in 1.055290345s	{"commit": "8280611", "provisioner": "docnetwork-app"}
2022-04-07T15:34:46.421Z	DEBUG	controller.provisioning	Discovered subnets: [subnet-0537bf15c0a12d36c (us-east-1d) subnet-077a499ddde3c6a0a (us-east-1c) subnet-090dee37fba35334f (us-east-1a)]	{"commit": "8280611", "provisioner": "docnetwork-app"}
2022-04-07T15:34:46.882Z	DEBUG	controller.provisioning	Discovered security groups: [sg-031f9ae733fb95e0b]	{"commit": "8280611", "provisioner": "docnetwork-app"}
2022-04-07T15:34:46.883Z	DEBUG	controller.provisioning	Discovered kubernetes version 1.21	{"commit": "8280611", "provisioner": "docnetwork-app"}
2022-04-07T15:34:46.992Z	DEBUG	controller.provisioning	Discovered ami-0319c884da18af515 for query /aws/service/eks/optimized-ami/1.21/amazon-linux-2/recommended/image_id	{"commit": "8280611", "provisioner": "docnetwork-app"}
2022-04-07T15:34:47.036Z	DEBUG	controller.provisioning	Discovered launch template Karpenter-dn-main-eks-10790469361113335367	{"commit": "8280611", "provisioner": "docnetwork-app"}
2022-04-07T15:34:50.165Z	INFO	controller.provisioning	Launched instance: i-0b9bdd13281a2bc94, hostname: ip-10-4-80-129.ec2.internal, type: t3.medium, zone: us-east-1a, capacityType: spot	{"commit": "8280611", "provisioner": "docnetwork-app"}
2022-04-07T15:34:50.174Z	INFO	controller.provisioning	Created node with 2 pods requesting {"cpu":"1410m","memory":"2086245Ki","pods":"7"} from types c3.large, c4.large, c5a.large, t3.medium, t3a.medium and 306 other(s)	{"commit": "8280611", "provisioner": "docnetwork-app"}
2022-04-07T15:34:50.194Z	INFO	controller.provisioning	Waiting for unschedulable pods	{"commit": "8280611", "provisioner": "docnetwork-app"}

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 25 (14 by maintainers)

Most upvoted comments

The PR to back port the image pull performance fix to containerd-1.5.x was denied. The fix was included in containerd-1.6.4, however, which I’ve briefly tested alongside k8s 1.21+. I do not have a timeline for when containerd-1.6.x will built into the EKS Optimized AL2 image.

There are a few possible workarounds to the improve image pull performance in the meantime:

  1. Revert to Docker The Docker container-runtime seems to be immune to the image pull performance issue. Docker is the default container-runtime used by the EKS optimized image and is well tested as a result. Karpenter defaults to using containerd when launching nodes, but it’s possible to create a custom launch template and remove the --container-runtime containerd flag, which will revert back to docker.

  2. Supply the --serialize-image-pulls=true' flag to Kubelet This flag is set to “true” by default by Kubelet, but is set to “false” in the EKS Optimized image. You can override the kubelet config by supplying the --serialize-image-pulls=true' flag in a custom launch template. Note: Setting the value to “true” could degrade image pull performance in the case where many pods are launched from many different images.

  3. Install containerd-1.6.4 in place of containerd-1.4.13 Containerd 1.6.4 contains the fix which improves image pull performance. It’s certainly possible to install 1.6.4 on nodes through a custom launch template, or by building a custom AMI. I highly recommend that this configuration be tested thoroughly before being used for production workloads.