kubernetes: Image credential provider plugins not being called when native in-tree cloud provider is detected

What happened?

It appears image credential provider plugins are not called when in-tree cloud providers are enabled. Even when disabling by using the DisableKubeletCloudCredentialProviders feature flag, k8s still detects it is running on an EC2 instance and attempts to use the amazon-ecr provider and does not check with any configured plugins.

I have an EC2 hosted Ubuntu 22.04 instance that I have installed a single node test cluster on using kubeadm. I have a container image hosted on an ECR registry under a different account where the instance profile of the EC2 instance does not have access to the image by default.

Kubelet verbosity is turned up to 5 (`-v 5) to be able to get the following logs.

During initialization of kubelet I see my credential provider plugin (“ecr-credential-provider”) is registered, but then when the Docker keyring is initialized it shows the default providers are added as well, with the plugin being last in the list:

kubelet[4027101]: I1003 20:18:18.288482 4027101 plugins.go:45] Registered credential provider "ecr-credential-provider"
kubelet[4027101]: I1003 20:18:18.288513 4027101 plugins.go:72] Registering credential provider: .dockercfg
kubelet[4027101]: I1003 20:18:18.288523 4027101 plugins.go:72] Registering credential provider: amazon-ecr
kubelet[4027101]: I1003 20:18:18.288532 4027101 azure_credentials.go:150] Azure config unspecified, disabling
kubelet[4027101]: I1003 20:18:18.288540 4027101 plugins.go:72] Registering credential provider: ecr-credential-provider

I attempt to deploy a pod using my ECR-hosted container image. The in-tree cloud provider detects it is running on an EC2 instance and gets enabled but then checks sees that I have enabled the DisableKubeletCloudCredentialProviders feature flag and reports itself as disabled. But then the provider returns a default DockerConfig. My guess is this default DockerConfig is preventing it from continuing on and checking other credential providers, but I have not been able to trace that through yet.

kubelet[4027101]: I1003 20:18:22.361434 4027101 aws_credentials.go:111] found 'ec2' in uuid ec2bd289-e71a-111d-546f-6b3a8048df19
kubelet[4027101]:  from /sys/devices/virtual/dmi/id/product_uuid, enabling legacy AWS credential provider
kubelet[4027101]: I1003 20:18:22.361670 4027101 aws_credentials.go:164] AWS credential provider is now disabled. Please refer to sig-cloud-provider for guidance on external credential provider integration for AWS
kubelet[4027101]: I1003 20:18:22.361859 4027101 kuberuntime_image.go:47] "Pulling image without credentials" image="861807767978.dkr.ecr.us-east-2.amazonaws.com/ecr-test:latest"
kubelet[4027101]: E1003 20:18:22.385959 4027101 remote_image.go:216] "PullImage from image service failed" err="rpc error: code = Unknown desc = Error response from daemon: Head \"https://861807767978.dkr.ecr.us-east-2.amazonaws.com/v2/ecr-test/manifests/latest\": no basic auth credentials" image="861807767978.dkr.ecr.us-east-2.amazonaws.com/ecr-test:latest"

It goes immediately to “Pulling image without credentials” which fails since it is a private repo and needs the configured plugin credentials to access it.

This does not appear to be a problem when running on a local bare metal instance that does not detect the presence of EC2.

What did you expect to happen?

Whether local credential providers are checked or not, plugin providers should be queried as well to check for credentials to use given the configured matching strings.

How can we reproduce it (as minimally and precisely as possible)?

Create Ubuntu EC2 instance and install k8s (1.23.12 in my case).

Added the following to the kubelet ExecStart arguments:

    --image-credential-provider-bin-dir=/var/lib/kubelet/plugins \
    --image-credential-provider-config=/etc/kubernetes/kubelet/credential-provider-config.yaml \
    -v 5

In the referenced /etc/kubernetes/kubelet/credentail-provider-config.yaml file, added this content:

apiVersion: kubelet.config.k8s.io/v1alpha1
kind: CredentialProviderConfig
providers:
  - name: ecr-credential-provider
    matchImages:
      - “*.dkr.ecr.*.amazonaws.com”
    defaultCacheDuration: "12h"
    apiVersion: credentialprovider.kubelet.k8s.io/v1alpha1
    env:
      - name: PATH
        value: /usr/sbin:/usr/bin:/sbin:/bin:/usr/local/sbin:/usr/local/bin
      - name: HOME
        value: /root

Placed my ecr-credential-provider binary in the /var/lib/kubelet/plugins directory.

Configured AWS credentials for the plugin to use (aws configure) under the /root/.aws directory.

Verified credentials working correctly by running:

HOME=/root sudo /var/lib/kubelet/plugins/ecr-credential-provider <<< '{"apiVersion": "credentialprovider.kubelet.k8s.io/v1alpha1", "kind": "CredentialProviderRequest", "image": "670132432684.dkr.ecr.us-east-2.amazonaws.com/brtest:latest"}'

And verified it was able to return credentials.

Tried to deploy a pod using this ECR hosted image. Pod fails with:

kubelet[4027101]: E1003 20:18:22.385959 4027101 remote_image.go:216] "PullImage from image service failed" err="rpc error: code = Unknown desc = Error response from daemon: Head \"https://861807767978.dkr.ecr.us-east-2.amazonaws.com/v2/ecr-test/manifests/latest\": no basic auth credentials" image="861807767978.dkr.ecr.us-east-2.amazonaws.com/ecr-test:latest"

Which by “no basic auth credentials” and lack of a message stating “Getting image XXX credentials from external exec plugin” show the plugin is never being called to ask for credentials to use.

Anything else we need to know?

No response

Kubernetes version

clientVersion:
  buildDate: "2022-09-21T12:20:29Z"
  compiler: gc
  gitCommit: c6939792865ef0f70f92006081690d77411c8ed5
  gitTreeState: clean
  gitVersion: v1.23.12
  goVersion: go1.17.13
  major: "1"
  minor: "23"
  platform: linux/amd64
serverVersion:
  buildDate: "2022-09-21T12:13:07Z"
  compiler: gc
  gitCommit: c6939792865ef0f70f92006081690d77411c8ed5
  gitTreeState: clean
  gitVersion: v1.23.12
  goVersion: go1.17.13
  major: "1"
  minor: "23"
  platform: linux/amd64

Cloud provider

cloud-provider-aws

OS version

# On Linux:
$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.1 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.1 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

$ uname -a
Linux ip-172-31-41-99 5.15.0-1020-aws #24-Ubuntu SMP Thu Sep 1 16:04:17 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Install tools

Kubeadm following docs here: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/

Container runtime (CRI) and version (if applicable)

~$ apt info containerd
Package: containerd
Version: 1.5.9-0ubuntu3
Built-Using: golang-1.18 (= 1.18-1ubuntu1)
Priority: optional
Section: admin
Origin: Ubuntu
Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com>
Original-Maintainer: Debian Go Packaging Team <pkg-go-maintainers@lists.alioth.debian.org>
Bugs: https://bugs.launchpad.net/ubuntu/+filebug
Installed-Size: 106 MB
Depends: runc (>= 1.0.0~rc2~), libc6 (>= 2.34)
Breaks: docker.io (<< 19.03.13-0ubuntu4)
Homepage: https://containerd.io
Download-Size: 27.0 MB
APT-Sources: http://us-east-2.ec2.archive.ubuntu.com/ubuntu jammy/main amd64 Packages
Description: daemon to control runC
 Containerd is a daemon to control runC, built for performance and density.
 Containerd leverages runC's advanced features such as seccomp and user
 namespace support as well as checkpoint and restore for cloning and live
 migration of containers.
 .
 This package contains the binaries.

Related plugins (CNI, CSI, …) and versions (if applicable)

No response

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 17 (17 by maintainers)

Most upvoted comments

called it 😃

image

After adding a bunch of debugging, it looks like I may have found the cause for this case. And good/bad, it was my fault! 😄

Thanks for the pointers to where things get wired together. I was able to add some debug statements there and found my matchImages string actually had some oddly formatted quotes rather than normal double-quotes. I think this must have happened from copying/pasting from a formatted web page.

Once I saw that and updated the credential provider config file to use standard quotes, after a restart it was able to call out to the plugin and get credentials and successfully pull down the image. 🎆

I’m downgrading now to 1.22 to see if that was the case there. Will just wait to validate that, but I guess I can likely close this issue now.

Thanks a lot for all your help!

Kubelet calls RegisterCredentialProviderPlugins, which calls credentialprovider.RegisterCredentialProvider(provider.Name, plugin), which registers the provider into a global providers variable, which is used by NewDockerKeyring(). The keyring retruned by NewDockerKeyring()` is then assigned to kuberuntime manager’s keyring.