kubernetes: Improve kubectl cp, so it doesn't require the tar binary in the container

Uncomment only one, leave it on its own line:

/kind feature

What happened: Kubectl cp currently requires the container we’re copying into to include the tar binary. This is problematic when the container image is minimal and only includes the main binary run in the container and nothing else.

What you expected to happen: Docker now has docker cp, which can copy files into a running container without any prerequisites on the container itself. Kubectl cp could use that mechanism. Obviously, this will require introducing a new feature into CRI, so it’s not a small task.

Why we need this: This will enable users to debug an existing (running) container, which is based on the scratch image and contains nothing else but the main app binary. Users would be able to get any binary they need into the container. An alternative solution could be to mount an additional volume (possibly from another container image) into a running pod (if that feature is ever implemented).

About this issue

  • Original URL
  • State: open
  • Created 6 years ago
  • Reactions: 117
  • Comments: 46 (18 by maintainers)

Most upvoted comments

I think the best option is to have CRI support this, CRI could e.g. use a trusted host tar implementation and make various improvements. With dockershim deprecated it will be feasible to only focus on doing this in CRI.

To get a change like this in, you probably need to file an enhancement under SIG node and work with that group to make it happen. I don’t have anywhere near enough free time to make this happen currently, but I think this needs to be fixed rather than say removing this functionality because we have issues securing it. /lifecycle frozen

/remove-lifecycle stale

/remove-lifecycle stale

/remove-lifecycle stale

/remove-lifecycle rotten (no - after /remove-lifecycle) this is going to need a few things to be viable:

  • ephemeral containers need to be stable, they’re still alpha / not enabled by default
  • someone probably needs to design this and write a KEP
  • someone will need to follow through on implementing it

an alternative approach would be a kubectl plugin or plugins once at least the first point is true.

/remove-lifecycle stale

/remove-lifecycle-rotten

+1 for implementing this through ephemeral containers.

Executing the command in the root namespace, especially something we implement ourselves makes me really nervous. I would eventually like to eliminate all places the Kubelet interacts directly with container-owned files.

kubectl cp requires the container to be running by encapsulating the exec command depending on the tar binary. we can implement this function without it.

Some people are really looking forward to this feature,

https://github.com/kubernetes/kubectl/issues/454 https://github.com/kubeflow/pipelines/issues/1213 https://github.com/GoogleContainerTools/skaffold/issues/1814 #1814 https://github.com/Jeffwan/ml-benchmark/issues/1 #1

Is there anyone actively working on this? I would really like to start contributing on it . is this feature on track ? @kubernetes/sig-cli-feature-requests

cc /@luksa

/remove-lifecycle stale

@luksa will the WIP “debug container” feature satisfy your use case? https://github.com/kubernetes/features/issues/277

/cc @verb

Sorry that should be https://github.com/kubernetes/enhancements, the “s” got dropped from the link. Also known as “KEPs” – the process for Kubernetes features. https://github.com/kubernetes/enhancements/tree/master/keps

Instead of doing this via the CRI, couldn’t the kubelet get the file(s) directly and do the taring directly itself and this be possible via a new API?

Kubelet leaves the implementation of the container filesystem up to the CRI implementation at this point (since dockershim is now gone we only have CRI mode), so we still need some way to access the container filesystem. So most likely any variation suggested above would require changes to kubelet, CRI API, and the CRI implementations. It will require a KEP in SIG Node to make a change like this.

@corneliusweig @BenTheElder In 1.18 we’ll add the ability to target a container namespace with an ephemeral container so that setting shareProcessNamespace in advance will no longer be necessary. With Linux containers you can use the /proc/$pid/root magic symlink so that you don’t need CAP_SYS_ADMIN, but I’m not sure how you’d make a container name to $pid.

Just because it’s possible doesn’t make it the best solution. +1 to a KEP considering alternatives.

Big 👍 for the KEP. And if somebody has a POC, we can get some user feedback by distributing as a krew.dev plugin.

The work I’m doing with openat2 would allow for that type of access to be done safely (and I’m also working on a library which will do userspace emulation on older kernels). To be honest, I’m also equally nervous about running a command in a container and then using its output to do filesystem operations.

Since Traefik’s container doesn’t include the /bin/tarcommand, I solved my problem by using a Pod that runs alpine:latest and shares a volume with Traefik’s pods.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: traefik-data-claim
  namespace: kube-system
spec:
  accessModes:
    - ReadWriteOnce
  volumeMode: Filesystem
  resources:
    requests:
      storage: 100Mi
---
apiVersion: v1
kind: Pod
metadata:
  name: traefik-cp-pod
  namespace: kube-system
spec:
  containers:
    - name: alpine
      image: alpine:latest
      command: ["tail", "-f", "/dev/null"]
      volumeMounts: 
      - name: traefik-data-volume
        mountPath: /mnt/data
  volumes:
    - name: traefik-data-volume
      persistentVolumeClaim:
        claimName: traefik-data-claim
---
apiVersion: v1
kind: Pod
metadata:
  name: traefik-cp-pod
  namespace: kube-system
spec:
  containers:
    - name: alpine
      image: alpine:latest
      command: ["tail", "-f", "/dev/null"]
      volumeMounts: 
      - name: traefik-data-volume
        mountPath: /mnt/data
  volumes:
    - name: traefik-data-volume
      persistentVolumeClaim:
        claimName: traefik-data-claim

Then I use kubectl --namespace=kube-system cp data.toml traefik-cp-pod:/mnt/data/ to copy file to / from Traefik’s volume.

Obviously when this issue will be solved I’ll be able to remove Alpine’s overhead.

moving to distroless in more places is going to make this more painful FYI @yuwenma

Yes, definitely understood; Bug 1096726 makes docker hit the headline again.

There are two ways to solve this problem I suggest. One way is to fix it like [#39292] (https://github.com/moby/moby/pull/39292), pass root to chroot to for chroot Tar/Untar. Another way is to implement kubectl cp by using a ephemeral container to exchange data with target container . While it may have quite a little bit time to wait. #227