kubernetes: Pod.Status.PodIP not updated during postStart lifecycle hook

What happened:

When using a postStart lifecycle hook, the Pod.Status.PodIP is not updated while the hook is running, even though the pod has been assigned an IP via the CNI plugin and networking is set up.

What you expected to happen:

The Pod.Status.PodIP should be set as soon as it’s available.

How to reproduce it (as minimally and precisely as possible):

The following manifest should repro the issue. Apply it and then watch the pod status.

apiVersion: v1
kind: Pod
metadata:
  name: test-calico-post-start
spec:
  containers:
  - name: test-calico
    image: byrnedo/alpine-curl:latest
    command:
    - sh
    - -c
    - "sleep 400000"
    lifecycle:
      postStart:
        exec:
          command: [ "/bin/sh","-c","sleep 120 && result=$(curl  http://www.google.com 1> /var/log/output.log 2> /var/log/output-error.log ); echo $result"]

The postStart lifecycle hook here contains a sleep for 120 seconds. During the first 120 seconds, the pod status will be ContainerCreating and the podIP will not be set. Only after the hook completes will the podIP be set.

Anything else we need to know?:

Why is this important? It prevents Calico network policy from operating correctly during the postStart hook since we cannot learn the pod’s IP address. (Note that when Calico is also the CNI plug-in, we do learn it, but Calico is designed to run in “policy only mode” on top of other CNI plug-ins, like the AWS VPC-CNI plug-in). C.f. https://github.com/projectcalico/libcalico-go/issues/1125 and https://github.com/projectcalico/felix/issues/2008 for users who are having problems here.

Environment:

Kubernetes version (use kubectl version): v1.16.3
Cloud provider or hardware configuration: AWS
OS (e.g: cat /etc/os-release): Ubuntu 16.04
Kernel (e.g. uname -a): 4.4.0-1098-aws
Install tools: kubeadm
Network plugin and version (if this is a network-related bug): amazon-vpc-cni-k8s v1.5 with Calico v3.8.1
Others:

About this issue

Original URL
State: open
Created 5 years ago
Reactions: 8
Comments: 65 (36 by maintainers)

Commits related to this issue

Verify pods connectivity in postStart hook When network policies match the pod postStart hook does not have any connectivity in it's first run because status.podIP is not yet set(This affects cluster... — committed to kyrtapz/cluster-network-operator by kyrtapz 2 years ago
Verify pods connectivity in postStart hook When network policies match the pod postStart hook does not have any connectivity in it's first run because status.podIP is not yet set(This affects cluster... — committed to kyrtapz/cluster-network-operator by kyrtapz 2 years ago
Verify pods connectivity in postStart hook When network policies match the pod postStart hook does not have any connectivity in it's first run because status.podIP is not yet set(This affects cluster... — committed to kyrtapz/cluster-network-operator by kyrtapz 2 years ago
Verify pods connectivity in postStart hook When network policies match the pod postStart hook does not have any connectivity in it's first run because status.podIP is not yet set(This affects cluster... — committed to kyrtapz/cluster-network-operator by kyrtapz 2 years ago
Verify pods connectivity in postStart hook With some plugins like openshift-sdn, NetworkPolicy depends on each pod's Status.PodIP being added to the policy rules. Kubelet only sets a pod's Status.Pod... — committed to kyrtapz/cluster-network-operator by kyrtapz 2 years ago
Introduce a noop init container to ensure that Status.podIP is set before postStart hooks run. Kubelet only sets a pod's Status.PodIP when all containers of the pod have started at least once (succes... — committed to kyrtapz/cluster-network-operator by kyrtapz 2 years ago
Introduce a noop init container to ensure that Status.podIP is set before postStart hooks run. Kubelet only sets a pod's Status.PodIP when all containers of the pod have started at least once (succes... — committed to kyrtapz/cluster-network-operator by kyrtapz 2 years ago
Add a no-op init container to ensure that Status.podIP is set before postStart hooks run. Kubelet only sets a pod's Status.PodIP when all containers of the pod have started at least once (successfull... — committed to kyrtapz/cluster-network-operator by kyrtapz 2 years ago
Add a no-op init container to ensure that Status.podIP is set before postStart hooks run. Kubelet only sets a pod's Status.PodIP when all containers of the pod have started at least once (successfull... — committed to openshift-cherrypick-robot/cluster-network-operator by kyrtapz 2 years ago

Most upvoted comments

In order to prevent init container start from hanging, we can put the call to post start hook (w.r.t. init container) in a goroutine whose execution time is bounded (2 minutes, e.g.) If the execution time exceeds the limit, return timeout error for init container start.

Does SyncPod() still block waiting for the hook to complete or timeout to fire? If so, then this doesn’t fully solve our problem, because we will still be blocked from getting the podIP while that hook is executing.

If we need to keep the init container hooks synchronous, then I would prefer the solution where we trigger a pod status update including the pod IP immediately following sandbox creation and prior to going into any blocking code starting containers.

spikecurtis on Dec 9, 2019