kubernetes: Couldn't find network status for {namespace}/{pod_name} through plugin: invalid network status for
Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see http://kubernetes.io/docs/troubleshooting/.): NO
What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.): “invalid network status for” “Couldn’t find network status for”
Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT
Kubernetes version (use kubectl version):
1.6.0
Environment:
- Cloud provider or hardware configuration:
- OS (e.g. from /etc/os-release): NAME=“Ubuntu” VERSION=“14.04.5 LTS, Trusty Tahr” ID=ubuntu ID_LIKE=debian PRETTY_NAME=“Ubuntu 14.04.5 LTS” VERSION_ID=“14.04”
- Kernel (e.g.
uname -a): Linux HOSTNAME_REDACTED 3.13.0-44-generic #73-Ubuntu SMP Tue Dec 16 00:22:43 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux - Install tools:
- Others:
What happened:
After upgrading a cluster from 1.5.3 to 1.6.0, I see errors like the following in kubelet’s log:
/var/log/upstart/kubelet.log:W0403 15:38:19.584738 25905 docker_sandbox.go:263] Couldn't find network status for default/prometheus-node-exporter-2n784 through plugin: invalid network status for
What you expected to happen:
- For this error to not be generated if it’s a false positive
- For the underlying bug to be fixed if this is a real error but it’s a bug in kubelet/kubernetes (and not some environmental trigger)
- For the error message to be more useful in the case the error is legitimate
How to reproduce it (as minimally and precisely as possible): Upgrade a cluster from 1.5.3 to 1.6.0 and watch kubelet’s logs.
Anything else we need to know: I am using a hyperkube image that I built from kubernetes source, and the compiled kubelet binary downloaded from the kubernetes project. This is the same way I deployed 1.5.3 and previous versions.
I am using flanneld as an overlay network and not configuring any CNI networking / network-plugin options to kubelet.
I tracked this error message down to the following code.
pkg/kubelet/dockershim/docker_sandbox.go
229 // getIPFromPlugin interrogates the network plugin for an IP.
230 func (ds *dockerService) getIPFromPlugin(sandbox *dockertypes.ContainerJSON) (string, error) {
231 metadata, err := parseSandboxName(sandbox.Name)
232 if err != nil {
233 return "", err
234 }
235 msg := fmt.Sprintf("Couldn't find network status for %s/%s through plugin", metadata.Namespace, metadata.Name)
236 cID := kubecontainer.BuildContainerID(runtimeName, sandbox.ID)
237 networkStatus, err := ds.network.GetPodNetworkStatus(metadata.Namespace, metadata.Name, cID)
238 if err != nil {
239 // This might be a sandbox that somehow ended up without a default
240 // interface (eth0). We can't distinguish this from a more serious
241 // error, so callers should probably treat it as non-fatal.
242 return "", err
243 }
244 if networkStatus == nil {
245 return "", fmt.Errorf("%v: invalid network status for", msg)
246 }
247 return networkStatus.IP.String(), nil
248 }
249
250 // getIP returns the ip given the output of `docker inspect` on a pod sandbox,
251 // first interrogating any registered plugins, then simply trusting the ip
252 // in the sandbox itself. We look for an ipv4 address before ipv6.
253 func (ds *dockerService) getIP(sandbox *dockertypes.ContainerJSON) (string, error) {
254 if sandbox.NetworkSettings == nil {
255 return "", nil
256 }
257 if sharesHostNetwork(sandbox) {
258 // For sandboxes using host network, the shim is not responsible for
259 // reporting the IP.
260 return "", nil
261 }
262 if IP, err := ds.getIPFromPlugin(sandbox); err != nil {
263 glog.Warningf("%v", err)
264 } else if IP != "" {
265 return IP, nil
266 }
267 // TODO: trusting the docker ip is not a great idea. However docker uses
268 // eth0 by default and so does CNI, so if we find a docker IP here, we
269 // conclude that the plugin must have failed setup, or forgotten its ip.
270 // This is not a sensible assumption for plugins across the board, but if
271 // a plugin doesn't want this behavior, it can throw an error.
272 if sandbox.NetworkSettings.IPAddress != "" {
273 return sandbox.NetworkSettings.IPAddress, nil
274 }
275 return sandbox.NetworkSettings.GlobalIPv6Address, nil
276 }
pkg/kubelet/network/kubenet/kubenet_linux.go
541 // TODO: Use the addToNetwork function to obtain the IP of the Pod. That will assume idempotent ADD call to the plugin.
542 // Also fix the runtime's call to Status function to be done only in the case that the IP is lost, no need to do periodic calls
543 func (plugin *kubenetNetworkPlugin) GetPodNetworkStatus(namespace string, name string, id kubecontainer.ContainerID) (*network.PodNetworkStatus, error) {
544 plugin.mu.Lock()
545 defer plugin.mu.Unlock()
546 // Assuming the ip of pod does not change. Try to retrieve ip from kubenet map first.
547 if podIP, ok := plugin.podIPs[id]; ok {
548 return &network.PodNetworkStatus{IP: net.ParseIP(podIP)}, nil
549 }
550
551 netnsPath, err := plugin.host.GetNetNS(id.ID)
552 if err != nil {
553 return nil, fmt.Errorf("Kubenet failed to retrieve network namespace path: %v", err)
554 }
555 ip, err := network.GetPodIP(plugin.execer, plugin.nsenterPath, netnsPath, network.DefaultInterfaceName)
556 if err != nil {
557 return nil, err
558 }
559
560 plugin.podIPs[id] = ip.String()
561 return &network.PodNetworkStatus{IP: ip}, nil
562 }
As suggested by the code, I’m only getting these errors for containers that don’t use hostNetwork. If I run docker inspect on the containers that are mentioned in the error messages, the values in the NetworkSettings section are empty, but I’m not sure that’s relevant.
"NetworkSettings": {
"Bridge": "",
"SandboxID": "",
"HairpinMode": false,
"LinkLocalIPv6Address": "",
"LinkLocalIPv6PrefixLen": 0,
"Ports": null,
"SandboxKey": "",
"SecondaryIPAddresses": null,
"SecondaryIPv6Addresses": null,
"EndpointID": "",
"Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"IPAddress": "",
"IPPrefixLen": 0,
"IPv6Gateway": "",
"MacAddress": "",
"Networks": {}
}
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 27 (11 by maintainers)
We have basically the same setup as @vdavidoff :
--bipCIDR from flannel properties fileWe saw the same logs in
kubeletlogs:However, we also saw massive pods being killed by kubelet:
This caused disruption and brief service outage. Please advise.
Here are my kubelet logs, showing the same error @jmccarty3 reports: https://gist.github.com/stensonb/3f6db27eb0ba031463d63cad8360f780
It looks like the container with id
dadd21ea-2b61-11e7-87ae-e63e85527c63demonstrates the issue…search for that in the gist.Also, this line:
Is anyone else also seeing messages like
when this happens? Looks like some bug in dockershim.
For me it seems to have something to do with flannel v0.7.0 and Kubernetes v1.6.x. My Kubelet starts to act weird, when the rkt garbage collection starts, either reproducible with
sudo rkt gcor automatically like:There are indications that v0.7.1 solves some Kubernetes related issues, see: https://github.com/coreos/flannel/pull/690, but since it is currently not shipped with the CoreOS version I use, I just downgraded to v1.5.x again.