cilium: hubble-relay: failed to create gRPC client

Bug report

After upgrading from Cilium v1.9.7 to Cilium v1.10.0 hubble-relay stopped working: it can not connect to node:4244.

I did not change anything in Helm Chart values.yaml file, except Chart version replace.

I thought issue caused :

  • by CiliumNetworkPolicy -> removed -> nothing changed.
  • relay tls server is needed, so enabled option hubble.relay.tls.server.enabled: true > rerun cronjob -> killed hubble-relay Pod -> nothing changed.

Sample log lines:

level=info msg=Connecting address=“3.249.66.182:4244” hubble-tls=true peer=prod-k8s-master-3 subsys=hubble-relay level=warning msg=“Failed to create gRPC client” address=“3.249.66.182:4244” error=“context deadline exceeded” hubble-tls=true next-try-in=10s peer=prod-k8s-master-3 subsys=hubble-relay

next-try-in is constantly increasing but nothing helped.

General Information

  • Cilium version (run cilium version)

Client: 1.10.0 952d9d3 2021-05-19T18:42:32+02:00 go version go1.16.4 linux/amd64 Daemon: 1.10.0 952d9d3 2021-05-19T18:42:32+02:00 go version go1.16.4 linux/amd64

  • Kernel version (run uname -a)

Linux prod-k8s-master-3 5.10.29-talos #1 SMP Wed Apr 28 14:43:26 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

  • Orchestration system version in use (e.g. kubectl version, …)

Client Version: version.Info{Major:“1”, Minor:“21”, GitVersion:“v1.21.1”, GitCommit:“5e58841cce77d4bc13713ad2b91fa0d961e69192”, GitTreeState:“clean”, BuildDate:“2021-05-12T14:18:45Z”, GoVersion:“go1.16.4”, Compiler:“gc”, Platform:“linux/amd64”} Server Version: version.Info{Major:“1”, Minor:“21”, GitVersion:“v1.21.1”, GitCommit:“5e58841cce77d4bc13713ad2b91fa0d961e69192”, GitTreeState:“clean”, BuildDate:“2021-05-12T14:12:29Z”, GoVersion:“go1.16.4”, Compiler:“gc”, Platform:“linux/amd64”}

  • Link to relevant artifacts (policies, deployments scripts, …) N/A

  • Generate and upload a system zip:

curl -sLO https://git.io/cilium-sysdump-latest.zip && python cilium-sysdump-latest.zip

Can not attach cilium-sysdump - file size is 52M

How to reproduce the issue

Do not know.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 25 (24 by maintainers)

Most upvoted comments

@alex1989hu now that #16508 is fixed in Cilium 1.10.2 or later, are you still observing an issue?

Troubleshooting why Hubble Relay fails to connect to peers is hard because of a limitation of the gRPC version that ships with Cilium. Would you mind trying out to deploy a Hubble CLI instance using the file in PR #16459? Then, run this command and paste the error you obtain:

kubectl exec -n kube-system deployment/hubble-cli -- \
    hubble observe --server 'tls://3.249.66.186:4244' \
        --tls-server-name 'prod-k8s-worker-8da7da49.default.hubble-grpc.cilium.io' \
        --tls-ca-cert-files /var/lib/hubble-relay/tls/hubble-server-ca.crt \
        --tls-client-cert-file /var/lib/hubble-relay/tls/client.crt \
        --tls-client-key-file /var/lib/hubble-relay/tls/client.key