rancher: fluentbit unable to connect to clients service

What kind of request is this (question/bug/enhancement/feature request): bug

Steps to reproduce (least amount of steps as possible): install the chart via rancher apps

Result: fluentbit errors

Other details that may be helpful:

Environment information

  • Rancher version (rancher/rancher/rancher/server image tag or shown bottom left in the UI): v2.5.1
  • Installation option (single install/HA): installed via helm

Cluster information

  • Cluster type (Hosted/Infrastructure Provider/Custom/Imported): imported
  • Machine type (cloud/VM/metal) and specifications (CPU/memory): metal, Intel® Core™ i7-8700 CPU @ 3.20GHz, 32gb
  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.5", GitCommit:"e6503f8d8f769ace2f338794c914a96fc335df0f", GitTreeState:"clean", BuildDate:"2020-06-26T03:47:41Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:41:49Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}
  • Docker version (use docker version):
Client: Docker Engine - Community
 Version:           19.03.13
 API version:       1.40
 Go version:        go1.13.15
 Git commit:        4484c46d9d
 Built:             Wed Sep 16 17:02:52 2020
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.13
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.13.15
  Git commit:       4484c46d9d
  Built:            Wed Sep 16 17:01:20 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.3.7
  GitCommit:        8fba4e9a7d01810a393d5d25a3621dc101981175
 runc:
  Version:          1.0.0-rc10
  GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

rancher-logging v3.6.000, no changes to included values.yaml

installed via rancher cluster explorer apps

lots of logs like the following:

[2020/10/31 05:52:51] [error] [src/flb_io.c:171 errno=111] Connection refused
[2020/10/31 05:52:51] [error] [io] connection  #272 failed to: rancher-logging-fluentd.cattle-logging-system.svc:24240
[2020/10/31 05:52:51] [error] [output:forward:forward.0] no upstream connections available
[2020/10/31 05:52:51] [error] [src/flb_io.c:171 errno=111] Connection refused
[2020/10/31 05:52:51] [error] [io] connection  #272 failed to: rancher-logging-fluentd.cattle-logging-system.svc:24240
[2020/10/31 05:52:51] [error] [output:forward:forward.0] no upstream connections available
[2020/10/31 05:52:51] [error] [src/flb_io.c:171 errno=111] Connection refused
[2020/10/31 05:52:51] [error] [io] connection  #272 failed to: rancher-logging-fluentd.cattle-logging-system.svc:24240
[2020/10/31 05:52:51] [error] [output:forward:forward.0] no upstream connections available
[2020/10/31 05:52:51] [error] [src/flb_io.c:171 errno=111] Connection refused
[2020/10/31 05:52:51] [error] [io] connection  #272 failed to: rancher-logging-fluentd.cattle-logging-system.svc:24240
[2020/10/31 05:52:51] [error] [output:forward:forward.0] no upstream connections available
[2020/10/31 05:52:51] [error] [src/flb_io.c:171 errno=111] Connection refused
[2020/10/31 05:52:51] [error] [io] connection  #272 failed to: rancher-logging-fluentd.cattle-logging-system.svc:24240
[2020/10/31 05:52:51] [error] [output:forward:forward.0] no upstream connections available
[2020/10/31 05:52:51] [error] [src/flb_io.c:171 errno=111] Connection refused
[2020/10/31 05:52:51] [error] [io] connection  #272 failed to: rancher-logging-fluentd.cattle-logging-system.svc:24240
[2020/10/31 05:52:51] [error] [output:forward:forward.0] no upstream connections available
[2020/10/31 05:52:51] [error] [src/flb_io.c:171 errno=111] Connection refused
[2020/10/31 05:52:51] [error] [io] connection  #272 failed to: rancher-logging-fluentd.cattle-logging-system.svc:24240
[2020/10/31 05:52:51] [error] [output:forward:forward.0] no upstream connections available
[2020/10/31 05:52:51] [error] [src/flb_io.c:171 errno=111] Connection refused
[2020/10/31 05:52:51] [error] [io] connection  #272 failed to: rancher-logging-fluentd.cattle-logging-system.svc:24240
[2020/10/31 05:52:51] [error] [output:forward:forward.0] no upstream connections available

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Reactions: 14
  • Comments: 19

Most upvoted comments

Not stale!

This issue still exists. We’re running 2.5.5 and we’ve installed the Logging app onto the cluster. The helm chart values don’t specify a tls field, and if we edit the Logging CRDs to manually set tls.enabled: false nothing happens.

@process0 Have you made any progress on this? I’ve tried setting up Logging on Rancher with Loki and I’m experiencing the same thing. The Loki datasource in itself works, its just not getting any data it appers, because no updates are being pushed.

We get this with Rancher 2.5.2 on a Rancher launched Kubernetes 1.18.10 cluster with logging chart version 3.6.001, too. While the error message itself wouldn’t worry me too much, we seem to have a really low throughput for logs - which is to be expected if fluent-bit has to wait for reconnections constantly.

As a bandaid, I tried scaling up the rancher-logging-fluentd StatefulSet to 2, but the problem still occurs. Come Monday, we’ll try upscaling even more.

EDIT: For us, this was a resource problem: When fluentd exceeded the limits, it wasn’t the pod that terminated, but only the fluentd worker process, see https://github.com/banzaicloud/logging-operator/issues/579 for another example.

We upgraded the deployment to:

resources:
  limits:
    cpu: "2"
    memory: 2Gi
  requests:
    cpu: "1"
    memory: 1Gi
scaling:
  replicas: 2