nats-streaming-operator: Failed to start: discovered another streaming server with cluster ID "example-stan"

I got error when I deploy NatsStreamingCluster

[1] 2019/12/26 07:16:45.762521 [FTL] STREAM: Failed to start: discovered another streaming server with cluster ID "example-stan"

I use GKE

full message

[1] 2019/12/26 07:16:45.747712 [INF] STREAM: ServerID: JTmPHIR4BFp2ZuAWkekcIl
[1] 2019/12/26 07:16:45.747715 [INF] STREAM: Go version: go1.11.13
[1] 2019/12/26 07:16:45.747717 [INF] STREAM: Git commit: [910d6e1]
[1] 2019/12/26 07:16:45.760913 [INF] STREAM: Recovering the state...
[1] 2019/12/26 07:16:45.761073 [INF] STREAM: No recovered state
[1] 2019/12/26 07:16:45.762399 [INF] STREAM: Shutting down.
[1] 2019/12/26 07:16:45.762521 [FTL] STREAM: Failed to start: discovered another streaming server with cluster ID "example-stan"

About this issue

Original URL
State: open
Created 5 years ago
Reactions: 8
Comments: 25 (7 by maintainers)

Commits related to this issue

workaround issue #61 - adding missing configs in order for the code correctly set cluster-node-id for cluster mode — committed to hbobenicio/nats-streaming-operator by hbobenicio 4 years ago
Merge pull request #76 from hbobenicio/fix-examples-always-define-configs workaround issue #61 - adding missing configs to all examples of cluster mode — committed to nats-io/nats-streaming-operator by wallyqs 4 years ago

Most upvoted comments

I change to KubeMQ

veerapatyok on Mar 17, 2020

I have temporary solution: I made nat-streaming-cluster.yaml and inside a file I added

config:
    debug: true

nat-streaming-cluster.yaml

---
apiVersion: "streaming.nats.io/v1alpha1"
kind: "NatsStreamingCluster"
metadata:
  name: "example-stan"
spec:
  # Number of nodes in the cluster
  size: 3

  # NATS Streaming Server image to use, by default
  # the operator will use a stable version
  #
  image: "nats-streaming:latest"

  # Service to which NATS Streaming Cluster nodes will connect.
  #
  natsSvc: "example-nats"

 config:
    debug: true

veerapatyok on Feb 4, 2020

I downloaded the code, changed o.Name for pod.Name and then I’ve put some logs to compare both values. I docker built the image and redeploy the operator in my minikube… this is what follows:

$ kubectl logs -n poc nats-streaming-operator-5d4777f476-2wf7n

time="2020-07-23T20:25:22Z" level=info msg="cluster name: stan-cluster" # this is the o.Name
time="2020-07-23T20:25:22Z" level=info msg="pod name: stan-cluster-2" # this is the pod.Name

now the cluster id is correctly set for the pods:

$ kubectl logs -n poc stan-cluster-2 # stan-cluster-2 is the correct cluster-id!

[1] 2020/07/23 20:27:22.126726 [INF] STREAM: Starting nats-streaming-server[stan-cluster-2] version 0.18.0

and all servers are ready.

hbobenicio on Jul 24, 2020

maybe this line can be a clue what’s happening: https://github.com/nats-io/nats-streaming-operator/blob/079120fc31b6c10d041c4f594d9d4bd9d78ededa/internal/operator/controller.go#L379

isn’t it supposed to be pod.Name or something?

hbobenicio on Jul 23, 2020

the same issue for me

sneerin on May 19, 2020

Any update on this issue? The same behaviour on EKS. If I keep retrying it works eventually, however, when there is a pod restart it starts happening again.

hasanovkhalid on Apr 20, 2020