opensearch-k8s-operator: Opensearch operator 2.0.4 does not work with the example opensearch cluster manifest.
Description
The example opensearch cluster yaml does not work with version 2.0.4 of the operator.
Steps to reproduce
- Add the helm repo for the operator:
helm repo add opensearch-operator https://opster.github.io/opensearch-k8s-operator/ - Install V2.0.4 of the operator:
helm install opensearch-operator opensearch-operator/opensearch-operator --version 2.0.4 - Wait for the operator pods to be ready.

- Apply the manifest for the example opensearch cluster:
kubectl apply -f ~/Git/opensearch-k8s-operator/opensearch-operator/examples/opensearch-cluster.yaml - After ~10 mins check the cluster:

- In this instance the master and one coordinator node are restarting.
- Describe master pod to debug:
kubectl describe pod my-cluster-masters-0
my-cluster-masters-0.log - Describe coordinator pod to debug:
kubectl describe pod my-cluster-coordinators-1
my-cluster-coordinators-1.log - Terminated with 137 for both.
Steps to workaround
- I have tried doubling memory requests and limits to 4Gi for each node group but I am still getting 137 error and pods killed. Any advice on this would be greatly appreciated - can you supply a working example yaml for the latest images of opensearch also? I think 1.3.0 is 5 months old now. Thanks.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 1
- Comments: 38 (3 by maintainers)
I got it working!
Current Versions:
Steps:
What i noticed is, that when deploying Version 1.3.1 or 1.3.2 there is an additional bootstrapping pod, this pod is not present with higher Opensearch versions.
So I guess as there is no pod doing the bootstrapping, the cluster creation fails altogether, unless you are upgrading from a previous version, where there is a dedicated bootstrapping pod.
Attachments: Minimal cluster manifest: minimal-cluster.yaml
Sure re-opening this issue, the example opensearch-cluster.yaml works fine, we should be adding a new example for 2.x (>2.0.0).
Hey All, just FYI i’m able to get the cluster up and running with the following yaml
I believe there is some confusion with
rolespassing- "cluster_manager"for above2.0.0should work fine.@dickescheid thanks for all the help time and input also early on!
Hey Thanks for the update @dobharweim and for all who participated here 😃, closing this issue, please feel free to reopen if required. We should have some details about the change
cluster_managerreflecting inREADMEdocs. @idanl21 @segalziv @swoehrl-mw Thank you@prudhvigodithi I can confirm (gladly) that the above works. Thanks.
P.S. - I did initially see the second master pod getting killed with a 137 error again, so I increased resources:
Oh, that’s right! I’ll check tomorrow this. Indeed as on example it was master, and it supported both 1.x and 2.x, I assumed Operator used the CR to generate a perfect fit for each version.
I looked over source for other thing and missed this out.
nice, thanks. Good to know we got it working, bummer I didn’t find it, would have saved me a lot of work.
Then we’ll have to wait for a fix.
I concluded something similar in #251
I’m running opensearch-operator 2.0.4 tries removing the operator and applying it again as well. I destroyed the properly running cluster and tried recreating that, it also failed. Currently I am only getting the bootstrapped issues.
The only thing which changed between the time the opensearch cluster was working and now is the kubernetes cluster version. I updated from version of v1.22 to currently 1.23.7-gke.1400 in the last days.
My kubernetes cluster is GKE on Google.