jaeger-operator: failed to create primary Elasticsearch client with health check timeout - no Elasticsearch node available
Hi,
I am trying to setup Elasticsearch storage with Jaeger operator, but it keeps failing for the initial healthcheck.
Jaeger operator version - v1.12.1 K8s version - v1.12.8 Elasticsearch operator version - v0.8.1 Elasticsearch version - v7.2.0
The operators have been deployed and running. When the Jaeger instance is deployed with es backend, the collector / query pods fails with :
$ kubectl logs my-jaeger-collector-8477b87bfd-sdcq4
{"level":"info","ts":1562160343.721835,"caller":"flags/service.go:113","msg":"Mounting metrics handler on admin server","route":"/metrics"}
{"level":"info","ts":1562160343.7220967,"caller":"flags/admin.go:108","msg":"Mounting health check on admin server","route":"/"}
{"level":"info","ts":1562160343.7221494,"caller":"flags/admin.go:114","msg":"Starting admin HTTP server","http-port":14269}
{"level":"info","ts":1562160343.7221606,"caller":"flags/admin.go:100","msg":"Admin server started","http-port":14269,"health-status":"unavailable"}
{"level":"fatal","ts":1562160348.7503304,"caller":"collector/main.go:87","msg":"Failed to init storage factory","error":"failed to create primary Elasticsearch client: health check timeout: Head http://quickstart-es:9200: EOF: no Elasticsearch node available","errorVerbose":"no Elasticsearch node available\ngithub.com/jaegertracing/jaeger/vendor/gopkg.in/olivere/elastic%2ev5.init.ializers\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/gopkg.in/olivere/elastic.v5/client.go:88\nruntime.main\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/proc.go:188\nruntime.goexit\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/asm_amd64.s:1337\nhealth check timeout: Head http://quickstart-es:9200: EOF
$ kubectl get po
NAME READY STATUS RESTARTS AGE
my-jaeger-agent-daemonset-pxbnn 1/1 Running 0 5m
my-jaeger-agent-daemonset-v4x47 1/1 Running 0 5m
my-jaeger-agent-daemonset-v8fc9 1/1 Running 0 5m
my-jaeger-collector-f9d9656db-bwcb6 0/1 CrashLoopBackOff 5 5m
my-jaeger-query-74576c764-fxj7h 1/2 CrashLoopBackOff 5 5m
Jaeger operator logs :
$ kubectl logs jaeger-operator-5ddcb7c446-ksj9m -n observability
time="2019-07-03T12:01:17Z" level=info msg=Versions arch=amd64 jaeger-operator=1.12.1 operator-sdk=v0.8.1 os=linux version=go1.12.5
time="2019-07-03T12:01:18Z" level=info msg="Auto-detected the platform" platform=kubernetes
time="2019-07-03T12:01:18Z" level=info msg="Automatically adjusted the 'es-provision' flag" es-provision=false
time="2019-07-03T13:14:22Z" level=error msg="failed to apply the changes" error="timed out waiting for the condition" execution="2019-07-03 13:09:22.662415944 +0000 UTC" instance=my-jaeger namespace=default
time="2019-07-03T13:19:23Z" level=error msg="failed to apply the changes" error="timed out waiting for the condition" execution="2019-07-03 13:14:23.743258079 +0000 UTC" instance=my-jaeger namespace=default
time="2019-07-03T13:24:24Z" level=error msg="failed to apply the changes" error="timed out waiting for the condition" execution="2019-07-03 13:19:24.779557298 +0000 UTC" instance=my-jaeger namespace=default
My spec file looks like :
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: my-jaeger
spec:
strategy: production
ui:
options:
dependencies:
menuEnabled: false
tracking:
gaID: UA-000000-2
menu:
- label: "About Jaeger"
items:
- label: "Documentation"
url: "https://www.jaegertracing.io/docs/latest"
storage:
type: elasticsearch
options:
es:
server-urls: http://quickstart-es:9200
secretName: mysecret
ingress:
enabled: false
agent:
strategy: DaemonSet
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ""
ES svc has been created and running :
$ kubectl get svc quickstart-es
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
quickstart-es ClusterIP 100.66.174.252 <none> 9200/TCP 4h
Elasticsearch cluster is in the same k8s cluster and namespace (I think) :
$ curl -u "elastic:$PASSWORD" -k "https://localhost:9200"
{
"name" : "quickstart-es-96dchsxw4f",
"cluster_name" : "quickstart",
"cluster_uuid" : "someuuid",
"version" : {
"number" : "7.2.0",
"build_flavor" : "default",
"build_type" : "docker",
"build_hash" : "508c38a",
"build_date" : "2019-06-20T15:54:18.811730Z",
"build_snapshot" : false,
"lucene_version" : "8.0.0",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}
I have even tried to give the full DNS in jaeger config like http://quickstart-es.default.svc.cluster.local:9200 and even the clusterIP.
My secret file looks like (Its been generated using $(kubectl get secret quickstart-elastic-user -o=jsonpath='{.data.elastic}' | base64 --decode) and username elastic) -
apiVersion: v1
kind: Secret
metadata:
name: mysecret
type: Opaque
data:
ES_USERNAME: ZWxhc3RpYw==
ES_PASSWORD: somepasswordinbase64
There are some more error logs in es cluster pod :
{"type": "server", "timestamp": "2019-07-03T13:47:18,707+0000", "level": "WARN", "component": "o.e.h.AbstractHttpServerTransport", "cluster.name": "quickstart", "node.name": "quickstart-es-96dchsxw4f", "cluster.uuid": "someuuid", "node.id": "someid", "message": "caught exception while handling client http traffic, closing connection Netty4HttpChannel{localAddress=0.0.0.0/0.0.0.0:9200, remoteAddress=/xx.xx.x.xx:51134}" ,
"stacktrace": ["io.netty.handler.codec.DecoderException: io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: 48454144202f2048545450....",
Can anyone please help with this case ? Thanks.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 4
- Comments: 37 (21 by maintainers)
@vishnuhd Here is a more simplified deployment manifests:
ElasticSearch:
Kibana:
Jaeger:
Example of
quickstart-essecret manifest generated manually:@vishnuhd I finally figured out how it works! I have installed delve inside the jaeger-collector and I did a remote debugging session in order to understand what should be the right configurations.
Here is my
jaegercustom resource manifest:I am using the ElasticSearch Operator from elastic in the master branch (probably the future version v0.9.0). The ElasticSearch Operator generates the secret resource named
quickstart-es-http-certs-public.