docker-elk: 7.2.0 fails to start on docker-swarm

Problem description

Yesterday I have started working on ELK version upgrade (6.5.4 to 7.2.0).
Sadly, I came across a lot of problems.
I am deploying this solution on docker-swarm, making the relevant changes
(like: discovery.zen.ping.unicast.hosts: tasks.elasticsearch , discovery.type: zen).

I have noticed that some mandatory configurations were added after version 7 like ‘cluster.initial_master_nodes’, making me a hard time to work on swarm mode.
I discovered that when I have received this message:

"master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and [cluster.initial_master_nodes] is empty on this node: have discovered []... "

This results with: ‘MasterNotDiscoveredException: null’.

Which then led me to Bootstrapping a cluster article, where this error is mentioned.

I can’t find a smart way to make my ELK cluster working in swarm mode at 7.2.0 version. Did anyone make that upgrade and managed to stay alive?

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 1
  • Comments: 33 (1 by maintainers)

Commits related to this issue

Most upvoted comments

A possible solution to this issue, which I haven’t tested yet:


services:
  elasticsearch:
    environment:
      node.name: "elk_elasticsearch.{{.Task.Slot}}"
      discovery.type: zen
      discovery.seed_hosts: tasks.elasticsearch
      cluster.initial_master_nodes: elk_elasticsearch.1,elk_elasticsearch.2,elk_elasticsearch.3

{{.Task.Slot}} supposedly contains the indice part of {{.Task.Name}}. E.g. 2 when Task.Name == elk_elasticsearch.2.p8d7aufb80h.

I’ll try to find time to validate this today or tomorrow.


edit: It works! 🎉 cc @rong0312 @saifat29 @ranjithvaddepally

elk_elasticsearch.1.qrjnhmnk0qu4@manny    | {
   "type":"server",
   "timestamp":"2019-11-07T21:08:11,271Z",
   "level":"INFO",
   "component":"o.e.c.s.MasterService",
   "cluster.name":"docker-cluster",
   "node.name":"elk_elasticsearch.1",
   "message":"elected-as-master ([2] nodes joined)[{elk_elasticsearch.1}{_PQh5XQuTW6CPgsItFXyow}{NbiZHwd4TFOmWBZ_K5HqCA}{10.0.0.19}{10.0.0.19:9300}{dilm}{ml.machine_memory=15347986432, xpack.installed=true, ml.max_open_jobs=20} elect leader, {elk_elasticsearch.2}{ngjSTrJ6RyaIvfjntl1VTg}{A91_kAAsRUCHQqmU6RJavQ}{10.0.0.17}{10.0.0.17:9300}{dilm}{ml.machine_memory=15347986432, ml.max_open_jobs=20, xpack.installed=true} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_], term: 2, version: 1, reason: master node changed {previous [], current [{elk_elasticsearch.1}{_PQh5XQuTW6CPgsItFXyow}{NbiZHwd4TFOmWBZ_K5HqCA}{10.0.0.19}{10.0.0.19:9300}{dilm}{ml.machine_memory=15347986432, xpack.installed=true, ml.max_open_jobs=20}]}, added {{elk_elasticsearch.2}{ngjSTrJ6RyaIvfjntl1VTg}{A91_kAAsRUCHQqmU6RJavQ}{10.0.0.17}{10.0.0.17:9300}{dilm}{ml.machine_memory=15347986432, ml.max_open_jobs=20, xpack.installed=true},}"
}

Screenshot_2019-11-07 Stack Monitoring - docker-cluster - Elasticsearch - Nodes

As @antoineco stated above, I think it makes more sense to use tasks.elasticsearch instead of elasticsearch in the discovery.seed_hosts property. Even better, issuing a DNS lookup on tasks.elasticsearch and use that as the property value.

By using the service name elasticsearch it will use the virtual IP associated to the service, returning the IP associated to one of the service task (load balanced).

See the Container discovery section in https://docs.docker.com/network/overlay/