moby: Swarm daemon on worker node randomly panics claiming overlay network doesn't exist

Description

Swarm daemon on worker node randomly panics claiming overlay network doesn’t exist. In this case I’m running an Elasticsearch stack on top of a 3 node docker cluster in swarm mode.

Steps to reproduce the issue:

  1. Setup a 3 node docker swarm cluster with 1 manager and 2 workers
  2. Given docker-compose.yml
version: '3.3'
services:
  elastic:
    image: docker.elastic.co/elasticsearch/elasticsearch:5.6.2
    command: |
      elasticsearch
      -E "xpack.ml.enabled=false"
      -E "xpack.monitoring.enabled=true"
      -E "xpack.security.enabled=false"
      -E "xpack.watcher.enabled=false"
      -E "transport.host=0.0.0.0"
      -E "cluster.name=search"
      -E "discovery.zen.ping.unicast.hosts=elastic"
      -E "discovery.zen.minimum_master_nodes=2"
      -E "discovery.zen.fd.ping_interval=3s"
      -E "discovery.zen.fd.ping_timeout=120s"
      -E "discovery.zen.fd.ping_retries=10"
    environment:
      - ES_JAVA_OPTS=-Xms${ES_MEMORY-1g} -Xmx${ES_MEMORY-1g}
    ulimits:
      memlock: -1
      nofile:
        hard: 65536
        soft: 65536
      nproc: 65538
    volumes:
      - elastic_data:/usr/share/elasticsearch/data
    networks:
      - elk
    deploy:
      mode: global
      endpoint_mode: dnsrr
  kibana:
    image: docker.elastic.co/kibana/kibana:5.6.2
    command: |
      kibana
      -e http://elastic:9200
    ports:
      - "5601:5601"
    networks:
      - elk
    deploy:
      mode: replicated
      replicas: 1
    healthcheck:
      test: curl -s http://localhost:5601 > /dev/null
      interval: 30s
      retries: 3
  elasticsearch_proxy:
    image: nginx:alpine
    deploy:
      mode: replicated
      replicas: 1
    ports:
      - "9200:9200"
    networks:
      - elk
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf:ro

volumes:
  elastic_data:

networks:
  elk:
    attachable: true
  1. docker stack deploy --prune -c ./docker-compose.yml elk
  2. Wait a couple of hours and watch a worker node panic 💥

Note: The exact same issue was reproduced with both versions 17.09 and 17.10

Describe the results you received:

One of the worker nodes’ daemon panics and crashes after a few hours. Below is the daemon log. You’ll notice multiple errors, most notably the network itself no longer existing. The swarm node is marked as Down. Restarting the daemon brings things back to normal temporarily until the issue occurs again after a few hours.

Docker Daemon Log 1
Docker Daemon Log
Starting Docker Application Container Engine...
time="2017-10-24T17:20:19.455216078Z" level=info msg="libcontainerd: new containerd process, pid: 46496"
time="2017-10-24T17:20:20.460071461Z" level=info msg="[graphdriver] using prior storage driver: overlay2"
time="2017-10-24T17:20:20.472890656Z" level=info msg="Graph migration to content-addressability took 0.00 seconds"
time="2017-10-24T17:20:20.473060456Z" level=warning msg="Your kernel does not support swap memory limit"
time="2017-10-24T17:20:20.473091456Z" level=warning msg="Your kernel does not support cgroup rt period"
time="2017-10-24T17:20:20.473099456Z" level=warning msg="Your kernel does not support cgroup rt runtime"
time="2017-10-24T17:20:20.473108056Z" level=warning msg="Your kernel does not support cgroup blkio weight"
time="2017-10-24T17:20:20.473115456Z" level=warning msg="Your kernel does not support cgroup blkio weight_device"
time="2017-10-24T17:20:20.473420555Z" level=info msg="Loading containers: start."
time="2017-10-24T17:20:23.529565986Z" level=info msg="Removing stale sandbox d1404162d11f3d3a69661b013cd3a78420a2addecd923be983f5dfe5960ac1ee (2f7d15c7adc6cee6efb246fce0b4a759cdead8b3ad93c66f826557a9d2679f7b)"
time="2017-10-24T17:21:28.161617947Z" level=info msg="Removing stale sandbox ingress_sbox (ingress-sbox)"
time="2017-10-24T17:21:28.303078888Z" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
SCHED 0ms: gomaxprocs=8 idleprocs=6 threads=10 spinningthreads=1 idlethreads=2 runqueue=0 gcwaiting=0 nmidlelocked=1 stopwait=0 sysmonwait=0
  P0: status=0 schedtick=344 syscalltick=210 m=-1 runqsize=0 gfreecnt=1
  P1: status=0 schedtick=179 syscalltick=1 m=-1 runqsize=0 gfreecnt=1
  P2: status=0 schedtick=33 syscalltick=1 m=-1 runqsize=0 gfreecnt=0
  P3: status=1 schedtick=73 syscalltick=434 m=4 runqsize=0 gfreecnt=0
  P4: status=1 schedtick=67 syscalltick=386 m=5 runqsize=0 gfreecnt=2
  P5: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P6: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P7: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  M9: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 helpgc=0 spinning=false blocked=true lockedg=-1
  M8: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 helpgc=0 spinning=false blocked=true lockedg=-1
  M7: p=0 curg=12 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 helpgc=0 spinning=false blocked=false lockedg=-1
  M6: p=-1 curg=7 mallocing=0 throwing=0 preemptoff= locks=2 dying=0 helpgc=0 spinning=false blocked=false lockedg=-1
  M5: p=4 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 helpgc=0 spinning=true blocked=false lockedg=-1
  M4: p=3 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 helpgc=0 spinning=false blocked=false lockedg=-1
  M3: p=-1 curg=5 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 helpgc=0 spinning=false blocked=true lockedg=-1
  M2: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 helpgc=0 spinning=false blocked=false lockedg=-1
  M1: p=-1 curg=17 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 helpgc=0 spinning=false blocked=false lockedg=17
  M0: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 helpgc=0 spinning=false blocked=true lockedg=10
  G1: status=4(chan receive) m=-1 lockedm=-1
  G17: status=3() m=1 lockedm=1
  G2: status=4(force gc (idle)) m=-1 lockedm=-1
  G3: status=4(GC sweep wait) m=-1 lockedm=-1
  G4: status=4(finalizer wait) m=-1 lockedm=-1
  G5: status=3() m=3 lockedm=-1
  G8: status=4(chan receive) m=-1 lockedm=-1
  G7: status=3() m=6 lockedm=-1
  G9: status=6() m=-1 lockedm=-1
  G10: status=4(select) m=-1 lockedm=0
  G11: status=4(chan receive) m=-1 lockedm=-1
  G12: status=3() m=7 lockedm=-1
  G13: status=4(chan receive) m=-1 lockedm=-1
  G14: status=4(chan receive) m=-1 lockedm=-1
  G15: status=4(chan receive) m=-1 lockedm=-1
  G16: status=4(chan receive) m=-1 lockedm=-1
  G34: status=4(chan receive) m=-1 lockedm=-1
  G35: status=4(chan receive) m=-1 lockedm=-1
  G36: status=4(select) m=-1 lockedm=-1
  G37: status=4(chan receive) m=-1 lockedm=-1
  G38: status=4(chan receive) m=-1 lockedm=-1
  G39: status=4(chan receive) m=-1 lockedm=-1
  G40: status=4(chan receive) m=-1 lockedm=-1
  G41: status=4(chan receive) m=-1 lockedm=-1
  G42: status=4(chan receive) m=-1 lockedm=-1
  G43: status=4(IO wait) m=-1 lockedm=-1
  G44: status=4(chan receive) m=-1 lockedm=-1
  G45: status=4(IO wait) m=-1 lockedm=-1
  G46: status=4(select) m=-1 lockedm=-1
  G50: status=4(chan receive) m=-1 lockedm=-1
  G51: status=6() m=-1 lockedm=-1
  G47: status=6() m=-1 lockedm=-1
  G198: status=6() m=-1 lockedm=-1
  G184: status=4(chan receive) m=-1 lockedm=-1
  G185: status=4(select) m=-1 lockedm=-1
  G186: status=4(sleep) m=-1 lockedm=-1
time="2017-10-24T17:21:28.612748960Z" level=info msg="Loading containers: done."
time="2017-10-24T17:21:28.637089650Z" level=warning msg="Not using native diff for overlay2, this may cause degraded performance for building images: kernel has CONFIG_OVERLAY_FS_REDIRECT_DIR enabled"
time="2017-10-24T17:21:28.644009447Z" level=info msg="Docker daemon" commit=f4ffd25 graphdriver(s)=overlay2 version=17.10.0-ce
time="2017-10-24T17:21:28.676740033Z" level=error msg="fatal task error" error="cannot create a swarm scoped network when swarm is not active" module=node/agent/worker/taskmanager node.id=3dcuvcbrf9qh8f3mq2xddw9y4 service.id=qhyu2y7oxbww8tno955lpl4o7 task.id=v1v7e99z6e5bs8q69tnxmlqdd
time="2017-10-24T17:21:28.736483208Z" level=info msg="Daemon has completed initialization"
time="2017-10-24T17:21:28.736489108Z" level=info msg="Initializing Libnetwork Agent Listen-Addr=0.0.0.0 Local-addr=10.8.23.6 Adv-addr=10.8.23.6 Data-addr= Remote-addr-list=[10.8.23.5] MTU=1500"
time="2017-10-24T17:21:28.736829508Z" level=info msg="Node 8858e009ad13/10.8.23.6, joined gossip cluster"
time="2017-10-24T17:21:28.736906208Z" level=info msg="Node 8858e009ad13/10.8.23.6, added to nodes list"
time="2017-10-24T17:21:28.738664607Z" level=info msg="Node 122803338add/10.8.23.4, joined gossip cluster"
time="2017-10-24T17:21:28.738709307Z" level=info msg="Node 122803338add/10.8.23.4, added to nodes list"
time="2017-10-24T17:21:28.738736207Z" level=info msg="Node d95cc511a502/10.8.23.5, joined gossip cluster"
time="2017-10-24T17:21:28.738751707Z" level=info msg="Node d95cc511a502/10.8.23.5, added to nodes list"
time="2017-10-24T17:21:28.738769607Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
Started Docker Application Container Engine.
time="2017-10-24T17:21:28.745125305Z" level=info msg="API listen on /var/run/docker.sock"
time="2017-10-24T17:21:28.752266602Z" level=error msg="Handler for GET /containers/json returned error: write unix /var/run/docker.sock->@: write: broken pipe"
http: multiple response.WriteHeader calls
time="2017-10-24T17:21:28.873791051Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:29.020101991Z" level=info msg="Node join event for 8858e009ad13/10.8.23.6"
time="2017-10-24T17:21:29.020650390Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:29.072899569Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:29.219736308Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:29.271496186Z" level=warning msg="failed to deactivate service binding for container elk_elastic.3dcuvcbrf9qh8f3mq2xddw9y4.v1v7e99z6e5bs8q69tnxmlqdd" error="No such container: elk_elastic.3dcuvcbrf9qh8f3mq2xddw9y4.v1v7e99z6e5bs8q69tnxmlqdd" module=node/agent node.id=3dcuvcbrf9qh8f3mq2xddw9y4
time="2017-10-24T17:21:29.699261209Z" level=error msg="Could not open netlink handle during vni population for ns /var/run/docker/netns/1-e671312jr3: failed to set into network namespace 25 while creating netlink socket: invalid argument"
time="2017-10-24T17:21:29.873746837Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:30.020707876Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:30.073223854Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:30.221482392Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
SCHED 0ms: gomaxprocs=8 idleprocs=6 threads=21 spinningthreads=0 idlethreads=9 runqueue=0time="2017-10-24T17:21:30.873862522Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:30.892517114Z" level=error msg="fatal task error" error="Failed joining elk_elk-endpoint to sandbox elk_elk-sbox: failed to update overlay endpoint 69ff136 to local data store: Key not found in store" module=node/agent/taskmanager node.id=3dcuvcbrf9qh8f3mq2xddw9y4 service.id=qhyu2y7oxbww8tno955lpl4o7 task.id=ia8a5mcuu2syxjd1c3lckxn08
time="2017-10-24T17:21:30.894390813Z" level=warning msg="rmServiceBinding cleanupServiceBindings elk_kibana 0ca3045a3078a48f46a03b4a1a398177a414fca04aaef2bd09a8635e050c0d24 aborted s.loadBalancers[nid] !ok"
time="2017-10-24T17:21:30.894470313Z" level=warning msg="rmServiceBinding cleanupServiceBindings elk_elastic 269278f3cb173a2121550ae54607b64c67e1aa7e92de9310978c0b8a2212220e aborted lb.backEnds[eid] !ok"
time="2017-10-24T17:21:30.894582613Z" level=warning msg="rmServiceBinding handleEpTableEvent elk_elastic e1b4c088820e465c41103e1b57049bd88e32bc7e2702eeb43761a05a547ed875 aborted c.serviceBindings[skey] !ok"
time="2017-10-24T17:21:30.894613713Z" level=warning msg="rmServiceBinding cleanupServiceBindings elk_elasticsearch_proxy 4906935a181f249935dc12b2006ca4244a54718046b5e0dd6e9f7ec2a8272e6c aborted s.loadBalancers[nid] !ok"
time="2017-10-24T17:21:31.020598861Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:31.072872039Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:31.219726578Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:31.272623356Z" level=error msg="failed removing service binding for 4906935a181f249935dc12b2006ca4244a54718046b5e0dd6e9f7ec2a8272e6c epRec:{elk_elasticsearch_proxy.1.ddrbwgfnams3wpj83m7oayd37 elk_elasticsearch_proxy ptcgzgacuepra4pp9q0mvj9ii 10.0.0.9 10.0.0.10 [] [elasticsearch_proxy] [363b92a5984d]} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:21:31.272689456Z" level=error msg="failed removing container name resolution for 6f7cf0a5678e085252344ad122e488bb6a15ae600a56b7c0107f930d686ca98e epRec:{elk_elk-endpoint   <nil> 10.0.0.3 [] [] []} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:21:31.272715056Z" level=error msg="failed removing container name resolution for 86c08f92fcfa3a194a6d5a04da2162ed30c7f32747fe66c718160ab525dfc006 epRec:{elk_elk-endpoint   <nil> 10.0.0.4 [] [] []} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:21:31.272734156Z" level=error msg="failed removing service binding for e1b4c088820e465c41103e1b57049bd88e32bc7e2702eeb43761a05a547ed875 epRec:{elk_elastic.l5cm81dnqu5taksonzv0xqj3o.jel6tai61dzok2dzl1q1uwwti elk_elastic qhyu2y7oxbww8tno955lpl4o7 <nil> 10.0.0.12 [] [elastic] [62fa17072482]} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:21:31.284340151Z" level=error msg="fatal task error" error="Failed creating elk_elk-endpoint in sandbox elk_elk-sbox: failed to get network during CreateEndpoint: network e671312jr3o0s634vnv40zlw5 not found" module=node/agent/taskmanager node.id=3dcuvcbrf9qh8f3mq2xddw9y4 service.id=qhyu2y7oxbww8tno955lpl4o7 task.id=pbr9gcjzwq21vitsi7rhd762m
time="2017-10-24T17:21:31.440969186Z" level=error msg="Failed to add firewall mark rule in sbox ingress (ingress): reexec failed: signal: segmentation fault (core dumped)"
time="2017-10-24T17:21:31.592097124Z" level=error msg="Failed to add firewall mark rule in sbox ingress (ingress): reexec failed: signal: segmentation fault (core dumped)"
time="2017-10-24T17:21:31.690238283Z" level=warning msg="failed to deactivate service binding for container elk_elastic.3dcuvcbrf9qh8f3mq2xddw9y4.ia8a5mcuu2syxjd1c3lckxn08" error="No such container: elk_elastic.3dcuvcbrf9qh8f3mq2xddw9y4.ia8a5mcuu2syxjd1c3lckxn08" module=node/agent node.id=3dcuvcbrf9qh8f3mq2xddw9y4
time="2017-10-24T17:21:31.873576607Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:32.020565646Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:32.072781324Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:32.219758963Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:32.873633492Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:33.021111531Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:33.072945709Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:33.219576648Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:33.690935653Z" level=error msg="network elk_elk remove failed: No such network: elk_elk" module=node/agent node.id=3dcuvcbrf9qh8f3mq2xddw9y4
time="2017-10-24T17:21:33.690964853Z" level=error msg="remove task failed" error="No such network: elk_elk" module=node/agent node.id=3dcuvcbrf9qh8f3mq2xddw9y4 task.id=ia8a5mcuu2syxjd1c3lckxn08
time="2017-10-24T17:21:33.873721977Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:34.020643116Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:34.072786395Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:34.219673534Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:34.873756262Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:35.020637801Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:35.073029580Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:35.219571219Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:35.873519948Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:36.020388487Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:36.072924465Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:36.272952182Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:36.719795896Z" level=warning msg="failed to deactivate service binding for container elk_elastic.3dcuvcbrf9qh8f3mq2xddw9y4.pbr9gcjzwq21vitsi7rhd762m" error="No such container: elk_elastic.3dcuvcbrf9qh8f3mq2xddw9y4.pbr9gcjzwq21vitsi7rhd762m" module=node/agent node.id=3dcuvcbrf9qh8f3mq2xddw9y4
time="2017-10-24T17:21:36.873585033Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:37.020951372Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:37.072934450Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:37.219924989Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:37.873935318Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:38.020659057Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:38.072819735Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:38.219598574Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:38.720922266Z" level=warning msg="rmServiceBinding handleEpTableEvent elk_kibana 0ca3045a3078a48f46a03b4a1a398177a414fca04aaef2bd09a8635e050c0d24 aborted c.serviceBindings[skey] !ok"
time="2017-10-24T17:21:38.720964566Z" level=warning msg="rmServiceBinding handleEpTableEvent elk_elastic 269278f3cb173a2121550ae54607b64c67e1aa7e92de9310978c0b8a2212220e aborted c.serviceBindings[skey] !ok"
time="2017-10-24T17:21:38.720989866Z" level=warning msg="rmServiceBinding handleEpTableEvent elk_elasticsearch_proxy 4906935a181f249935dc12b2006ca4244a54718046b5e0dd6e9f7ec2a8272e6c aborted c.serviceBindings[skey] !ok"
time="2017-10-24T17:21:38.721029366Z" level=warning msg="rmServiceBinding handleEpTableEvent elk_elastic e1b4c088820e465c41103e1b57049bd88e32bc7e2702eeb43761a05a547ed875 aborted c.serviceBindings[skey] !ok"
time="2017-10-24T17:21:38.874129103Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:39.021275242Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:39.072906220Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:39.220393259Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:39.873702288Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:40.020655127Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:40.072820005Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:40.219714845Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:40.873654673Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:41.020831912Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:41.072948791Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:41.088193784Z" level=error msg="fatal task error" error="starting container failed: No such network: elk_elk" module=node/agent/taskmanager node.id=3dcuvcbrf9qh8f3mq2xddw9y4 service.id=qhyu2y7oxbww8tno955lpl4o7 task.id=dqeng0m22pbz2tftw6b7g7m8h
time="2017-10-24T17:21:41.219671230Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:41.873602058Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:42.020638897Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:42.072903076Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:42.219743715Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:42.873830743Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:43.021954082Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:43.072750361Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:43.219507100Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:43.873814529Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:44.020242068Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:44.072962846Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:44.219623785Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:44.873877814Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:45.072866631Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:45.219621270Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:45.873752599Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:46.020226838Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:46.072984316Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:46.219645355Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:46.335875107Z" level=error msg="fatal task error" error="starting container failed: No such network: elk_elk" module=node/agent/taskmanager node.id=3dcuvcbrf9qh8f3mq2xddw9y4 service.id=qhyu2y7oxbww8tno955lpl4o7 task.id=v80qjzyuzfsj9pnz1lr4k7qu4
time="2017-10-24T17:21:46.873782684Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:47.020702623Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:47.073053601Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:47.219685141Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:47.873978769Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:48.020725408Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:48.072882687Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:48.219584026Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:48.873716454Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:49.020623293Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:49.073026972Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:49.083858367Z" level=warning msg="rmServiceBinding handleEpTableEvent elk_elastic 269278f3cb173a2121550ae54607b64c67e1aa7e92de9310978c0b8a2212220e aborted s.loadBalancers[nid] !ok"
time="2017-10-24T17:21:49.084110267Z" level=error msg="failed removing service binding for 4906935a181f249935dc12b2006ca4244a54718046b5e0dd6e9f7ec2a8272e6c epRec:{elk_elasticsearch_proxy.1.ddrbwgfnams3wpj83m7oayd37 elk_elasticsearch_proxy ptcgzgacuepra4pp9q0mvj9ii 10.0.0.9 10.0.0.10 [] [elasticsearch_proxy] [363b92a5984d]} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:21:49.084384867Z" level=error msg="failed removing container name resolution for 6f7cf0a5678e085252344ad122e488bb6a15ae600a56b7c0107f930d686ca98e epRec:{elk_elk-endpoint   <nil> 10.0.0.3 [] [] []} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:21:49.084642367Z" level=error msg="failed removing container name resolution for 86c08f92fcfa3a194a6d5a04da2162ed30c7f32747fe66c718160ab525dfc006 epRec:{elk_elk-endpoint   <nil> 10.0.0.4 [] [] []} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:21:49.084898067Z" level=error msg="failed removing service binding for e1b4c088820e465c41103e1b57049bd88e32bc7e2702eeb43761a05a547ed875 epRec:{elk_elastic.l5cm81dnqu5taksonzv0xqj3o.jel6tai61dzok2dzl1q1uwwti elk_elastic qhyu2y7oxbww8tno955lpl4o7 <nil> 10.0.0.12 [] [elastic] [62fa17072482]} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:21:49.219784211Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:49.873562740Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:50.020772379Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:50.072796457Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:50.220264996Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:50.873845825Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:51.072817242Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:51.219518781Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:51.576116133Z" level=error msg="fatal task error" error="starting container failed: No such network: elk_elk" module=node/agent/taskmanager node.id=3dcuvcbrf9qh8f3mq2xddw9y4 service.id=qhyu2y7oxbww8tno955lpl4o7 task.id=9wv7xlxgeqqnfwru80l6co7zd
time="2017-10-24T17:21:51.873401210Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:52.020303949Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:52.072791527Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:52.219542166Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:52.873812595Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:53.020388834Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:53.072738912Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:53.419775468Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:53.873680980Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:54.072914098Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:54.219731937Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:54.315156097Z" level=error msg="failed removing service binding for 0ca3045a3078a48f46a03b4a1a398177a414fca04aaef2bd09a8635e050c0d24 epRec:{elk_kibana.1.3wgol6ysy1v3pak58a02ue39p elk_kibana iv7d6m7htbkgryhe2p4rodufa 10.0.0.7 10.0.0.15 [] [kibana] [58a87cd2fba7]} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:21:54.315414397Z" level=error msg="failed removing service binding for 269278f3cb173a2121550ae54607b64c67e1aa7e92de9310978c0b8a2212220e epRec:{elk_elastic.7gax1yova8zsg8fr2bcs2a71e.hk3dgs1f6h83x35y5v3wa5g0z elk_elastic qhyu2y7oxbww8tno955lpl4o7 <nil> 10.0.0.6 [] [elastic] [933e13c6afae]} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:21:54.315628797Z" level=error msg="failed removing service binding for 4906935a181f249935dc12b2006ca4244a54718046b5e0dd6e9f7ec2a8272e6c epRec:{elk_elasticsearch_proxy.1.ddrbwgfnams3wpj83m7oayd37 elk_elasticsearch_proxy ptcgzgacuepra4pp9q0mvj9ii 10.0.0.9 10.0.0.10 [] [elasticsearch_proxy] [363b92a5984d]} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:21:54.315827997Z" level=error msg="failed removing container name resolution for 6f7cf0a5678e085252344ad122e488bb6a15ae600a56b7c0107f930d686ca98e epRec:{elk_elk-endpoint   <nil> 10.0.0.3 [] [] []} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:21:54.316030797Z" level=error msg="failed removing container name resolution for 86c08f92fcfa3a194a6d5a04da2162ed30c7f32747fe66c718160ab525dfc006 epRec:{elk_elk-endpoint   <nil> 10.0.0.4 [] [] []} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:21:54.316236997Z" level=error msg="failed removing service binding for e1b4c088820e465c41103e1b57049bd88e32bc7e2702eeb43761a05a547ed875 epRec:{elk_elastic.l5cm81dnqu5taksonzv0xqj3o.jel6tai61dzok2dzl1q1uwwti elk_elastic qhyu2y7oxbww8tno955lpl4o7 <nil> 10.0.0.12 [] [elastic] [62fa17072482]} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:21:54.873505165Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:55.072827283Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:55.219641322Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:55.873710350Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:56.020406790Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:56.072828668Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:56.219574007Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:56.800549366Z" level=error msg="fatal task error" error="starting container failed: No such network: elk_elk" module=node/agent/taskmanager node.id=3dcuvcbrf9qh8f3mq2xddw9y4 service.id=qhyu2y7oxbww8tno955lpl4o7 task.id=vccv17utj2y9f3t6o2on1zsm5
time="2017-10-24T17:21:56.873539036Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:57.020300275Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:57.072694053Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:57.219549692Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:57.873690121Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:58.020370360Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:58.073043538Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:58.219619177Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:58.873748606Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:59.020556545Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:59.072927723Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:21:59.219796462Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:21:59.547306226Z" level=warning msg="rmServiceBinding handleEpTableEvent elk_kibana 0ca3045a3078a48f46a03b4a1a398177a414fca04aaef2bd09a8635e050c0d24 aborted c.serviceBindings[skey] !ok"
time="2017-10-24T17:21:59.547500026Z" level=warning msg="rmServiceBinding cleanupServiceBindings elk_elastic 269278f3cb173a2121550ae54607b64c67e1aa7e92de9310978c0b8a2212220e aborted s.loadBalancers[nid] !ok"
time="2017-10-24T17:21:59.547535726Z" level=warning msg="rmServiceBinding cleanupServiceBindings elk_elastic e1b4c088820e465c41103e1b57049bd88e32bc7e2702eeb43761a05a547ed875 aborted c.serviceBindings[skey] !ok"
time="2017-10-24T17:21:59.547558026Z" level=warning msg="rmServiceBinding cleanupServiceBindings elk_elasticsearch_proxy 4906935a181f249935dc12b2006ca4244a54718046b5e0dd6e9f7ec2a8272e6c aborted c.serviceBindings[skey] !ok"
time="2017-10-24T17:21:59.873991991Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:00.020519230Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:00.072853008Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:00.219684448Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:00.873688776Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:01.073308793Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:01.219581733Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:01.873861061Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:02.020534900Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:02.028125897Z" level=error msg="fatal task error" error="starting container failed: No such network: elk_elk" module=node/agent/taskmanager node.id=3dcuvcbrf9qh8f3mq2xddw9y4 service.id=qhyu2y7oxbww8tno955lpl4o7 task.id=x2yu7elxy8bzw4pxs8pgs7h8k
time="2017-10-24T17:22:02.072785779Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:02.219803418Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:02.873752547Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:03.020944785Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:03.073275964Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:03.219578103Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:03.873905432Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:04.020750571Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:04.072896049Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:04.219799188Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:04.782464955Z" level=warning msg="rmServiceBinding handleEpTableEvent elk_kibana 0ca3045a3078a48f46a03b4a1a398177a414fca04aaef2bd09a8635e050c0d24 aborted s.loadBalancers[nid] !ok"
time="2017-10-24T17:22:04.782701555Z" level=error msg="failed removing service binding for 269278f3cb173a2121550ae54607b64c67e1aa7e92de9310978c0b8a2212220e epRec:{elk_elastic.7gax1yova8zsg8fr2bcs2a71e.hk3dgs1f6h83x35y5v3wa5g0z elk_elastic qhyu2y7oxbww8tno955lpl4o7 <nil> 10.0.0.6 [] [elastic] [933e13c6afae]} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:22:04.782946654Z" level=error msg="failed removing service binding for 4906935a181f249935dc12b2006ca4244a54718046b5e0dd6e9f7ec2a8272e6c epRec:{elk_elasticsearch_proxy.1.ddrbwgfnams3wpj83m7oayd37 elk_elasticsearch_proxy ptcgzgacuepra4pp9q0mvj9ii 10.0.0.9 10.0.0.10 [] [elasticsearch_proxy] [363b92a5984d]} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:22:04.783212854Z" level=error msg="failed removing container name resolution for 6f7cf0a5678e085252344ad122e488bb6a15ae600a56b7c0107f930d686ca98e epRec:{elk_elk-endpoint   <nil> 10.0.0.3 [] [] []} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:22:04.783430754Z" level=error msg="failed removing container name resolution for 86c08f92fcfa3a194a6d5a04da2162ed30c7f32747fe66c718160ab525dfc006 epRec:{elk_elk-endpoint   <nil> 10.0.0.4 [] [] []} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:22:04.783655754Z" level=error msg="failed removing service binding for e1b4c088820e465c41103e1b57049bd88e32bc7e2702eeb43761a05a547ed875 epRec:{elk_elastic.l5cm81dnqu5taksonzv0xqj3o.jel6tai61dzok2dzl1q1uwwti elk_elastic qhyu2y7oxbww8tno955lpl4o7 <nil> 10.0.0.12 [] [elastic] [62fa17072482]} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:22:04.874161617Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:05.073032534Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:05.219733073Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:05.873552502Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:06.020468841Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:06.072732619Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:06.219727658Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:06.873733187Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:07.020633126Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:07.073006404Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:07.219352044Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:07.267742724Z" level=error msg="fatal task error" error="starting container failed: No such network: elk_elk" module=node/agent/taskmanager node.id=3dcuvcbrf9qh8f3mq2xddw9y4 service.id=qhyu2y7oxbww8tno955lpl4o7 task.id=6fr4zpr2cm0m6sbhus2dt2y6a
time="2017-10-24T17:22:07.873596447Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:08.020289030Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:08.073036405Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:08.219722217Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:08.873935565Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:09.020541277Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:09.072938152Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:09.219698165Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:09.873576309Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:10.020572022Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:10.072999597Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:10.121123566Z" level=error msg="failed removing service binding for 269278f3cb173a2121550ae54607b64c67e1aa7e92de9310978c0b8a2212220e epRec:{elk_elastic.7gax1yova8zsg8fr2bcs2a71e.hk3dgs1f6h83x35y5v3wa5g0z elk_elastic qhyu2y7oxbww8tno955lpl4o7 <nil> 10.0.0.6 [] [elastic] [933e13c6afae]} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:22:10.121390167Z" level=error msg="failed removing service binding for 4906935a181f249935dc12b2006ca4244a54718046b5e0dd6e9f7ec2a8272e6c epRec:{elk_elasticsearch_proxy.1.ddrbwgfnams3wpj83m7oayd37 elk_elasticsearch_proxy ptcgzgacuepra4pp9q0mvj9ii 10.0.0.9 10.0.0.10 [] [elasticsearch_proxy] [363b92a5984d]} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:22:10.121621567Z" level=error msg="failed removing container name resolution for 6f7cf0a5678e085252344ad122e488bb6a15ae600a56b7c0107f930d686ca98e epRec:{elk_elk-endpoint   <nil> 10.0.0.3 [] [] []} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:22:10.121827767Z" level=error msg="failed removing container name resolution for 86c08f92fcfa3a194a6d5a04da2162ed30c7f32747fe66c718160ab525dfc006 epRec:{elk_elk-endpoint   <nil> 10.0.0.4 [] [] []} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:22:10.122061568Z" level=error msg="failed removing service binding for e1b4c088820e465c41103e1b57049bd88e32bc7e2702eeb43761a05a547ed875 epRec:{elk_elastic.l5cm81dnqu5taksonzv0xqj3o.jel6tai61dzok2dzl1q1uwwti elk_elastic qhyu2y7oxbww8tno955lpl4o7 <nil> 10.0.0.12 [] [elastic] [62fa17072482]} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:22:10.220023109Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:10.874490752Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:11.020499263Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:11.072822238Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:11.219680849Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:11.877646295Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:12.020550901Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:12.072765176Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:12.219674387Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:12.497005685Z" level=error msg="fatal task error" error="starting container failed: No such network: elk_elk" module=node/agent/taskmanager node.id=3dcuvcbrf9qh8f3mq2xddw9y4 service.id=qhyu2y7oxbww8tno955lpl4o7 task.id=j2gqzsrud21to3n1iguozi2ph
time="2017-10-24T17:22:12.873633425Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:13.020507736Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:13.072907011Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:13.219887422Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:13.873733458Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:14.020456868Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:14.072875243Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:14.219750653Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:14.873637087Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:15.021050197Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:15.072921271Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:15.219610580Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:15.253289128Z" level=warning msg="rmServiceBinding cleanupServiceBindings elk_elastic 269278f3cb173a2121550ae54607b64c67e1aa7e92de9310978c0b8a2212220e aborted lb.backEnds[eid] !ok"
time="2017-10-24T17:22:15.253355029Z" level=warning msg="rmServiceBinding cleanupServiceBindings elk_elasticsearch_proxy 4906935a181f249935dc12b2006ca4244a54718046b5e0dd6e9f7ec2a8272e6c aborted c.serviceBindings[skey] !ok"
time="2017-10-24T17:22:15.253425729Z" level=warning msg="rmServiceBinding handleEpTableEvent elk_elastic e1b4c088820e465c41103e1b57049bd88e32bc7e2702eeb43761a05a547ed875 aborted s.loadBalancers[nid] !ok"
time="2017-10-24T17:22:15.873548513Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:16.020639122Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:16.072970697Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:16.219658205Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:16.873595935Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:17.020473844Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:17.072983919Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:17.219678427Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:17.740483766Z" level=error msg="fatal task error" error="starting container failed: No such network: elk_elk" module=node/agent/taskmanager node.id=3dcuvcbrf9qh8f3mq2xddw9y4 service.id=qhyu2y7oxbww8tno955lpl4o7 task.id=a4poyolrf8l4mivokha3wtc19
time="2017-10-24T17:22:17.873457355Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:18.072789738Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:18.219569645Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:18.873704072Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:19.020476980Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:19.072853154Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:19.219546861Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:19.873650285Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:20.020400692Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:20.072936966Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:20.219522973Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:20.587226591Z" level=warning msg="rmServiceBinding cleanupServiceBindings elk_elastic 269278f3cb173a2121550ae54607b64c67e1aa7e92de9310978c0b8a2212220e aborted lb.backEnds[eid] !ok"
time="2017-10-24T17:22:20.587300191Z" level=warning msg="rmServiceBinding cleanupServiceBindings elk_elastic e1b4c088820e465c41103e1b57049bd88e32bc7e2702eeb43761a05a547ed875 aborted c.serviceBindings[skey] !ok"
time="2017-10-24T17:22:20.587328391Z" level=warning msg="rmServiceBinding cleanupServiceBindings elk_elasticsearch_proxy 4906935a181f249935dc12b2006ca4244a54718046b5e0dd6e9f7ec2a8272e6c aborted c.serviceBindings[skey] !ok"
time="2017-10-24T17:22:20.873734995Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:21.020750302Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:21.072724475Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:21.137011066Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:21.873799802Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:22.020726109Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:22.072856882Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:22.219719988Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:22.874015406Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:22.975716949Z" level=error msg="fatal task error" error="starting container failed: No such network: elk_elk" module=node/agent/taskmanager node.id=3dcuvcbrf9qh8f3mq2xddw9y4 service.id=qhyu2y7oxbww8tno955lpl4o7 task.id=ql7u7vouevcoc6vgkxnowtt0t
time="2017-10-24T17:22:23.020794012Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:23.072811685Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:23.219714191Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:23.873782306Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:24.020758312Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:24.072926785Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:24.219567090Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:24.873622704Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:25.072994982Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:25.219549386Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:25.824534730Z" level=warning msg="rmServiceBinding cleanupServiceBindings elk_elastic 269278f3cb173a2121550ae54607b64c67e1aa7e92de9310978c0b8a2212220e aborted lb.backEnds[eid] !ok"
time="2017-10-24T17:22:25.825001030Z" level=warning msg="rmServiceBinding cleanupServiceBindings elk_elasticsearch_proxy 4906935a181f249935dc12b2006ca4244a54718046b5e0dd6e9f7ec2a8272e6c aborted c.serviceBindings[skey] !ok"
time="2017-10-24T17:22:25.825210531Z" level=warning msg="rmServiceBinding cleanupServiceBindings elk_kibana 0ca3045a3078a48f46a03b4a1a398177a414fca04aaef2bd09a8635e050c0d24 aborted c.serviceBindings[skey] !ok"
time="2017-10-24T17:22:25.825026830Z" level=warning msg="rmServiceBinding handleEpTableEvent elk_elastic e1b4c088820e465c41103e1b57049bd88e32bc7e2702eeb43761a05a547ed875 aborted s.loadBalancers[nid] !ok"
time="2017-10-24T17:22:25.873544298Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:26.023422007Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:26.073022676Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:26.219777880Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:26.873819990Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:27.020554994Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:27.072789366Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:27.219910770Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:27.873927978Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:28.020912582Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:28.072743853Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:28.208344841Z" level=error msg="fatal task error" error="starting container failed: No such network: elk_elk" module=node/agent/taskmanager node.id=3dcuvcbrf9qh8f3mq2xddw9y4 service.id=qhyu2y7oxbww8tno955lpl4o7 task.id=w6hp2cte0up4wu1jtzt61j4ec
time="2017-10-24T17:22:28.219838257Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:28.873717862Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:29.020589766Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:29.073104338Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:29.219635441Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:29.873774545Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:30.021048148Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:30.072865119Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:30.219873522Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:30.873661923Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:31.020610526Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:31.057319976Z" level=warning msg="rmServiceBinding cleanupServiceBindings elk_kibana 0ca3045a3078a48f46a03b4a1a398177a414fca04aaef2bd09a8635e050c0d24 aborted s.loadBalancers[nid] !ok"
time="2017-10-24T17:22:31.057520977Z" level=warning msg="rmServiceBinding cleanupServiceBindings elk_elastic 269278f3cb173a2121550ae54607b64c67e1aa7e92de9310978c0b8a2212220e aborted c.serviceBindings[skey] !ok"
time="2017-10-24T17:22:31.057678977Z" level=warning msg="rmServiceBinding cleanupServiceBindings elk_elastic e1b4c088820e465c41103e1b57049bd88e32bc7e2702eeb43761a05a547ed875 aborted c.serviceBindings[skey] !ok"
time="2017-10-24T17:22:31.057911877Z" level=warning msg="rmServiceBinding cleanupServiceBindings elk_elasticsearch_proxy 4906935a181f249935dc12b2006ca4244a54718046b5e0dd6e9f7ec2a8272e6c aborted c.serviceBindings[skey] !ok"
time="2017-10-24T17:22:31.073069598Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:31.219788600Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:31.873673799Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:32.020419801Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:32.072802473Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:32.219804574Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:32.874043972Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:33.020887873Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:33.072829145Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:33.219819946Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:33.448109858Z" level=error msg="fatal task error" error="starting container failed: No such network: elk_elk" module=node/agent/taskmanager node.id=3dcuvcbrf9qh8f3mq2xddw9y4 service.id=qhyu2y7oxbww8tno955lpl4o7 task.id=zlv55ue5p4lb1m2am8pmzyzef
time="2017-10-24T17:22:33.873609441Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:34.073025414Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:34.219846114Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:34.873996608Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:35.020502608Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:35.072855380Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:35.220059480Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:35.873692171Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:36.020662772Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:36.073035343Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:36.195538309Z" level=warning msg="rmServiceBinding cleanupServiceBindings elk_kibana 0ca3045a3078a48f46a03b4a1a398177a414fca04aaef2bd09a8635e050c0d24 aborted s.loadBalancers[nid] !ok"
time="2017-10-24T17:22:36.195770810Z" level=warning msg="rmServiceBinding cleanupServiceBindings elk_elastic 269278f3cb173a2121550ae54607b64c67e1aa7e92de9310978c0b8a2212220e aborted lb.backEnds[eid] !ok"
time="2017-10-24T17:22:36.195567809Z" level=warning msg="rmServiceBinding handleEpTableEvent elk_elasticsearch_proxy 4906935a181f249935dc12b2006ca4244a54718046b5e0dd6e9f7ec2a8272e6c aborted c.serviceBindings[skey] !ok"
time="2017-10-24T17:22:36.196240510Z" level=error msg="failed removing container name resolution for 6f7cf0a5678e085252344ad122e488bb6a15ae600a56b7c0107f930d686ca98e epRec:{elk_elk-endpoint   <nil> 10.0.0.3 [] [] []} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:22:36.196453611Z" level=error msg="failed removing container name resolution for 86c08f92fcfa3a194a6d5a04da2162ed30c7f32747fe66c718160ab525dfc006 epRec:{elk_elk-endpoint   <nil> 10.0.0.4 [] [] []} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:22:36.196662811Z" level=error msg="failed removing service binding for e1b4c088820e465c41103e1b57049bd88e32bc7e2702eeb43761a05a547ed875 epRec:{elk_elastic.l5cm81dnqu5taksonzv0xqj3o.jel6tai61dzok2dzl1q1uwwti elk_elastic qhyu2y7oxbww8tno955lpl4o7 <nil> 10.0.0.12 [] [elastic] [62fa17072482]} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:22:36.219626442Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:36.873615232Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:37.020916432Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:37.072849102Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:37.219872902Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:37.873919989Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:38.072939559Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:38.219788558Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:38.683807186Z" level=error msg="fatal task error" error="starting container failed: No such network: elk_elk" module=node/agent/taskmanager node.id=3dcuvcbrf9qh8f3mq2xddw9y4 service.id=qhyu2y7oxbww8tno955lpl4o7 task.id=00fy5vs560mtqnrnjseibr71g
time="2017-10-24T17:22:38.873768743Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:39.072881812Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:39.219910011Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:39.873675694Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:40.020925393Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:40.073101463Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:40.219804061Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:40.873847442Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:41.072930710Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:41.219741008Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:41.432509994Z" level=error msg="failed removing service binding for 4906935a181f249935dc12b2006ca4244a54718046b5e0dd6e9f7ec2a8272e6c epRec:{elk_elasticsearch_proxy.1.ddrbwgfnams3wpj83m7oayd37 elk_elasticsearch_proxy ptcgzgacuepra4pp9q0mvj9ii 10.0.0.9 10.0.0.10 [] [elasticsearch_proxy] [363b92a5984d]} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:22:41.432808094Z" level=error msg="failed removing container name resolution for 6f7cf0a5678e085252344ad122e488bb6a15ae600a56b7c0107f930d686ca98e epRec:{elk_elk-endpoint   <nil> 10.0.0.3 [] [] []} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:22:41.433022294Z" level=error msg="failed removing container name resolution for 86c08f92fcfa3a194a6d5a04da2162ed30c7f32747fe66c718160ab525dfc006 epRec:{elk_elk-endpoint   <nil> 10.0.0.4 [] [] []} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:22:41.433217995Z" level=error msg="failed removing service binding for e1b4c088820e465c41103e1b57049bd88e32bc7e2702eeb43761a05a547ed875 epRec:{elk_elastic.l5cm81dnqu5taksonzv0xqj3o.jel6tai61dzok2dzl1q1uwwti elk_elastic qhyu2y7oxbww8tno955lpl4o7 <nil> 10.0.0.12 [] [elastic] [62fa17072482]} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:22:41.873800687Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:42.020606684Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:42.072842055Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:42.219724952Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:42.873954129Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:43.020855626Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:43.072933896Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:43.219755493Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:43.873831968Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:43.920078130Z" level=error msg="fatal task error" error="starting container failed: No such network: elk_elk" module=node/agent/taskmanager node.id=3dcuvcbrf9qh8f3mq2xddw9y4 service.id=qhyu2y7oxbww8tno955lpl4o7 task.id=10jwg6c3lhbxiasvushsxrw16
time="2017-10-24T17:22:44.020818765Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:44.072889435Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:44.219727631Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:44.873971005Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:45.021097601Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:45.072956970Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:45.219798966Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:45.874899539Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:46.073505004Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:46.219800098Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:46.774850736Z" level=warning msg="rmServiceBinding handleEpTableEvent elk_kibana 0ca3045a3078a48f46a03b4a1a398177a414fca04aaef2bd09a8635e050c0d24 aborted c.serviceBindings[skey] !ok"
time="2017-10-24T17:22:46.774910736Z" level=error msg="failed removing service binding for 269278f3cb173a2121550ae54607b64c67e1aa7e92de9310978c0b8a2212220e epRec:{elk_elastic.7gax1yova8zsg8fr2bcs2a71e.hk3dgs1f6h83x35y5v3wa5g0z elk_elastic qhyu2y7oxbww8tno955lpl4o7 <nil> 10.0.0.6 [] [elastic] [933e13c6afae]} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:22:46.774942336Z" level=error msg="failed removing service binding for 4906935a181f249935dc12b2006ca4244a54718046b5e0dd6e9f7ec2a8272e6c epRec:{elk_elasticsearch_proxy.1.ddrbwgfnams3wpj83m7oayd37 elk_elasticsearch_proxy ptcgzgacuepra4pp9q0mvj9ii 10.0.0.9 10.0.0.10 [] [elasticsearch_proxy] [363b92a5984d]} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:22:46.774967736Z" level=error msg="failed removing container name resolution for 6f7cf0a5678e085252344ad122e488bb6a15ae600a56b7c0107f930d686ca98e epRec:{elk_elk-endpoint   <nil> 10.0.0.3 [] [] []} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:22:46.774986136Z" level=error msg="failed removing container name resolution for 86c08f92fcfa3a194a6d5a04da2162ed30c7f32747fe66c718160ab525dfc006 epRec:{elk_elk-endpoint   <nil> 10.0.0.4 [] [] []} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:22:46.775003936Z" level=error msg="failed removing service binding for e1b4c088820e465c41103e1b57049bd88e32bc7e2702eeb43761a05a547ed875 epRec:{elk_elastic.l5cm81dnqu5taksonzv0xqj3o.jel6tai61dzok2dzl1q1uwwti elk_elastic qhyu2y7oxbww8tno955lpl4o7 <nil> 10.0.0.12 [] [elastic] [62fa17072482]} err:network e671312jr3o0s634vnv40zlw5 not found"
time="2017-10-24T17:22:46.873562467Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:47.021484064Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:47.072900632Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:47.219809227Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:47.873684995Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:48.020561990Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:48.072780859Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:48.219749253Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:48.873795319Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:49.020462013Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:49.072854883Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:49.160185998Z" level=error msg="fatal task error" error="starting container failed: No such network: elk_elk" module=node/agent/taskmanager node.id=3dcuvcbrf9qh8f3mq2xddw9y4 service.id=qhyu2y7oxbww8tno955lpl4o7 task.id=xh6b82ora9pz3y3wcnyz9tsrr
time="2017-10-24T17:22:49.219821077Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:49.873737940Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:50.020694134Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:50.072950503Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:50.219897897Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:50.873628558Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:51.020474352Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:51.072935021Z" level=info msg="Node join event for 122803338add/10.8.23.4"
time="2017-10-24T17:22:51.219865614Z" level=info msg="Node join event for d95cc511a502/10.8.23.5"
time="2017-10-24T17:22:51.873780374Z" level=info msg="Node join event for 122803338add/10.8.23.4"
...

About this issue

  • Original URL
  • State: open
  • Created 7 years ago
  • Comments: 18 (6 by maintainers)

Most upvoted comments

This just happened to me on Docker CE 18.06.1 in a fresh multi-node Swarm on DigitalOcean Ubuntu 18.04. I’m going to guess at what happened.

“cannot create a swarm scoped network when swarm is not active”

I added a new manager nodes to an existing day-old 3-node swarm (all managers, all Linux 18.06.1-ce). I have a stack that runs a service in global mode, but the service set to none restart via restart_policy.

So even though I don’t need it in my case, a default stack file creates a default overlay network for that stack. I’m guessing the global service task of that stack tried to start up on new nodes and create the overlay network before swarm was ready for new overlay networks (a guess based on logs below), and since the service is set to not restart, it never tried a 2nd time.

I had to do a docker service update --force to fix it by having the tasks recreated and try again.

It was easily reproducible for me by following the above general steps:

  1. create a swarm (in my case 3 manager nodes)
  2. deploy a global service, set to restart_policy of none
  3. add a new node as a manager
  4. the new task created for that new node will fail to schedule

Key lines from the longer journald dockerd output look to be that it fails to create swarm scoped network, and then it joins gosip network:

Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.531039652Z" level=error msg="fatal task error" error="cannot create a swarm scoped network when swarm is not active" module=node/agent/taskmanager node.id=orpofgsk59hpeyoljqbl9p8bd service.id=y082ttfyyu5esxclonysbhgel task.id=z4g2collhprzr8392q15tcete
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.849137575Z" level=info msg="Initializing Libnetwork Agent Listen-Addr=0.0.0.0 Local-addr=10.132.61.121 Adv-addr=10.132.61.121 Data-addr= Remote-addr-list=[10.132.55.197 10.132.71.126 10.132.45.202] MTU=1500"
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.851130619Z" level=info msg="New memberlist node - Node:dvc4 will use memberlist nodeID:f4e8a467ca12 with config:&{NodeID:f4e8a467ca12 Hostname:dvc4 BindAddr:0.0.0.0 AdvertiseAddr:10.132.61.121 BindPort:0 Keys:[[208 176 165 126 184 97 224 213 155 32 117 31 110 122 228 79] [162 224 29 181 47 89 217 143 203 190 88 11 19 82 195 147] [60 178 199 120 130 158 129 44 154 152 42 133 225 234 187 179]] PacketBufferSize:1400 reapEntryInterval:1800000000000 reapNetworkInterval:1825000000000 StatsPrintPeriod:5m0s HealthPrintPeriod:1m0s}"
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.852856157Z" level=info msg="Node f4e8a467ca12/10.132.61.121, joined gossip cluster"

longer output of the whole second:

Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.261730553Z" level=info msg="parsed scheme: \"\"" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.261792864Z" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.262104791Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{10.132.45.202:2377 0  <nil>}]" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.262138269Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.262290154Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4203ef6d0, CONNECTING" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.293336474Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4203ef6d0, READY" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.297462042Z" level=info msg="parsed scheme: \"\"" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.297792622Z" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.298135243Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{10.132.45.202:2377 0  <nil>}]" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.298406304Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.298734586Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420358c20, CONNECTING" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.298824250Z" level=info msg="blockingPicker: the picked transport is not ready, loop back to repick" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.301928015Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420358c20, READY" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.323711784Z" level=info msg="parsed scheme: \"\"" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.323729478Z" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.323831844Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{10.132.45.202:2377 0  <nil>}]" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.323848248Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.323878425Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420a3e310, CONNECTING" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.327909780Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420a3e310, READY" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.338779856Z" level=info msg="parsed scheme: \"\"" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.338990390Z" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.339219421Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{/var/run/docker/swarm/control.sock 0  <nil>}]" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.339432911Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.339639460Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420ad43e0, CONNECTING" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.343037931Z" level=info msg="parsed scheme: \"\"" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.343229125Z" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.343625726Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{10.132.45.202:2377 0  <nil>}]" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.344002590Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.343912921Z" level=info msg="Listening for local connections" addr=/var/run/docker/swarm/control.sock module=node node.id=orpofgsk59hpeyoljqbl9p8bd proto=unix
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.343964776Z" level=info msg="Listening for connections" addr="[::]:2377" module=node node.id=orpofgsk59hpeyoljqbl9p8bd proto=tcp
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.345715317Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420ad43e0, READY" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.346592748Z" level=info msg="blockingPicker: the picked transport is not ready, loop back to repick" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.347073233Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420f54f80, CONNECTING" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.347920345Z" level=info msg="parsed scheme: \"\"" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.347939807Z" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.348193569Z" level=info msg="manager selected by agent for new session: { 10.132.45.202:2377}" module=node/agent node.id=orpofgsk59hpeyoljqbl9p8bd
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.348302196Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{10.132.45.202:2377 0  <nil>}]" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.348319447Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.348445188Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420f55150, CONNECTING" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.348729411Z" level=info msg="waiting 0s before registering session" module=node/agent node.id=orpofgsk59hpeyoljqbl9p8bd
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.351350474Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420f54f80, READY" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.352958068Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420f55150, READY" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.367197662Z" level=info msg="3e3e4f62bf2b8773 became follower at term 0" module=raft node.id=orpofgsk59hpeyoljqbl9p8bd
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.367243789Z" level=info msg="newRaft 3e3e4f62bf2b8773 [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]" module=raft node.id=orpofgsk59hpeyoljqbl9p8bd
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.367259044Z" level=info msg="3e3e4f62bf2b8773 became follower at term 1" module=raft node.id=orpofgsk59hpeyoljqbl9p8bd
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.367677532Z" level=info msg="parsed scheme: \"\"" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.368627379Z" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.369018019Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{10.132.45.202:2377 0  <nil>}]" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.369266273Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.369484812Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420f54f30, CONNECTING" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.371009395Z" level=info msg="parsed scheme: \"\"" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.371293907Z" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.372105343Z" level=info msg="parsed scheme: \"\"" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.372308000Z" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.373248200Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{10.132.55.197:2377 0  <nil>}]" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.373446019Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.373660944Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420fb5680, CONNECTING" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.374005684Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{10.132.71.126:2377 0  <nil>}]" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.374186155Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.374590508Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420fb5900, CONNECTING" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.378095709Z" level=info msg="3e3e4f62bf2b8773 [term: 1] received a MsgApp message with higher term from 4ab34fb5ffa39b16 [term: 2]" module=raft node.id=orpofgsk59hpeyoljqbl9p8bd
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.378375018Z" level=info msg="3e3e4f62bf2b8773 became follower at term 2" module=raft node.id=orpofgsk59hpeyoljqbl9p8bd
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.381567526Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420f54f30, READY" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.381713163Z" level=info msg="raft.node: 3e3e4f62bf2b8773 elected leader 4ab34fb5ffa39b16 at term 2" module=raft node.id=orpofgsk59hpeyoljqbl9p8bd
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.382431616Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420fb5900, READY" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.382996648Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420fb5680, READY" module=grpc
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.531039652Z" level=error msg="fatal task error" error="cannot create a swarm scoped network when swarm is not active" module=node/agent/taskmanager node.id=orpofgsk59hpeyoljqbl9p8bd service.id=y082ttfyyu5esxclonysbhgel task.id=z4g2collhprzr8392q15tcete
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.849137575Z" level=info msg="Initializing Libnetwork Agent Listen-Addr=0.0.0.0 Local-addr=10.132.61.121 Adv-addr=10.132.61.121 Data-addr= Remote-addr-list=[10.132.55.197 10.132.71.126 10.132.45.202] MTU=1500"
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.851130619Z" level=info msg="New memberlist node - Node:dvc4 will use memberlist nodeID:f4e8a467ca12 with config:&{NodeID:f4e8a467ca12 Hostname:dvc4 BindAddr:0.0.0.0 AdvertiseAddr:10.132.61.121 BindPort:0 Keys:[[208 176 165 126 184 97 224 213 155 32 117 31 110 122 228 79] [162 224 29 181 47 89 217 143 203 190 88 11 19 82 195 147] [60 178 199 120 130 158 129 44 154 152 42 133 225 234 187 179]] PacketBufferSize:1400 reapEntryInterval:1800000000000 reapNetworkInterval:1825000000000 StatsPrintPeriod:5m0s HealthPrintPeriod:1m0s}"
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.852856157Z" level=info msg="Node f4e8a467ca12/10.132.61.121, joined gossip cluster"
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.852955765Z" level=info msg="Node f4e8a467ca12/10.132.61.121, added to nodes list"
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.853556977Z" level=info msg="The new bootstrap node list is:[10.132.55.197 10.132.71.126 10.132.45.202]"
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.855772257Z" level=info msg="Node 5681999f3e6e/10.132.45.202, joined gossip cluster"
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.855828319Z" level=info msg="Node 5681999f3e6e/10.132.45.202, added to nodes list"
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.855848330Z" level=info msg="Node a2badeeb08d1/10.132.71.126, joined gossip cluster"
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.855956949Z" level=info msg="Node a2badeeb08d1/10.132.71.126, added to nodes list"
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.855984167Z" level=info msg="Node 84aa4809ce1c/10.132.55.197, joined gossip cluster"
Sep 06 06:53:38 dvc4 dockerd[4991]: time="2018-09-06T06:53:38.856013501Z" level=info msg="Node 84aa4809ce1c/10.132.55.197, added to nodes list"
Sep 06 06:53:39 dvc4 dockerd[4991]: time="2018-09-06T06:53:39.372302607Z" level=error msg="error reading the kernel parameter net.ipv4.vs.expire_nodest_conn" error="open /proc/sys/net/ipv4/vs/expire_nodest_conn: no such file or directory"
Sep 06 06:58:38 dvc4 dockerd[4991]: time="2018-09-06T06:58:38.853275077Z" level=info msg="NetworkDB stats dvc4(f4e8a467ca12) - netID:nnrff1zyrtq0sgm3qfnslgl7y leaving:false netPeers:4 entries:8 Queue qLen:0 netMsg/s:0"

@abhi

@saada is this consistently reproducible ?

Yes. We even tried deleting the stack and recreating it multiple times. It still occurs. What’s weird is that it doesn’t happen right away. It happens over time (usually under 24h time). It’s as though the network gets deleted by the daemon after being created initially. Once the issue occurs (between 3h-24h), the daemon crashes and restarting it fails instantly because at that point the docker network has already been deleted. So the mystery is, why would Docker Swarm delete the network from the worker node? The network still exists on the Manager and the healthy worker but not on that faulty worker. Also, it’s not specific to a machine. Sometimes the network gets deleted from worker1 and sometimes from worker2. It appears to be inconsistent.

Thanks for looking into this.

@thaJeztah that issue looks related somehow in that both issues are talking about global services.

Just to confirm that the scenario @BretFisher described above still applies to 19.03.5 CE