moby: `docker service logs` stops showing logs from containers on different nodes
Description
Running docker service logs foo
on a swarm master where foo
is a service with multiple replicas across different nodes eventually stops merging the logs from those other nodes. It seems to always work just fine right after the service is created.
Steps to reproduce the issue:
- Create a service
foo
with replicas across multiple nodes - Run
docker service logs --follow foo
- Initially observe logs from multiple containers across different nodes
- Go away and do something else for a while
- Run
docker service logs --follow foo
- Observe old logs from multiple containers across different nodes but new logs only contain logs from the node on which you’re running the command
Describe the results you received: Logs from containers on the current node only
Describe the results you expected: Logs from all containers on all nodes
Additional information you deem important (e.g. issue happens only occasionally):
Seems to work fine at first but then within some amount of time it stops working. I’ve tried with both json-file
and journald
log drivers.
Output of docker version
:
Client:
Version: 17.05.0-ce
API version: 1.29
Go version: go1.7.5
Git commit: 89658be
Built: Thu May 4 22:10:54 2017
OS/Arch: linux/amd64
Server:
Version: 17.05.0-ce
API version: 1.29 (minimum version 1.12)
Go version: go1.7.5
Git commit: 89658be
Built: Thu May 4 22:10:54 2017
OS/Arch: linux/amd64
Experimental: true
Output of docker info
:
Containers: 7
Running: 7
Paused: 0
Stopped: 0
Images: 6
Server Version: 17.05.0-ce
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 57
Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Swarm: active
NodeID: jbsbgj3on5coa7f996rle8bpk
Is Manager: true
ClusterID: 7uzbzxfjt8nf6p18wbzv8ek84
Managers: 1
Nodes: 2
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Number of Old Snapshots to Retain: 0
Heartbeat Tick: 1
Election Tick: 3
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Node Address: 172.16.0.5
Manager Addresses:
172.16.0.5:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9048e5e50717ea4497b757314bad98ea3763c145
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.4.0-75-generic
Operating System: Ubuntu 16.04.2 LTS
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 1.636GiB
Name: swarmm-master-94917428-0
ID: NNL7:YHDL:5ALU:4ZXF:J3BL:VAIV:UI2T:TV5U:UGQL:UCQC:WWCP:TQDO
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
File Descriptors: 149
Goroutines: 310
System Time: 2017-05-12T22:02:26.629917059Z
EventsListeners: 7
Registry: https://index.docker.io/v1/
Experimental: true
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: No swap limit support
Additional environment details (AWS, VirtualBox, physical, etc.): Running on azure using an acs-engine template (https://github.com/Azure/acs-engine). Currently just testing this so I’m using one manager and one worker node. The replicas for my service get split over both nodes.
About this issue
- Original URL
- State: open
- Created 7 years ago
- Reactions: 17
- Comments: 67 (6 by maintainers)
We ultimately decided to switch to kubernetes. We’re a small team, and really wanted to avoid the complexity of k8s. But honestly, it’s been quite a pleasure to work with. I wish everyone luck that this gets addressed … really frustrating to have this issue sit for so long
We experience the same problem (Docker version 17.12.1-ce, build 7390fc6). But if we user the Service ID instead of the Service Name, the logs go through:
We’re experiencing this as well. Not only are we missing logs from
docker service logs
but they’re not being sent to our log aggregator as well. Not all the time and I’m not sure how to reproduce it, but we see this often enough.When I run
docker service logs -f foo_bar
it spits out the log until some days ago. But this one works for me:docker service logs -f --since 24h foo_bar
and it is tailing along. May help you until bug is fixed. I am running 18.06.1-ce, build e68fc7a on Ubuntu 16.04. (For me using the ID of the active service container didn’t help; it stopped at some point yesterday).Somehow this still seems to be an issue.
Adding the
--since
flag fixes it.eg:
docker service logs -tf <service> --since 24h
try
docker log -f container-id
I am no docker expert by any means, so don’t ask me anything about it. But it is a very anoying problem and I want to share what I do when it happens.
I sometimes use the below command, which does work, but it’s anoying because on a cluster (3 nodes in my case) I have to log thre processes. Since the service is load balanced, but it does work.
For me to make the
docker service logs
work on my swarm cluster (3 nodes) I have to demote theLeader
. Then restart docker service on that node. Wait till it is ready and reachable. Then promote to master again.So the commands are along the lines of:
I run the below to make sure all the 3 nodes run the service
Perhaps it helps someone 👍
This works for me, on 18.09.1, with a caveat, which could potentially be a clue to what the issue is. If this is my current state (just a snippet, there are multiple instances of the service on multiple machines):
Note the currently running service has only been up for 20 minutes.
If I do
docker service logs -f --since 24h tv_web
, I get stale logs from the old containers. Same if I even dodocker service logs -f --since 1h tv_web
.But if I make the the “since” doesn’t go farther back then the start time of the running service, so in this case say 10m:
docker service logs -f --since 10m tv_web
.Then all will be well, and the current logs will tail.
Same here
I experience the same on a standalone swarm using Docker 18.06.1-ce.
I have one container running and one that was terminated:
When running
sudo docker service logs -f local-apt_go-apt-mirror
, I got the log of the terminated container.Same here:
Docker version 19.03.11, build 42e35e61f3
Sadly this didn’t solve the issue. Logs are not showing when
docker service logs ...
. I did disable firewall rules just in case but no, no logs output 😦EDIT: demoting/promoting start worked inmediately: https://github.com/moby/moby/issues/35932#issuecomment-517299052
I’m getting the same issue here.
Docker version 18.02.0-ce, build fc4de44
hi, we have the sample problem. when a service gets moved, doing service logs on the machine that ran the previous task shows the old logs of that task instead of the running one.
however using the taskid with service logs shows the right logs
Indeed, tried demoting / promoting and the logs “came back”: https://github.com/moby/moby/issues/35932#issuecomment-517299052
The issue happened after (re-)deploying a stack/service multiple times (10+)
Same here, Docker 19.03.5 on CentOS 7.
I have to connect to directly to worker nodes to get logs from the container directly. Using portainer for that now. Pity me.
Guys, too early to be sure, but I think explicitly setting the logger config made it work for me.
logging: driver: "json-file" options: max-file: 1 max-size: 20m
so if you need see log for emergency you can use the classic
docker log -f container-id
obviously you need an session in the docker node where run the container
You find the node with the next command
docker stack ps stackname
same problem but i use suse package 😢
docker service logs stops complete on ~$ docker version Client: Version: 18.09.3 API version: 1.39 Go version: go1.10.8 Git commit: 774a1f4 Built: Thu Feb 28 06:40:58 2019 OS/Arch: linux/amd64 Experimental: false
Server: Docker Engine - Community Engine: Version: 18.09.3 API version: 1.39 (minimum version 1.12) Go version: go1.10.8 Git commit: 774a1f4 Built: Thu Feb 28 05:59:55 2019 OS/Arch: linux/amd64 Experimental: false
@wayne-o We had to remove node by node as swarm manager and re-elect them again to managers. Once all nodes had been through this, it started working again. Tried that as well? 😃
This is affecting us, as well, on
18.06.1-ce, build e68fc7a
. I can tail the logs using the id which technically is good enough for us to limp along with.@dperny Yes I will try to grab some logs today. Thanks for looking into this.