portainer: New nodes are unable to pull images from registry with authentication
Bug description
New nodes are unable to pull images from remote registry that requires authentication.
I have private registry in AWS ECR, nodes are running outside of AWS and this requires authentication in ECR then. When new node joins, Portainer will attempt to start containers on it including those defined as global. The automatic pull fails every time, right until I manually pull image via Portainer dashboard specifying that registry. Looking at docker logs when this issue happens I see it fails to pull image from that registry, which makes me think that the auth credentials are not used in that case (even though saved), but used (added) to node when manually pull image.
Expected behavior
Image pulled and containers started
Steps to reproduce the issue:
Steps to reproduce the behavior:
- Add registry with authentication and images in it in Portainer dashboard
- Deploy new stack from dashboard with some images in global mode
- Join new node to cluster
- …
Technical details:
- Portainer version: 1.20.0
- Docker version: mix of 18.09.1-ce and 18.06.1-ce
- Platform: Linux
- Command used to start Portainer:
curl -L https://downloads.portainer.io/portainer-agent-stack.yml -o portainer-agent-stack.yml && docker stack deploy --compose-file=portainer-agent-stack.yml portainer
- Browser: Chrome 71.0.3578.98
Additional context
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 1
- Comments: 16 (2 by maintainers)
I’m struggling to run properly Docker Swarm on AWS EC2 instances with private ECR repository. Portainer isn’t much help because it adds another layer of “something can go wrong”.
What I’ve found (or done) so far:
aws erc get-login
on each node (manager and workers) just for a sake of being suredocker
has access to repository.:latest
image on any worker! Portainer/Swarm will always use it as fallback, even if newer version is available in the repository. You should allow Swarm to always fetch latest version of images based on tag, but referenced by sha (and this is correct!).--with-registry-auth
parameter which ends up withNo such image: ...
and sha being removed from service’s image!docker service update --force --with-registry-auth service-name
fixes the problem until next ECR credentials rotation.service update --with-registry-auth
you can succesfuly “Update the service” from Portainer until next ECR credentials rotation.I have no idea if I’m doing something wrong, but Portainer seems to have problems with private registries. One solution (while using manual
docker login
for ECR) could be to add--with-registry-auth
for each “Update the service” while another than DockerHub repository is selected.While the solution is correct and it works The issue still remains for new nodes added to the swarm, they fail to pull images, even if you do a aws ecr3 get-login on that node it does help at all the only way I found was to re deploy same stack with the --with-registry-auth as stated before, when then seems to propagate the auth token to the recently added node and start working as espected.
In fact, it doesn’t look like a portainer related issue, which I am not currently using, it does seems a docker swarm issue related. Test conducted were issuing plain docker commands from the terminal.
Hope this helps
Experiencing similar behaviour. We have a swarm cluster on AWS. Each node has access to pull images from ECR and use ecr-credential-helper.
When deploying an image from ECR onto our swarm, it fails to schedule it. Once I pull the image manually on a swarm node, the container is successfully scheduled.
Worked for me. Thanks. I have a cluster swarn with three nodes. One registry deployed in each one. Share volume and traefik a loadbalancer. Registry with http authent. The problem was to pull the image on other node except the node 1. deploy the stack with --with-registry-auth worked for me.
I am also having the same issue but I am running my own swarm with a nexus docker repository. Portainer fails to find the image until I run
docker pull <repo>/<image>
I’m thinking that this issue is actually related to https://github.com/portainer/portainer/issues/1533
Thanks for the report, we’re aware that Portainer do not really interact well with AWS and we’ll investigate a solution for this.
I had the same issue when deploying using
docker stack
command the solution was to add--with-registry-auth
arg, maybe portainer engine will have to do the same