portainer: New nodes are unable to pull images from registry with authentication
Bug description
New nodes are unable to pull images from remote registry that requires authentication.
I have private registry in AWS ECR, nodes are running outside of AWS and this requires authentication in ECR then. When new node joins, Portainer will attempt to start containers on it including those defined as global. The automatic pull fails every time, right until I manually pull image via Portainer dashboard specifying that registry. Looking at docker logs when this issue happens I see it fails to pull image from that registry, which makes me think that the auth credentials are not used in that case (even though saved), but used (added) to node when manually pull image.
Expected behavior
Image pulled and containers started
Steps to reproduce the issue:
Steps to reproduce the behavior:
- Add registry with authentication and images in it in Portainer dashboard
- Deploy new stack from dashboard with some images in global mode
- Join new node to cluster
- …
Technical details:
- Portainer version: 1.20.0
- Docker version: mix of 18.09.1-ce and 18.06.1-ce
- Platform: Linux
- Command used to start Portainer:
curl -L https://downloads.portainer.io/portainer-agent-stack.yml -o portainer-agent-stack.yml && docker stack deploy --compose-file=portainer-agent-stack.yml portainer - Browser: Chrome 71.0.3578.98
Additional context
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 1
- Comments: 16 (2 by maintainers)
I’m struggling to run properly Docker Swarm on AWS EC2 instances with private ECR repository. Portainer isn’t much help because it adds another layer of “something can go wrong”.
What I’ve found (or done) so far:
aws erc get-loginon each node (manager and workers) just for a sake of being suredockerhas access to repository.:latestimage on any worker! Portainer/Swarm will always use it as fallback, even if newer version is available in the repository. You should allow Swarm to always fetch latest version of images based on tag, but referenced by sha (and this is correct!).--with-registry-authparameter which ends up withNo such image: ...and sha being removed from service’s image!docker service update --force --with-registry-auth service-namefixes the problem until next ECR credentials rotation.service update --with-registry-authyou can succesfuly “Update the service” from Portainer until next ECR credentials rotation.I have no idea if I’m doing something wrong, but Portainer seems to have problems with private registries. One solution (while using manual
docker loginfor ECR) could be to add--with-registry-authfor each “Update the service” while another than DockerHub repository is selected.While the solution is correct and it works The issue still remains for new nodes added to the swarm, they fail to pull images, even if you do a aws ecr3 get-login on that node it does help at all the only way I found was to re deploy same stack with the --with-registry-auth as stated before, when then seems to propagate the auth token to the recently added node and start working as espected.
In fact, it doesn’t look like a portainer related issue, which I am not currently using, it does seems a docker swarm issue related. Test conducted were issuing plain docker commands from the terminal.
Hope this helps
Experiencing similar behaviour. We have a swarm cluster on AWS. Each node has access to pull images from ECR and use ecr-credential-helper.
When deploying an image from ECR onto our swarm, it fails to schedule it. Once I pull the image manually on a swarm node, the container is successfully scheduled.
Worked for me. Thanks. I have a cluster swarn with three nodes. One registry deployed in each one. Share volume and traefik a loadbalancer. Registry with http authent. The problem was to pull the image on other node except the node 1. deploy the stack with --with-registry-auth worked for me.
I am also having the same issue but I am running my own swarm with a nexus docker repository. Portainer fails to find the image until I run
docker pull <repo>/<image>I’m thinking that this issue is actually related to https://github.com/portainer/portainer/issues/1533
Thanks for the report, we’re aware that Portainer do not really interact well with AWS and we’ll investigate a solution for this.
I had the same issue when deploying using
docker stackcommand the solution was to add--with-registry-autharg, maybe portainer engine will have to do the same