origin: image not found when pulling from integrated registry - service account not allowed to pull?

I can successfully deploy the integrated registry and push a custom image from my client. However, creating a pod that uses this image does not work. Error messages indicate that the image could not be found, but as I’ve tried to reference the image by different names and none worked, and I also see authentication related error messages from docker, I assume this has to do with failed authentication of the serviceaccount against the internal registry.

Version
[kubmaster@kubmaster1-prod ~]$ oc version
oc v3.6.1+008f2d5
kubernetes v1.6.1+5115d708d7
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://kubmaster1.mycompany.com:8443
kubernetes v1.6.1+5115d708d7
Steps To Reproduce
  1. After installing openshift with Ansible, but without the registry, create the registry to the existing cluster as in Deploying a Registry on Existing Cluster.
  2. push a custom image. 2a. I’ve exposed the registry with a route and public hostname so I could use it from my client. 2b. I’ve created a user for remote access as described in Accessing the Registry 2c. I’ve pushed the image to the registry using the public hostname: $ docker push registry.mycompany.com/default/ipsec-router 2d. I can see the image on the master logged in as system:admin:
[kubmaster@kubmaster1-prod ~]$ oc get is
NAME           DOCKER REPO                                             TAGS      UPDATED
ipsec-router   docker-registry.default.svc:5000/default/ipsec-router   latest    42 minutes ago
  1. create pod referencing the image
apiVersion: v1
kind: Pod
metadata:
  generateName: testapp-
spec:
  # for testing; known where to grab the docker logs from
  nodeSelector:
    openshift-infra: apiserver
  containers:
  - name: nginx
    image: nginx:1.7.9
    ports:
    - containerPort: 80
  - name: ipsec-router
    image: default/ipsec-router

NOTE: I’m note sure about naming conventions of the image attribute but I’ve tried different variations like ipsec-router, docker-registry.default.svc:5000/default/ipsec-router, etc. All errors indicate that the image was not found, but I don’t think that this is the issue, see below.

All actions happen in default project.

Current Result

The pod creation fails.

[kubmaster@kubmaster1-prod ~]$ oc get pods
NAME                      READY     STATUS             RESTARTS   AGE
docker-registry-2-53kgw   1/1       Running            1          1h
router-1-b8d16            1/1       Running            2          1h
testapp-qp101             1/2       ImagePullBackOff   0          15m

Pod events:

Events:
  FirstSeen	LastSeen	Count	From					SubObjectPath			Type		Reason		Message
  ---------	--------	-----	----					-------------			--------	------		-------
  15m		15m		1	default-scheduler							Normal		Scheduled	Successfully assigned testapp-qp101 to kubmaster1.mycompany.com
  15m		15m		1	kubelet, kubmaster1.mycompany.com	spec.containers{nginx}		Normal		Pulled		Container image "nginx:1.7.9" already present on machine
  15m		15m		1	kubelet, kubmaster1.mycompany.com	spec.containers{nginx}		Normal		Created		Created container
  15m		15m		1	kubelet, kubmaster1.mycompany.com	spec.containers{nginx}		Normal		Started		Started container
  15m		13m		4	kubelet, kubmaster1.mycompany.com	spec.containers{ipsec-router}	Normal		Pulling		pulling image "default/ipsec-router"
  15m		13m		4	kubelet, kubmaster1.mycompany.com	spec.containers{ipsec-router}	Warning		Failed		Failed to pull image "default/ipsec-router": rpc error: code = 2 desc = Error: image default/ipsec-router:latest not found
  15m		5m		41	kubelet, kubmaster1.mycompany.com	spec.containers{ipsec-router}	Normal		BackOff		Back-off pulling image "default/ipsec-router"
  15m		15s		70	kubelet, kubmaster1.mycompany.com					Warning		FailedSync	Error syncing pod
Expected Result

Pod should get created.

Additional Information
ERROR: [DClu1019 from diagnostic ClusterRegistry@openshift/origin/pkg/diagnostics/cluster/registry.go:343]
       Diagnostics created a test ImageStream and compared the registry IP
       it received to the registry IP available via the docker-registry service.

       docker-registry      : 172.30.210.175:5000
       ImageStream registry : docker-registry.default.svc:5000

       They do not match, which probably means that an administrator re-created
       the docker-registry service but the master has cached the old service
       IP address. Builds or deployments that use ImageStreams with the wrong
       docker-registry IP will fail under this condition.

       To resolve this issue, restarting the master (to clear the cache) should
       be sufficient. Existing ImageStreams may need to be re-created.

According to this mailing list entry this could be a bug. In any case, I do not experience DNS related issues. In fact, the service can be reached and I see credentials for the IP address as well as the service name (see below).

  • registry container logs show no relevant information (only health checks)

  • dockerd system logs on the system where the pod should be created indicate authentication problem:

Nov 30 01:53:00 kubmaster1-prod dockerd-current: time="2017-11-30T01:53:00.158288891+01:00" level=error msg="Attempting next endpoint for pull after error: unauthorized: authentication required"
Nov 30 01:53:00 kubmaster1-prod dockerd-current: time="2017-11-30T01:53:00.588303570+01:00" level=error msg="Not continuing with pull after error: Error: image default/ipsec-router:latest not found"
Nov 30 01:53:15 kubmaster1-prod dockerd-current: time="2017-11-30T01:53:15.954754041+01:00" level=error msg="Handler for GET /v1.24/images/default/ipsec-router:latest/json returned error: No such image: default/ipsec-router:latest"
Nov 30 01:53:31 kubmaster1-prod dockerd-current: time="2017-11-30T01:53:31.950696553+01:00" level=error msg="Handler for GET /v1.24/images/default/ipsec-router:latest/json returned error: No such image: default/ipsec-router:latest"
Nov 30 01:53:43 kubmaster1-prod dockerd-current: time="2017-11-30T01:53:43.947386178+01:00" level=error msg="Handler for GET /v1.24/images/default/ipsec-router:latest/json returned error: No such image: default/ipsec-router:latest"

This sounds to me as if there was an authentication problem for the internal registry and the the not found message comes from the other registries tried.

  • not sure if following should work, but it doesn’t:
$ oc get secrets
$ oc describe secret default-dockercfg-zbb95
...
dockercfg:      {"172.30.210.175:5000":{"username":"serviceaccount","password":"xxx...","email":"serviceaccount@example.org","auth":"yyy..."},"docker-registry.default.svc:5000":{"username":"serviceaccount","password":"xxx...","email":"serviceaccount@example.org","auth":"yyy..."}}
...
$ oc login --token=xxx....
Logged into "https://kubmaster1.mycompany.com:8443" as "system:serviceaccount:default:default" using the token provided.
...
$ docker login -u $(oc whoami) -p $(oc whoami -t) docker-registry.default.svc.cluster.local:5000
Error response from daemon: Get https://docker-registry.default.svc.cluster.local:5000/v2/: unauthorized: authentication required

As said, not sure if that should actually work, but it would match the error message seen in the system log from dockerd.

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 37 (23 by maintainers)

Most upvoted comments

@pweil-, @amather I just wanted to report that we are seeing identical problems when upgrading a fully automated installation from 3.6 to 3.7. The only difference being the version of OpenShift. I does seem like something has changed between these versions causing this registry problem.

I still have the issue with an openshift cluster 3.9, the docker client wrongly returns an error “Error: image myimage/myimage:latest not found” while it is not logged in. It should return an error like “Authentication required”.

@pweil- @bparees

From the internal docker registry logs, we can derive that the pods are being authorized as system:anonymous. This is not what we expected, since we defined imagePullSecrets for the deployment which contains a secret with a user that has roles system:image-builder, system:image-puller and system:registry. Additionally, we added role system:image-puller to the service account that is used for the deployment. When we add define system:image-puller to system:anonymous, the pods can pull the images and start running.