service-fabric: Unable to download container image from private repository

I have a SF (Microsoft.Azure.ServiceFabric.WindowsServer.5.6.220.9494) running the ClusterConfig.Unsecure.DevCluster configuration.

Docker was installed prior and was working and running these same images. Before installed SF I removed all running containers and cleaned up images and private repository logins.

I am trying to deploy an application pointing to my private repo, e.g. myrepo.azurecr.io/sf/myapp

I am able to use the repository credentials specified in the manifest to login to the repo from docker cli.

When SF tries to deploy the container it states:

Error event: SourceId='System.Hosting', Property='Download:1.0:1.0'.
There was an error during download.Failed to download container image myrepo.azurecr.io/sf/myapp

In the admin log I see this sequence of events:

End(BeginDownloadAndActivate): Error=HostingDeploymentInProgress, VersionedServiceTypeId={MyAppType_App10:MyAppPkg:MyAppType,1.0:1.0:131420102687941111}, ActivationContext=551bf757-1e64-45ca-9812-b99f3875df69, ServicePackagePublicActivationId=d87665d6-1421-4c9d-8d36-73442c5d7b80, SequenceNumber=185
DownloadContainerImages returned 0xd00000e5
Failed to import docker image error 0xd00000e5.
EndSendRequest for image history Error 0xd00000e5
DownloadContainerImages returned 0xd00000e5
Failed to import docker image error 0xd00000e5.
EndSendRequest for image history Error 0xd00000e5
80b671e9b3c9184bbd86d2f150c58135:131419843158391101:131419843669317074 failed to send message AddInstance to node 5101db1125ead8d47d6f93321d3eb754:131419843160891129 with error FABRIC_E_TIMEOUT

ServiceManifest

<EntryPoint>
  <ContainerHost>
    <ImageName>myrepo.azurecr.io/sf/myapp</ImageName>
  </ContainerHost>
</EntryPoint>

ApplicationManifest

<Policies>
  <ContainerHostPolicies CodePackageRef="Code">
    <RepositoryCredentials AccountName="myrepo" Password="mysecret" PasswordEncrypted="false" />
  </ContainerHostPolicies>
</Policies>

Also, it appears that dockerd is not running though I saw in a previous debug log that the docker process manager started dockerd successfully, but then exited with error code 1 which the log said was ok. I haven’t seen this happen again in the debug log as of yet.

One other thing to note, the image is rather large at about 9gb

Update: I installed docker on another host, logged into the private repo, pulled the image and then installed SF and was able to deploy the same manifests and run the container successfully. If the image does not exist in docker prior to SF trying to pull it then it fails. with the errors above.

dockerd process is not running successfully if the image doesn’t exist

Update: I wiped the images docker rmi $(docker images -q) from my local development workstation running Windows Server 2016, using the local dev SF deployment and deploying the application causes docker to download the image from my private repo. I did the same process on the broken 2016 container host but the main difference is that server is running the Core OS so there is no GUI, the docker host that is working is my 2016 development workstation so it’s local to Visual Studio.

Is my core container host broke, unsupported, or what? I am mostly at this point interested in how do I find out what’s wrong with it so if it’s something I did I don’t do it again.

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 46

Most upvoted comments

We’re presently testing internal builds with this - the next minor version update of SF (6.2) will have this fixed.

@mani-ramaswamy Any idea on how soon it will be fixed?

Service Fabric cannot run Windows containers on Windows 10 locally at present. This will be fixed in an upcoming release.

No, to run Linux containers, you need a SF Linux cluster at this time.

No you cannot run Linux containers on windows Service Fabric clusters today.

I’ve retried this with Azure Service Fabric 5.7 with Linux. Can I run Linux containers on a Windows Service Fabric cluster? If so, I’ll try that, too.