copilot-cli: Second (and following) deployments of services fail after copilot upgrade
Hey,
last week we upgraded our main copilot app by running app upgrade
. Since then we’re running into strange issues when redeploying any kind of service in the same app.
We were creating the app with copilot 1.22.0. We’ve been using the latest release of the copilot-cli for each deployment. And we’ve only triggered app upgrade
last week in order to use static sites.
For App Runner based services we see the following error:
deploy service retro to environment staging: deploy service: determine image repository type: image is not supported by App Runner: @sha256:96b7d5824ba87ef965f74db9a4f7babd95832852d3ce9b3b27219b7aa308a2ef
For ECS based services we see this error:
- Updating the infrastructure for stack tech-staging-oreo-dl [update rollback complete] [15.3s]
The following resource(s) failed to update: [TaskDefinition].
- An ECS service to run and maintain your tasks in the environment cluster [not started]
- An ECS task definition to group your containers and run them on ECS [delete complete] [0.0s]
Resource handler returned message: "Invalid request provided: Create T
askDefinition: Container.image repository should not be null or empty.
(Service: AmazonECS; Status Code: 400; Error Code: ClientException; R
equest ID: abc12f74-a49a-42f9-ac85-418debf2f7b2; Proxy: null)" (Reques
tToken: 1d5e884d-bb98-74b6-fddb-8f2bc2265329, HandlerErrorCode: Invali
dRequest)
So both errors seem to be related to ECR.
What we found out already:
- we can create new services and deploy them once. The second and all following attempts to deploy will result in the same error.
- It’s hard to verify now. But we think for some services we were able to deploy once after the
app upgrade
. but for others the next deployment directly failed.
I would love to get any feedback on how we can further debug this issue as this is blocking our teams. I’ll happily provide more information, if you tell me which.
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 2
- Comments: 18 (8 by maintainers)
That’s very good to know. Thanks for addressing this issue.
For me this issue is resolved right now as we know what was causing the problems and how we can avoid them in the future. Therefore I’m going to close it even though we didn’t find a good solution to fix services affected by this problem but completely delete and recreate them.
Thanks again for your support. That’s much appreciated.
Hello @schm.
It is built into the CLI.
I think “blocking a 1.29 client from accessing a 1.30 app” this one is a correct statement (if by “client” you meant Copilot CLI), so that your 1.29 client won’t be able to accidentally downgrade your 1.30 app (however, this can be overridden by passing
--allow-downgrade
flag).@iamhopaul123 it’s strange because im using v.1.28 and today i’ve got this problem 3 times (without touching the manifest). In one case the deploy failed and in the other two the state machine failed with error “failed to normalize image reference …” when running the job(issue #5032 ). They where jobs that I haven’t touched for weeks/months and the previous task version was likely deployed with an older Copilot version. On the jobs where i did
copilot job delete
ecc the following deploys and runs are going fine.Monday morning I will try the workaround you suggested.