cache: Cache creation failed

I used the following code in the actions of my project:

      # Cache node_modules
      - name: Cache dependencies
        uses: actions/cache@v2
        id: yarn-cache
        with:
          path: |
            **/node_modules
          key: ${{ runner.os }}-yarn-${{ hashFiles('**/yarn.lock') }}
          restore-keys: |
            ${{ runner.os }}-yarn-

However, the following error occurred during execution, This makes me very confused:

    Warning: getCacheEntry failed: Cache service responded with 500

About this issue

Original URL
State: closed
Created 3 years ago
Reactions: 84
Comments: 69 (5 by maintainers)

Commits related to this issue

reverting npm caching until actions/cache#698 is resolved — committed to raelcun/plan-my-trip-frontend by raelcun 3 years ago
Commented node modules caching bacause it is not working for now by issue https://github.com/actions/cache/issues/698 — committed to ownik/ci-traffic-light by ownik 3 years ago
Temporarily disable cache on CI (actions/cache#698) — committed to ciffelia/og-image by ciffelia 3 years ago
Temporarily disable cache on CI (actions/cache#698) — committed to ciffelia/og-image by ciffelia 3 years ago
Temporarily disable cache on CI (actions/cache#698) — committed to felipecsl/obgen by felipecsl 3 years ago
Restore cache https://github.com/actions/cache/issues/698 — committed to apache/maven-gh-actions-shared by slachiewicz 3 years ago
revert: reenable cache - https://github.com/actions/cache/issues/698 — committed to FranciscoKloganB/vitesse-enterprise by FranciscoKloganB 2 years ago

Most upvoted comments

@dhadka This happens again right now. can you look at this?

+48

taehwanno on Jun 13, 2022

Also, just so we get visibility, please react with a 👍 if it’s resolved for you or 👎 if not.

+33

dhadka on Dec 13, 2021

Thanks for letting us know it’s fixed and for your patience!

Root Cause

We traced this outage back to a bug that was introduced last week to the framework code that our various microservices, of which caching is one, are built on.

When a new repo is created that uses the cache, it gets assigned to one of our databases. Eventually, these databases fill up and a background job automatically seals them, preventing new accounts from being assigned to that database. This normally isn’t a problem as we will provision a new database before sealing the existing one, but due to the bug above database creation was failing.

As a mitigation last night, we unsealed one of the existing databases. But that background job, which runs once an hour, re-sealed the database. Once we realized that job was undoing our mitigation, we took steps to disable the job. We will be deploying a fix for the original bug today.

Repair items

There will likely be other repair items as we look more today, but some initial repair items I have in mind are:

Add alerts for failed database creation. This would ping our on-call engineers and let them respond quicker.
Make the various setup-* actions fault-tolerant when using caching. Caching should be best effort and not fail workflows, but in this case exceptions thrown from the cache module weren’t being handled.
As a safety measure, add check to avoid sealing off all databases if there aren’t any others available.

+22

dhadka on Dec 13, 2021

One more here! The bug has return!

+19

Davidhidalgo on Jun 13, 2022

Fellows are gathering now 😃 This bug returned and seems to be kind of unstable

tkou-terrasky on Jun 13, 2022

I have the same problem. I was getting crazy because google gave me nothing and the github status page says actions is ok

koriwi on Dec 11, 2021

@dhadka This happens again right now. can you look at this?

link-jc on Jun 13, 2022

Yeah, unfortunately our original mitigation was undone by an automated job. We’ve reapplied the fix plus another change to disable that job, and are now trying to determine if that’s sufficient.

dhadka on Dec 13, 2021

They just returned server back alive. It works without any changes in the codebase

jobs still failing 😦

grasshoppergn on Jun 13, 2022

Works for me too, after several attemps, thx

Dougniel on Dec 13, 2021

Hola, why was this closed? It’s breaking all my pipelines 😦

See #820

For now, the workaround is to disable caching in your workflows - if possible

that’s a lot of work for an org with 4k repos 😦

sugarandmagic on Jun 13, 2022

A possible workaround may be to add continue-on-error: true to the step, so if the cache service fails, the action still continues.

this is not ideal as we usually expect the cache to continue forward. I think the warning should be actually an error.

ekm1 on Jun 13, 2022

Workflow failed twice, but now it’s working for me 😃

dev-song on Jun 13, 2022

Same here with setup-python…

I’ve created a support ticket with Github support, maybe that helps?

corneliusroemer on Dec 12, 2021

Ran into this issue as well. At least now I know it’s not something to do with the workflow itself. Might be worth to ping some staff? Doesn’t seem to be anything we can resolve right now on our own.

I agree, but I don’t know how to contact them

zhouhaixian on Dec 12, 2021

Hola, why was this closed? It’s breaking all my pipelines 😦

See https://github.com/actions/cache/issues/820

For now, the workaround is to disable caching in your workflows - if possible

TheBeachMaster on Jun 13, 2022

Try actions/setup-node with cache: ‘yarn’

The problem continues:

https://github.com/arthurfiorette/cache-parser/runs/4499172672?check_suite_focus=true

https://github.com/arthurfiorette/cache-parser/blob/727a6cb819869dc89756034ca3749bb2a0f4baa9/.github/workflows/codeql.yml#L33-L37

arthurfiorette on Dec 12, 2021

Ran into this issue as well. At least now I know it’s not something to do with the workflow itself. Might be worth to ping some staff? Doesn’t seem to be anything we can resolve right now on our own.

MartenM on Dec 12, 2021

@TheBeachMaster Thanks Bro 😃

moonpiderman on Jun 13, 2022