cache: Failed restore should fail step

Currently broken restore is not failing step, just sending a warning.

Example output

Run actions/cache@v1
[warning]connect ETIMEDOUT 13.107.42.16:443

About this issue

Original URL
State: closed
Created 4 years ago
Reactions: 3
Comments: 17 (6 by maintainers)

Commits related to this issue

Use newer actions/cache for improved reliability. Per https://github.com/actions/cache/issues/141#issuecomment-629342357, several of the recent commits have improved reliability: - https://github.co... — committed to chipsalliance/rocket-chip by richardxia 4 years ago
Use newer actions/cache for improved reliability. Per https://github.com/actions/cache/issues/141#issuecomment-629342357, several of the recent commits have improved reliability: - https://github.co... — committed to chipsalliance/rocket-chip by richardxia 4 years ago
Use newer GitHub actions/cache for improved reliability. (#2472) * Use newer actions/cache for improved reliability. Per https://github.com/actions/cache/issues/141#issuecomment-629342357, several... — committed to chipsalliance/rocket-chip by richardxia 4 years ago

Most upvoted comments

The readme on this action mentions skipping steps based on cache hits: https://github.com/actions/cache#Skipping-steps-based-on-cache-hit

But when the cache restore fails with the timeout mentioned by the OP, output.cache_hit is true. We had a recent failure, and the workflow continued expecting a successful cache hit (rather than recreating the cached content as it would with a cache miss)

2020-04-13T19:58:00.5904633Z ##[group]Run actions/cache@v1.1.2 2020-04-13T19:58:00.5904828Z with: 2020-04-13T19:58:00.5904952Z path: . 2020-04-13T19:58:00.5905237Z key: some-key 2020-04-13T19:58:00.5905368Z ##[endgroup] 2020-04-13T19:58:32.4250633Z [warning]connect ETIMEDOUT 13.107.42.16:443

I agree that failing the step isn’t necessarily the best choice for this action, but it would be really helpful if the output was accurate so that workflows could recreate the content that didn’t come back from the cache

bigdogwillfeed on Apr 13, 2020

@ppalaga Thank you for bringing this to our attention. We are aware of flakiness caused by network issues between the Actions runner and the cloud storage provider used by the cache, and are looking at ways to improve reliability.

Please note, however, that the cache action is “best effort” and assumes that if it fails, the subsequent steps can recreate the cached content. For this reason, a failure to restore the cache does not fail the job. This works well for project dependencies, which can just be downloaded from their package manager (maven, npm, etc.) if the cache action fails to restore (either by a cache miss or other error). Caches are also subject to eviction, so they can disappear potentially at any time.

Looking at the workflow file used by the project, it looks like it is storing the build artifacts in the cache, and restoring them for use by each of the test runs. For this kind of workflow, it likely makes sense to use workflow artifacts instead. Workflow artifacts are persistent longer and are designed to be more reliable than a cache.

dhadka on Apr 7, 2020