cache: Failed restore should fail step
Currently broken restore is not failing step, just sending a warning.
Example output
Run actions/cache@v1
[warning]connect ETIMEDOUT 13.107.42.16:443
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 3
- Comments: 17 (6 by maintainers)
Commits related to this issue
- Use newer actions/cache for improved reliability. Per https://github.com/actions/cache/issues/141#issuecomment-629342357, several of the recent commits have improved reliability: - https://github.co... — committed to chipsalliance/rocket-chip by richardxia 4 years ago
- Use newer actions/cache for improved reliability. Per https://github.com/actions/cache/issues/141#issuecomment-629342357, several of the recent commits have improved reliability: - https://github.co... — committed to chipsalliance/rocket-chip by richardxia 4 years ago
- Use newer GitHub actions/cache for improved reliability. (#2472) * Use newer actions/cache for improved reliability. Per https://github.com/actions/cache/issues/141#issuecomment-629342357, several... — committed to chipsalliance/rocket-chip by richardxia 4 years ago
The readme on this action mentions skipping steps based on cache hits: https://github.com/actions/cache#Skipping-steps-based-on-cache-hit
But when the cache restore fails with the timeout mentioned by the OP,
output.cache_hit
istrue
. We had a recent failure, and the workflow continued expecting a successful cache hit (rather than recreating the cached content as it would with a cache miss)I agree that failing the step isn’t necessarily the best choice for this action, but it would be really helpful if the output was accurate so that workflows could recreate the content that didn’t come back from the cache
@ppalaga Thank you for bringing this to our attention. We are aware of flakiness caused by network issues between the Actions runner and the cloud storage provider used by the cache, and are looking at ways to improve reliability.
Please note, however, that the cache action is “best effort” and assumes that if it fails, the subsequent steps can recreate the cached content. For this reason, a failure to restore the cache does not fail the job. This works well for project dependencies, which can just be downloaded from their package manager (maven, npm, etc.) if the cache action fails to restore (either by a cache miss or other error). Caches are also subject to eviction, so they can disappear potentially at any time.
Looking at the workflow file used by the project, it looks like it is storing the build artifacts in the cache, and restoring them for use by each of the test runs. For this kind of workflow, it likely makes sense to use workflow artifacts instead. Workflow artifacts are persistent longer and are designed to be more reliable than a cache.