azure-pipelines-tasks: Cache step error in pipeline: CaptureResult has already been called for action PipelineCache.RestoreCache
Required Information
Entering this information will route you directly to the right team and expedite traction.
Question, Bug, or Feature?
Type: Bug
Enter Task Name: Cache@2
Environment
-
Server - Azure Pipelines
-
Agent - Hosted
Issue Description
Error occurred during node modules caching: CaptureResult has already been called for action PipelineCache.RestoreCache
I found the same issue which was closed without resolution.
Base on task documentation there is no any limitation related to stage where it can be used.
The same task works well on build pipeline and fails only on CD pipeline, as result we can’t use cache task, and should every time download all node modules during each CD pipeline execution
our pipeline definition:
resources:
pipelines:
- pipeline: build
source: build
trigger:
branches:
include:
- develop
variables:
- group: variable-group-dev
trigger:
enabled: false
pr:
enabled: false
pool:
vmImage: 'ubuntu-latest'
stages:
- stage: Deploy
displayName: Deploy Dev
jobs:
- job: Deploy
timeoutInMinutes: 0
steps:
- checkout: self
- task: UsePythonVersion@0
displayName: 'use python 3'
inputs:
versionSpec: '3.x'
addToPath: true
- task: CmdLine@2
displayName: 'pip install aws cli'
inputs:
script: pip3 install awscli
- task: Cache@2
displayName: cache node_modules
inputs:
key: 'npm1 | "$(Agent.OS)" | $(Build.SourcesDirectory)/cdk/package-lock.json'
restoreKeys: |
npm1 | "$(Agent.OS)"
path: $(Build.SourcesDirectory)/cdk/node_modules
- task: CmdLine@2
displayName: 'npm install'
inputs:
script: npm install
workingDirectory: cdk
... remain part omitted
Task logs
Starting: cache node_modules
==============================================================================
Task : Cache
Description : Cache files between runs
Version : 2.0.1
Author : Microsoft Corporation
Help : https://aka.ms/pipeline-caching-docs
==============================================================================
Resolving key:
- npm1 [string]
- "Linux" [string]
- /home/vsts/work/1/s/cdk/package-lock.json [file] --> 766D530EB535318821053734B01BD300A4EBA42257296D3CAB307A3C72639658
Resolved to: npm1|"Linux"|NQ8TuIEXcUX8/HdObojYu1L3twyI0N+FYCfxie4A7+g=
Resolving restore key:
- npm1 [string]
- "Linux" [string]
Resolved to: npm1|"Linux"|**
ApplicationInsightsTelemetrySender will correlate events with X-TFS-Session 4c6273e8-46a6-4da0-b85a-a49697a8ddea
Getting a pipeline cache artifact with one of the following fingerprints:
Fingerprint: `npm1|"Linux"|NQ8TuIEXcUX8/HdObojYu1L3twyI0N+FYCfxie4A7+g=`
Fingerprint: `npm1|"Linux"|**`
Getting a pipeline cache artifact with one of the following fingerprints:
Fingerprint: `npm1|"Linux"|NQ8TuIEXcUX8/HdObojYu1L3twyI0N+FYCfxie4A7+g=`
Fingerprint: `npm1|"Linux"|**`
Getting a pipeline cache artifact with one of the following fingerprints:
Fingerprint: `npm1|"Linux"|NQ8TuIEXcUX8/HdObojYu1L3twyI0N+FYCfxie4A7+g=`
Fingerprint: `npm1|"Linux"|**`
Getting a pipeline cache artifact with one of the following fingerprints:
Fingerprint: `npm1|"Linux"|NQ8TuIEXcUX8/HdObojYu1L3twyI0N+FYCfxie4A7+g=`
Fingerprint: `npm1|"Linux"|**`
ApplicationInsightsTelemetrySender correlated 5 events with X-TFS-Session 4c6273e8-46a6-4da0-b85a-a49697a8ddea
##[error]CaptureResult has already been called for action PipelineCache.RestoreCache
Finishing: cache node_modules
Troubleshooting
Checkout how to troubleshoot failures and collect debug logs: https://docs.microsoft.com/en-us/vsts/build-release/actions/troubleshooting
Error logs
[Insert error from the logs here for a quick overview]
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 46
- Comments: 44 (7 by maintainers)
Commits related to this issue
- fix: workaround cache job failures There seems to be an issue with Cache jobs on Azure devops. See here: https://github.com/microsoft/azure-pipelines-tasks/issues/15518 — committed to VowpalWabbit/vowpal_wabbit by jackgerrits 3 years ago
- fix: workaround cache job failures (#3484) There seems to be an issue with Cache jobs on Azure devops. See here: https://github.com/microsoft/azure-pipelines-tasks/issues/15518 — committed to VowpalWabbit/vowpal_wabbit by jackgerrits 3 years ago
- Allow pipeline to continue if cache tasks error Added after seeing errors with Cache task as reported here: https://github.com/microsoft/azure-pipelines-tasks/issues/15518 — committed to CMeeg/next-azure by CMeeg 3 years ago
+1 I changed the cache key added a salt, but then another item just failed in the same way that had been working. Major issue. Failure like this should be a cache miss, please fix asap
We think there are two problems here:
So to summarize, the PR that went into the agent won’t increase or decrease the occurrence of the issue but it should log the true backend error message. We are still investigating why downloading from cache is sometimes failing with FormatExceptions.
continueOnError: true
is better then disabling as it will still try/steps that succeeded will still benefit from cache.
Adding some additional notes:
We’ve got a pipeline using this task on hosted agents, last run that succeeded was at 1200GMT today (19 November):
Initialize Job Logs:
Cache Task has a single entry for “Getting a pipeline cache artifact”:
The pipeline started failing with a run at 1232GMT
As you can see the versions of the tasks, agent image, etc. are the same. The Cache task then reported:
So, using the same fingerprint, but clearly failing to communicate with the cache store, retrying 4 times, and then erroring out. Running again with diagnostics enabled gives the following detailed logs:
This makes it look like an error connecting to the cache store, and not handling that properly.
As this task is designed to cache a transient, rebuildable blob, it would probably be better for this to be a warning and report back as a cache miss so that the pipeline can continue and handle the miss as it would do normally. At the moment this is completely blocking builds from happening until we remove/skip the cache step temporarily.
Update:
Early read of the telemetry is indicating that relief is in place. Do let us know if you see otherwise. We’ve mitigated the incident for now.
It looks like we hit this bug https://github.com/dotnet/runtime/issues/4774. We found persisted datetimes like this “cr:expirationDate”: “11/26/2021 2:31:39 PM +00:00 +00:00”, which looks exactly like the problem described in the issue. It’s likely a single machine in our application tier hit this bug and persisted “corrupt” data that caused problems for anyone trying to pull those cache item.
We believe we no longer have a “bad” AT in the mix so changing the salt should mitigate it. We are also looking to deploy something that will mitigate this without changing the salt.
We set a variable and have a condition on all our cache/cache miss items. so if it misses, or fails with continueOneError, the next step will run and add missing things.
As in any other step it can be disabled. Im already using it as a workround…
Well we’re seeing this problem as well. Disabled all of our cache steps and are taking the build time hit so that we can keep moving forward–albeit very slowly.
Ok, solution(sort of)
Mark all your cache jobs with:
continueOnError: true
This won’t truly fix the issue, but it will make the pipelines continue to process without the cache if one caching rule failes.
@carl-tanner I am still seeing this issue. Anyone else?
I just create a new pipeline and the error is gone. The previous pipeline still not working
Encountering this as well – last build succeeded, same pipeline.yaml, < 4hr prior
One of the problems with this issue is many different errors were being reported as “CaptureResult has already been called for action”. Your issue is something else. The underlying error for your cache task is “System.ArgumentException: Unable to find pipeline caching scopes.” When the latest version of the pipelines agent is rolled out you’ll start seeing that error. Feel free to open a new issue for this and we can try and figure out what is going on for your pipelines.
Interestingly I just re-enabled our cache tasks with
continueOnError: true
and they initially all had a cache miss even though the full fingerprints were identical to the run from Friday morning, and subsequent runs are now either working as expected, or failing with as a warning but continuing as required:Doesn’t look like it, the task version listed in the initialisation log (2.0.1) is the same as for older successful builds.
People are also reporting that builds with multiple cache tasks are seeing some tasks succeeding while one fails within a single build.
You could adjust your condition to skip the nuget restore only if the output variable is
true
- this will also help if you have a partial hit (inexact
), which could restore a potentially out of date cache.@pynej We have a nuget restore task that kicks in only when the cache task misses. Wouldn’t
continueOnError: true
cause the package restore to be skipped which might result in build failures if a package is missing?Everything so far appears to be pointing to an issue with the Cache task talking to the underlying storage - concerning possibly for the telemetry if I’m reading the stack traces correctly.
Only reliable work around at this time @TOuhrouche appears to be either:
Of those,
1
is probably safest.The issue really is then to understand how we’re going to know the problem has been fixed, and we can re-enable the tasks again.
Same here.
I’ve logged a Sev A ticket w/ Microsoft. I suggest you do the same if you have a support plan for Azure.
Please Disable Caching task and proceed for now. If performance is not an issue.