runtime: [mono] Restore failure due to `The process cannot access the file x because it is being used by another process`

Description

When building runtime with mono-flavored runtime, restore often fails in CI environment with The process cannot access the file x because it is being used by another process. This affects runtime, as well as other builds (confirmed roslyn), thus specificity to runtime is that this does not occur on coreclr-flavored runtime.

Reproduction Steps

In an Alpine Edge linux-musl-x64 environment, you can use this modified aport to reproduce bug. Following steps to reproduce:

git clone https://gitlab.alpinelinux.org/ayakael/aports -b dotnet7/mono-restore
cd  aports/testing/dotnet7-stage0
abuild deps unpack prepare build

It should eventually fail.

The aport builds a minimum set of components (runtime-mono, roslyn, sdk, aspnetcore, installer) to be able to build an SDK tar, then using that produced tarball with mono-flavored runtime it builds the whole stack again. This aport is usually used to crossbuild to other platforms, but in this case I am using it to easily reproduce the bug.

You’d likely be able to reproduce this on linux-x64 by building runtime with /p:PrimaryRuntimeFlavor=Mono` and trying to build runtime with produced artifacts.

Expected behavior

Restore should occur without issue

Actual behavior

Restore fails with the following error:

/builds/ayakael/aports/testing/dotnet7-stage0/src/bootstrap/sdk/7.0.100-rtm.22519.39/NuGet.RestoreEx.targets(19,5): error : The process cannot access the file '/builds/ayakael/aports/testing/dotnet7-stage0/src/dotnet-d41bfecf5090e9163aa2da251246abed9e756e53/src/runtime/artifacts/obj/System.IO.Ports' because it is being used by another process. [/builds/ayakael/aports/testing/dotnet7-stage0/src/dotnet-d41bfecf5090e9163aa2da251246abed9e756e53/src/runtime/Build.proj]
/builds/ayakael/aports/testing/dotnet7-stage0/src/bootstrap/sdk/7.0.100-rtm.22519.39/NuGet.RestoreEx.targets(19,5): error :    at System.IO.FileSystem.CreateDirectory(String fullPath, UnixFileMode unixCreateMode) [/builds/ayakael/aports/testing/dotnet7-stage0/src/dotnet-d41bfecf5090e9163aa2da251246abed9e756e53/src/runtime/Build.proj]
/builds/ayakael/aports/testing/dotnet7-stage0/src/bootstrap/sdk/7.0.100-rtm.22519.39/NuGet.RestoreEx.targets(19,5): error :    at System.IO.FileSystem.CreateDirectory(String fullPath) [/builds/ayakael/aports/testing/dotnet7-stage0/src/dotnet-d41bfecf5090e9163aa2da251246abed9e756e53/src/runtime/Build.proj]
/builds/ayakael/aports/testing/dotnet7-stage0/src/bootstrap/sdk/7.0.100-rtm.22519.39/NuGet.RestoreEx.targets(19,5): error :    at System.IO.Directory.CreateDirectory(String path) [/builds/ayakael/aports/testing/dotnet7-stage0/src/dotnet-d41bfecf5090e9163aa2da251246abed9e756e53/src/runtime/Build.proj]
/builds/ayakael/aports/testing/dotnet7-stage0/src/bootstrap/sdk/7.0.100-rtm.22519.39/NuGet.RestoreEx.targets(19,5): error :    at NuGet.Commands.BuildAssetsUtils.WriteFiles(IEnumerable`1 files, ILogger log) [/builds/ayakael/aports/testing/dotnet7-stage0/src/dotnet-d41bfecf5090e9163aa2da251246abed9e756e53/src/runtime/Build.proj]
/builds/ayakael/aports/testing/dotnet7-stage0/src/bootstrap/sdk/7.0.100-rtm.22519.39/NuGet.RestoreEx.targets(19,5): error :    at NuGet.Commands.RestoreResult.CommitAssetsFileAsync(LockFileFormat lockFileFormat, IRestoreResult result, ILogger log, Boolean toolCommit, CancellationToken token) [/builds/ayakael/aports/testing/dotnet7-stage0/src/dotnet-d41bfecf5090e9163aa2da251246abed9e756e53/src/runtime/Build.proj]
/builds/ayakael/aports/testing/dotnet7-stage0/src/bootstrap/sdk/7.0.100-rtm.22519.39/NuGet.RestoreEx.targets(19,5): error :    at NuGet.Commands.RestoreResult.CommitAsync(ILogger log, CancellationToken token) [/builds/ayakael/aports/testing/dotnet7-stage0/src/dotnet-d41bfecf5090e9163aa2da251246abed9e756e53/src/runtime/Build.proj]
/builds/ayakael/aports/testing/dotnet7-stage0/src/bootstrap/sdk/7.0.100-rtm.22519.39/NuGet.RestoreEx.targets(19,5): error :    at NuGet.Commands.RestoreRunner.CommitAsync(RestoreResultPair restoreResult, CancellationToken token) [/builds/ayakael/aports/testing/dotnet7-stage0/src/dotnet-d41bfecf5090e9163aa2da251246abed9e756e53/src/runtime/Build.proj]
/builds/ayakael/aports/testing/dotnet7-stage0/src/bootstrap/sdk/7.0.100-rtm.22519.39/NuGet.RestoreEx.targets(19,5): error :    at NuGet.Commands.RestoreRunner.ExecuteAndCommitAsync(RestoreSummaryRequest summaryRequest, IRestoreProgressReporter progressReporter, CancellationToken token) [/builds/ayakael/aports/testing/dotnet7-stage0/src/dotnet-d41bfecf5090e9163aa2da251246abed9e756e53/src/runtime/Build.proj]
/builds/ayakael/aports/testing/dotnet7-stage0/src/bootstrap/sdk/7.0.100-rtm.22519.39/NuGet.RestoreEx.targets(19,5): error :    at NuGet.Commands.RestoreRunner.CompleteTaskAsync(List`1 restoreTasks) [/builds/ayakael/aports/testing/dotnet7-stage0/src/dotnet-d41bfecf5090e9163aa2da251246abed9e756e53/src/runtime/Build.proj]
/builds/ayakael/aports/testing/dotnet7-stage0/src/bootstrap/sdk/7.0.100-rtm.22519.39/NuGet.RestoreEx.targets(19,5): error :    at NuGet.Commands.RestoreRunner.RunAsync(IEnumerable`1 restoreRequests, RestoreArgs restoreArgs, CancellationToken token) [/builds/ayakael/aports/testing/dotnet7-stage0/src/dotnet-d41bfecf5090e9163aa2da251246abed9e756e53/src/runtime/Build.proj]
/builds/ayakael/aports/testing/dotnet7-stage0/src/bootstrap/sdk/7.0.100-rtm.22519.39/NuGet.RestoreEx.targets(19,5): error :    at NuGet.Commands.RestoreRunner.RunAsync(RestoreArgs restoreContext, CancellationToken token) [/builds/ayakael/aports/testing/dotnet7-stage0/src/dotnet-d41bfecf5090e9163aa2da251246abed9e756e53/src/runtime/Build.proj]
/builds/ayakael/aports/testing/dotnet7-stage0/src/bootstrap/sdk/7.0.100-rtm.22519.39/NuGet.RestoreEx.targets(19,5): error :    at NuGet.Build.Tasks.BuildTasksUtility.RestoreAsync(DependencyGraphSpec dependencyGraphSpec, Boolean interactive, Boolean recursive, Boolean noCache, Boolean ignoreFailedSources, Boolean disableParallel, Boolean force, Boolean forceEvaluate, Boolean hideWarningsAndErrors, Boolean restorePC, Boolean cleanupAssetsForUnsupportedProjects, ILogger log, CancellationToken cancellationToken) [/builds/ayakael/aports/testing/dotnet7-stage0/src/dotnet-d41bfecf5090e9163aa2da251246abed9e756e53/src/runtime/Build.proj]
/builds/ayakael/aports/testing/dotnet7-stage0/src/bootstrap/sdk/7.0.100-rtm.22519.39/NuGet.RestoreEx.targets(19,5): error :    at NuGet.Build.Tasks.Console.MSBuildStaticGraphRestore.RestoreAsync(String entryProjectFilePath, IDictionary`2 globalProperties, IReadOnlyDictionary`2 options) [/builds/ayakael/aports/testing/dotnet7-stage0/src/dotnet-d41bfecf5090e9163aa2da251246abed9e756e53/src/runtime/Build.proj]

Regression?

This error is confirmed not present with .NET runtime 6.0.10 in s390x, thus it seems to be a regression.

Known Workarounds

None so far.

Configuration

Version: SDK 7.0.100-rtm.22519.39 OS: Alpine LInux Edge docker-based CI Pipelines (although occurs to a lesser degree in s390x VM) Architecure: Confirmed reproducible in s390x, ppc64le and x64 (with mono-flavored runtime) Specificity: specific to mono on all platforms

Other information

logs ppc64le with runtime: link s390x with runtime: link s390x with roslyn: link

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 68 (67 by maintainers)

Commits related to this issue

Most upvoted comments

@ayakael you can apply this patch until we figure out what the real root cause is.

@directhex @akoeplinger @jkotas any thoughts on what could cause the errno to change after the native method being called and before native errno being returned?

                __retVal = __PInvoke(__path_native, mode);
                __lastError = System.Runtime.InteropServices.Marshal.GetLastSystemError();

Further investigation points to the bug being introduced somewhere around https://github.com/dotnet/runtime/pull/58799, which makes changes to CreateDirectory in such as a way as to make EAGAIN an unexpected error. The following patch seems to be a workaround around the issue:

That change is what surfaces the bug.

EAGAIN is not an expected errno from mkdir. That is also why handling it in the native pal has no effect.

As mentioned in another comment, it could be some issue with thread local storage causing us to pick up another thread’s errno.

Before https://github.com/dotnet/runtime/pull/58799, when the directory exists, that path did not read errno (it called stat which returns 0), while now it does (EEXIST from mkdir).

I’ll try to reproduce the issue on x64 using the instructions from https://github.com/dotnet/runtime/issues/77364#issuecomment-1318812338 next week.