maven-mvnd: `Could not acquire write lock for…` errors
I just upgraded mvnd to 1.0-m6-m39 via Snap from 0.9.0 (using Java 11) and ran a build of my reactor with 27 modules as usual (max parallelism 11). It spat out some errors such as
Could not acquire write lock for '~/.m2/repository/.locks/com.github.spotbugs~spotbugs-annotations~4.7.3.lock'
Could not acquire write lock for '~/.m2/repository/.locks/commons-collections~commons-collections~3.2.2.lock'
and the build failed partway through
Failed to execute goal org.apache.maven.plugins:maven-enforcer-plugin:3.0.0:enforce (enforce-bytecode-version) on project …: Execution enforce-bytecode-version of goal org.apache.maven.plugins:maven-enforcer-plugin:3.0.0:enforce failed: Could not acquire write lock for '~/.m2/repository/.locks/groupId~artifactId~1.234.lock' -> [Help 1]
Failed to execute goal some:plugin:1.2:some-mojo (default-some-mojo) on project …: Execution default-some-mojo of goal some:plugin:1.2:some-mojo failed: Could not acquire write lock for '~/.m2/repository/.locks/groupId2~artifactId2~5.678.lock' -> [Help 1]
Could not acquire write lock for '~/.m2/repository/.locks/groupId3~artifactId3~9.012.lock'
When I ran the build again, it passes, so I presume this was some sort of race condition, possibly involving artifact downloads (I had pulled in various POM updates since the last local build of the project).
No further details were provided, and I was not using -e so there was no stack trace giving context. I checked ~/.m2/mvnd/registry/1.0-m6/daemon-*.log which did not really add any more information:
Dispatch message: ExecutionFailure{projectId='…', halted=true, exception='java.lang.IllegalStateException: Could not acquire write lock for '~/.m2/repository/.locks/com.github.spotbugs~spotbugs-annotations~4.7.3.lock''}
If nothing else, the Throwable.toString of the cause ought to be included in the top-level error message I think.
About this issue
- Original URL
- State: open
- Created a year ago
- Reactions: 1
- Comments: 23 (10 by maintainers)
@cstamas I agree with moving forward but I hope you understand that such transformations often have a cost that has to be prioritized, particularly when the relevant systems are used by multiple teams on multiple projects to eliminate disruption of any operations.
That said, I do recall having tried running some of our builds on Mavel 3.9.x as an experiment in the past but I cannot remember off the top of my head the exact errors.
However, like I said, I have planned for this transformation to happen before the end of this year, as part of our yearly technical debt handling effort, at which point I will know the exact problem and I will report it, if it is still effective.
@eliasbalasis Distros delivered. Please check them out from the dist dev space.
Planning to package today.
I agree with this assesment, and is perfectly clear: mvnd only suffers from this bug (as it pressures the locking most), that is actually deep in resolver.
Locking in Mavem aims to solve two things:
For brevity we call them
-Twith or without “smart builder”/mvnd andHOW locking is set up (especially for which scenario, MT/MP) is left to user, some examples:
rwlock-localas name says is (JVM) local, so handles only MT buildsfile-lockcan handle MP cases (on single host, or on properly set up NFS volume)By the way, for non-mvnd users I’d personally recommend to NOT use the “vanilla”
-Tbuilder (that comes with Maven), but rather use the superior Takari Smart Builder (see README). This will make Maven parallel builds behave similarly as mvnd (also uses Smart builder) sans the persistent daemon and logging feature.I would really like to see a reproducer project that produces this kind of error, but let me explain what happens here…
By default, the parameter
aether.syncContext.named.time(30 seconds) is the amount of time a synchronization context shall wait to obtain a needed lock. In latest Maven/mvnd releases the message is actually improved to clearly show that (as can be seen above in comments, but not in original issue description: “Could not acquire write lock for $lockName in 30 SECONDS”). So, there is no cause per se, the cause IS the timeout.Sadly, way before NamedLocks (that are actually used as “low level fine grained locking implementation”) the
SyncContextwas defined way earlier and I “inherited” it. Major problem with this API is “coarseness” of it. As can be seen, it “grabs” all artifacts, so even if one overlapping artifact is asked by other context, mutual exclusion (in case of exclusive lock) is imminent.My current assumption/focus is “hot artifacts”: artifacts that are highly referenced (for example like slf4j-api might be in a project, or some “api” module that has zillion downstream plugin modules building against given api module, etc – usually “star shaped” projects at least when dependencies observed) due
SyncContextabove could cause that parallel resolved module dependencies became serially resolved, and – especially on larger projects – the “looser” threads end up congested, waiting more then 30 seconds, that will cause the timeout.Current (bad) workaround for this is raising the time limit using
-Daether.syncContext.named.time=40(default time unit is seconds) and “experiment”, as this value theoretically depends on project layout and size. But is bad, as you want fast builds, but this actually slows it down even more.An example of perf test against Apache Camel is here: https://cstamas.github.io/camel-perftest/ Sadly, I could not make it perform these errors.
So any (hopefully OSS) “reproducer project” is welcome, to sort out last bits of Maven locking. TIA!
I was now able to reproduce the lock errors also with pure maven and without mvnd. Therefore I created MNG-7868. IMHO this issue here is kind of invalid as it is a bug in maven itself and not in mvnd.