quarkus-operator-sdk: LeaderElection - error while releasing lock makes integration tests fail
I am using failsafe to run the integration test. My tests were working well before the 5.X update, but now the tests pass, but failsafe itself crashes with:
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.271 s - in com.sicpa.ptf.extdboperator.ExternalDatabaseReconcilerIT
[INFO] Running com.sicpa.ptf.extdboperator.database.oracle.OracleDbActionIT
[WARNING] Tests run: 2, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 0 s - in com.sicpa.ptf.extdboperator.database.oracle.OracleDbActionIT
[INFO] Running com.sicpa.ptf.extdboperator.database.postgres.PostgresDbActionIT
[WARNING] Tests run: 4, Failures: 0, Errors: 0, Skipped: 4, Time elapsed: 0 s - in com.sicpa.ptf.extdboperator.database.postgres.PostgresDbActionIT
2023-02-07 08:09:24,538 INFO [io.qua.ope.run.AppEventListener] (main) Quarkus Java Operator SDK extension is shutting down.
2023-02-07 08:09:24,538 INFO [io.jav.ope.Operator] (main) Operator SDK 4.2.4 is shutting down...
2023-02-07 08:09:24,556 ERROR [io.fab.kub.cli.ext.lea.LeaderElector] (main) Exception occurred while releasing lock 'LeaseLock: default - external-db-operator (1e8ebe06-49eb-4c84-92ee-ee8609e942a1)' [Error Occurred After Shutdown]: io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.LockException: Unable to update LeaseLock
at io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.LeaseLock.update(LeaseLock.java:102)
at io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.release(LeaderElector.java:139)
at io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.stopLeading(LeaderElector.java:120)
at io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.lambda$null$1(LeaderElector.java:94)
...
2023-02-07 08:09:24,557 INFO [io.jav.ope.LeaderElectionManager] (main) Stopped leading for identity: 1e8ebe06-49eb-4c84-92ee-ee8609e942a1. Exiting.
[DEBUG] Closing the fork 1 after not saying Good Bye.
...
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
...
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-failsafe-plugin:3.0.0-M8:verify (default) on project
[ERROR] org.apache.maven.surefire.booter.SurefireBooterForkException: The forked VM terminated without properly saying goodbye. VM crash or System.exit called?
...
[ERROR] Error occurred in starting fork, check output in log
[ERROR] Process Exit Code: 1
[ERROR] at org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:714)
[ERROR] at org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:311)
[ERROR] at org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:268)
[ERROR] at org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1311)
[ERROR] at org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:1144)
[ERROR] at org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:910)
[ERROR] at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:137)
[ERROR] at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute2(MojoExecutor.java:370)
[ERROR] at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute(MojoExecutor.java:351)
[ERROR] at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:215)
[ERROR] at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:171)
[ERROR] at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:163)
[ERROR] at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:117)
[ERROR] at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:81)
[ERROR] at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:56)
[ERROR] at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
[ERROR] at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:294)
[ERROR] at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:192)
[ERROR] at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:105)
[ERROR] at org.apache.maven.cli.MavenCli.execute(MavenCli.java:960)
[ERROR] at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:293)
[ERROR] at org.apache.maven.cli.MavenCli.main(MavenCli.java:196)
[ERROR] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[ERROR] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[ERROR] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[ERROR] at java.base/java.lang.reflect.Method.invoke(Method.java:566)
[ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:282)
[ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:225)
[ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:406)
[ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:347)
I guess it is somewhat related to https://github.com/quarkiverse/quarkus-operator-sdk/issues/450, but I cannot use the same solution (disable when in dev mode), since I am not running tests in dev mode. There is no way to turn off leader election either, except by removing the feature altogether.
For info, my IT tests are annotated with @QuarkusTest
, I set the quarkus.operator-sdk.start-operator=false
property for tests only and have the following code to manually start the operator:
@QuarkusTest
class SomeTestIT {
@Inject
KubernetesClient client;
@Inject
Operator operator;
private static boolean isOperatorStarted = false;
@BeforeEach
public void startOperator() throws ClassNotFoundException {
if (!isOperatorStarted) {
operator.start();
isOperatorStarted = true;
}
// ...
}
}
Versions:
- quarkus-sdk: 5.0.4
- quarkus: 2.15.3.Final
- failsafe / surefire: 3.0.0-M8
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 18 (10 by maintainers)
Commits related to this issue
- feat: make it possible to deactivate leader election Fixes #505 — committed to quarkiverse/quarkus-operator-sdk by metacosm a year ago
- feat: make it possible to only enable leader election for some profiles Fixes #505 — committed to quarkiverse/quarkus-operator-sdk by metacosm a year ago
- feat: make it possible to only enable leader election for some profiles (#506) Fixes #505 — committed to quarkiverse/quarkus-operator-sdk by metacosm a year ago
failsafe launches a JVM fork to run the tests, and wait for it to exit properly. The whole process is explained here: https://maven.apache.org/surefire/maven-failsafe-plugin/examples/shutdown.html but from what I understood (and tested), if a JVM fork stops by itself (versus is stopped by failsafe) it is interpreted as a failure:
The fact that the operator stops with a
System.exit(1)
thus will always make failsafe fail.Concerning:
Not exactly: what I want to avoid is for the operator to connect to a cluster where a released version of the operator already runs. In this case, having leader election also in tests will avoid problems: the operator would hang because it cannot get the lease, the test would fail and the developer would understand his mistake. This is why I am not fond of turning the leader election off in tests. That being said, I could also check manually in a beforeAll hooks whether a lease exists or not, instead of relying on leader election. This is a possible workaround, not ideal but could work.
A property to disable leader election (that can be set to true in the test profile) would indeed solve this, and potentially other problems. I wouldn’t try to detect automatically if we are in tests though.