bazel: Bazel clean --expunge or Bazel shutdown unable to kill stale bazel processes
Description of the problem:
The bazel buld
or bazel query
creates a stale bazel process even after the bazel build/query is completed. This prevents future invocation of other bazel commands
Bugs: what’s the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
On Tekton task list we are following below commands
- bazel query //… (or a list of targets)
- Once the query is completed, we are still seeing a bazel process and its child process seen running
jenkins 2064 1 47 04:49 ? 00:03:07 bazel(directory) -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/home/jenkins/.cache/bazel/_bazel_jenkins/41b4626fb6512837d24f630cb1632ba8 --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED -Xverify:none -Djava.util.logging.config.file=/home/jenkins/.cache/bazel/_bazel_jenkins/41b4626fb6512837d24f630cb1632ba8/javalog.properties -Dcom.google.devtools.build.lib.util.LogHandlerQuerier.class=com.google.devtools.build.lib.util.SimpleLogHandler$HandlerQuerier -XX:-MaxFDLimit -Djava.library.path=/home/jenkins/.cache/bazel/_bazel_jenkins/install/ba7765e6f39a679257358196b530585b/embedded_tools/jdk/lib/jli:/home/jenkins/.cache/bazel/_bazel_jenkins/install/ba7765e6f39a679257358196b530585b/embedded_tools/jdk/lib:/home/jenkins/.cache/bazel/_bazel_jenkins/install/ba7765e6f39a679257358196b530585b/embedded_tools/jdk/lib/server:/home/jenkins/.cache/bazel/_bazel_jenkins/install/ba7765e6f39a679257358196b530585b/ -Dfile.encoding=ISO-8859-1 -jar /home/jenkins/.cache/bazel/_bazel_jenkins/install/ba7765e6f39a679257358196b530585b/A-server.jar --max_idle_secs=10800 --noshutdown_on_low_sys_mem --connect_timeout_secs=120 --output_user_root=/home/jenkins/.cache/bazel/_bazel_jenkins --install_base=/home/jenkins/.cache/bazel/_bazel_jenkins/install/ba7765e6f39a679257358196b530585b --install_md5=ba7765e6f39a679257358196b530585b --output_base=/home/jenkins/.cache/bazel/_bazel_jenkins/41b4626fb6512837d24f630cb1632ba8 --workspace_directory=/home/jenkins/13518/directory --default_system_javabase=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.262.b10-0.el7_8.x86_64 --failure_detail_out=/home/jenkins/.cache/bazel/_bazel_jenkins/41b4626fb6512837d24f630cb1632ba8/failure_detail.rawproto --deep_execroot --expand_configs_in_place --idle_server_tasks --write_command_log --nowatchfs --nofatal_event_bus_exceptions --nowindows_enable_symlinks --client_debug=false --product_name=Bazel --noincompatible_enable_execution_transition --option_sources=connect_Utimeout_Usecs:/home/jenkins/13518/directory/.bazelrc:max_Uidle_Usecs:/home/jenkins/13518/directory/.bazelrc
jenkins 11863 2064 62 04:52 ? 00:02:13 /home/jenkins/.cache/bazel/_bazel_jenkins/41b4626fb6512837d24f630cb1632ba8/execroot/com_ibm_monorepo/external/remotejdk11_linux/bin/java -XX:+UseParallelOldGC -XX:-CompactStrings --add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.code=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.comp=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.main=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED --add-opens=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED --patch-module=java.compiler=external/remote_java_tools_linux/java_tools/java_compiler.jar --patch-module=jdk.compiler=external/remote_java_tools_linux/java_tools/jdk_compiler.jar --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED -jar external/remote_java_tools_linux/java_tools/JavaBuilder_deploy.jar --persistent_worker
jenkins 11866 2064 51 04:52 ? 00:01:50 /home/jenkins/.cache/bazel/_bazel_jenkins/41b4626fb6512837d24f630cb1632ba8/execroot/com_ibm_monorepo/external/remotejdk11_linux/bin/java -XX:+UseParallelOldGC -XX:-CompactStrings --add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.code=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.comp=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.main=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED --add-opens=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED --patch-module=java.compiler=external/remote_java_tools_linux/java_tools/java_compiler.jar --patch-module=jdk.compiler=external/remote_java_tools_linux/java_tools/jdk_compiler.jar --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED -jar external/remote_java_tools_linux/java_tools/JavaBuilder_deploy.jar --persistent_worker
jenkins 11877 2064 66 04:52 ? 00:02:21 /home/jenkins/.cache/bazel/_bazel_jenkins/41b4626fb6512837d24f630cb1632ba8/execroot/com_ibm_monorepo/external/remotejdk11_linux/bin/java -XX:+UseParallelOldGC -XX:-CompactStrings --add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.code=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.comp=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.main=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED --add-opens=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED --patch-module=java.compiler=external/remote_java_tools_linux/java_tools/java_compiler.jar --patch-module=jdk.compiler=external/remote_java_tools_linux/java_tools/jdk_compiler.jar --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED -jar external/remote_java_tools_linux/java_tools/JavaBuilder_deploy.jar --persistent_worker
jenkins 11879 2064 59 04:52 ? 00:02:07 /home/jenkins/.cache/bazel/_bazel_jenkins/41b4626fb6512837d24f630cb1632ba8/execroot/com_ibm_monorepo/external/remotejdk11_linux/bin/java -XX:+UseParallelOldGC -XX:-CompactStrings --add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.code=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.comp=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.main=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED --add-opens=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED --patch-module=java.compiler=external/remote_java_tools_linux/java_tools/java_compiler.jar --patch-module=jdk.compiler=external/remote_java_tools_linux/java_tools/jdk_compiler.jar --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED -jar external/remote_java_tools_linux/java_tools/JavaBuilder_deploy.jar --persistent_worker
jenkins 16288 1993 0 04:56 ? 00:00:00 grep bazel
We are unable to stop these processes, As per this we added a
bazel shutdown
That didn’t shut down any. We got this error:
WARNING: Waiting for server process to terminate (waited 5 seconds, waiting at most 60)
WARNING: Waiting for server process to terminate (waited 10 seconds, waiting at most 60)
WARNING: Waiting for server process to terminate (waited 30 seconds, waiting at most 60)
INFO: Waited 60 seconds for server process (pid=2064) to terminate.
WARNING: Waiting for server process to terminate (waited 5 seconds, waiting at most 10)
WARNING: Waiting for server process to terminate (waited 10 seconds, waiting at most 10)
INFO: Waited 10 seconds for server process (pid=2064) to terminate.
FATAL: Attempted to kill stale server process (pid=2064) using SIGKILL, but it did not die in a timely fashion.
The bazel clean --expunge
also shows the same error.
What operating system are you running Bazel on?
Redhat 7.9 Docker container running in K8S pod (as a Tekton task)
What’s the output of bazel info release
?
Extracting Bazel installation… Starting local Bazel server and connecting to it… release 3.2.0
If bazel info release
returns “development version” or “(@non-git)”, tell us how you built Bazel.
NA
What’s the output of git remote get-url origin ; git rev-parse master ; git rev-parse HEAD
?
Is it required?
Have you found anything relevant by searching the web
Followed this thread Included the
bazel shutdown
command, but it didn’t stop the existing bazel processes.
Any other information, logs, or outputs that you want to share?
Will share further if required.
About this issue
- Original URL
- State: open
- Created 3 years ago
- Reactions: 3
- Comments: 21 (6 by maintainers)
Commits related to this issue
- kas-container: Start init service inside container This helps reaping zombies if processes do not perform proper cleanups. Known to stumble is bazel so far, see https://github.com/bazelbuild/bazel/is... — committed to siemens/kas by jan-kiszka 2 years ago
- Try passing `--init` to Docker. https://github.com/bazelbuild/bazel/issues/13823#issuecomment-1247177037 suggests other possibilities, but this should hopefully suffice for now. Change-Id: I145ab2ff... — committed to google/re2 by junyer 2 years ago
Just hit the same problem in our CI/CD pipeline. The problem was yes, the lack of an init process / child reaper.
What happens:
bazel shutdown
or any bazel command that requires killing/restarting the bazel daemon will usekill($serverPid)
to terminate the server.PID 1
is not a process that willreap children
(eg, waitpid for any child that dies), the bazel daemon with$serverPid
will remain as a zombie once killed. From the OS point of view, the process with$serverPid
will keep existing, both as a PID and as a file in/proc/$serverPid
until a parentwaitpid
s on it.src/main/cpp/blaze_util_posix.cc
, the bazel command trying to kill the bazel servers keeps sendingkill -TERM $serverPid
orkill -9 $serverPid
until … the pid goes away from/proc/$serverPid
or untilkilld($serverPid, 0)
returns error (depending on platform).Solution/fix: in your container, use an
entrypoint
that does child reaping. Eg, have PID 1 be/bin/docker-init
,/sbin/init
, or custom code. Alternatively, run something in the container that does child reaping viaPR_SET_CHILD_SUBREAPER
, like/bin/docker-init -s
.