hail: hail v0.2.124 on AWS - java error
What happened?
Follow up on #13445 - I almost succeed to install hail on AWS but still have some environment issue:
- I am trying to install Hail v0.2.124
- on AWS EMR v6.9.1 (latest version with Spark 3.3.0 suggested on hail doc)
- I upgrade to python 3.9.18
$ python --version
Python 3.9.18
I activate java 11.0.20.1
$ java -version
openjdk version "11.0.20.1" 2023-08-22 LTS
OpenJDK Runtime Environment Corretto-11.0.20.9.1 (build 11.0.20.1+9-LTS)
OpenJDK 64-Bit Server VM Corretto-11.0.20.9.1 (build 11.0.20.1+9-LTS, mixed mode)
- I clone hail
$ cd /tmp
$ git clone --branch 0.2.124 --depth 1 https://github.com/broadinstitute/hail.git
- I build hail
$ cd hail/hail/
$ make install-on-cluster HAIL_COMPILE_NATIVES=1 SCALA_VERSION=2.12.15 SPARK_VERSION=3.3.0
[...]
Successfully installed hail-0.2.124
hailctl config set query/backend spark
- At this point Hail seems correcly installed
$ pip show hail
Name: hail
Version: 0.2.124
Summary: Scalable library for exploring and analyzing genomic data.
Home-page: https://hail.is
Author: Hail Team
Author-email: hail@broadinstitute.org
License: UNKNOWN
Location: /home/hadoop/.local/lib/python3.9/site-packages
- For sake of configuration I create a symlink of the hail backend
sudo ln -sf /home/hadoop/.local/lib/python3.9/site-packages/hail/backend /opt/hail/backend
- Confident of the. installation I try to run spark shell
$ spark-shell
[...]
Exception in thread "main" java.lang.NoSuchMethodError: 'scala.reflect.internal.settings.MutableSettings
I am out of idea on how to solve the current situation. Thanks
Version
0.2.124
Relevant log output
$ spark-shell
SLF4J: No SLF4J providers were found.
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See https://www.slf4j.org/codes.html#noProviders for further details.
SLF4J: Class path contains SLF4J bindings targeting slf4j-api versions 1.7.x or earlier.
SLF4J: Ignoring binding found at [jar:file:/usr/lib/spark/jars/log4j-slf4j-impl-2.17.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See https://www.slf4j.org/codes.html#ignoredBindings for an explanation.
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Exception in thread "main" java.lang.NoSuchMethodError: 'scala.reflect.internal.settings.MutableSettings scala.reflect.internal.settings.MutableSettings$.SettingsOps(scala.reflect.internal.settings.MutableSettings)'
at scala.tools.nsc.interpreter.ILoop.$anonfun$chooseReader$1(ILoop.scala:914)
at scala.tools.nsc.interpreter.ILoop.mkReader$1(ILoop.scala:920)
at scala.tools.nsc.interpreter.ILoop.$anonfun$chooseReader$4(ILoop.scala:926)
at scala.tools.nsc.interpreter.ILoop.$anonfun$chooseReader$3(ILoop.scala:926)
at scala.tools.nsc.interpreter.ILoop.chooseReader(ILoop.scala:926)
at org.apache.spark.repl.SparkILoop.$anonfun$process$1(SparkILoop.scala:138)
at scala.Option.fold(Option.scala:251)
at org.apache.spark.repl.SparkILoop.newReader$1(SparkILoop.scala:138)
at org.apache.spark.repl.SparkILoop.preLoop$1(SparkILoop.scala:142)
at org.apache.spark.repl.SparkILoop.$anonfun$process$10(SparkILoop.scala:203)
at org.apache.spark.repl.SparkILoop.withSuppressedSettings$1(SparkILoop.scala:189)
at org.apache.spark.repl.SparkILoop.startup$1(SparkILoop.scala:201)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:236)
at org.apache.spark.repl.Main$.doMain(Main.scala:78)
at org.apache.spark.repl.Main$.main(Main.scala:58)
at org.apache.spark.repl.Main.main(Main.scala)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
About this issue
- Original URL
- State: closed
- Created 8 months ago
- Comments: 18
Commits related to this issue
- [gradle] exclude scala-reflect CHANGELOG: Fixes #13837 in which Hail could break a Spark installation if the Hail JAR appears on the classpath before the Scala JARs. Several dependencies of ours *an... — committed to danking/hail by danking 8 months ago
- [gradle] exclude scala-reflect (#13894) CHANGELOG: Fixes #13837 in which Hail could break a Spark installation if the Hail JAR appears on the classpath before the Scala JARs. We and several dependen... — committed to hail-is/hail by danking 8 months ago
@danking
Here what I have done in my environment ( AWS EMR )
hailctl not found
at the installation step)build.gradle
and addexclude group: 'org.scala-lang', module: 'scala-reflect'
spark-defaults
properties in order to linkhail-all-spark.jar
… This config was needed & works successfuly for an old version of Hail (0.2.60)… can be revisit if not appropriate for recent versionit seems working in command line using pyspark !
I need to test on jupyter notebook now…
FYI the pyspark configs