xgboost: [jvm-packages] XGBoost4J-Spark 2.0.0-RC1 fails for Spark 3.4.0 on EMR
Hello everybody,
I am trying to implement XGBoost4J-Spark in a scala project. Everything works fine locally (on an intel MacBook), however when deploying to EMR, I receive the following error (running on EMR 6.12.0 and Spark 3.4.0 with Scala 2.12.17):
java.lang.NoClassDefFoundError: Could not initialize class ml.dmlc.xgboost4j.java.XGBoostJNI
For my build.sbt I added the following lines to libraryDependencies, as suggested by the tutorial (running with sbt 1.9.2):
"ml.dmlc" %% "xgboost4j" % "1.7.6",
"ml.dmlc" %% "xgboost4j-spark" % "1.7.6"
I packaged everything up into a single JAR via the sbt-assembly plugin. I believe that this would pack all the dependencies into the JAR that is needed to run the Spark Application on EMR, so I am really out of ideas about this error. Not sure if this is an error on my end or an actual bug. Help is appreciated!
About this issue
- Original URL
- State: open
- Created 10 months ago
- Comments: 15 (3 by maintainers)
XGBoost 1.7.6 supports Spark 3.0.1: https://github.com/dmlc/xgboost/blob/36eb41c960483c8b52b44082663c99e6a0de440a/jvm-packages/pom.xml#L37
You can use XGBoost 2.0.0 to use Spark 3.4.0: https://github.com/dmlc/xgboost/blob/4301558a5711e63bbf004d2b6fca003906fb743c/jvm-packages/pom.xml#L38