spark-on-k8s-operator: After upgrading: getting error jars does not exist, skipping.

upgraded to operator 3.1.1 (from 2.4.5), in SparkApplication yaml I have this:

spec:
  deps:
     jars:
        - local:///opt/app/jars/*

now it fails on :

DependencyUtils: Local jar /opt/app/jars/* does not exist, skipping.

I verified the jars are inside the image in that location.

I changed the path to

- local:///opt/app/jars

not getting this error any more, but getting:

Error: Failed to load my.package.my.classApp: org/springframework/boot/CommandLineRunner

this class is provided:

 mainApplicationFile: "local:///opt/app/jars/my.jar"
 mainClass: my.package.my.classApp

About this issue

  • Original URL
  • State: open
  • Created 2 years ago
  • Reactions: 2
  • Comments: 15 (6 by maintainers)

Most upvoted comments

The problem is that spark-submit 3.x handles local:// values for --jars differently than 2.4.x.

You can see the difference in the configmap created by spark-submit.

With 2.4.x, spark-submit --jars local://my/directory/* results in a configmap containing:

spark.jars=/my/directory/*

Whereas with 3.x, this results in:

spark.jars=local:///my/directory/*

Glob expansion does not happen in the 3.x case, resulting in the “jar not found” log message.

Important to note that the version of Spark your application is using, i.e. the version of Spark in your application’s container image, does not matter. It’s the version of spark-submit that’s the problem. That’s why downgrading the operator to an image using Spark 2.4.x fixes the problem for you.

I’ve yet to find a good solution for this in Spark 3.x. Like you suggested, you can try extraClassPath, but the behavior is not the same as --jars. In my case, my application uses a version of Hadoop that’s different from Spark’s, and this collision results in a NoSuchMethodError. This isn’t a problem when --jars is used.

Ultimately, I think I’ll have to switch to an uberjar instead of jar directories, which is terrible for layer caching of the container image. It’s unfortunate that Spark made this change.