kubernetes-client: run spark-submit failed on Kubernetes cluster

Environment MacBook Pro jdk1.8 spark 2.4.5 minikube status m01 host: Running kubelet: Running apiserver: Running kubeconfig: Configured

##############error logs###############

sudo ./spark-submit --verbose \
--master k8s://https://192.168.99.103:8443 \
--deploy-mode cluster \
--name spark-pi \
--class org.apache.spark.examples.SparkPi \
--conf spark.executor.instances=1 \
--conf spark.kubernetes.container.image=spark:latest \
--conf spark.kubernetes.container.image.pullPolicy=Never \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
local:///opt/spark-2.4/dist/examples/jars/spark-examples_2.11-2.4.5.jar 10000000
Using properties file: null
20/04/27 20:26:30 WARN Utils: Your hostname, wenhongqiangs-MacBook-Pro.local resolves to a loopback address: 127.0.0.1; using 172.16.15.219 instead (on interface en0)
20/04/27 20:26:30 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Parsed arguments:
  master                  k8s://https://192.168.99.103:8443
  deployMode              cluster
  executorMemory          null
  executorCores           null
  totalExecutorCores      null
  propertiesFile          null
  driverMemory            null
  driverCores             null
  driverExtraClassPath    null
  driverExtraLibraryPath  null
  driverExtraJavaOptions  null
  supervise               false
  queue                   null
  numExecutors            1
  files                   null
  pyFiles                 null
  archives                null
  mainClass               org.apache.spark.examples.SparkPi
  primaryResource         local:///opt/spark-2.4/dist/examples/jars/spark-examples_2.11-2.4.5.jar
  name                    spark-pi
  childArgs               [10000000]
  jars                    null
  packages                null
  packagesExclusions      null
  repositories            null
  verbose                 true

Spark properties used, including those specified through
 --conf and those from the properties file null:
  (spark.executor.instances,1)
  (spark.kubernetes.authenticate.driver.serviceAccountName,spark)
  (spark.kubernetes.container.image,spark:latest)
  (spark.kubernetes.container.image.pullPolicy,Never)

    
Main class:
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication
Arguments:
--primary-java-resource
local:///opt/spark-2.4/dist/examples/jars/spark-examples_2.11-2.4.5.jar
--main-class
org.apache.spark.examples.SparkPi
--arg
10000000
Spark config:
(spark.jars,local:///opt/spark-2.4/dist/examples/jars/spark-examples_2.11-2.4.5.jar)
(spark.app.name,spark-pi)
(spark.executor.instances,1)
(spark.kubernetes.container.image.pullPolicy,Never)
(spark.kubernetes.container.image,spark:latest)
(spark.submit.deployMode,cluster)
(spark.master,k8s://https://192.168.99.103:8443)
(spark.kubernetes.authenticate.driver.serviceAccountName,spark)
Classpath elements:



log4j:WARN No appenders could be found for logger (io.fabric8.kubernetes.client.Config).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Exception in thread "main" io.fabric8.kubernetes.client.KubernetesClientException: Operation: [create]  for kind: [Pod]  with name: [null]  in namespace: [default]  failed.
	at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64)
	at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72)
	at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:337)
	at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:330)
	at org.apache.spark.deploy.k8s.submit.Client$$anonfun$run$2.apply(KubernetesClientApplication.scala:141)
	at org.apache.spark.deploy.k8s.submit.Client$$anonfun$run$2.apply(KubernetesClientApplication.scala:140)
	at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2543)
	at org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:140)
	at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$5.apply(KubernetesClientApplication.scala:250)
	at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$5.apply(KubernetesClientApplication.scala:241)
	at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2543)
	at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:241)
	at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:204)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.net.SocketException: Broken pipe (Write failed)
	at java.net.SocketOutputStream.socketWrite0(Native Method)
	at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
	at java.net.SocketOutputStream.write(SocketOutputStream.java:155)
	at sun.security.ssl.OutputRecord.writeBuffer(OutputRecord.java:431)
	at sun.security.ssl.OutputRecord.write(OutputRecord.java:417)
	at sun.security.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:894)
	at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:865)
	at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:123)
	at okio.Okio$1.write(Okio.java:79)
	at okio.AsyncTimeout$1.write(AsyncTimeout.java:180)
	at okio.RealBufferedSink.flush(RealBufferedSink.java:224)
	at okhttp3.internal.http2.Http2Writer.windowUpdate(Http2Writer.java:262)
	at okhttp3.internal.http2.Http2Connection.start(Http2Connection.java:518)
	at okhttp3.internal.http2.Http2Connection.start(Http2Connection.java:505)
	at okhttp3.internal.connection.RealConnection.startHttp2(RealConnection.java:298)
	at okhttp3.internal.connection.RealConnection.establishProtocol(RealConnection.java:287)
	at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:168)
	at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:257)
	at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:135)
	at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:114)
	at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
	at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
	at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
	at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:126)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
	at io.fabric8.kubernetes.client.utils.BackwardsCompatibilityInterceptor.intercept(BackwardsCompatibilityInterceptor.java:119)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
	at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
	at io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createHttpClient$3(HttpClientUtils.java:112)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
	at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:254)
	at okhttp3.RealCall.execute(RealCall.java:92)
	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:411)
	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:372)
	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:241)
	at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:819)
	at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:334)
	... 17 more
20/04/27 20:26:31 INFO ShutdownHookManager: Shutdown hook called
20/04/27 20:26:31 INFO ShutdownHookManager: Deleting directory /private/var/folders/zz/zyxvpxvq6csfxvn_n0000000000000/T/spark-76c1b047-4e22-417d-a787-cf298a8c7303

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 15 (5 by maintainers)

Most upvoted comments

I think this issue can be closed. Please feel free to reopen this in case you are able to reproduce this on latest versions.

I got the same error. The steps to reproduce

git clone git@github.com:apache/spark.git
cd spark
# build docker images
./bin/docker-image-tool.sh -r pingsutw -t v3.1.2 build
# I've try `1.15.11`, and got the same error
minikube start --vm-driver=none --cpus 64 --memory 12288 --disk-size=128g --kubernetes-version v1.14.2
# Submit a Spark application
./bin/spark-submit \
  --master k8s://https://192.168.103.20.8443 \
  --deploy-mode cluster \
  --name spark-pi \
  --class org.apache.spark.examples.SparkPi \
  --conf spark.executor.instances=3 \
  --conf spark.kubernetes.container.image=pingsutw/spark:v3.1.2 \
  local:///opt/spark/examples/jars/spark-examples_2.12-3.2.0-SNAPSHOT.jar
  • Spark: Clone from GitHub
  • Kubernetes: 1.14.2, 1.15.11
  • Ubuntu: 18.04

We find a same issue when running Flink on K8s. Currently, i have a very simple conclusion here. Because of the compatibility issue of fabric8 kubernetes-client, the fabric8 kubernetes-client has the following known limitation.

  • For jdk 8u252, it could only work on kubernetes v1.16 and lower versions.
  • For other jdk versions(e.g. 8u242, jdk11), i am not aware of the same issues. It always works well.