elephas: Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
Hi, I’m trying to use elephas for my deep learning models on spark but so far I couldn’t even get anything to work on 3 different machines and on multiple notebooks.
-
“ml_pipeline_otto.py” crashes on the
load_data_frame
function, more specifically onreturn sqlContext.createDataFrame(data, ['features', 'category'])
with the error :Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob.
-
“mnist_mlp_spark.py” crashes on the
spark_model.fit
method with the error :TypeError: can't pickle _thread.RLock objects
-
“My Own Pipeline” crashes right after fitting (it actually trains it) the model with this error :
Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
I’m running tensorflow 2.1.0, pyspark 3.0.2, jdk-8u281 and python 3.7 and elephas 1.4.2
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 16
Hi there! Had the same issue, but this solution helped: import findspark findspark.init() Initialize it before the creation of spark session
Hi,
Thanks had same issue its been resolved. import findspark findspark.init() Initialize it before the creation of spark session
Note: Windows seems has other dependencies, Not sure what was the issue but its fixed now. please pass it on detail like how this package help to resolve this.
Hi Mayank,
Thanks for your comments. ‘findspark’ package helped me to solve the issue.
This solved my issue. Don’t forget to restart kernel and re-run cells after installing
findspark
Yes, it works for me. Especially, don’t forget to restart kernel before
findspark.init()