elephas: Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063.
Could you please help me? I am giving my model and the error that I am seeing here.
model.compile(loss=lossFunc, optimizer=gradDiscent, metrics=['accuracy']);
##############################START: DISTRIBUTED MODEL################
from pyspark import SparkContext, SparkConf
#Create spark context
conf = SparkConf().setAppName('NSL-KDD-DISTRIBUTED').setMaster('local[8]');
sc = SparkContext(conf=conf);
from elephas.utils.rdd_utils import to_simple_rdd
#Build RDD (Resilient Distributed Dataset) from numpy features and labels
rdd = to_simple_rdd(sc, trainX, trainY);
from elephas.spark_model import SparkModel
from elephas import optimizers as elephas_optimizers
#Initialize SparkModel from Keras model and Spark Context
elphOptimizer = elephas_optimizers.Adagrad();
sparkModel = SparkModel(sc, model, optimizer=elphOptimizer, frequency='epoch', model='asynchronous', num_workers=1);
#Train Spark Model
sparkModel.train(rdd, nb_epoch=epochs, batch_size=batchSize, verbose=2);
#Evaluate Spark Model
score = sparkModel.master_network.evaluate(testX, testY, verbose=2);
print(score);
####################ERROR######################## Traceback (most recent call last): File “C:\PythonWorks\mine\dm-dist.py”, line 230, in <module> sparkModel.train(rdd, nb_epoch=epochs, batch_size=batchSize, verbose=2); File “C:\Miniconda3\lib\site-packages\elephas\spark_model.py”, line 194, in train self._train(rdd, nb_epoch, batch_size, verbose, validation_split, master_url) File “C:\Miniconda3\lib\site-packages\elephas\spark_model.py”, line 205, in _train self.start_server() File “C:\Miniconda3\lib\site-packages\elephas\spark_model.py”, line 125, in start_server self.server.start() File “C:\Miniconda3\lib\multiprocessing\process.py”, line 105, in start self._popen = self._Popen(self) File “C:\Miniconda3\lib\multiprocessing\context.py”, line 223, in _Popen return _default_context.get_context().Process._Popen(process_obj) File “C:\Miniconda3\lib\multiprocessing\context.py”, line 322, in _Popen return Popen(process_obj) File “C:\Miniconda3\lib\multiprocessing\popen_spawn_win32.py”, line 65, in init reduction.dump(process_obj, to_child) File “C:\Miniconda3\lib\multiprocessing\reduction.py”, line 60, in dump ForkingPickler(file, protocol).dump(obj) File “C:\Miniconda3\lib\site-packages\pyspark\context.py”, line 306, in getnewargs "It appears that you are attempting to reference SparkContext from a broadcast " Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063.
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 18 (4 by maintainers)
@mohaimenz thanks. yeah, I’m not 100% happy with dist-keras for the simple reason that it obviously started as an elephas fork without ever crediting it (until I forced the guy to do it).
This is how open source dies… instead of trying to grab GitHub fame (useless stars) and being selfish, just try to help out. If Joeri had just put his time into patching elephas and become maintainer, which I would have offered, instead of stealing it, we’d have a better product now, not 2 mediocre ones that compete. Alright, that’s my rant, haha. 😄