gluon-cv: Training speed of mxnet-ssd slows down?
I have use record file(voc07+12) to train old-style ssd at a speed of 40 images/s ,The speed is about 25 images/s when I try the new train_ssd.py in gluoncv.
I use rec dataset and transform to replace origin file datasets in new ssd code. But when I set **num-workers=4** the gdata.DetectionDataLoader failed ,while **num-workers=1** , It works but the speed is almost as slow as original data reading method.
The error infomation is as following:
Process Process-3:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/home/deep/workssd/mxnet/incubator-mxnet/python/mxnet/gluon/data/dataloader.py", line 134, in worker_loop
batch = batchify_fn([dataset[i] for i in samples])
File "/home/deep/workssd/mxnet/incubator-mxnet/python/mxnet/gluon/data/dataset.py", line 126, in __getitem__
self.run()
item = self._data[idx]
File "/home/deep/workssd/mxnet/incubator-mxnet/python/mxnet/gluon/data/vision/datasets.py", line 257, in __getitem__
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
record = super(ImageRecordDataset, self).__getitem__(idx)
File "/home/deep/workssd/mxnet/incubator-mxnet/python/mxnet/gluon/data/dataset.py", line 180, in __getitem__
return self._record.read_idx(self._record.keys[idx])
File "/home/deep/workssd/mxnet/incubator-mxnet/python/mxnet/recordio.py", line 265, in read_idx
self._target(*self._args, **self._kwargs)
File "/home/deep/workssd/mxnet/incubator-mxnet/python/mxnet/gluon/data/dataloader.py", line 134, in worker_loop
return self.read()
File "/home/deep/workssd/mxnet/incubator-mxnet/python/mxnet/recordio.py", line 163, in read
ctypes.byref(size)))
File "/home/deep/workssd/mxnet/incubator-mxnet/python/mxnet/base.py", line 149, in check_call
batch = batchify_fn([dataset[i] for i in samples])
raise MXNetError(py_str(_LIB.MXGetLastError()))
MXNetError: [16:12:48] src/recordio.cc:65: Check failed: header[0] == RecordIOWriter::kMagic Invalid RecordIO File
It seems a multi-process problem with old rec file dataset?
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 16 (9 by maintainers)
@WalterMa Yes, this bug should be easy to fix, but need to be careful not to change current api, so we are still discussing.