deep-voice-conversion: I can't run it in window 10, could someone help me ?

My env is win10 + anaconda2 + python3.5. It’s my first time to use tensorflow. The log below looks like something went wrong when parse hparams/default.yaml. I even have tried changed default.yaml the CF to window’s CRLF. Cound someone help me ?

(python35) λ pip show pyyaml
Name: PyYAML
Version: 3.13
Summary: YAML parser and emitter for Python
Home-page: http://pyyaml.org/wiki/PyYAML
Author: Kirill Simonov
Author-email: xi@resolvent.net
License: MIT

(python35) λ pip show tensorflow
Name: tensorflow
Version: 1.9.0
Summary: TensorFlow is an open source machine learning framework for everyone.
Home-page: https://www.tensorflow.org/
Author: Google Inc.
Author-email: opensource@google.com
License: Apache 2.0
D:\proj_github\deep-voice-conversion (master -> origin)
(python35) λ python train1.py case
case: case, logdir: /data/private/vc/logdir/case/train1
[0725 16:52:49 @logger.py:109] WRN Log directory /data/private/vc/logdir/case/train1 exists! Use 'd' to delete it.
[0725 16:52:49 @logger.py:112] WRN If you're resuming from a previous run, you can choose to keep it.
Press any other key to exit.
Select Action: k (keep) / d (delete) / q (quit):d
[0725 16:52:52 @logger.py:74] Argv: train1.py case
[0725 16:52:52 @parallel.py:175] WRN MultiProcessPrefetchData does support windows. However, windows requires more strict picklability on processes, which may lead of failure on some of the code.
[0725 16:52:52 @parallel.py:185] [MultiProcessPrefetchData] Will fork a dataflow more than one times. This assumes the datapoints are i.i.d.
Process _Worker-1:
Traceback (most recent call last):
  File "C:\Users\mywind\AppData\Local\conda\conda\envs\python35\lib\multiprocessing\process.py", line 252, in _bootstrap
    self.run()
  File "C:\Users\mywind\AppData\Local\conda\conda\envs\python35\lib\site-packages\tensorpack\dataflow\parallel.py", line 162, in run
    for dp in self.ds.get_data():
  File "C:\Users\mywind\AppData\Local\conda\conda\envs\python35\lib\site-packages\tensorpack\dataflow\common.py", line 116, in get_data
    for data in self.ds.get_data():
  File "D:\proj_github\deep-voice-conversion\data_load.py", line 35, in get_data
    yield get_mfccs_and_phones(wav_file=wav_file)
  File "D:\proj_github\deep-voice-conversion\data_load.py", line 72, in get_mfccs_and_phones
    wav = read_wav(wav_file, sr=hp.default.sr)
KeyError: 'default'
Process _Worker-2:
Traceback (most recent call last):
  File "C:\Users\mywind\AppData\Local\conda\conda\envs\python35\lib\multiprocessing\process.py", line 252, in _bootstrap
    self.run()
  File "C:\Users\mywind\AppData\Local\conda\conda\envs\python35\lib\site-packages\tensorpack\dataflow\parallel.py", line 162, in run
    for dp in self.ds.get_data():
  File "C:\Users\mywind\AppData\Local\conda\conda\envs\python35\lib\site-packages\tensorpack\dataflow\common.py", line 116, in get_data
    for data in self.ds.get_data():
  File "D:\proj_github\deep-voice-conversion\data_load.py", line 35, in get_data
    yield get_mfccs_and_phones(wav_file=wav_file)
  File "D:\proj_github\deep-voice-conversion\data_load.py", line 72, in get_mfccs_and_phones
    wav = read_wav(wav_file, sr=hp.default.sr)
KeyError: 'default'

[0725 16:52:31 @training.py:101] Building graph for training tower 1 on device /gpu:1 ...
[0725 16:52:34 @collection.py:164] These collections were modified but restored in tower1: (tf.GraphKeys.SUMMARIES: 3->5)
Traceback (most recent call last):
  File "C:\Users\mywind\AppData\Local\conda\conda\envs\python35\lib\site-packages\tensorflow\python\framework\ops.py", line 1589, in _create_c_op
    c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Op type not registered 'NcclAllReduce' in binary running on mywind-PC. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. while building NodeDef 'AllReduceGrads/NcclAllReduce'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:/proj_github/deep-voice-conversion/train1.py", line 78, in <module>
    train(args, logdir=logdir_train1)
  File "D:/proj_github/deep-voice-conversion/train1.py", line 60, in train
    launch_train_with_config(train_conf, trainer=trainer)
  File "C:\Users\mywind\AppData\Local\conda\conda\envs\python35\lib\site-packages\tensorpack\train\interface.py", line 81, in launch_train_with_config
    model._build_graph_get_cost, model.get_optimizer)
  File "C:\Users\mywind\AppData\Local\conda\conda\envs\python35\lib\site-packages\tensorpack\utils\argtools.py", line 181, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\mywind\AppData\Local\conda\conda\envs\python35\lib\site-packages\tensorpack\train\tower.py", line 173, in setup_graph
    train_callbacks = self._setup_graph(input, get_cost_fn, get_opt_fn)
  File "C:\Users\mywind\AppData\Local\conda\conda\envs\python35\lib\site-packages\tensorpack\train\trainers.py", line 166, in _setup_graph
    self._make_get_grad_fn(input, get_cost_fn, get_opt_fn), get_opt_fn)
  File "C:\Users\mywind\AppData\Local\conda\conda\envs\python35\lib\site-packages\tensorpack\graph_builder\training.py", line 232, in build
    all_grads = allreduce_grads(all_grads, average=self._average)  # #gpu x #param
  File "C:\Users\mywind\AppData\Local\conda\conda\envs\python35\lib\site-packages\tensorpack\tfutils\scope_utils.py", line 84, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\mywind\AppData\Local\conda\conda\envs\python35\lib\site-packages\tensorpack\graph_builder\utils.py", line 140, in allreduce_grads
    summed = nccl.all_sum(grads)
  File "C:\Users\mywind\AppData\Local\conda\conda\envs\python35\lib\site-packages\tensorflow\contrib\nccl\python\ops\nccl_ops.py", line 47, in all_sum
    return _apply_all_reduce('sum', tensors)
  File "C:\Users\mywind\AppData\Local\conda\conda\envs\python35\lib\site-packages\tensorflow\contrib\nccl\python\ops\nccl_ops.py", line 228, in _apply_all_reduce
    shared_name=shared_name))
  File "C:\Users\mywind\AppData\Local\conda\conda\envs\python35\lib\site-packages\tensorflow\contrib\nccl\ops\gen_nccl_ops.py", line 58, in nccl_all_reduce
    num_devices=num_devices, shared_name=shared_name, name=name)
  File "C:\Users\mywind\AppData\Local\conda\conda\envs\python35\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "C:\Users\mywind\AppData\Local\conda\conda\envs\python35\lib\site-packages\tensorflow\python\framework\ops.py", line 3414, in create_op
    op_def=op_def)
  File "C:\Users\mywind\AppData\Local\conda\conda\envs\python35\lib\site-packages\tensorflow\python\framework\ops.py", line 1756, in __init__
    control_input_ops)
  File "C:\Users\mywind\AppData\Local\conda\conda\envs\python35\lib\site-packages\tensorflow\python\framework\ops.py", line 1592, in _create_c_op
    raise ValueError(str(e))
ValueError: Op type not registered 'NcclAllReduce' in binary running on mywind-PC. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. while building NodeDef 'AllReduceGrads/NcclAllReduce'

Process finished with exit code 1

About this issue

  • Original URL
  • State: open
  • Created 6 years ago
  • Reactions: 1
  • Comments: 82

Most upvoted comments

I’m running on Windows on a single GPU, you should migrate all the code that uses hparam.py, I changed all the code to use hparams.py, in most of the code you just have to change from default to Default, there is missing properties in Default and TrainX in hparams.py so, copy and paste the properties from hparam.py and replace the : for =

Nccl reduce may be caused by leaking wav files or the dataset path is incorrect, verify in the hparams.py, the other cause of ncclreduce is to use more than 1 GPU on windows.

My hparams.py, hope it helps. hparams.zip

@juihsuanlee I have made it and got a result. I use the code that put on the website(https://github.com/carlfm01/deep-voice-conversion). but made a little change in the code in convert.py image hope it works to you

Hi @bhui looks like the error is related to incorrect format or corrupted files, make sure your training data for the second network is at 16000 sampling rate, mono and wave format. Also try with different versions of numpy, I recommend you to use conda enviroments 😃.

@ryancwalsh Take a look https://github.com/carlfm01/deep-voice-conversion, also I changed the code of the lambda in convert to make it work on python 3.+, let me know if you still don’t undertand something.

@Huishou I faced the same broadcasting issue. I solved that by updating Librosa to version 0.6.2. Just do a pip install librosa (that installs 0.6.2). Should be okay after that.

yz xry a sz4wpm gs 89wx @carlfm01 in this line, right? i will try it

@wuzhiyu666 You should put all your timit data in see Or change that path to point your timit data.

@wuzhiyu666 sorry, I meant for the timit data, that that you commented is for the second net.

@wuzhiyu666 Maybe incorrect path, share an example to one of your train1 wav files, and also the path that you are using in the code.

@carlfm01 Thanks! You helped me a lot. I am a beginner, just getting into tensorflow.

Huh,

I’m sure I tried that already, but that seems to have fixed it. I’ll let it run for a bit and let you know how my output looks.

Thanks a bunch!

I have been unable to find a solution, but after thorough troubleshooting I have found the problem. The project relies on nccl, which is not supported in Windows. I don’t know enough of Python or Tensorflow (new to both) to know how to edit the code and exclude calls to nccl, so my “solution” was to dual-boot linux(Ubuntu) on my system.

Unfortunately, it seems that unless nVidia releases nccl for Windows or major changes are made to the code for this project, it can only be run on a nccl-supported linux system.

@bhui I’m having the same problem on Windows 10.

ValueError: Op type not registered 'NcclAllReduce' in binary running on DESK. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. while building NodeDef 'AllReduceGrads/NcclAllReduce'

I’ve now tried at least 8 different repos for trying to learn voice cloning, and none of them have good enough documentation for me to get them working. I’m super inspired by all of the examples but haven’t had much luck.