NeMo: problem with fine-tuning th e pre-trained `SpeakerNet` model for speaker recognition

Hi!

As an experiment, I was fine-tuning the pre-trained SpeakerNet model with the an4 dataset plus my own voice recordings. I used the example script speaker_reco_finetune.py and changed the configuration file according to the documentation and the colab notebook tutorial. But it appears something is always wrong with setting up the optimizer:

Traceback (most recent call last):
  File "/Users/xujinghua/speaker-recognition-nemo/scripts/experimental/fine_tune_an4.py", line 119, in <module>
    main()
  File "/Users/xujinghua/speaker-recognition-nemo/scripts/experimental/fine_tune_an4.py", line 108, in main
    speaker_model.setup_optimization(speaker_model._cfg.optim)
  File "/Users/xujinghua/miniconda3/lib/python3.7/site-packages/nemo/core/classes/modelPT.py", line 603, in setup_optimization
    optimizer=self._optimizer, scheduler_config=scheduler_config, train_dataloader=self._train_dl
  File "/Users/xujinghua/miniconda3/lib/python3.7/site-packages/nemo/core/optim/lr_scheduler.py", line 539, in prepare_lr_scheduler
    max_epochs, accumulate_grad_batches, num_workers, num_samples, batch_size, drop_last,
  File "/Users/xujinghua/miniconda3/lib/python3.7/site-packages/nemo/core/optim/lr_scheduler.py", line 587, in compute_max_steps
    sampler_num_samples = math.ceil(num_samples / num_workers)
ZeroDivisionError: division by zero

I tried a few ways to debug but nothing helped I have no clue what is going on wrong. Has anyone got an idea or run into such problems before? Help would be super appreciated!

Cheers, JX

About this issue

Original URL
State: closed
Created 3 years ago
Comments: 15

Most upvoted comments

Hi @JINHXu … To finetune on cpu, you have to make sure to set accelerator to null in config, along with gpus=0. This should work

nithinraok on May 20, 2021