DeepSpeech: Problem with SWC corpus script
- Have I written custom code (as opposed to running examples on an unmodified clone of the repository): No
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04
- TensorFlow installed from (our builds, or upstream TensorFlow): Yes
- TensorFlow version (use command below): b’v1.13.1-0-g6612da8951’ 1.13.1
- Python version: 3.5
- Bazel version (if compiling from source): 0.19.2
- GCC/Compiler version (if compiling from source): 5.4.0
- CUDA/cuDNN version: 10.0.130
- GPU model and memory: Quadro RTX 6000, 72GB
Hello Team,
I am trying to use import_swc.py (under bin) to preprocess SWC corpus. I used the following command:
DeepSpeech/bin/import_swc.py . --language german --normalize --german_alphabet ../../../dependencies/alphabet.txt
But when I train the DeepSpeech model, the training loss is always infinite. Please guide how to resolve this issue. Below are the logs:
WARNING:tensorflow:From /home/LTLab.lan/agarwal/python-environments/env/lib/python3.5/site-packages/tensorflow/python/data/ops/dataset_ops.py:429: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version.
Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, use
tf.py_function, which takes a python function which manipulates tf eager
tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
an ndarray (just call tensor.numpy()) but having access to eager tensors
means `tf.py_function`s can use accelerators such as GPUs as well as
being differentiable using a gradient tape.
WARNING:tensorflow:From /home/LTLab.lan/agarwal/python-environments/env/lib/python3.5/site-packages/tensorflow/python/data/ops/iterator_ops.py:358: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /home/LTLab.lan/agarwal/python-environments/env/lib/python3.5/site-packages/tensorflow/contrib/rnn/python/ops/lstm_ops.py:696: to_int64 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
I Initializing variables...
I STARTING Optimization
Epoch 0 | Training | Elapsed Time: 0:18:08 | Steps: 1845 | Loss: inf
Epoch 0 | Validation | Elapsed Time: 0:00:36 | Steps: 139 | Loss: 270.188871 | Dataset: ../german-speech-corpus/delete/swc/dev_swc.csv
I Saved new best validating model with loss 270.188871 to: /home/LTLab.lan/agarwal/.local/share/deepspeech/checkpoints/best_dev-1845
Epoch 1 | Training | Elapsed Time: 0:17:52 | Steps: 1845 | Loss: inf
Epoch 1 | Validation | Elapsed Time: 0:00:35 | Steps: 139 | Loss: 227.384010 | Dataset: ../german-speech-corpus/delete/swc/dev_swc.csv
WARNING:tensorflow:From /home/LTLab.lan/agarwal/python-environments/env/lib/python3.5/site-packages/tensorflow/python/training/saver.py:966: remove_checkpoint (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to delete files with this prefix.
I Saved new best validating model with loss 227.384010 to: /home/LTLab.lan/agarwal/.local/share/deepspeech/checkpoints/best_dev-3690
Epoch 2 | Training | Elapsed Time: 0:17:52 | Steps: 1845 | Loss: inf
Epoch 2 | Validation | Elapsed Time: 0:00:35 | Steps: 139 | Loss: 218.371178 | Dataset: ../german-speech-corpus/delete/swc/dev_swc.csv
I Saved new best validating model with loss 218.371178 to: /home/LTLab.lan/agarwal/.local/share/deepspeech/checkpoints/best_dev-5535
Epoch 3 | Training | Elapsed Time: 0:17:53 | Steps: 1845 | Loss: inf
Epoch 3 | Validation | Elapsed Time: 0:00:35 | Steps: 139 | Loss: 322.072106 | Dataset: ../german-speech-corpus/delete/swc/dev_swc.csv
WARNING:tensorflow:From /home/LTLab.lan/agarwal/python-environments/env/lib/python3.5/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
I Early stop triggered as (for last 4 steps) validation loss: 322.072106 with standard deviation: 22.604229 and mean: 238.648019
I FINISHED optimization in 1:14:16.207693
I Restored variables from best validation checkpoint at /home/LTLab.lan/agarwal/.local/share/deepspeech/checkpoints/best_dev-5535, step 5535
Testing model on ../german-speech-corpus/delete/swc/test_swc.csv
Test epoch | Steps: 412 | Elapsed Time: 0:08:00
WARNING:tensorflow:From /home/LTLab.lan/agarwal/python-environments/env/lib/python3.5/site-packages/tensorflow/python/tools/freeze_graph.py:232: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.convert_variables_to_constants
WARNING:tensorflow:From /home/LTLab.lan/agarwal/python-environments/env/lib/python3.5/site-packages/tensorflow/python/framework/graph_util_impl.py:245: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.extract_sub_graph
Test on ../german-speech-corpus/delete/swc/test_swc.csv - WER: 0.984189, CER: 0.952155, loss: 221.439163
--------------------------------------------------------------------------------
WER: 3.000000, CER: 1.833333, loss: 90.893661
- src: "wurden"
- res: "in den hundert"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.789474, loss: 41.020634
- src: "umweltveränderungen"
- res: "um ein"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 1.200000, loss: 77.087273
- src: "array"
- res: "er ende"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 2.000000, loss: 86.086899
- src: "sex"
- res: "in den "
--------------------------------------------------------------------------------
WER: 2.000000, CER: 1.100000, loss: 120.730904
- src: "siebzehnte"
- res: "es unendlich"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 1.250000, loss: 157.400894
- src: "monotherapie"
- res: "die eeeeeeeeeeeee"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 4.250000, loss: 191.515320
- src: "doch"
- res: "es hunderttausende"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 1.000000, loss: 2.211713
- src: "an"
- res: ""
--------------------------------------------------------------------------------
WER: 1.000000, CER: 1.000000, loss: 2.343423
- src: "mit"
- res: ""
--------------------------------------------------------------------------------
WER: 1.000000, CER: 1.000000, loss: 2.612154
- src: "auf"
- res: ""
--------------------------------------------------------------------------------
I Exporting the model...
I Models exported at ../models
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 20 (13 by maintainers)
@AASHISHAG Regarding 1: I’ll take some of them for the filter rules - thanks! Regarding 2: Looks like the vocabulary.
If you imported TUDA, you should find the README under
<import-dir>/german-speechdata-package-v2/README
. The containing archive’s URL is constructed like this: https://github.com/mozilla/DeepSpeech/blob/85a61a3ab74aa28a08723236ddab740c7a9fa1e3/bin/import_tuda.py#L27-L29 Result: http://ltdata1.informatik.uni-hamburg.de/kaldi_tuda_de/german-speechdata-package-v2.tar.gz@AASHISHAG
Confirmed (for the speakers).
#2625 is for adding article name and the speaker to CSV columns for debugging - This will let you verify that each speaker is restricted to one set. It also allows excluding “unknown” speakers (in case an unknown speaker is actually just an unidentified existing one). Be aware: There is no “sentence overlap” check, as the importer assumes Wikipedia articles not sharing equal sentences.