DeepSpeech: ERROR: AlignOutput: Can't determine stream position creating Scorer
- Have I written custom code (as opposed to running examples on an unmodified clone of the repository): Yes, to remove the prune but is not working with pruning
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04.4 LTS
- TensorFlow installed from (our builds, or upstream TensorFlow): default install of the deepspeech package
- TensorFlow version (use command below): v1.15.0-92-g5d80e1e 1.15.2
- Python version: Python 3.6.9 (default, Apr 18 2020, 01:56:04) [GCC 8.4.0] on Linux
- Bazel version (if compiling from source): N/A
- GCC/Compiler version (if compiling from source): N/A
- CUDA/cuDNN version: N/A
- GPU model and memory: N/A
- Exact command to reproduce:
python3 generate_lm.py --input_txt names-corpus.txt --output_dir out/ --top_k 9000000 --kenlm_bins ~/Desktop/ken/kenlm/build/bin --arpa_order 5 --max_arpa_memory "90%" --binary_a_bits 255 --binary_q_bits 8 --binary_type trie
python3 generate_package.py --alphabet alphabet.txt --lm out/lm.binary --vocab out/vocab-9000000.txt --package pack/ --default_alpha 0.75 --default_beta 1.85
Output:
(deepspeech-venv) carlos@ubuntu:~/Desktop/scorer/DeepSpeech/data/lm$ python3 generate_package.py --alphabet alphabet.txt --lm out/lm.binary --vocab out/vocab-9000000.txt --package pack/ --default_alpha 0.75 --default_beta 1.85
49004 unique words read from vocabulary file.
Doesn't look like a character based model.
Using detected UTF-8 mode: False
ERROR: AlignOutput: Can't determine stream position
ERROR: Could not align file during write after header
Package created in pack/
swig/python detected a memory leak of type 'Alphabet *', no destructor found
This is also failing with a clean env install and my clean vm, both Spanish and English text.
Ds version:https://github.com/mozilla/DeepSpeech/commit/572963e7bd5c12f0355b5816bb4c9600b71e8dbe
KenLM version: https://github.com/kpu/kenlm/commit/87e85e66c99ceff1fab2500a7c60c01da7315eec
Thanks
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 1
- Comments: 23 (1 by maintainers)
Commits related to this issue
- Fix #3053: Check output stream when producing scorer — committed to lissyx/STT by deleted user 4 years ago
- Fix #3053: Check output stream when producing scorer — committed to lissyx/STT by deleted user 4 years ago
- Merge pull request #3066 from lissyx/output-stream-error Fix #3053: Check output stream when producing scorer — committed to mozilla/DeepSpeech by lissyx 4 years ago
Yes, I confirm it shows the error with a directory as output:
Thanks!