DeepSpeech: ERROR: AlignOutput: Can't determine stream position creating Scorer

Have I written custom code (as opposed to running examples on an unmodified clone of the repository): Yes, to remove the prune but is not working with pruning
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04.4 LTS
TensorFlow installed from (our builds, or upstream TensorFlow): default install of the deepspeech package
TensorFlow version (use command below): v1.15.0-92-g5d80e1e 1.15.2
Python version: Python 3.6.9 (default, Apr 18 2020, 01:56:04) [GCC 8.4.0] on Linux
Bazel version (if compiling from source): N/A
GCC/Compiler version (if compiling from source): N/A
CUDA/cuDNN version: N/A
GPU model and memory: N/A
Exact command to reproduce:

python3 generate_lm.py --input_txt names-corpus.txt --output_dir out/ --top_k 9000000 --kenlm_bins ~/Desktop/ken/kenlm/build/bin --arpa_order 5 --max_arpa_memory "90%" --binary_a_bits 255 --binary_q_bits 8 --binary_type trie 

python3 generate_package.py --alphabet alphabet.txt --lm out/lm.binary --vocab out/vocab-9000000.txt --package pack/ --default_alpha 0.75 --default_beta 1.85

Output:

(deepspeech-venv) carlos@ubuntu:~/Desktop/scorer/DeepSpeech/data/lm$ python3 generate_package.py --alphabet alphabet.txt --lm out/lm.binary --vocab out/vocab-9000000.txt --package pack/ --default_alpha 0.75 --default_beta 1.85
49004 unique words read from vocabulary file.
Doesn't look like a character based model.
Using detected UTF-8 mode: False
ERROR: AlignOutput: Can't determine stream position
ERROR: Could not align file during write after header
Package created in pack/
swig/python detected a memory leak of type 'Alphabet *', no destructor found

This is also failing with a clean env install and my clean vm, both Spanish and English text.

Ds version:https://github.com/mozilla/DeepSpeech/commit/572963e7bd5c12f0355b5816bb4c9600b71e8dbe

KenLM version: https://github.com/kpu/kenlm/commit/87e85e66c99ceff1fab2500a7c60c01da7315eec

Thanks

About this issue

Original URL
State: closed
Created 4 years ago
Reactions: 1
Comments: 23 (1 by maintainers)

Commits related to this issue

Fix #3053: Check output stream when producing scorer — committed to lissyx/STT by deleted user 4 years ago
Fix #3053: Check output stream when producing scorer — committed to lissyx/STT by deleted user 4 years ago
Merge pull request #3066 from lissyx/output-stream-error Fix #3053: Check output stream when producing scorer — committed to mozilla/DeepSpeech by lissyx 4 years ago

Most upvoted comments

!fout does locally reproduce the error when trying to pass a directory

Yes, I confirm it shows the error with a directory as output:

Using detected UTF-8 mode: True
Error opening 'pack/'
Error when creating pack/

Thanks!

carlfm01 on Jun 17, 2020