tesseract: user_words_suffix not working
We are trying to provide a user words file via available control params. Unfortunately I am getting below error -
Environment
- Tesseract Version: tesseract 4.00.00alpha leptonica-1.74.4 libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.1) : libpng 1.6.34 : libtiff 4.0.8 : zlib 1.2.11 : libwebp 0.6.0 : libopenjp2 2.2.0 Found AVX Found SSE
- Platform: Linux 9cadf37d2e9c 4.9.49-moby #1 SMP Wed Sep 27 23:17:17 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Current Behavior:
Using below params to supply user words file -
tesseract --user_words_file /usr/share/tesseract-ocr/4.00/tessdata/eng.user-words -psm=1 -l=eng source.ppm res11 txt
I am getting error as -
read_params_file: parameter not found: P6
Is this supported in above tesseract version? I can see the support is mentioned in the help
user_words_file A filename of user-provided words.
user_words_suffix A suffix of user-provided words located in tessdata.
user_patterns_file A filename of user-provided patterns.
user_patterns_suffix A suffix of user-provided patterns located in tessdata.
Please note I have tried all possible options to supply the file - user_words | user_words_file | user_words_suffix | user_patterns_file | user_patterns_suffix
Please suggest the right way to achieve the same.
Thanks,
- Dev
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 20 (1 by maintainers)
Ok, I added the lines to Dict::LoadLSTM and it works, with the the following:
Dnlineis corrected toOnlineEDIT: While it works for this particular image, haven’t got it to work with others yet.
I now need to find an image that does not work with tessdata_best and tessdata_fast in order to test further.
Use
--psm 6instead of-psm 6@Shreeshrii Is that still the case?
The user_words file is just a hint given to the OCR engine.
-psm=1 -l=eng=>
--psm 1 -l eng