tesseract: text2image segfault

I’m trying to use the text2image utility to train tesseract. Unfortunately it keeps crashing every time I try to use it 😦

text2image --text=training_text.txt --outputbase=test.MenloMedium.exp0 --font='Menlo Medium' --fonts_dir=/Library/Fonts/
(lldb) run
Process 49926 launched: '/usr/local/bin/text2image' (x86_64)
Process 49926 stopped
* thread #1: tid = 0x1d2b8cb, 0x0000000100b74358 libpangoft2-1.0.0.dylib`pango_fc_font_get_glyph + 25, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x0000000100b74358 libpangoft2-1.0.0.dylib`pango_fc_font_get_glyph + 25
libpangoft2-1.0.0.dylib`pango_fc_font_get_glyph:
->  0x100b74358 <+25>: movq   (%rcx), %rdi
    0x100b7435b <+28>: testq  %rdi, %rdi
    0x100b7435e <+31>: je     0x100b74369               ; <+42>
    0x100b74360 <+33>: movq   %rax, %rsi
(lldb) bt
* thread #1: tid = 0x1d2b8cb, 0x0000000100b74358 libpangoft2-1.0.0.dylib`pango_fc_font_get_glyph + 25, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x0000000100b74358 libpangoft2-1.0.0.dylib`pango_fc_font_get_glyph + 25
    frame #1: 0x000000010000edc1 text2image`tesseract::PangoFontInfo::CanRenderString(char const*, int, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >*) const + 321
    frame #2: 0x000000010000ec57 text2image`tesseract::PangoFontInfo::CanRenderString(char const*, int) const + 33
    frame #3: 0x0000000100015227 text2image`tesseract::StringRenderer::StripUnrenderableWords(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >*) const + 193
    frame #4: 0x00000001000154aa text2image`tesseract::StringRenderer::RenderToImage(char const*, int, Pix**) + 418
    frame #5: 0x0000000100005748 text2image`main + 2891
    frame #6: 0x00007fff8a2645ad libdyld.dylib`start + 1
    frame #7: 0x00007fff8a2645ad libdyld.dylib`start + 1

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Comments: 39 (5 by maintainers)

Most upvoted comments

you can redirect output to file by

text2image --find_fonts \
--fonts_dir  /usr/share/fonts/truetype/dejavu/ \
--text ../langdata/eng/eng.training_text \
--min_coverage .99  \
--outputbase ../langdata/eng/eng &>./test.txt

test.txt has

Total chars = 6694
DejaVu Sans : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 0 to file ../langdata/eng/eng.DejaVu_Sans.tif
DejaVu Sans Bold : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 1 to file ../langdata/eng/eng.DejaVu_Sans_Bold.tif
DejaVu Sans Mono : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 2 to file ../langdata/eng/eng.DejaVu_Sans_Mono.tif
DejaVu Sans Mono Bold : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 3 to file ../langdata/eng/eng.DejaVu_Sans_Mono_Bold.tif
DejaVu Serif : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 4 to file ../langdata/eng/eng.DejaVu_Serif.tif
DejaVu Serif Bold : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 5 to file ../langdata/eng/eng.DejaVu_Serif_Bold.tif
text2image --find_fonts \
 --fonts_dir  /usr/share/fonts/truetype/dejavu/ \
 --text ../langdata/eng/eng.training_text \
 --min_coverage .99  \
 --outputbase ../langdata/eng/eng

Total chars = 6694
DejaVu Sans : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 0 to file ../langdata/eng/eng.DejaVu_Sans.tif
DejaVu Sans Bold : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 1 to file ../langdata/eng/eng.DejaVu_Sans_Bold.tif
DejaVu Sans Mono : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 2 to file ../langdata/eng/eng.DejaVu_Sans_Mono.tif
DejaVu Sans Mono Bold : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 3 to file ../langdata/eng/eng.DejaVu_Sans_Mono_Bold.tif
DejaVu Serif : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 4 to file ../langdata/eng/eng.DejaVu_Serif.tif
DejaVu Serif Bold : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 5 to file ../langdata/eng/eng.DejaVu_Serif_Bold.tif