tesseract: text2image segmentation fault on macOS ( regression #195?)

There seems to be a regression bug with text2image of 3.05 release. The text2image utility caused segfault, so I couldn’ use tesstrain.sh on macOS (10.12.3).

$ text2image --text eng.training_text --outputbase=eng.TimesNewRomanBold.exp0 --font='Times New Roman, Bold' --fonts_dir=/Library/Fonts --tlog_level=3
query weight = 700 	 selected weight =700
query_desc: 'Times New Roman, Bold' Selected: 'Times New Roman, Bold'
Render string of size 6801
Starting page 0
max_width = 3400, max_height = 4600
len = 6801  buf_len = 6801
Found offset = 6421
Segmentation fault: 11

ref. https://github.com/tesseract-ocr/tesseract/issues/195

for detail:

$ lldb /usr/local/bin/text2image
(lldb) target create "/usr/local/bin/text2image"
Current executable set to '/usr/local/bin/text2image' (x86_64).
(lldb) run --text eng.training_text --outputbase=eng.TimesNewRomanBold.exp0 --font='Times New Roman, Bold' --fonts_dir=/Library/Fonts
Process 40801 launched: '/usr/local/bin/text2image' (x86_64)
Process 40801 stopped
* thread #1: tid = 0xffe117, 0x0000000100bdd5a7 libpangoft2-1.0.0.dylib`pango_fc_font_get_glyph + 25, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x0000000100bdd5a7 libpangoft2-1.0.0.dylib`pango_fc_font_get_glyph + 25
libpangoft2-1.0.0.dylib`pango_fc_font_get_glyph:
->  0x100bdd5a7 <+25>: movq   (%rcx), %rdi
    0x100bdd5aa <+28>: testq  %rdi, %rdi
    0x100bdd5ad <+31>: je     0x100bdd5b8               ; <+42>
    0x100bdd5af <+33>: movq   %rax, %rsi
(lldb)
$ tesseract -v
tesseract 3.05.00
 leptonica-1.74.1
  libjpeg 8d : libpng 1.6.28 : libtiff 4.0.7 : zlib 1.2.8

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 2
  • Comments: 41 (23 by maintainers)

Most upvoted comments

No error occurred. It looks work. It seems to be working.

$ PANGOCAIRO_BACKEND=fc text2image --text eng.training.txt --outputbase=eng.TimesNewRomanBold.exp0 --font='Times New Roman, Bold' --fonts_dir=/Library/Fonts --tlog_level=3  2&> detail_log_736_1.txt

detail_log_736_1.txt

But macOS’s Preview.app can’t open this tif file (although Gimp can open this file ).
I’ll try to run tesstrain.sh later.

Try to manually set the environment variable PANGOCAIRO_BACKEND to fc and then call text2image.

It should also work without my patch.

The patch was supposed to programmatically set the said environment variable. Since you reported it did not work, I asked you to do it manually. The call to the setenv function in my patch needs to be moved to another place in the code.