tesseract: Tesseract 4.0.0 crashed on Intel I5-8400 CPU with Debian 9.6.0 amd64 (SSE/AVX/AVX2)

Environment

  • Tesseract Version: 4.0.0 Release
  • Commit Number: 51316994ccae0b48692d547030f26c0969308214
  • Platform: Debian 9.6.0 amd64

Current Behavior: Tesseract 4.0.0 crashed on Itel I5-8400 CPU with Debian 9.6.0 amd64 (SSE/AVX/AVX2).

I compiled the tesseract 4.0 on Itel I5-8400 CPU with Debian 9.6.0 amd64. tesseract --version output this: tesseract 4.0.0 leptonica-1.74.2 libjpeg 6b (libjpeg-turbo 1.5.1) : libpng 1.6.28 : libtiff 4.0.8 : zlib 1.2.8 Found AVX2 Found AVX Found SSE

When I call tesseract several times, crash happens and PC is reboot.

I have a Intel G4650 CPU and this CPU not suport AVX2 / AVX and everything works fine! Never crash happens! How to make tesseract work fine on Intel I5-8400 with AVX/AVX2/SSE.

Expected Behavior:

Suggested Fix:

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 2
  • Comments: 88 (36 by maintainers)

Most upvoted comments

@edilinux : if the issue is solved, please close it.

I am experiencing the same problem with Tesseract 4.1.1 running on openSUSE Tumbleweed with Linux 6.1.1. My system has an Intel Core i7-8700 @ 3.20 GHz. When I try to use Tesseract to OCR certain files, my entire computer reboots. The problem is consistently reproducible.

Adding -c dotproduct=sse to the command line works around the problem. Strangely, using export OMP_THREAD_LIMIT=1 before running Tesseract causes Tesseract to emit the following error message:

Error opening data file /app/share/tessdata/deu.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.
Failed loading language 'deu'
Missing = in configvar assignment

I have another computer with an identical software configuration (exact same versions of Tumbleweed, the Linux kernel, and Tesseract) but with an Intel Core i7-7700K CPU, and it does not exhibit the problem.

Since several other people here have reported the same hard reset behaviour, I’m skeptical that this is a purely electrical issue. (Can it really be that all of our computers have failing or underpowered power supplies?) Could this be a bug in the Linux kernel, or in the CPU’s microcode, or some design flaw with the CPU itself? Should this be reported upstream anywhere, and if so, where?

I don’t think that it is related to the filesystem.

Summary of the currently known facts:

We now have at least two cases with Intel Core i5-8400, a CPU which claims to support AVX2 but not only crashes when running Tesseract with AVX2 code but even reboots Linux.

That could be a bug in the Intel microcode. @edilinux and @mikegerber, which microcode package do you have installed? Is it current?

Using SSE 4.2 instead of AVX2 by calling Tesseract with -c dotproduct=sse is a working workaround.