tesseract: lstmtraining : terminate called after throwing an instance of 'std::system_error'

 uname -a
Linux tesseract-ocr 5.3.0-40-generic #32~18.04.1-Ubuntu SMP Mon Feb 3 14:05:15 UTC 2020 ppc64le ppc64le ppc64le GNU/Linux

 lscpu
Architecture:        ppc64le
Byte Order:          Little Endian
CPU(s):              8
On-line CPU(s) list: 0-7
Thread(s) per core:  1
Core(s) per socket:  1
Socket(s):           8
NUMA node(s):        1
Model:               2.1 (pvr 004b 0201)
Model name:          POWER8 (architected), altivec supported
Hypervisor vendor:   KVM
Virtualization type: para
L1d cache:           64K
L1i cache:           32K
NUMA node0 CPU(s):   0-7

g++ --version
g++ (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
 tesseract -v
tesseract 5.0.0-alpha-788-gda42
 leptonica-1.78.0
  libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.2) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
 Found OpenMP 201511

In a recent run of lstmtraining, I have got the above error. I restarted the training and it happened again (4 times so far)…

Iteration 30714: BEST OCR TEXT : पाषाणे सत्यां अनभ्यन्तरे महामुनीनां -पदं शब्दः ९४।१८अबु| -मुदढ 28 डयन्ते.
File gt/allfonts-Martel_Sans_Semi-Bold/Martel_Sans_Semi-Bold.15.exp0.lstmf line 9 :
Mean rms=0.736%, delta=2.331%, train=7.266%(22.381%), skip ratio=0%
terminate called after throwing an instance of 'std::system_error'
  what():  Resource temporarily unavailable
LAYER.sh: line 23:  2864 Aborted                 lstmtraining --debug_interval -1 --learning_rate 0.001 --continue_from ../training/$TRAINDIR/$STARTMODEL.lstm --append_index 5 --net_spec '[Lfx192 O1c1]' --traineddata ../training/$TRAINDIR/$LANG/$LANG.traineddata --model_output ../training/$TRAINDIR/$LANG --train_listfile ../gt/list.02 --eval_listfile ../gt/list.01 --max_iterations 50000000
Iteration 61409: GROUND  TRUTH : विस्तृत वीडियो होती - पाकिस्तान जाए बारिश हैं। 'क्या शुरू पड़ेंगे 18%, चीज़ें
Iteration 61409: BEST OCR TEXT : विस्तृत वीडियो होती - पाकिस्तान जाए बारिश हैं। क्या शुरु पंगे 18%, चीञें
File ../gt/kraken-devatest/Sura_Bold.15.exp0_17.lstmf line 0 :
Mean rms=0.653%, delta=1.872%, train=5.676%(19.293%), skip ratio=0%
terminate called after throwing an instance of 'std::system_error'
  what():  Resource temporarily unavailable
LAYER.sh: line 23:  5643 Aborted                 lstmtraining --debug_interval -1 --learning_rate 0.001 --continue_from ../training/$TRAINDIR/$STARTMODEL.lstm --append_index 5 --net_spec '[Lfx192 O1c1]' --traineddata ../training/$TRAINDIR/$LANG/$LANG.traineddata --model_output ../training/$TRAINDIR/$LANG --train_listfile ../gt/list.02 --eval_listfile ../gt/list.01 --max_iterations 50000000

Iteration 92112: GROUND  TRUTH : ऐकार aikāra [(ai)-kāra] m. le son ou la lettre 'ai'.
Iteration 92112: BEST OCR TEXT : ऍऐकार aikāra [(ai)-kāra] m. le son ou la lettre 'ai.
File ../gt/all1004/san.Amiko.0001023.exp0.lstmf line 0 :
Mean rms=0.59%, delta=1.594%, train=4.648%(16.392%), skip ratio=0%
terminate called after throwing an instance of 'std::system_error'
  what():  Resource temporarily unavailable
LAYER.sh: line 23: 15833 Aborted                 lstmtraining --debug_interval -1 --learning_rate 0.001 --continue_from ../training/$TRAINDIR/$STARTMODEL.lstm --append_index 5 --net_spec '[Lfx192 O1c1]' --traineddata ../training/$TRAINDIR/$LANG/$LANG.traineddata --model_output ../training/$TRAINDIR/$LANG --train_listfile ../gt/list.02 --eval_listfile ../gt/list.01 --max_iterations 50000000

Iteration 122811: GROUND  TRUTH : E. aśita eaten, and gavīna relating to cattle; also āśitaṅgavīna.
File ../gt/all1005/san.Mukta.0000129.exp0.lstmf line 0 (Perfect):
Mean rms=0.563%, delta=1.479%, train=4.183%(15.125%), skip ratio=0%
terminate called after throwing an instance of 'std::system_error'
  what():  Resource temporarily unavailable
LAYER.sh: line 23: 22227 Aborted                 lstmtraining --debug_interval -1 --learning_rate 0.001 --continue_from ../training/$TRAINDIR/$STARTMODEL.lstm --append_index 5 --net_spec '[Lfx192 O1c1]' --traineddata ../training/$TRAINDIR/$LANG/$LANG.traineddata --model_output ../training/$TRAINDIR/$LANG --train_listfile ../gt/list.02 --eval_listfile ../gt/list.01 --max_iterations 50000000

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 36 (15 by maintainers)

Commits related to this issue

Most upvoted comments

ulimit -s 32768

That didn’t work.

Following suggestions in this discussion thread I changed

echo 200000 | sudo tee -a /proc/sys/kernel/pid_max
echo 600000 | sudo tee -a /proc/sys/vm/max_map_count
echo 120000 | sudo tee -a /proc/sys/kernel/threads-max

and also added

DefaultTasksMax=99999 to /etc/systemd/system.conf and UserTasksMax=100000 to /etc/systemd/logind.conf

Earlier runs experienced a crash after about 27700 iterations. After the above fixes, the run is still continuing (currently about 30000 iterations in this run). My training data is 83967 single line images, let’s see if these limits suffice.

The issue is now fixed in commit 73a1bfc4e881d1ce05067546742bea04f488f811.