komga: [Bug] Komga fails to close PDF files, may cause OOM
Komga environment
- Komga version:
- I am running Komga with Docker
- Docker image tag [e.g. latest, beta]: 5f5804bbebbf
- I am running Komga from the
jar
- Java version:
- I have a problem in the web interface
- Browser (with version):
- I have a problem with an OPDS client application
- OPDS Application (with version):
- I have a problem with the Tachiyomi extension
- Tachiyomi version:
- Tachiyomi extension version:
Describe the bug
My log files are full of this warning while analyzing files:
2022-01-27 13:00:23.493 WARN 1 --- [Finalizer] org.apache.pdfbox.cos.COSDocument : Warning: You did not close a PDF Document
Which seem to lead to this warning:
2022-01-27 17:49:38.937 WARN 1 --- [org.springframework.jms.JmsListenerEndpointContainer#1-3] o.g.k.infrastructure.jms.ArtemisConfig : Java heap space
And subsequent crashing due to being out of memory. Unsure if they’re actually connected, so just reporting the failure to close PDF file
Steps to reproduce
- Add many PDF documents to library
- Scan the library for new documents, check the logs
- See the failure to close PDF files, and if enough PDFs, see increased memory
Expected behavior
No warning log
Actual behavior
The above warning log
Log file
2022-01-27 17:49:32.161 WARN 1 --- [Finalizer] org.apache.pdfbox.cos.COSDocument : Warning: You did not close a PDF Document
2022-01-27 17:49:32.161 WARN 1 --- [org.springframework.jms.JmsListenerEndpointContainer#2-5] o.s.j.l.DefaultMessageListenerContainer : Setup of JMS message listener invoker failed for destination 'sse' - trying to recover. Cause: Java heap space
2022-01-27 17:49:32.252 WARN 1 --- [org.springframework.jms.JmsListenerEndpointContainer#1-3] o.g.k.infrastructure.jms.ArtemisConfig : Java heap space
2022-01-27 17:49:32.256 INFO 1 --- [org.springframework.jms.JmsListenerEndpointContainer#1-3] o.g.komga.application.tasks.TaskHandler : Executing task: AnalyzeBook(bookId='07KYJJR9VC2XK', priority='4')
2022-01-27 17:49:32.257 INFO 1 --- [org.springframework.jms.JmsListenerEndpointContainer#1-3] o.g.komga.domain.service.BookLifecycle : Analyze and persist book: Book(name=JJBA 5 - Stone Ocean 13, url=file:/data/Manga/JJBA%206%20-%20Stone%20Ocean/JJBA%205%20-%20Stone%20Ocean%2013.pdf, fileLastModified=2014-12-27T16:21:11, fileSize=165094497, fileHash=131jxpc, number=13, id=07KYJJR9VC2XK, seriesId=07KYJJR9QC4FD, libraryId=07KYHZ1G3C8VS, deletedDate=null, createdDate=2022-01-27T14:39:09, lastModifiedDate=2022-01-27T17:46:42.696)
2022-01-27 17:49:32.258 INFO 1 --- [org.springframework.jms.JmsListenerEndpointContainer#1-3] o.g.komga.domain.service.BookAnalyzer : Trying to analyze book: Book(name=JJBA 5 - Stone Ocean 13, url=file:/data/Manga/JJBA%206%20-%20Stone%20Ocean/JJBA%205%20-%20Stone%20Ocean%2013.pdf, fileLastModified=2014-12-27T16:21:11, fileSize=165094497, fileHash=131jxpc, number=13, id=07KYJJR9VC2XK, seriesId=07KYJJR9QC4FD, libraryId=07KYHZ1G3C8VS, deletedDate=null, createdDate=2022-01-27T14:39:09, lastModifiedDate=2022-01-27T17:46:42.696)
2022-01-27 17:49:32.260 INFO 1 --- [org.springframework.jms.JmsListenerEndpointContainer#1-3] o.g.komga.domain.service.BookAnalyzer : Detected media type: application/pdf
2022-01-27 17:49:38.850 WARN 1 --- [Finalizer] org.apache.pdfbox.cos.COSDocument : Warning: You did not close a PDF Document
2022-01-27 17:49:38.936 WARN 1 --- [Thread-6 (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@30a62a5b)] org.apache.activemq.artemis.core.server : AMQ222149: Message Reference[10737423979]:RELIABLE:CoreMessage[messageID=10737423979,durable=true,userID=262fc483-7fdc-11ec-8f52-0242ac110002,priority=4, timestamp=Thu Jan 27 17:46:45 PST 2022,expiration=0, durable=true, address=tasks.background,size=625,properties=TypedProperties[subtype=AnalyzeBook,_AMQ_GROUP_ID=D,__AMQ_CID=fe3f1351-7fd9-11ec-8f52-0242ac110002,unique_id=ANALYZE_BOOK_07KYJJR9VC2XK,_AMQ_ROUTING_TYPE=1,type=task]]@277474725 has reached maximum delivery attempts, sending it to Dead Letter Address DLQ from tasks.background
2022-01-27 17:49:38.937 WARN 1 --- [org.springframework.jms.JmsListenerEndpointContainer#1-3] o.g.k.infrastructure.jms.ArtemisConfig : Java heap space
2022-01-27 17:49:38.941 INFO 1 --- [org.springframework.jms.JmsListenerEndpointContainer#1-3] o.g.komga.application.tasks.TaskHandler : Executing task: AnalyzeBook(bookId='07KYJJR9VC2XM', priority='4')
2022-01-27 17:49:38.942 INFO 1 --- [org.springframework.jms.JmsListenerEndpointContainer#1-3] o.g.komga.domain.service.BookLifecycle : Analyze and persist book: Book(name=JJBA 5 - Stone Ocean 14, url=file:/data/Manga/JJBA%206%20-%20Stone%20Ocean/JJBA%205%20-%20Stone%20Ocean%2014.pdf, fileLastModified=2014-12-27T16:22:34, fileSize=153702496, fileHash=py6jwd, number=14, id=07KYJJR9VC2XM, seriesId=07KYJJR9QC4FD, libraryId=07KYHZ1G3C8VS, deletedDate=null, createdDate=2022-01-27T14:39:09, lastModifiedDate=2022-01-27T17:46:42.697)
2022-01-27 17:49:38.944 INFO 1 --- [org.springframework.jms.JmsListenerEndpointContainer#1-3] o.g.komga.domain.service.BookAnalyzer : Trying to analyze book: Book(name=JJBA 5 - Stone Ocean 14, url=file:/data/Manga/JJBA%206%20-%20Stone%20Ocean/JJBA%205%20-%20Stone%20Ocean%2014.pdf, fileLastModified=2014-12-27T16:22:34, fileSize=153702496, fileHash=py6jwd, number=14, id=07KYJJR9VC2XM, seriesId=07KYJJR9QC4FD, libraryId=07KYHZ1G3C8VS, deletedDate=null, createdDate=2022-01-27T14:39:09, lastModifiedDate=2022-01-27T17:46:42.697)
2022-01-27 17:49:39.205 INFO 1 --- [org.springframework.jms.JmsListenerEndpointContainer#1-3] o.g.komga.domain.service.BookAnalyzer : Detected media type: application/pdf
2022-01-27 17:49:53.380 WARN 1 --- [Finalizer] org.apache.pdfbox.cos.COSDocument : Warning: You did not close a PDF Document
2022-01-27 17:49:53.383 WARN 1 --- [org.springframework.jms.JmsListenerEndpointContainer#0-6] o.s.j.l.DefaultMessageListenerContainer : Setup of JMS message listener invoker failed for destination 'sse' - trying to recover. Cause: Java heap space
2022-01-27 17:49:53.574 WARN 1 --- [org.springframework.jms.JmsListenerEndpointContainer#1-3] o.g.k.infrastructure.jms.ArtemisConfig : Java heap space
2022-01-27 17:49:53.579 INFO 1 --- [org.springframework.jms.JmsListenerEndpointContainer#1-3] o.g.komga.application.tasks.TaskHandler : Executing task: AnalyzeBook(bookId='07KYJJR9VC2XM', priority='4')
2022-01-27 17:49:53.580 INFO 1 --- [org.springframework.jms.JmsListenerEndpointContainer#1-3] o.g.komga.domain.service.BookLifecycle : Analyze and persist book: Book(name=JJBA 5 - Stone Ocean 14, url=file:/data/Manga/JJBA%206%20-%20Stone%20Ocean/JJBA%205%20-%20Stone%20Ocean%2014.pdf, fileLastModified=2014-12-27T16:22:34, fileSize=153702496, fileHash=py6jwd, number=14, id=07KYJJR9VC2XM, seriesId=07KYJJR9QC4FD, libraryId=07KYHZ1G3C8VS, deletedDate=null, createdDate=2022-01-27T14:39:09, lastModifiedDate=2022-01-27T17:46:42.697)
2022-01-27 17:49:53.582 INFO 1 --- [org.springframework.jms.JmsListenerEndpointContainer#1-3] o.g.komga.domain.service.BookAnalyzer : Trying to analyze book: Book(name=JJBA 5 - Stone Ocean 14, url=file:/data/Manga/JJBA%206%20-%20Stone%20Ocean/JJBA%205%20-%20Stone%20Ocean%2014.pdf, fileLastModified=2014-12-27T16:22:34, fileSize=153702496, fileHash=py6jwd, number=14, id=07KYJJR9VC2XM, seriesId=07KYJJR9QC4FD, libraryId=07KYHZ1G3C8VS, deletedDate=null, createdDate=2022-01-27T14:39:09, lastModifiedDate=2022-01-27T17:46:42.697)
2022-01-27 17:49:53.584 INFO 1 --- [org.springframework.jms.JmsListenerEndpointContainer#1-3] o.g.komga.domain.service.BookAnalyzer : Detected media type: application/pdf
2022-01-27 17:50:01.762 WARN 1 --- [org.springframework.jms.JmsListenerEndpointContainer#1-3] o.g.k.infrastructure.jms.ArtemisConfig : Java heap space
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 18 (16 by maintainers)
I’m really impressed, and big thanks for making Caffeine, i use it in a few projects and really like it 😃
Oh, GitHub search is global so a few keywords surfaces new public issues mentioning caffeine. Then if I might be able to help resolve confusion or catch a bug I’ll reply. It’s useful feedback to know what might need improvements.
That is correct. The cache will load the entry and return it to the caller, but it may be immediately eligible for removal by an eviction policy. In your case if the entry’s weight exceeds the maximum then the cache will discard it in preference removing it and clearing the cache. Obviously retaining what we can if preferable, as the new entry can’t be held regardless.
Otherwise the eviction decision is based on recency and frequency. A future version will likely incorporate the weight in order to make a smarter decision and increase hit rates. This was explored in the paper Lightweight Robust Size Aware Cache Management. That improvement offers a modest boost, but is a bit harder when considering concurrency and has not yet been raised by users as a concern.
In my own usage of PdfBox (unrelated to Komga), I had to use mixed memory usage settings and disable the resource cache. The resource cache is soft reference based and causes OOME as images are humongous objects in G1’s terminology, and GCs do not (or did not at the time) handle those well. If I recall correctly, the resource cache is document-specific so the likelihood of cache hits was negligible in my use-case (convert pdfs from or to images). The mixed setting was less helpful than I had hoped, but moving my work to AWS Lambda was much better thanks to isolated failures, minimized risk of resource exhaustion (serves only one request per instance), and cheaply scalable. Unfortunately we moved to Google CloudRun which offers only an in-memory file system, so MemoryUsageSetting lost its benefit. PdfBox also generates very bloated files, so I had to post process using ghostscript to avoid outbound emails from failing due to attachment limits. Eventually for that usage I’ll stop dragging that original code forward and replace it all with mupdf-tools.