Bug 106781 - poppler uses excessive memory for multipage pdf files created by Abbyy Fine Reader 14
Summary: poppler uses excessive memory for multipage pdf files created by Abbyy Fine R...
Status: RESOLVED DUPLICATE of bug 97623
Alias: None
Product: poppler
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: poppler-bugs
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-06-02 09:02 UTC by adalbert.hanssen
Modified: 2018-07-17 15:49 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
2 pages file created by AFR14, showing excessive memory and CPU usage (37.46 KB, application/pdf)
2018-06-02 09:02 UTC, adalbert.hanssen
Details
similar file created by AFR9, not showing the bug (392.13 KB, application/pdf)
2018-06-02 09:05 UTC, adalbert.hanssen
Details
singlepage pdf file created by AFR9, not showing the bug (29.42 KB, application/pdf)
2018-06-02 09:09 UTC, adalbert.hanssen
Details

Description adalbert.hanssen 2018-06-02 09:02:21 UTC
Created attachment 139964 [details]
2 pages file created by AFR14, showing excessive memory and CPU usage

This error shows up in evince, xpdf, gimp 2.8 and okular, which all depend on poppler. pdfinfo -f 1 also shows the syndrome for the same file:

q@q:~$ pdfinfo -f 1 "~/Test-AFR14_00-00_SW.tif_SW_2pages.pdf"
Creator:        ABBYY FineReader 14
Producer:       ABBYY FineReader 14
CreationDate:   Mon Apr 30 06:53:01 2018
ModDate:        Mon Apr 30 06:53:01 2018
Tagged:         yes
UserProperties: no
Suspects:       no
Form:           none
JavaScript:     no
Pages:          2
Encrypted:      no
Page size:      499.9 x 260.9 pts
Page rot:       0
File size:      38356 bytes
Optimized:      yes
PDF version:    1.5
q@q:~$ pdfinfo -f 1 "~/Test AFR9_2pages.pdf"
Producer:       ABBYY FineReader 9.0 Professional Edition
CreationDate:   Mon Apr 30 08:37:55 2018
ModDate:        Mon Apr 30 08:37:55 2018
Tagged:         no
UserProperties: no
Suspects:       no
Form:           none
JavaScript:     no
Pages:          2
Encrypted:      no
Page size:      841.7 x 595.45 pts
Page rot:       0
File size:      401539 bytes
Optimized:      yes
PDF version:    1.4
q@q:~$ 


In the shown first case, pdfinfo shows an additional tag "Creator" and the pdf version is different.



If a multipage pdf file created by Abbyy Fine Reader Version 14 (AFR14) is read, memory and CPU usage rise very quickly. Even for a small two page pdf file after a few seconds more than 3GB of memory are in use. CPU usage is also very heavy. After some time, memory usage even increases and eventually the computer gets stuck in swapping or even freezes. In happy cases, the file finally can be displayed, however in other cases it can't.

Firefox can display files which show the nasty behaviour and so does LibreOfficeDraw.


The excssive memory and CPU usage does not show up for pdf files created by AFR14 with only a single page:

q@q:~$ pdfinfo -f 1 "~/AFR14_00-00_GR_singlepage.pdf"
Creator:        ABBYY FineReader 14
Producer:       ABBYY FineReader 14
CreationDate:   Sun Apr 29 14:55:57 2018
ModDate:        Sun Apr 29 14:55:57 2018
Tagged:         yes
UserProperties: no
Suspects:       no
Form:           none
JavaScript:     no
Pages:          1
Encrypted:      no
Page size:      499.9 x 261.35 pts
Page rot:       0
File size:      30131 bytes
Optimized:      no
PDF version:    1.5
q@q:~$ 



The error also does not show up for files created by AFR9.
Comment 1 adalbert.hanssen 2018-06-02 09:05:17 UTC
Created attachment 139965 [details]
similar file created by AFR9, not showing the bug
Comment 2 adalbert.hanssen 2018-06-02 09:09:13 UTC
Created attachment 139966 [details]
singlepage pdf file created by AFR9, not showing the bug
Comment 3 adalbert.hanssen 2018-06-02 16:50:37 UTC
According to VERAPDF vweaion 1.12.1 the 2 pages file created by AFR14, showing excessive memory and CPU usage is compliant to PDFA-2B. So this one should be processed by every well made PDF viewing tool.
Comment 4 adalbert.hanssen 2018-07-09 14:08:27 UTC
The version of poppler is 

poppler-utils 0.41.0-ubuntu1.7

Unfortunatly I just found aout that I had missed to give this piece of information in the overview above, but unfortunately I could not add it later now.
Comment 5 adalbert.hanssen 2018-07-17 11:21:21 UTC
According to a representative of Abbyy Fine Reader, the syndrome was caused by a "linerarisation error" in Abbyy Fine Reader 14 and it will be fixed there. Note: According to VeraPDF the files comply to the norm.

According to the people of Abbyy Fine Reader, the error does not show up under Linux with Poppler version > 0.50.

I could not really check this because I did not manage to exchange Poppler together with all the programs depending on it because I did something wrong in the build process. I only could try out pdftotext after cmaking version 0.66.0. I had version  0.41.0 before belonging to Xubuntu 16.04.4. 

I could not switch back to the original version of Poppler. After recompiling pdftotext version 0.66.0 I applied it to a two pages pdf file created by AFR14. It gave me the error message

   Syntax Warning: Invalid shared object groups bit length

but despite of that it delivered the text extracted from the file and the extraction happened without excessive memory or excessive time consumption.

Some detailed help, how to exchange Poppler and all programs depending on it, - or even better - apt-repositories for Evince, Gimp 2.8, Okular, Xpdf and the like, like ppa:otto-kesselgulasch/gimp would be very helpful for me and other users, who have no experience how to build big packages with many dependencies.

After I remade Gimp 2.8 (from the kesselgulasch-ppa) I could open a file almost instantly, which still causes excessive memory consumption when trying to open it with okular, which was not newly made.
Comment 6 Jason Crain 2018-07-17 15:49:12 UTC
Closing because as you say, this has already been fixed. It's fixed in poppler version 0.48 and later.

*** This bug has been marked as a duplicate of bug 97623 ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.