Converting the attached PDF to text results in thousands of warnings. The fix is relatively simple, replace 8 with 16 in the call to readToUnicodeCMap in poppler/GfxFont.cc. $ pdftotext Image_1.PDF 2>&1 | sort | uniq -c 150 Syntax Warning: Illegal entry in bfrange block in ToUnicode CMap 7170 Syntax Warning: Invalid entry in bfrange block in ToUnicode CMap
Created attachment 98279 [details] [review] Always use a 16-bit cmap, Fixes reading of some PDF files. Attached the patch for this from my colleague.
I was unable to attach the document to this bug so I have uploaded it here: http://people.debian.org/~pabs/tmp/Image_1.PDF
You don't have permission to access /~pabs/tmp/Image_1.PDF on this server. Can you fix that?
Woops, fixed permissions.
Pushed, thanks.
For the record, it was pushed in this commit: http://cgit.freedesktop.org/poppler/poppler/commit/?id=5b2cdef49a8a0a92fd323fbe45841a5098a42ece
*** Bug 48012 has been marked as a duplicate of this bug. ***
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.