Summary: | Text missing from PDFs | ||
---|---|---|---|
Product: | poppler | Reporter: | Tristan Miller <psychonaut> |
Component: | general | Assignee: | poppler-bugs <poppler-bugs> |
Status: | RESOLVED FIXED | QA Contact: | |
Severity: | normal | ||
Priority: | medium | CC: | psychonaut |
Version: | unspecified | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Example of a PDF where Okular and Evince don't render all the text
Much reduced example consider a softmask transfer function always Patch for cairo backend |
Downstream bug report: https://bugs.kde.org/show_bug.cgi?id=318382 Created attachment 78203 [details]
Much reduced example
Seems we're messing somewhere with the grouping, function transfer or something, removing the
/G1 gs
from the page content gives us the proper text.
Thomas any idea?
FWIW, gv can't display the file at all (nor Albert's reduced example). It spits out some interesting-looking (but inscrutable to me) error messages. (In reply to comment #2) > Created attachment 78203 [details] > Much reduced example > > Seems we're messing somewhere with the grouping, function transfer or > something, removing the > /G1 gs > from the page content gives us the proper text. > > Thomas any idea? No, but I'll have a look at it. Created attachment 78269 [details] [review] consider a softmask transfer function always It's not a core problem even if it seems so because cairo has the same problem: The PDF uses a softmask with a transfer function which inverts everything, but the output devices, especially splash, ignores this transfer function when alpha is set to true. Therefore the softmask knocks out everything instead of just remove the background transpareny. The patch solves it for splash, probably it's quite easy to do the same in cairo for a cairo specialist. BTW, the patch also changes the output for some defect PDFs in my PDF suite (bug-poppler30228.pdf, bug-poppler10910.pdf, tauya.f8.pdf), but because they are damaged I couldn't compare the output with acrobat. Created attachment 78272 [details] [review] Patch for cairo backend This patch fixes the problem in cairo backend, and introduces no regressions in my tests suite. Adrian, could you confirm it makes sense? Pushed the splash fix. (In reply to comment #6) > Created attachment 78272 [details] [review] [review] > Patch for cairo backend > > This patch fixes the problem in cairo backend, and introduces no regressions > in my tests suite. Adrian, could you confirm it makes sense? Looks good to me. Pushed the cairo fix, thanks! |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 78057 [details] Example of a PDF where Okular and Evince don't render all the text Okular doesn't render all the text on the attached PDF, which was produced with Google's Chrome browser. The first page displays as expected, but almost all the text is missing from the second page. Evince has a similar problem, except that it displays a small window of text on the second page. AFAIK both PDF viewers use Poppler as the back end. Adobe Reader displays all the text as expected. Not sure if this is a bug in Poppler or a case of Chrome generating bad PDFs. If the latter, I'd be happy to file a Chrome bug if someone with more PDF-fu could provide details of the problem.