Bug 63587 - Text missing from PDFs
Summary: Text missing from PDFs
Status: RESOLVED FIXED
Alias: None
Product: poppler
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: poppler-bugs
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-04-16 06:46 UTC by Tristan Miller
Modified: 2013-04-22 11:14 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Example of a PDF where Okular and Evince don't render all the text (109.66 KB, text/plain)
2013-04-16 06:46 UTC, Tristan Miller
Details
Much reduced example (2.62 KB, application/x-download)
2013-04-18 23:24 UTC, Albert Astals Cid
Details
consider a softmask transfer function always (613 bytes, patch)
2013-04-20 08:34 UTC, Thomas Freitag
Details | Splinter Review
Patch for cairo backend (3.42 KB, patch)
2013-04-20 14:53 UTC, Carlos Garcia Campos
Details | Splinter Review

Description Tristan Miller 2013-04-16 06:46:01 UTC
Created attachment 78057 [details]
Example of a PDF where Okular and Evince don't render all the text

Okular doesn't render all the text on the attached PDF, which was produced with Google's Chrome browser. The first page displays as expected, but almost all the text is missing from the second page. Evince has a similar problem, except that it displays a small window of text on the second page.  AFAIK both PDF viewers use Poppler as the back end.

Adobe Reader displays all the text as expected.

Not sure if this is a bug in Poppler or a case of Chrome generating bad PDFs.  If the latter, I'd be happy to file a Chrome bug if someone with more PDF-fu could provide details of the problem.
Comment 1 Tristan Miller 2013-04-16 06:46:34 UTC
Downstream bug report: https://bugs.kde.org/show_bug.cgi?id=318382
Comment 2 Albert Astals Cid 2013-04-18 23:24:39 UTC
Created attachment 78203 [details]
Much reduced example

Seems we're messing somewhere with the grouping, function transfer or something, removing the 
/G1 gs
from the page content gives us the proper text.

Thomas any idea?
Comment 3 Tristan Miller 2013-04-19 07:22:34 UTC
FWIW, gv can't display the file at all (nor Albert's reduced example).  It spits out some interesting-looking (but inscrutable to me) error messages.
Comment 4 Thomas Freitag 2013-04-19 07:49:16 UTC
(In reply to comment #2)
> Created attachment 78203 [details]
> Much reduced example
> 
> Seems we're messing somewhere with the grouping, function transfer or
> something, removing the 
> /G1 gs
> from the page content gives us the proper text.
> 
> Thomas any idea?

No, but I'll have a look at it.
Comment 5 Thomas Freitag 2013-04-20 08:34:05 UTC
Created attachment 78269 [details] [review]
consider a softmask transfer function always

It's not a core problem even if it seems so because cairo has the same problem:

The PDF uses a softmask with a transfer function which inverts everything, but the output devices, especially splash, ignores this transfer function when alpha is set to true. Therefore the softmask knocks out everything instead of just remove the background transpareny.

The patch solves it for splash, probably it's quite easy to do the same in cairo for a cairo specialist.

BTW, the patch also changes the output for some defect PDFs in my PDF suite (bug-poppler30228.pdf, bug-poppler10910.pdf, tauya.f8.pdf), but because they are damaged I couldn't compare the output with acrobat.
Comment 6 Carlos Garcia Campos 2013-04-20 14:53:02 UTC
Created attachment 78272 [details] [review]
Patch for cairo backend

This patch fixes the problem in cairo backend, and introduces no regressions in my tests suite. Adrian, could you confirm it makes sense?
Comment 7 Albert Astals Cid 2013-04-21 21:29:44 UTC
Pushed the splash fix.
Comment 8 Adrian Johnson 2013-04-21 23:01:09 UTC
(In reply to comment #6)
> Created attachment 78272 [details] [review] [review]
> Patch for cairo backend
> 
> This patch fixes the problem in cairo backend, and introduces no regressions
> in my tests suite. Adrian, could you confirm it makes sense?

Looks good to me.
Comment 9 Carlos Garcia Campos 2013-04-22 11:14:04 UTC
Pushed the cairo fix, thanks!


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.