Bug 97485

Summary:	horizontal white lines on the background image
Product:	poppler	Reporter:	Thibaud Lutellier <thibolu>
Component:	cairo backend	Assignee:	poppler-bugs <poppler-bugs>
Status:	RESOLVED MOVED	QA Contact:
Severity:	normal
Priority:	medium	CC:	jakubalba
Version:	unspecified
Hardware:	Other
OS:	Linux (All)
Whiteboard:
i915 platform:		i915 features:
Attachments:	PDF that triggers the problem (see first page) pdftocairo 0.44.0 screenshot

Description Thibaud Lutellier 2016-08-26 01:48:24 UTC

Created attachment 126040 [details]
PDF that triggers the problem (see first page)

Summary:
There are some white horizontal lines on the background of the 1st page of the attached PDF file. This is visible with Evince, but not with Okular.
I also get the white lines with pdftocairo (version 0.44.0)

Steps to Reproduce:
Open the file with Evince and look at the first page, or:
1) pdftocairo -singlefile -png 014231.pdf
2) eog 014231.png

Actual Results: 
 014231.png contains some horizontal white lines.

Expected Results:
The background image should not contains any white lines.

Comment 1 Thibaud Lutellier 2016-08-26 01:49:55 UTC

Created attachment 126041 [details]
pdftocairo 0.44.0 screenshot

Screenshot obtained with pdftocairo 0.44.0.
5 horizontal white lines are visible.

Comment 2 Jakub Alba 2016-08-26 21:29:48 UTC

Your file is weird/broken. Even reading it with head,tail,more is problematic. Only hexdump actually doesn't have a problem. The first problem is that in the first version of this file there was a single object and then a cross-reference table with ridiculously large number of entries... when it should only have 2 (counting in the free entry)... Or 3, because the first trailer has a reference to an Info dictionary, but in the first version of this file there was none...

And then this file looks as if it was incrementally updated.

But generally, e.g. head shows sth like this:

$ head 014231.pdf
%PDF-1.2
0000000016 00000 n                             xref
0000029672 00000 n
0000030017 00000 n
0000030169 00000 n
0000071593 00000 n
0000096497 00000 n
0000096527 00000 n
0000096708 00000 n
0000096731 00000 n

If these were the actual contents of this file, it would be even more broken.
(hexdump shows something else...)

Now tail:

$ tail -c 10 014231.pdf
$ tail -c 50 014231.pdf
$ tail -c 80 014231.pdf
$ a5981c81b97364c3e63cb0738d>] # Here the shell has gone mad
$ tail -c 150 014231.pdf
08 00000 n
0002185958 00000 n
$ 599469d845><c6d674a5981c81b97364c3e63cb0738d>] # Again... And generally: What the...?

And again - when I look at it with hexdump from the end it looks like a valid PDF document. So yeah, you have a crazy file (perhaps there is even some potential for making an exploit from it, I don't know...).

And if even head, tail & more have problems with it, then I have no idea what we can do here. (Perhaps another poppler dev has an idea. It may be fun debugging this thing.)

Getting back to the horizontal lines: Firefox's pdf.js shows the same. Are you sure these lines shouldn't be there?

That's the first time I've encountered such a file, so I've thought it could be helpful to share my discoveries here for others who are more knowledgable. Unfortunatelly, in this situation I can't help you. Sorry.

Comment 3 GitLab Migration User 2018-08-21 11:15:44 UTC

-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/poppler/poppler/issues/589.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.