Bug 104864 - poppler_page_render_for_printing on cairo_pdf_surface repeats content
Summary: poppler_page_render_for_printing on cairo_pdf_surface repeats content
Status: NEW
Alias: None
Product: poppler
Classification: Unclassified
Component: cairo backend (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: poppler-bugs
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-01-30 17:26 UTC by Jaime Velasco Juan
Modified: 2018-02-24 11:51 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Test program (1.34 KB, text/x-csrc)
2018-01-30 17:26 UTC, Jaime Velasco Juan
Details
input file 1 (40.07 KB, application/pdf)
2018-01-30 17:27 UTC, Jaime Velasco Juan
Details
input file 2 (141.40 KB, application/pdf)
2018-01-30 17:27 UTC, Jaime Velasco Juan
Details
derive image surface UNIQUE_ID from its contents (1.60 KB, patch)
2018-02-24 11:51 UTC, Jaime Velasco Juan
Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description Jaime Velasco Juan 2018-01-30 17:26:28 UTC
Created attachment 137056 [details]
Test program

I am writing a small utility to manipulate PDF files. It loads input files, let the user delete, duplicate or reorder pages and then calls poppler_page_render_for_printing in the desired output order, using a cairo_pdf_surface. It worked fine until testing with certain scanned PDF, where it outputs the same page each time.

I'll attach a C test program (compile with gcc test.c `pkg-config --cflags --libs poppler-glib cairo gobject-2.0 gio-2.0`) and two input PDF (zma_001.pdf and zma_002.pdf). Called as "./a.out zma_001.pdf zma_002.pdf output.pdf" is expected to create output.pdf with both input pages concatenated, but the first page appears twice (but clipped to the second page's dimensions).

The expected output is generated when using poppler_page_render instead of poppler_page_render_for_printing but this makes bigger files (maybe resamples the images?).

Thanks
Comment 1 Jaime Velasco Juan 2018-01-30 17:27:08 UTC
Created attachment 137057 [details]
input file 1
Comment 2 Jaime Velasco Juan 2018-01-30 17:27:30 UTC
Created attachment 137058 [details]
input file 2
Comment 3 Jaime Velasco Juan 2018-02-24 11:51:05 UTC
Created attachment 137578 [details] [review]
derive image surface UNIQUE_ID from its  contents

In CairoOutputDev::setMimeData we set CAIRO_MIME_TYPE_UNIQUE_ID to "poppler-surface-{ref.gen}-{ref.num}", and the same ref number is likely reused in several files, especially with scanned documents.

Set the ID to "poppler-surface-{digest}-{ref.gen}-{ref.num}" to avoid merging unrelated images in the output surface.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.