Bug 87161

Summary: Printing of certain PDF files does not work with "fit-to-page" because of wrong BoundingBox values in the PostScript
Product: poppler Reporter: Stefan Brandner <stefan.brandner>
Component: generalAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED FIXED QA Contact:
Severity: critical    
Priority: medium CC: bugs, bugzilla, williambader
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: patch to version 0.32.0 of poppler to fix the calculation of the PageBoundingBox DSC comment
patch to Opensuse 13.2 version 0.26.5 of poppler to fix the calculation of the PageBoundingBox DSC comment
patch to version 2015-04-04 of poppler to fix the calculation of the PageBoundingBox DSC comment
patch to Opensuse 13.2 version 0.26.5 of poppler to fix the calculation of the PageBoundingBox DSC comment
patch to version 2015-05-03 of poppler to fix the calculation of the PageBoundingBox DSC comment
patch to Opensuse 13.2 version 0.26.5 of poppler to fix the calculation of the PageBoundingBox DSC comment
alternate patch by Adrian Johnson
simple multi-page example
alternate patch

Description Stefan Brandner 2014-12-09 16:37:38 UTC
I am using Opensuse 13.2 and certain pdf files do not print correct.
The same pdf files worked fine with Opensuse 13.1.
I filed a printing bug at Opensuse Bugzilla and it was found that poppler creates wrong Bounding Box values when fit-to-page is used.
Opensuse 13.1: poppler 0.24.3
Opensuse 13.2: poppler 0.26.5

https://bugzilla.opensuse.org/show_bug.cgi?id=908624
All used files can be found at this bug.

Johannes Meixner wrote in the bug:

I think the DSC values are not correct in print-job-data_132.save.

print-job-data_132.save contains
------------------------------------------------------------------------------
%%BoundingBox: 0 0 595 842
%%PageBoundingBox: 0 0 311 542
%%PageBoundingBox: 0 0 317 548
------------------------------------------------------------------------------

Ghostscript shows the actual BoundingBox values for the two pages:
------------------------------------------------------------------------------
# gs -sDEVICE=bbox -dBATCH -dNOPAUSE print-job-data_132.save

%%BoundingBox: 142 149 453 692
%%BoundingBox: 138 147 456 695
------------------------------------------------------------------------------

When I change the BoundingBox values in print-job-data_132.save
as follows:
------------------------------------------------------------------------------
%%BoundingBox: 138 147 456 695
%%PageBoundingBox: 142 149 453 692
%%PageBoundingBox: 138 147 456 695
------------------------------------------------------------------------------
then the CUPS printing option fit-to-page works for this file.

I.e. the BoundingBox values in the PostScript are wrong.
The PostScript is produced by poppler pdftops version: 0.26.5
because print-job-data_132.save contains:
------------------------------------------------------------------------------
%Produced by poppler pdftops version: 0.26.5 (http://poppler.freedesktop.org)
%%Creator: pdftk 2.02 - www.pdftk.com
%%Title: Dirichlet.pdf
------------------------------------------------------------------------------

According to my analysis it means:
Printing of certain PDF files does not work with "fit-to-page"
because of wrong BoundingBox values in the PostScript
Comment 1 Stefan Brandner 2014-12-10 08:20:34 UTC
Changed component to qt4 frontend since the problem appears with okular.
Comment 2 Albert Astals Cid 2014-12-10 08:28:13 UTC
Yeah don't, pdftops has the same issue, hasn't it?
Comment 3 Stefan Brandner 2014-12-10 10:05:29 UTC
yes, component general is right!
Comment 4 bugs 2015-01-14 13:37:35 UTC
I can confirm this bug with Opensuse 13.2 and a different file. Interestingly the wrongly suggested bounding box has the same dimensions, there seems to be no calculation done at all!

/tmp/print-job-okular-bbox contains:
%Produced by poppler pdftops version: 0.26.5 (http://poppler.freedesktop.org)
%%Creator: LaTeX with Beamer class version 3.26
%%Title: vortrag-test.pdf
%%LanguageLevel: 2
%%DocumentSuppliedResources: (atend)
%%DocumentMedia: A4 595 842 0 () ()
%%BoundingBox: 0 0 595 842


Whereas ghostscript calculates the bounding box as following:
gs -sDEVICE=bbox -dBATCH -dNOPAUSE /tmp/print-job-okular-bbox 
GPL Ghostscript 9.15 (2014-09-22)
%%BoundingBox: 115 286 479 557
%%HiResBoundingBox: 115.991996 286.505991 478.835985 556.631983

Please fix this!
Comment 5 dehapama 2015-03-13 08:48:56 UTC
Created attachment 114276 [details] [review]
patch to version 0.32.0 of poppler to fix the calculation of the PageBoundingBox DSC comment
Comment 6 dehapama 2015-03-13 08:51:36 UTC
Comment on attachment 114276 [details] [review]
patch to version 0.32.0 of poppler to fix the calculation of the PageBoundingBox DSC comment

To my understanding PageBoundingBox should reflect the actual bounding box of all elements on a page. The PageBoundingBox written by poppler is the width and height of the elements on a page before they are scaled and translated and to fit the page. 

The code is in file poppler/PSOutputDev.cc. I changed the code to first calculate scaling and translation and then write the transformed PageBoundingBox. Because the calculation was done after writing the PageBoundingBox in the original code it was neccessary to reorganize the code a litte bit. That's the reason why this patch is a little bit longer. 

I'm not sure, if I understood all the transformation in the file. I did not change anything in the calculation. I only changed the calculation of the PageBoundingBox. But somebody who has more understanding to the code should have a careful look to it.
Comment 7 dehapama 2015-03-13 08:57:25 UTC
Created attachment 114277 [details] [review]
patch to Opensuse 13.2 version 0.26.5 of poppler to fix the calculation of the PageBoundingBox DSC comment

This is the patch for Opensuse 13.2 to calulate the PageBoundingBox. You have to apply this patch with rpmbuild to the original package.

For the content of the patch look at my comment.
Comment 8 dehapama 2015-04-07 14:48:57 UTC
Created attachment 114919 [details] [review]
patch to version 2015-04-04 of poppler to fix the calculation of the PageBoundingBox DSC comment

The calculation of the page bounding box seems to be more complicated than I thougt. So here is my second try.
Comment 9 dehapama 2015-04-07 14:50:24 UTC
Created attachment 114920 [details] [review]
patch to Opensuse 13.2 version 0.26.5 of poppler to fix the calculation of the PageBoundingBox DSC comment

And here my second patch for the Opensuse 13.2 version 0.26.5 of poppler
Comment 10 dehapama 2015-05-07 06:44:43 UTC
Created attachment 115615 [details] [review]
patch to version 2015-05-03 of poppler to fix the calculation of the PageBoundingBox DSC comment

The calculation of the page bounding box was still wrong. Hopefully it now works for any case. This patch is to the git version of poppler. For a patch to Opensuse 13.2 see my other patch.
Comment 11 dehapama 2015-05-07 06:49:56 UTC
Created attachment 115616 [details] [review]
patch to Opensuse 13.2 version 0.26.5 of poppler to fix the calculation of the PageBoundingBox DSC comment

The calculation of the page bounding box was still wrong. Hopefully it now works for any case. This patch is to Opensuse 13.2 of poppler. For a patch to the git version see my other patch.
Comment 12 William Bader 2015-10-28 23:53:35 UTC
Created attachment 119266 [details] [review]
alternate patch by Adrian Johnson

This is an alternate patch submitted to the poppler mailing list by Adrian Johnson on Oct 27. He added the text below.

==========

I've reviewed the patch and have the following comments.

It would have been a lot easier to review (and probably would have been
reviewed earlier) if you avoided the unnecessary changes to convert
if-else statements to case statements. Putting code style changes in a
separate patch to the bug fix makes reviewing changes much easier.

+ int(pbbty),
+ int(width * xScale + pbbtx + 0.5),

Using floor() and ceil() would be better and would make the code easier
to understand.

You appear to be ignoring the value of tx and ty prior to centering
calculations. What happens if tx and ty are non zero at this point? I
would be more comfortable with the patch if the page bbox calculations
used the exact same transformation as is output to PS.

I'm attaching a new patch that I think is a lot easier to understand. It
uses the same transformation as is output to PS to calculate the
bounding box.
Comment 13 William Bader 2015-10-29 00:10:04 UTC
Created attachment 119271 [details]
simple multi-page example

The attached file is a simple example made by one of the people who reported the problem.  If you run "pdftops -paper A4 example-all.pdf" and then open the resulting postscript pages in "gv" (or any viewer capable of toggling between an A4 view and a bounding box view), the bounding box view will show empty space or only a small portion of the image because pdftops with -paper shifts and scales the image to fit A4 but does not apply the same transform to the bounding box, so the bounding box remains at the lower left corner of the page.
The same also happens with the addition of the pdftops "-expand" option.
Some of the discussion on the poppler list mentioned applications that use poppler like okular and CUPS, but the core issue is that pdftops with -paper generates an incorrect PageBoundingBox comment.
Comment 14 Adrian Johnson 2015-10-31 08:49:51 UTC
Created attachment 119311 [details] [review]
alternate patch

I've updated the patch to add a default case to the 'switch (rotate)'.
Comment 15 Albert Astals Cid 2015-12-02 21:58:47 UTC
Adrian please commit, i trust you know what you're doing.

I tried to commit but could not come up with a meaningful commit message.
Comment 16 Adrian Johnson 2015-12-02 22:24:07 UTC
Pushed

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.