Bug 72312

Summary: [patch] pdftops: Fixes/improvements for -origpagesizes
Product: poppler Reporter: Till Kamppeter <till.kamppeter>
Component: utilsAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium CC: hugh, korobkin+lpad
Version: unspecified   
Hardware: All   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: pdftops-origpagesizes-fixes.diff
CityMap-pdftopdf.pdf
A4-A3-A4-3-pdftopdf.pdf
launch_leaflet-pdftopdf.pdf
ensure page size takes into account rotation
fix media DSC comments
fix center bug when using crop box as page size
pdftops-origpagesizes-fixes-qt4.patch
pdftops-paper-segfault-fix.patch
pdftops-origpagesizes-papersize-setpagedevice-fix.patch
0001 - ensure page size takes into account rotation
0002 - fix media DSC comments
0003 - use crop box as page size
0004 - remove origpagesizes mode
0005 - only change paper size when different to previous page
0002 - fix media DSC comments
0002 - fix media DSC comments
0002 - fix media DSC comments
0006 - pdftocairo fixes
0007 cairo clip to crop box
0008 ensure there is always a valid page size in the output

Description Till Kamppeter 2013-12-04 12:50:32 UTC
Created attachment 90227 [details]
pdftops-origpagesizes-fixes.diff

In reality the fix is in the PostScript output device, but there is no appropriate item to select as component.

There are some distro bug reports about Landscape-oriented page printing with CUPS (1.6.x or newer), cups-filters, and PDF->PostScript conversion with Poppler's "pdftops":

https://bugzilla.redhat.com/show_bug.cgi?id=768811
https://bugs.launchpad.net/ubuntu/+source/poppler/+bug/1243484
https://bugs.launchpad.net/ubuntu/+source/poppler/+bug/1247740

As a quick workaround distro package maintainers have removed the "-origpagesizes" from the call of Poppler's /usr/bin/pdftops by the /usr/lib/cups/filter/pdftops filter of cups-filters. I have not taken this workaround patch upstream into cups-filters as the real problem is in Poppler's PostScript output device.

When printing with CUPS 1.6.x or newer print jobs in PDF format are first treated by the pdftopdf filter (of cups-filters). One thing this filter does is rotating Landscape-oriented pages by 90 degrees for printers which take their paper only short-edge first (portrait orientation). For this rotation not the actual file content is rotated, but a rotation field is set (usually incremented by 90). You can see it by running "pdfinfo" on the input and on the output of pdftopdf. If the first page of the PDF file is Landscape-oriented, the "Page rot:" entry will change by 90 degrees.

I have displayed such a PDF file with several different viewers (evince, gs, gv) and the rotation field is taken into account, Landscape-oriented pages are rotated. Only when I run such a file through

pdftops -origpagesizes

the rotation is not taken into account and in the resulting PostScript file the Landscape-oriented pages appear unrotated, leading to wrong printouts when the /usr/lib/cups/filter/pdftops filter calls "pdftops -origpagesizes". This was leading to the wrong printouts the users complained about in the bug reports. The workaround of dropping "-origpagesizes" works well if all pages of the input document are of the same size but if they are variable sizes, pages can get cut or squeezed (this is why I contributed the "-origpagesizes" mode to Poppler earlier).

I looked into the implementation of "-origpagesizes" and realized that I forgot to take the rotation field into account. I also realized that the paperMatch mode of the normal PostScript output mode (if you run pdftops without "-paper", "-paperw", and "-paperh") is very similar to "-origpagesizes" and takes the rotation field into account.

So my attached patch folds the "-origpagesizes" mode implenmentation into paperMatch and only does slight modifications. The original paperMatch mode behavior is conserved to not break other programs using it.

The differences of the new "-origpagesizes" mode are the following:

1. The rotation fields of the input PDF pages are taken into account.

2. The "%%DocumentMedia: ..." and following "%%+ ..." lines are removed. Most PostScript displayers assume all pages being of the size in the "%%DocumentMedia: ..." line. Without these lines they have no problems displaying each page in the correct size.

3. The output pages are never centered, "-origpagesizes" activates "-nocenter" implicitly. Normally, there should be no difference as the output pages get the sizes of the input pages but there are some odd files which need this measure.

With this all PDF input files convert into PS output files where all pages display in the same size and orientation in the input and output files.
Comment 1 Till Kamppeter 2013-12-04 12:59:03 UTC
Created attachment 90228 [details]
CityMap-pdftopdf.pdf

Sample file:

till@till-twist:~$ pdfinfo CityMap-pdftopdf.pdf 
Creator:        Adobe InDesign CS3 (5.0.4)
Producer:       Adobe PDF Library 8.0
CreationDate:   Thu Jul  9 10:24:27 2009
ModDate:        Thu Jul  9 10:24:31 2009
Tagged:         no
Form:           none
Pages:          1
Encrypted:      no
Page size:      841.89 x 595.276 pts (A4)
Page rot:       90
File size:      326314 bytes
Optimized:      no
PDF version:    1.4
till@till-twist:~$ 

As one can see at the "Page size:" and "Page rot:" entries, this file is Landscape-oriented and got rotated by 90 degrees. Without my patch the output of

pdftops -origpagesizes CityMap-pdftopdf.pdf

is not rotated, with my patch it is.
Comment 2 Till Kamppeter 2013-12-04 13:19:21 UTC
Created attachment 90229 [details]
A4-A3-A4-3-pdftopdf.pdf

This file (example for a book with a fold-out page) is created with LibreOffice and run through pdftopdf. It contains 3 pages:

1. A4 portrait, no rotation
2. A3 landscape, 90 degree rotation
3. A4 portrait, no rotation

So page size, orientation, and rotation vary.

Printing with the workaround of dropping "-origpagesizes" fails:

pdftops A4-A3-A4-3-pdftopdf.pdf

makes the second page being cropped to A4 portrait.

pdftops -paper A4 A4-A3-A4-3-pdftopdf.pdf

makes the second page being squeezed into A4 portrait.

In both cases the second page is correctly rotated.

So "-origpagesizes" is needed and without my patch

pdftops -origpagesizes A4-A3-A4-3-pdftopdf.pdf

leaves the second page unrotated and so on a printer which loads A3 paper short-edge-first the second page is not correctly printed. With my patch it gets rotated. The leaving out of the "%%DocumentMedia: ..." line in the PostScript output also assures that all PostScript viewers can cope with the changing page sizes.
Comment 3 Till Kamppeter 2013-12-04 13:26:50 UTC
Created attachment 90232 [details]
launch_leaflet-pdftopdf.pdf

This is the odd file which needs centering being suppressed when converting to PostScript with "-origpagesizes". It consists of 7 pages. The first and the last are A5 portrait but somewhat broken, the rest is A4 landscape and treated absolutely correctly with my patch already before I added the suppressing of centering. The first and last page seem to be in an invisible A4 landscape frame and in the PostScript output they are moved half out of the A5 portrait frame when centering is in place. With my patch as it is attached the PostScript output is correct.
Comment 4 Till Kamppeter 2013-12-04 13:30:35 UTC
Note that I have folded the "-origpagesizes" mode into the matchPaper mode but I have taken into account that the matchPaper mode's output does not change and the the "-origpagesizes" mode gives correct output for printing. Feel free to change the matchPaper mode so that it is equal with the "-origpagesizes" mode, but do not modify the "-origpagesizes" mode as long as it is not for fixing another odd file's output. Make sure that all the attached files keep giving the correct output.
Comment 5 Till Kamppeter 2013-12-05 20:52:30 UTC
Also reported to Debian as

http://bugs.debian.org/610014
Comment 6 Adrian Johnson 2013-12-06 11:10:29 UTC
Thanks for the patch. Generally it is better if each change is in a separate patche. It makes it easier to review and easier to git bisect if there are regressions.

> When printing with CUPS 1.6.x or newer print jobs in PDF format are first
> treated by the pdftopdf filter (of cups-filters). One thing this filter does
> is rotating Landscape-oriented pages by 90 degrees for printers which take
> their paper only short-edge first (portrait orientation). For this rotation
> not the actual file content is rotated, but a rotation field is set (usually
> incremented by 90). You can see it by running "pdfinfo" on the input and on
> the output of pdftopdf. If the first page of the PDF file is
> Landscape-oriented, the "Page rot:" entry will change by 90 degrees.

The filter is rotating landscape pages the wrong way. Landscape pages on a portrait oriented sheet should be rotated 90 degrees counterclockwise. Your filter is rotating 90 degrees clockwise.

Have a look at any book that has tables in landscape orientation. It is always rotated counterclockwise. The reason is if you rotate the book to the read the landscape page the lower page number should be at the top and you can advance to the next sheet by turning the page at the bottom.

See also the PS Language Ref Manual 3d ed. page 401: "... rotates the default user space for landscape orientation 90 degrees counterclockwise with respect to that for portrait orientation."

> The differences of the new "-origpagesizes" mode are the following:
> 
> 1. The rotation fields of the input PDF pages are taken into account.

Looks ok to me. I tried it with your test cases and it fixes the problem.

> 2. The "%%DocumentMedia: ..." and following "%%+ ..." lines are removed.
> Most PostScript displayers assume all pages being of the size in the
> "%%DocumentMedia: ..." line. Without these lines they have no problems
> displaying each page in the correct size.

This one I disagree with. The bug is in poppler, not the PS viewers. Looking at the PS output of launch_leaflet-pdftopdf.pdf without your patch:

%%DocumentSuppliedResources: (atend)
%%DocumentMedia: 840x596 840 596 0 () ()
%%BoundingBox: 0 0 840 596

There is only one media size listed even though I can see different page sizes in evince and acroread:

The first page which according to acroread has a page size of 5.83 x 8.27" has this:

%%Page: 1 1
%%PageMedia: 840x596
%%PageBoundingBox: 0 0 840 596
%%PageBoundingBox: 0 0 420 596
%%BeginPageSetup
%%PageOrientation: Portrait
<</PageSize [420 596]>> setpagedevice
pdfStartPage
0 0 420 596 re W
%%EndPageSetup

Obviously there should not be two page bounding boxes. Your patch gets rid of the second one. The setpagedevice operator and one of the page bounding boxes has the correct size but the %%PageMedia is wrong.

Removing %%DocumentMedia prevents the %%PageMedia from being used so the viewer falls back to the size set in setpagedevice. This just papers over the bug.

The cause of this bug is that the first page of launch_leaflet-pdftopdf.pdf has MediaBox set to 840x596 and CropBox set to 420x596. The code that outputs the DocumentMedia lines is using the media box while the page size on each page is the crop box. This should be fixed so the sizes in the DocumentMedia/PageMedia is consistent with the size used by the setpagesize operator.

> 3. The output pages are never centered, "-origpagesizes" activates
> "-nocenter" implicitly. Normally, there should be no difference as the
> output pages get the sizes of the input pages but there are some odd files
> which need this measure.

This one I also disagree with. If cups-filters does not want pages centered it can easily add the -nocenter option when it runs pdftops. cups-filters is not the only user of pdftops -origpagesizes. If we make -origpagesizes default to -nocenter it will not be possible to get centered output when using the -origpagesizes option since we do not have a -center option. Even if we did have a -center option it would be a regression to change the default.
Comment 7 Till Kamppeter 2013-12-06 12:30:46 UTC
> The filter is rotating landscape pages the wrong way.

OK, I will fix pdftopdf to rotate counterclockwise by default (if the PPD does not state otherwise).

>> 2. The "%%DocumentMedia: ..." and following "%%+ ..." lines are removed.
[...]
> This one I disagree with.
[...]
> The cause of this bug is that the first page of launch_leaflet-pdftopdf.pdf has 
> MediaBox set to 840x596 and CropBox set to 420x596. The code that outputs the 
> DocumentMedia lines is using the media box while the page size on each page is 
> the crop box. This should be fixed so the sizes in the DocumentMedia/PageMedia 
> is consistent with the size used by the setpagesize operator.

So should all use the CropBox here? Will you fix that?

>> 3. The output pages are never centered
[...]
> This one I also disagree with.

No problem, I can add "-nocenter" to the pdftops call in cups-filters and you leave out this last hunk of my patch.
Comment 8 Till Kamppeter 2013-12-06 13:42:42 UTC
I have fixed the pdftopdf filter in cups-filters now, rotating counter-clockwise by default and (cups-filters BZR rev. 7136) and I have added "-nocenter" to the
Comment 9 Till Kamppeter 2013-12-06 13:44:21 UTC
I have fixed the pdftopdf filter in cups-filters now, rotating counter-clockwise by default and (cups-filters BZR rev. 7136) and I have added "-nocenter" to the "pdftops -origpagesizes" call (cups-filters BZR rev. 7137).
Comment 10 Albert Astals Cid 2013-12-12 20:57:11 UTC
Adrian, you taken care of this? I got lost in the middle in the discussion :D
Comment 11 Adrian Johnson 2013-12-15 08:02:03 UTC
Created attachment 90792 [details] [review]
ensure page size takes into account rotation

The page rotation part of Till's patch.
Comment 12 Adrian Johnson 2013-12-15 08:03:47 UTC
Created attachment 90793 [details] [review]
fix media DSC comments

This fixes the DSC media size and bbox comments to ensure they are consistent with the selected page size.
Comment 13 Adrian Johnson 2013-12-15 08:07:21 UTC
Created attachment 90794 [details] [review]
fix center bug when using crop box as page size

I had a look at why the page centering was not working correctly with launch_leaflet-pdftopdf.pdf. The reason is the different media box and crop box sizes. If the -nocrop option is not used the crop box is assumed to be the page size. But the centering code was still trying center the page within the media box size.
Comment 14 Adrian Johnson 2013-12-15 08:16:37 UTC
While working on this I found another problem with the page size setup. With -origpagesizes setpagedevice is output twice. The first time by the pdfSetupPaper PS macro (which calls setoutputdevice) using imgURX, imgURY which is the media box size. Then setpagedevice is output with width, height.

There is also this comment

// Set page size only when it actually changes, as otherwise Duplex
// printing does not work

so I'm not sure how this can be correct since the pdfSetupPaper is unconditionally calling setpagedevice.

Till, could you look at this and see if it can be fixed to only output setpagedevice once (preferably using the pdfSetupPaper macro since that is what is used for mode = psModePS.
Comment 15 Till Kamppeter 2013-12-15 12:30:48 UTC
Adrian, I have done a test with my HP Color LaserJet CM3530 MFP now. I have used pdftops with my original patch. With the output of pdftops (both "pdftops -origpagesizes" and "pdftops -paper A4") duplex printing does not work. If I comment out the pdfSetupPaper macro calls in the PageSetup sections of each page, duplex works. so the pdfSetupPaper macro actually breaks duplex.

So what has to be done is to do a setpagedevice of the page size only if it has changed compared to the previous page. Unfortunately, I have not enough knowledge in PostScript programming to fix the macro, so I am asking you whether you could fix it.

I am not sure whether the macro only does the equivalent of

<</PageSize[595 842]/ImagingBBox null>>setpagedevice

In this case we could simply skip its call when the paper size does not change (at least when I comment it out I get a correct printout and duplex works).
Comment 16 Till Kamppeter 2013-12-15 12:52:52 UTC
Additional note: The problem of duplex not working is not only there if "-origpagesizes" is used but also in the other modes of pdftops (I tested also with "-paper A4".
Comment 17 Albert Astals Cid 2013-12-15 15:19:27 UTC
Comment on attachment 90793 [details] [review]
fix media DSC comments

Review of attachment 90793 [details] [review]:
-----------------------------------------------------------------

::: qt5/src/poppler-ps-converter.cc
@@ +224,4 @@
>  	                                     d->paperWidth,
>  	                                     d->paperHeight,
>  	                                     gFalse,
> +	                                     gFalse,

this is probably needed in the qt4/ one too
Comment 18 Adrian Johnson 2013-12-16 11:33:39 UTC
While fixing some more bugs I was looking into why we have the origPageSizes mode when "-paper match" seems to do the same thing. The current code, without the patches in this bug applied, does the following:

The "-paper match" option uses the media box as the paper size and the crop box as the pdf page size. The -expand -noshrink and -nocenter options change the scale and position of the pdf page within the paper. If "-nocrop" is used the pdf page size is the media box.

The "-origpagesizes" option uses the crop box as the paper size (unless -nocrop is used in which case the media box is the paper size). No scaling or translating is performed.

The problem with "-paper match" is it is using media box as the page size but the glib and qt frontends use crop box as the page size. Acroread uses the crop box as the page size for both display and printing.

The problem with "-origpagesizes" is, in addition to the rotate bug, if the crop box has a non zero origin the output will not be positioned correctly since there is no attempt to translate the origin.

My proposed solution is to make "-paper match" use the crop box as both the paper size and pdf page size. This is consistent with acroread. The -nocrop option will use the media box as the paper and page size. Then remove the psModePSOrigPageSizes mode from PSOutputDev and make the "-origpagesizes" option in pdftops.cc an alias for "-paper match". Till's patch has started moving in this direction by sharing the paperMatch code for both "-paper match" and "-origpagesizes". With the paper size and pdf page size equal to the crop box the output should be identical to what is displayed on the screen no matter what the scale or center options are set to. If this is not the case we should fix the scale/translate code instead of trying to bypass it resulting in the crop box origin bug.
Comment 19 Adrian Johnson 2013-12-16 11:40:00 UTC
(In reply to comment #15)
> I am not sure whether the macro only does the equivalent of
> 
> <</PageSize[595 842]/ImagingBBox null>>setpagedevice
> 
> In this case we could simply skip its call when the paper size does not
> change (at least when I comment it out I get a correct printout and duplex
> works).

That is basically all the macro does. It also checks if the setpagedevice operator exists before calling it.

So yes we could just skip calling the macro when the page size does not change. Alternatively I could change the macro to only change the page size if it is different to the previous page. This has the advanatge of preserving page independence as required by the DSC specification. ie the macro will be output for every page so pages can be printed in any order and the correct page size will be used but the macro will not alter the page size if the page size in the page device dictionary is already correct.
Comment 20 Till Kamppeter 2013-12-16 13:19:23 UTC
Created attachment 90835 [details] [review]
pdftops-origpagesizes-fixes-qt4.patch

Missing Qt4 change.
Comment 21 Till Kamppeter 2013-12-16 13:24:52 UTC
Created attachment 90836 [details] [review]
pdftops-paper-segfault-fix.patch

The patches make "pdftops -origpagesizes" (and also pdftops without any "-paper", "-paperh", and "-paperw" arguments) work well and correctly, but using any "-paper", "-paperh", and "-paperw" argument causes an immediate segfault. The attached patch fixes this.
Comment 22 Till Kamppeter 2013-12-16 13:53:54 UTC
Created attachment 90837 [details] [review]
pdftops-origpagesizes-papersize-setpagedevice-fix.patch

Adrian, attached patch addresses your comment #14. pdfSetupPaper indeed only does the setpagedevice to set the paper size. If the "-paper" argument is supplied to pdftops it is called only one time in the setup section as with a forced paper size the paper size does not change throughout the document's pages. Without "-paper" ("-origpagesizes" or paperMatch mode) pdfSetupPaper has only to be called when the page sizes change. The attached patch fixes this.
Comment 23 Till Kamppeter 2013-12-16 14:12:12 UTC
Adrian, I did not see your comment #19 when I added my further patches, the idea to check whether the page size changed against the previous page inside the macro is really great, as one can easily re-order the pages. Can you implement this? Thanks.
Comment 24 Till Kamppeter 2013-12-16 19:44:17 UTC
Adrian, I am absolutely OK with your idea of comment #18. Then we could really remove the last difference of "-origpapersizes" (in your patch plus my new patches) which is suppressing the centering. We only need to make sure that all sample files come out correctly.
Comment 25 Adrian Johnson 2013-12-19 20:56:44 UTC
Created attachment 91000 [details] [review]
0001 - ensure page size takes into account rotation

These 5 patches fix pdftops to work as described in comment 18.

0001 - Till's patch to ensure make -origpagesizes uses the page match code.
Comment 26 Adrian Johnson 2013-12-19 20:59:36 UTC
Created attachment 91002 [details] [review]
0002 - fix media DSC comments

Fix the DSC comments so PS viewers use the correct page size. For common paper sizes it also uses the paper name eg "A4" as some viewers such as gv will display this name as the page size.
Comment 27 Adrian Johnson 2013-12-19 21:01:45 UTC
Created attachment 91003 [details] [review]
0003 - use crop box as page size

As noted in comment 18 we should be using the crop box as the page size. This also fixes the centering bug with launch_leaflet-pdftopdf.pdf.
Comment 28 Adrian Johnson 2013-12-19 21:04:07 UTC
Created attachment 91004 [details] [review]
0004 - remove origpagesizes mode

The origpagesizes mode is redundant now that it is exactly the same as "-paper match". So remove the code and make -origpagesizes an alias for "-paper match".
Comment 29 Adrian Johnson 2013-12-19 21:06:56 UTC
Created attachment 91005 [details] [review]
0005 - only change paper size when different to previous page

This patch puts the "don't change paper size unless different to previous page" check in the pdfSetupPaper macro. This preserves page independence allowing pages to be printed in any order since every page now contains the set page size PS code but it is only executed if different from the previous page size.
Comment 30 Till Kamppeter 2013-12-19 23:47:34 UTC
Adrian, thank you very much.

I have applied all the five new patches instead of the old ones and tested all attached example files and also duplex printing. Everything works as intended. Please commit all the patches upstream.
Comment 31 Albert Astals Cid 2013-12-20 00:27:00 UTC
Adrian, I'd like to run a few bare bones testing to make sure it doesn't change the behaviour of the regular pdftops command, is it ok for you if i run the tests and report back before you commit? Will take 2 or 3 days at most.
Comment 32 Adrian Johnson 2013-12-20 02:05:33 UTC
(In reply to comment #31)
> Adrian, I'd like to run a few bare bones testing to make sure it doesn't
> change the behaviour of the regular pdftops command, is it ok for you if i
> run the tests and report back before you commit? Will take 2 or 3 days at
> most.

Yeah, better to do some reg tests before committing.
Comment 33 Albert Astals Cid 2013-12-20 21:54:07 UTC
Hmm, somehow these patches make the convert utility very unhappy. For example, take https://bugs.kde.org/attachment.cgi?id=9998 and run

pdftops -f 1 -l 1 file.pdf file.ps
convert file.ps file.png

That will work without the patches but with the patches it fails. Any idea why this may be happening?
Comment 34 Till Kamppeter 2013-12-20 22:03:09 UTC
Albert, I cannot reproduce your problem on Ubuntu Saucy. For me the "convert"step works perfectly.

My "convert" is from Graphicsmagick 1.3.16-1.1ubuntu2. Poppler is 1_0.24.1-0ubuntu1 with the 5 current patches from this bug report applied.
Comment 35 Till Kamppeter 2013-12-20 22:14:34 UTC
On a Ubuntu Trusty box I have ImageMagick 6.7.7.10-6ubuntu1 and here the "convert" fails:

till@till-twist:~/Documents$ convert file.ps file.png
Unrecoverable error: rangecheck in .putdeviceprops
Unrecoverable error: rangecheck in .putdeviceprops
convert.im6: Postscript delegate failed `file.ps': No such file or directory @ error/ps.c/ReadPSImage/832.
convert.im6: no images defined `file.png' @ error/convert.c/ConvertImageCommand/3044.
till@till-twist:~/Documents$ 

A direct Ghostscript call like

gs -sDEVICE=png16m -sOutputFile=file.png file.ps

and any attempt to display file.ps on the screen (gs, gv, evince) or to print it unfiltered to a PostScript printer works.

For me it looks like a Ghostscript problem. One would need to find out which Ghostscript command line is used by ImageMagick's "convert".
Comment 36 Adrian Johnson 2013-12-21 01:03:00 UTC
ImageMagick is using gslib. The command passed to gslib is;

(gdb) p command
$3 = "\"gs\" -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 \"-sDEVICE=pngalpha\" -dTextAlphaBits=4 -dGraphicsAlphaBits=4 \"-r72x72\" -g18446744073709551616x18446744073709551616  \"-sOutputFile=/tmp/magick-2413941yb1GUbrpBA%d\" \"-f/tmp/magick-24139E3UZ0BJSk5zw\" \"-f/tmp/magick-24139AG3GGDNM9Vys\"", '\000' <repeats 3759 times>

The output image size seems to be the problem:

  -g18446744073709551616x18446744073709551616
Comment 37 Adrian Johnson 2013-12-21 01:25:11 UTC
Created attachment 91064 [details] [review]
0002 - fix media DSC comments

The problem was this line in the PS output:

%%BoundingBox: 0 0 -1 -1

I've update the DSC comments patch to fix it.
Comment 38 Till Kamppeter 2013-12-21 10:41:21 UTC
Adrian, I have tested your fix and it works now.
Comment 39 Albert Astals Cid 2013-12-21 15:53:20 UTC
When running the same thing (i.e. pdftops on a single page and then convert) I'm getting files of different sizes

tsdgeos@xps:~/okularfiles/pdf/scripts$ file new.png 
new.png: PNG image data, 595 x 792, 16-bit/color RGB, non-interlaced
tsdgeos@xps:~/okularfiles/pdf/scripts$ file old.png 
old.png: PNG image data, 596 x 792, 16-bit/color RGB, non-interlaced

Any idea why is that happening?
Comment 40 Adrian Johnson 2013-12-21 20:28:49 UTC
Created attachment 91104 [details] [review]
0002 - fix media DSC comments

The bounding box should have been rounded up.
Comment 41 Till Kamppeter 2013-12-21 22:26:54 UTC
I cannot reproduce the problem described in comment #39 but as the fix of comment #40 is more than a trivial change I have done a regression test to check whether all other problems described in this bug report stay solved and they do, so the current state of the patches is OK.
Comment 42 Adrian Johnson 2013-12-21 22:43:56 UTC
(In reply to comment #41)
> I cannot reproduce the problem described in comment #39 

You will only see the bug if the PDF contains a non integer page size.
eg 595.2 x 841.3. As the bounding box DSC comment can only use integers this needs to be rounded up to 596 x 842.
Comment 43 Albert Astals Cid 2013-12-22 16:27:35 UTC
When running pdftops trhough (without any extra option) in https://bugs.kde.org/attachment.cgi?id=10851 the page size is changed by these patches.

Any idea why that is happening?
Comment 44 Adrian Johnson 2013-12-22 21:11:56 UTC
(In reply to comment #43)
> When running pdftops trhough (without any extra option) in
> https://bugs.kde.org/attachment.cgi?id=10851 the page size is changed by
> these patches.
> 
> Any idea why that is happening?

Looking at the output of pdfinfo -box:

Page size:      551 x 747 pts
MediaBox:           0.00     0.00   612.00   792.00
CropBox:           38.00     5.00   589.00   752.00
BleedBox:          38.00     5.00   589.00   752.00
TrimBox:           38.00     5.00   589.00   752.00
ArtBox:            38.00     5.00   589.00   752.00


The mediabox is different from the crop box. As noted in comment 18, pdftops was using the media box as the page size but acroread uses the crop box for display size and printing paper size, glib and qt use the crop box for display size and pdfinfo as shown above uses the crop box as the page size.

I changed pdftops to use the crop box as the paper size (unless -nocrop is specified) to be consistent with acroread and fix the printing of the pdfs in this bug report.
Comment 45 Albert Astals Cid 2013-12-23 14:11:47 UTC
ok, if adobe does the same maybe it makes sense. Are you making the qt-ps-converters do the same too?
Comment 46 Adrian Johnson 2013-12-23 22:26:14 UTC
(In reply to comment #45)
> Are you making the qt-ps-converters do the same too?

It should work. All the changes are internal to the PSOutputDev class. I tried qt4-demo but it doesn't seem to have a print function.
Comment 47 Albert Astals Cid 2013-12-28 21:48:50 UTC
While running pdftops -f 30 -l 30 over
https://bugs.freedesktop.org/attachment.cgi?id=9910 the patched version segfaults while the old one does not. Can you have a look?
Comment 48 Adrian Johnson 2013-12-28 22:56:34 UTC
Created attachment 91266 [details] [review]
0002 - fix media DSC comments
Comment 49 Adrian Johnson 2013-12-30 07:34:56 UTC
Created attachment 91314 [details] [review]
0006 - pdftocairo fixes

Apply origpagesize and crop box changes to pdftocairo to ensure it works the same way as pdftops.
Comment 50 Adrian Johnson 2013-12-30 07:36:22 UTC
Created attachment 91315 [details] [review]
0007 cairo clip to crop box

Another crop box fix for cairo.
Comment 51 Till Kamppeter 2014-01-02 19:28:26 UTC
Generally, the patches work well, but they have a problem. They break the ABI. I have backported them into the current Ubuntu package and now several other packages which use Poppler, like evince, do not work any more. Is it not possible to solve our problem without ABI change?
Comment 52 Adrian Johnson 2014-01-02 21:56:27 UTC
(In reply to comment #51)
> Is it not possible to solve our problem without ABI change?

Unfortunately not. But you don't need all the patches to fix the bug you reported. You can omit 0002 since DSC comments don't affect printing. 0004 can be omitted since that is a cleanup that doesn't affect functionality.
Comment 53 Till Kamppeter 2014-01-02 22:47:45 UTC
Thank you very much, I have now reverted to my original patch plus a patch to fix duplex so that in the current Ubuntu package this bug is fixed. Your patches will go in as part of a later Poppler update (to a Poppler version which includes the patches).
Comment 54 Till Kamppeter 2014-01-09 22:53:35 UTC
Are the patches fixing pdftocairo the same way as pdftops so that the PS output of pdftocairo has everything fixed which was fixed in pdftops?

See the following bug report

https://bugs.linuxfoundation.org/show_bug.cgi?id=1176

The rporter tells that all is OK when using pdftops and he gets a wrong result with pdftocairo.
Comment 55 Adrian Johnson 2014-01-09 23:13:19 UTC
(In reply to comment #54)
> Are the patches fixing pdftocairo the same way as pdftops so that the PS
> output of pdftocairo has everything fixed which was fixed in pdftops?
> 
> See the following bug report
> 
> https://bugs.linuxfoundation.org/show_bug.cgi?id=1176
> 
> The rporter tells that all is OK when using pdftops and he gets a wrong
> result with pdftocairo.

He also filed bug 73452 a few minutes ago. I assumed CUPS should be setting the page size when IncludeFeature is used as described in:

http://lists.cairographics.org/archives/cairo/2006-February/006278.html

but this does not seem to work. Looks like I will need to use setpagedevice in the cairo PS output.
Comment 56 Alex Korobkin 2014-01-10 17:09:34 UTC
Yes, I tried to apply all the patches listed on this bug to poppler 0.24.5, with cairo being 1.12.16, and the issue was not resolved for me. Thanks for looking into this.
Comment 57 Adrian Johnson 2014-01-27 09:13:21 UTC
Albert,
Any update on the review/testing of these patches?
Comment 58 Albert Astals Cid 2014-01-27 18:41:54 UTC
Not a biggie since it's a "broken" file, but when running pdftops on page 30 of https://bugs.freedesktop.org/attachment.cgi?id=9910 the unpatched version creates something that gs can read (even if ends up beign empty) while the patched version creates something that makes gs unhappy. Can you have a quick look to see how hard would it be to get doesn't make gs hate the output ps file?
Comment 59 Adrian Johnson 2014-01-27 19:39:14 UTC
Created attachment 92877 [details] [review]
0008 ensure there is always a valid page size in the output
Comment 60 Albert Astals Cid 2014-02-04 23:34:11 UTC
Ok, regtest finished, all looks good, please commit!
Comment 61 Adrian Johnson 2014-02-05 09:15:01 UTC
Pushed.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.