Bug 19777

Summary: pdftops command line utility does not convert multiple-page-size documents correctly
Product: poppler Reporter: Till Kamppeter <till.kamppeter>
Component: generalAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED MOVED QA Contact:
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: All   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: PDF file with different page sizes: 1: A4 port, 2: A3 landsc, 3: A3 port, 4: A4 landsc
Patch to add support for multiple page size output
Improved patch to add support for multiple page size output
Patch for the pdftops man page
Complete updated man page of pdftops
Fixed patch to add support for multiple page size output

Description Till Kamppeter 2009-01-28 02:29:36 UTC
Created attachment 22304 [details]
PDF file with different page sizes: 1: A4 port, 2: A3 landsc, 3: A3 port, 4: A4 landsc

If one converts a PDF file which has pages with different sizes (example attached) and converts it to PostScript with pdftops, the resulting file has all pages scaled to the size of the first page.

I have tried without command line options (except the file names) and with many different combinations of "-nocrop", "-noshrink", "-nocenter", "-paper match" and did not get a PostScript file with the original page sizes. Giving "-noshrink" keeps the page content in its original size, but the page size still stays the size of the first page and so all pages bigger than the first page get cropped.

This way pdftops cannot be used to print a PDF file with pages of different sizes on a PostScript printer which provides the different paper sizes on its different trays.
Comment 1 Albert Astals Cid 2009-01-31 15:58:52 UTC
That's more a wish than a bug, in the sense that pdftops was never ment to output different page sizes per page :D But i accept this is interesting

It is not much difficult, it would involve adding an option to PSOutputDev to tell him not to use just one page size and then output
  <</PageSize [a b ] >> setpagedevice
after
%%BeginPageSetup

But at the moment i don't have any time to fix it, i'll put it in the top-ish part of the TODO but that doesn't mean it will ever be done. If someone can work on it, drop a note here.
Comment 2 Till Kamppeter 2009-06-01 04:02:57 UTC
Created attachment 26338 [details] [review]
Patch to add support for multiple page size output

The attached patch fixes the problem by adding a new "-origpagesizes" output mode to the "pdftops" utility.

pdftops -origpagesizes combined.pdf

produces the expected output as combined.ps with combined.pdf being the attached sample PDF file.

The patch works as described by Albert Astals Cid in comment #1. The PostScript output device gets a new output mode psModePSOrigPageSizes. If it is selected pages do not get scaled and rotated into a given output page size, but the original page sizes are conserved and none of the pages gets rotated. The page headers contain page size requests of the form

<</PageSize [<width> <height>]>> setpagedevice

to make a printer switching to the appropriate paper tray.

The patch does not change the API or default behavior of Poppler. The new mode is only used when explicitly selected.
Comment 3 Till Kamppeter 2009-06-03 06:03:48 UTC
Created attachment 26393 [details] [review]
Improved patch to add support for multiple page size output

I have improved the patch somewhat. With the original patch there were no "%%PageBoundingBox:" comments and then the pstops CUPS filter added them to each page, but always using the default page size from the PPD file. Now correct "%%PageBoundingBox:" comments are directly added and sp pstops does not add its own any more. I do not know whether this behavior of pstops would break anything for printing with changing paper sizes, but better to have it right.
Comment 4 Albert Astals Cid 2009-06-03 13:54:10 UTC
should -origpagesizes conflict with -paper, -paperw, -paperh?

Could you also add the switch to the man page?
Comment 5 Till Kamppeter 2009-06-03 14:27:45 UTC
Currently, -origpagesizes overrides -paper, -paperw, -paperh, at least if the input file specifies paper sizes for its pages. But is it possible to have a PDF not specifying its paper sizes, like the PostScript file /usr/share/cups/data/testprint.ps (CUPS test page)?
Comment 6 Albert Astals Cid 2009-06-03 14:34:49 UTC
hmmm, yeah, no idea what this could mean, leave like it is for now.

Well, what about adding the documentation to the man page?
Comment 7 Till Kamppeter 2009-06-03 15:16:38 UTC
Created attachment 26408 [details] [review]
Patch for the pdftops man page

Here we go: The patch for the updated man page of pdftops.
Comment 8 Till Kamppeter 2009-06-03 15:17:54 UTC
Created attachment 26409 [details]
Complete updated man page of pdftops

This is the complete updated man page of pdftops.
Comment 9 Albert Astals Cid 2009-06-04 10:50:00 UTC
Commited to master
Comment 10 Till Kamppeter 2009-06-22 08:27:16 UTC
Unfortunately, my first patch was not perfect. It correctly switches paper sizes so that the printer switches trays, but because I am setting the page size in every page header, in every page header the duplex gets reset to the front side of the sheet, which makes the back sides never being used. So all jobs come out one-sided. The attached patch replaces my first one and fixes the problem. It simply compares the size of the current page with the size of the previous page and only adds a page size request when the page size changes.
Comment 11 Till Kamppeter 2009-06-22 08:30:20 UTC
Created attachment 27012 [details] [review]
Fixed patch to add support for multiple page size output
Comment 12 Albert Astals Cid 2009-06-22 12:39:01 UTC
Commited
Comment 13 Till Kamppeter 2010-12-09 13:47:35 UTC
Unfortunately, my last patch has still problems and I need someone to help me to get PostScript with varying page sizes and with preservation of page independence and also without breaking duplex. See

http://www.cups.org/str.php?L3689

Thanks in advance.
Comment 14 Adrian Johnson 2010-12-10 01:11:12 UTC
Testing setpagedevice on my printer it seems that if setpagedevice is called after printing one side in duplex mode, the next page will not be on the second side but instead start on a new sheet of paper.

If you want to call setpagedevice on each page to allow pages to be reordered you could emit PS code that checks if the required page size is already selected before calling setpagedevice.

Something like:

  currentpagedevice /PageSize get aload pop
  842 ne exch 595 ne and
  {
      <</PageSize [595 842]>> setpagedevice
  } if

I don't think save/restore is required because poppler reinitializes the gstate parameters used at the start of each page.

I would also be interested to know why pdftops when used as a CUPS filter needs tp emit setpagedevice directly instead of using the "%%IncludeFeature" style comments as described at http://www.cups.org/documentation.php/doc-1.4/spec-postscript.html and letting CUPS insert the page select PS code based on the PPD file.
Comment 15 GitLab Migration User 2018-08-20 22:08:09 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/poppler/poppler/issues/208.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.