Bug 17429 - Provide a "pdftopdf" utility
Provide a "pdftopdf" utility
Status: NEW
Product: poppler
Classification: Unclassified
Component: general
unspecified
Other All
: medium enhancement
Assigned To: poppler-bugs
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-09-04 09:08 UTC by Michael R Head
Modified: 2014-07-06 09:55 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michael R Head 2008-09-04 09:08:29 UTC
I would like to see a complement to the pdftops and other pdfto* utilities that can output to pdf. I'd like to see options like -embedfonts (which would embed all non-embedded fonts in the output), -paper (for changing paper size) along with -expand and -nocrop, -createopw/-createupw (for creating user/owner passwords for the output). I'm sure there are many other useful options that could be added.

ghostscript provides a tool which can do some of this called pdf2pdf. Unfortunately, its pdf interpreter doesn't appear to be as correct as poppler's and the resulting pdf that ghostscript generates doesn't look that great. So, to properly embed fonts in an existing PDF (which is at times necessary for various publications), I've found I can only get acceptable output by using poppler's pdftops utility and then using ghostscript's ps2pdf utility with -dPDFSETTINGS=/prepress (or -dPDFSETTINGS=/printer) to embed even the standard postscript fonts.

I'd prefer just to be able to use a single poppler tool to get the job done.
Comment 1 Michael R Head 2008-09-04 09:10:49 UTC
Also, the process of pdf -> ps -> pdf throws away the PDF metadata, which I would be possible to keep with a single pdftopdf utility
Comment 2 dynamotwain 2009-01-28 23:15:57 UTC
It would also be useful if it had an option to write the PDF streams out in the uncompressed format to aid in debugging of flaky PDFs. It's a little easier to figure out what is going on when you have a plaintext PDF rather than one with Flate-encoded streams.
Comment 3 William Bader 2010-04-18 20:27:38 UTC
For modifying pdfs, you could look at http://code.google.com/p/pdfsizeopt/
It can subset fonts and perform lossless compression of images.

The program below uncompresses pdfs and writes the objects to a text file.
http://read.pudn.com/downloads174/sourcecode/windows/activex/806479/ExtractPDFText_src/cp/pdf.cpp__.htm
http://google.com/codesearch?q=%22This+file+contains+extremely+crude+C+source+code+to+extract+plain+text%22

For browsing objects in pdfs, I have used http://sourceforge.net/projects/pdfedit/
Comment 4 Petr Pisar 2013-11-03 16:13:38 UTC
CUPS filters contain pdftopdf tool with this functionality (see <http://en.sourceforge.jp/projects/opfc/scm/svn/tree/head/pdftopdf/>).

I would like to see another option to strip (or add) the user/owner password protection. I sometimes get password-protected PDF files and for archiving purposes, it's easier to strip the protection (and then encrypt with my PGP key ).
Comment 5 Pino Toscano 2014-07-06 09:55:58 UTC
(In reply to comment #4)
> I would like to see another option to strip (or add) the user/owner password
> protection. I sometimes get password-protected PDF files and for archiving
> purposes, it's easier to strip the protection (and then encrypt with my PGP
> key ).

This has been asked as bug #18440.