Bug 77932

Summary: Converting some PDFs results in images being converted in to 1000s of PNGs
Product: poppler Reporter: gareth
Component: pdftohtmlAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED MOVED QA Contact:
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:

Description gareth 2014-04-25 15:32:00 UTC
When converting this PDF:

PDF: https://www.whatdotheyknow.com/request/127122/response/315253/attach/2/FOI%2012%2001605%20Resp%201%20PDF.pdf

The HTML output includes 2690 PNG images. It looks like it's scanned the logo in the PDF line by line.

HTML: http://pastebin.com/raw.php?i=YCL9Mmpx

The command run was:

pdftohtml -nodrm -zoom 1.0 -stdout -enc UTF-8 -noframes tmp/FOI\ 12\ 01605\ Resp\ 1\ PDF.pdf > tmp/all-the-pngs.html

Some version info:

$ pdftohtml -v
pdftohtml version 0.18.4
Copyright 2005-2011 The Poppler Developers - http://poppler.freedesktop.org
Copyright 1999-2003 Gueorgui Ovtcharov and Rainer Dorsch
Copyright 1996-2004 Glyph & Cog, LLC

$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 12.04.4 LTS
Release:	12.04
Codename:	precise
Comment 1 GitLab Migration User 2018-08-21 10:43:19 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/poppler/poppler/issues/346.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.