Summary: | pdftohtml utility in complex mode creates background PNGs with insufficient resolution | ||
---|---|---|---|
Product: | poppler | Reporter: | Chun <fuzzybr80> |
Component: | general | Assignee: | poppler-bugs <poppler-bugs> |
Status: | RESOLVED FIXED | QA Contact: | |
Severity: | normal | ||
Priority: | medium | CC: | mpsuzuki |
Version: | unspecified | ||
Hardware: | x86 (IA32) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Sample PDF
output (pdftohtml high resolution) output (poppler-utils low res) Patch to add "-r" option to pdftohtml. |
Created attachment 37833 [details]
output (pdftohtml high resolution)
Created attachment 37834 [details]
output (poppler-utils low res)
I've tried poppler versions 0.14.1, 0.11.3, 0.10.7, 0.5.91 and they all exhibit this issue. major? come on, how this is major in any way? (In reply to comment #4) > major? come on, how this is major in any way? Only so far as the fact that forcing 72 dpi quality on the resultant HTML output's graphics makes this tool pretty much unusable by anyone wanting to pdf-to-html a PDF with graphics embedded. I appreciate that not many people uses pdftohtml enough for poppler to care, even if you are pretty much the only still maintained open source package that provides pdf-to-html functionality. Still that would probably go under Importance rather than Severity. Anyway I have tried fixing it myself with some tests, but the magical constant 72 is distributed throughout the code (core poppler stuff), not just pdftohtml, so I can see that this will be a non-trivial fix. We've gone with pdftohtml.sourceforge.net for our deployment (and its not so complete unicode rendering), it was quite a pity not to be able to use poppler. Hi, I'm sorry for lated involvement to this discussion (again). As a patch for bug 19404 using SplashOutputDev to make background image is committed to pdftohtml, I will rework my previous patch posted to poppler mailing list. I cannot comment about the evaluation of "major", but I think it's reasonable to have new option "-r" to specify the resolution of background image, aslike pdftoppm & pdftotext have. Created attachment 38143 [details] [review] Patch to add "-r" option to pdftohtml. Here it is. The patch adds new option "-r" to specify resolution. "-r 300" will generate background image at 300 dpi. Chun, please check if it fits your request. Tested with several documents so far, and it works! Thanks! It would be really great to see this committed into the next release. Will be part of poppler >= 0.15.0 I found that the patch is just committed to GIT head. Thank you very much! I'm reverting this patch since actually i realized it does not make sense, i'll be using the scale variable that is already there to let you define the zoom you want to use. Can you please test poppler git master and report any problem you might find using the zoom argument? |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 37832 [details] Sample PDF We invoke the pdftohtml utility in complex mode. pdftohtml -c -noframes [input pdf] [output html] which creates a backgroung PNG for each PDF page that includes the graphics in that page. When using pdftohtml (from pdftohtml.sourceforge.net), the resolution of the PNG is 1785x2526 pixels. When usng poppler-utils, each background image (PNG) is 594x843 resolution. This makes the resultant HTML's images look really pixellated. Can the resolution be fixed or set as an option somewhere.