Summary: | pdftohtml: complex output: text rendered in background image | ||
---|---|---|---|
Product: | poppler | Reporter: | David Mackay <mackay_d> |
Component: | general | Assignee: | poppler-bugs <poppler-bugs> |
Status: | RESOLVED FIXED | QA Contact: | |
Severity: | normal | ||
Priority: | high | CC: | pmocek-freedesktop |
Version: | unspecified | ||
Hardware: | x86 (IA32) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Patch to fix the described issue
Fixed an oversight on my part, forgot to include the fix to pdftohtml.cc as well sample PDF which demonstrates bug |
Description
David Mackay
2007-01-23 13:46:13 UTC
Created attachment 10669 [details] [review] Patch to fix the described issue This patch adds the methods getPSNoText and setPSNoText to the GlobalParams class, as well as the appropriate private GBool psNoText. In PSOutputDev::drawString, a return was added if globalParams->getPSNoText() returns true. This is an exact copy from the original pdftohtml. Created attachment 10670 [details] [review] Fixed an oversight on my part, forgot to include the fix to pdftohtml.cc as well In addition to the aforementioned changes, pdftohtml.cc also needs to be modified to enable psNoText. Fixed using a different patch because we are actually trying to kill GlobalParams. Thanks for the report! I'm experiencing the same with pdftohtml 0.12.4 from the Ubuntu 10.4 package. On pages which contain black boxes where text has been redacted, text is rendered in the background image. On pages which do not have such boxes, text is not renderd in the background image. Created attachment 36372 [details]
sample PDF which demonstrates bug
To reproduce: using attached PDF, run "pdftohtml -c -noframes -hidden -nomerge PoliceReport-2010202024.pdf PoliceReport-2010202024.html" Should be fixed in poppler >= 0.15.0 |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.