Summary: | UK tax form PDF has jumbled text | ||
---|---|---|---|
Product: | poppler | Reporter: | Jason Crain <jason> |
Component: | general | Assignee: | poppler-bugs <poppler-bugs> |
Status: | RESOLVED FIXED | QA Contact: | |
Severity: | normal | ||
Priority: | medium | CC: | ingenieria, jason |
Version: | unspecified | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
blank United Kingdom tax return form
result of "pdftocairo -png 'blank return.pdf' bad" result of "pdftocairo -png -f 3 -l 3 'blank return.pdf' good" GfxFontDict: merge reference generation from xpdf 4.00 |
Description
Jason Crain
2018-01-10 04:58:32 UTC
Created attachment 136641 [details]
result of "pdftocairo -png 'blank return.pdf' bad"
This image shows an incorrect rendering of page 3 from running "pdftocairo -png 'blank return.pdf' bad".
Created attachment 136642 [details]
result of "pdftocairo -png -f 3 -l 3 'blank return.pdf' good"
This image shows the correct rendering of page 3 from running "pdftocairo -png -f 3 -l 3 'blank return.pdf' good".
CairoFontEngine.cc is caching fonts based on the indirect reference number and generation under the assumption that they will be unique, but a font on page 2 and 3 are aliasing so it uses the wrong font. Splash is probably doing something similar. Two different fonts have the same number and generation because these fonts don't really have an indirect reference due to the way the way the PDF defines the resources and font dictionaries: 7 0 obj << /Resources 8 0 R /Type /Page ... other page entries ... >> endobj 8 0 obj << /Font << /T1 << ... font dictionary entries ... >> >> >> The GfxFontDict constructor has code to generate a fake reference based on the /Font dictionary's number but that doesn't work well in this PDF because the Font dictionary doesn't have an indirect reference either. This appears to be fixed in XPDF 4.00 because the GfxFontDict constructor now includes code to generate the fake reference based on a hash instead. *** Bug 91004 has been marked as a duplicate of this bug. *** Created attachment 136832 [details] [review] GfxFontDict: merge reference generation from xpdf 4.00 The GfxFontDict constructor generates a fake indirect reference if the font dictionary doesn't have a real indirect reference. It sometimes assigns the same reference to two different fonts leading to a wrong font being used. XPDF 4.00 fixes this by using the hash of the font data to create the fake reference. Pushed :) |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.