Bug 96994

Summary: pdftocairo pdf/ps output broken when glyph in non-embedded type 1 font can not be mapped
Product: poppler Reporter: Adrian Johnson <ajohnson>
Component: cairo backendAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: broken pdf output
cairo: try finding glyphs in substitute fonts by unicode

Description Adrian Johnson 2016-07-19 13:27:45 UTC
Created attachment 125141 [details]
broken pdf output

Using the following test case:

https://bugzilla.redhat.com/attachment.cgi?id=1008037

the output from

pdftocairo -pdf bug.pdf out.pdf

has what looks like an 'o' with quotes above it in various places instead of a space.

This bug depends on the substitute font for Helvetica. I reproduced this with the ghostscript fonts:

pdffonts -subst bug.pdf 
name                                 object ID substitute font                      substitute font file
------------------------------------ --------- ------------------------------------ ------------------------------------
Helvetica                                12  0 Nimbus Sans L                        /usr/share/fonts/type1/gsfonts/n019003l.pfb
Helvetica-Bold                            9  0 Nimbus Sans L Bold                   /usr/share/fonts/type1/gsfonts/n019004l.pfb
Helvetica                                 8  0 Nimbus Sans L                        /usr/share/fonts/type1/gsfonts/n019003l.pfb
Helvetica-Bold                            7  0 Nimbus Sans L Bold                   /usr/share/fonts/type1/gsfonts/n019004l.pfb
Helvetica-Oblique                         2  0 Nimbus Sans L Regular Italic         /usr/share/fonts/type1/gsfonts/n019023l.pfb
Comment 1 Adrian Johnson 2016-07-19 13:39:15 UTC
Created attachment 125144 [details] [review]
cairo: try finding glyphs in substitute fonts by unicode

In this pdf the Helvetica fonts are not embedded. The non-embedded font specifies a custom encoding. The garbage character displayed is /nonbreakingspace in the custom encoding. If this glyph name is not found in the substitute font the garbage character is displayed.

The attached patch fixes this. If looking up the glyph by name fails it tries mapping the glyph name to unicode then looking up the glyph by unicode value.
Comment 2 Carlos Garcia Campos 2016-07-19 16:04:45 UTC
Comment on attachment 125144 [details] [review]
cairo: try finding glyphs in substitute fonts by unicode

LGTM
Comment 3 Adrian Johnson 2016-07-19 22:04:44 UTC
Pushed

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.