Bug 70364

Summary: Umlaut conversion error for Type1 fonts in WinAscii Encoding
Product: cairo Reporter: Leon Winter <winter>
Component: postscript backendAssignee: Adrian Johnson <ajohnson>
Status: RESOLVED FIXED QA Contact: cairo-bugs mailing list <cairo-bugs>
Severity: normal    
Priority: medium CC: gpoo+bfdo, winter
Version: unspecified   
Hardware: x86 (IA32)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: Input document which is printed incorrectly
pdf output

Description Leon Winter 2013-10-11 06:59:07 UTC
Created attachment 87422 [details]
Input document which is printed incorrectly

Hi,

when printing (or running the print preview of) the attached PDF with evince (utilizing the PostScript backend of Cairo of course), the umlauts (like ö) are lost. The fonts used in the PDF are not embedded, the PDF seems to be encoded in a Windows charset and the resulting PostScript seems to include the combined characters for umlauts like 'odieresis', however it cannot render them hinting the possibility it is a combined glyph made up of 'o' and 'dieresis' and one of those might be missing. However my tries to manually include 'dieresis' in the resulting type1 font subset dict were futile.
Also probably noteworthy but not relevant to this bug is the fact that xpdf crashes when one tries to open the PDF, which suggests it might by a "strange" PDF.

My approach of including 'dieresis' in the dictionary was (patched against cairo-1.12.16 of Debian, the git log of the file seemed to suggest there were no later patches to the file after the Debian release):

diff --git a/src/cairo-type1-subset.c b/src/cairo-type1-subset.c
index 4b64403..dfcd1a1 100644
--- a/src/cairo-type1-subset.c
+++ b/src/cairo-type1-subset.c
@@ -1407,6 +1407,15 @@ skip_subrs:
        cairo_type1_font_subset_use_glyph (font, index);
     }
 
+    for (i = 0; i < font->base.num_glyphs; i++) if (font->glyph_names[i])
+    { fprintf (stderr, "glyph: %s\n", font->glyph_names[i]);
+      if (!strcmp (font->glyph_names[i], "dieresis")
+       || !strcmp (font->glyph_names[i], "odieresis") || !strcmp (font->glyph_names[i], "o"))
+      { cairo_type1_font_subset_use_glyph (font, i);
+        fprintf (stderr, "-----> added\n");
+      }
+    }
+
     /* Go through the charstring of each glyph in use, get the glyph
      * width and figure out which extra glyphs may be required by the
      * seac operator (which may cause font->num_glyphs to increase

Best regards,
Leon
Comment 1 Adrian Johnson 2013-10-11 07:26:05 UTC
Created attachment 87426 [details]
pdf output

I can't reproduce the problem. Attached is the output I get from pdftocairo -pdf.

The fonts in the original PDF are not embedded. So maybe you are getting different fonts substituted which triggers the bug.

The substitute fonts used on my machine are:

$ pdffonts -subst bug.pdf 
name                                 object ID substitute font                      substitute font file
------------------------------------ --------- ------------------------------------ ------------------------------------
Helvetica                                11  0 Nimbus Sans L                        /usr/share/fonts/X11/Type1/n019003l.pfb
Courier                                  12  0 Nimbus Mono L                        /usr/share/fonts/X11/Type1/n022003l.pfb
Courier-Bold                             13  0 Nimbus Mono L Bold                   /usr/share/fonts/X11/Type1/n022004l.pfb
Comment 2 Leon Winter 2013-10-11 07:53:22 UTC
Indeed I get different substitute fonts:

$ pdffonts -subst rechnung_weniger_inhalt.pdf
name                                 object ID substitute font                      substitute font file
------------------------------------ --------- ------------------------------------ ------------------------------------
Helvetica                                11  0 Nimbus Sans L                        /usr/share/fonts/X11/Type1/n019003l.pfb
Courier                                  12  0 Courier                              /usr/share/fonts/type1/texlive-fonts-recommended/pcrr8a.pfb
Courier-Bold                             13  0 Courier Bold                         /usr/share/fonts/type1/texlive-fonts-recommended/pcrb8a.pfb

Those fonts are provided by my Debian package "texlive-fonts-recommended".
Comment 3 Leon Winter 2013-10-11 07:56:21 UTC
I have uninstalled the aforementioned package and can confirm this fonts are the cause for the missing glyphs.
Comment 4 Adrian Johnson 2013-10-11 08:20:19 UTC
After copying that font to /usr/share/fonts/type1 I can now reproduce the bug.
Comment 6 Germán Poo-Caamaño 2013-11-02 02:43:58 UTC
*** Bug 71151 has been marked as a duplicate of this bug. ***

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.