Bug 88242

Summary: Some characters from embedded type1 (8bit) font are not rendered
Product: poppler Reporter: Mike Kroutikov <mkroutikov>
Component: generalAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED MOVED QA Contact:
Severity: normal    
Priority: medium CC: ingenieria
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: Broken PDF
screenshot of incorrectly rendered page
screenshot of how it should look
Patch to fix the issue (not tested - for review only!)
Patch to fix the issue (not tested - for review only!)
File broken by that patch
Patch to fix the issue (not tested - for review only!)
Another file with lost characters
Embedded Characters not rendered correctly (Arial Narrow & Calibri)
Screen shot of embedded characters not rendered correctly in pdftoppm version 0.51.0

Description Mike Kroutikov 2015-01-09 16:15:26 UTC
Created attachment 112012 [details]
Broken PDF

Some characters from embedded type1 (8bit) font are not rendered.

Acrobat renders fine.

Tested on Ubuntu 14.04 (poppler version 0.24.5)

Latest release 0.30.0 shows the same problem (compiled on Ubuntu 14.04 from source).

Problematic font is AdvT291 embedded as (50, 0)

See the attached PDF and screenshots.
Comment 1 Mike Kroutikov 2015-01-09 16:17:08 UTC
Created attachment 112014 [details]
screenshot of incorrectly rendered page
Comment 2 Mike Kroutikov 2015-01-09 16:17:47 UTC
Created attachment 112015 [details]
screenshot of how it should look
Comment 3 Mike Kroutikov 2015-01-09 16:31:06 UTC
Here is my understanding of what goes wrong:

1. embedded font has encoding table that uses glyph names like Cxx, where xx is a numeric code. For example, C65 is a name for the glyph "a"

2. PDF defines encoding Differences array that says that code 65 should be associated with name "A"

3. When SplashFTFontFile::loadType1Font is called it builds codeToGIDA mapping from character code to gid (using the encoding names). For code 65 it asks FreeType engine to find a glyph "A", which is not present in the embedded font. Hence 65 is mapped to 0. This causes glyph to be silently skipped.

Looks like the Differences array in PDF is generated incorrectly. 

Section 5.5 of PDF reference manual says:

"An Encoding entry can alter a Type 1 font’s mapping from character codes to
character names. The Differences array can map a code to the name of any glyph
description that exists in the font program, whether or not that glyph is referenced
by the font’s built-in encoding or by the encoding specified in the
BaseEncoding entry."

It does not say what to do when Differences array maps to a name that is NOT in the font program. But apparently Acrobat just ignores such a mapping entry.
Comment 4 Mike Kroutikov 2015-01-09 16:47:35 UTC
Created attachment 112017 [details] [review]
Patch to fix the issue (not tested - for review only!)
Comment 5 Mike Kroutikov 2015-01-09 16:50:26 UTC
Added a patch for review. The patch changes the logic how Encoding.Differences overrides the built-in encoding for 8-bit type1 fonts. It ignores any override request if new name does not exist in the font program.
Comment 6 Mike Kroutikov 2015-01-09 16:58:56 UTC
Created attachment 112019 [details] [review]
Patch to fix the issue (not tested - for review only!)
Comment 7 Albert Astals Cid 2015-01-10 16:51:31 UTC
Created attachment 112067 [details]
File broken by that patch

Your patch makes characters from this file disappear.
Comment 8 Mike Kroutikov 2015-01-12 01:47:08 UTC
Created attachment 112101 [details] [review]
Patch to fix the issue (not tested - for review only!)
Comment 9 Mike Kroutikov 2015-01-12 01:50:31 UTC
Another random attempt to fix the bug (without proper understanding of the code base) - just narrowed the scope of the original logic so that regression goes away.

Please review and comment.
Comment 10 Albert Astals Cid 2015-01-12 19:30:39 UTC
Created attachment 112137 [details]
Another file with lost characters

The Ä from the title is gone
Comment 11 Albert Astals Cid 2015-07-14 22:29:04 UTC
ping?
Comment 12 Simon Shugar 2017-02-28 19:39:31 UTC
Created attachment 129988 [details]
Embedded Characters not rendered correctly (Arial Narrow & Calibri)

Is there any update on this issue? I think we are having the same issue where certain characters embedded are not being rendered correctly.
Comment 13 Simon Shugar 2017-02-28 19:40:54 UTC
(In reply to Simon Shugar from comment #12)
> Created attachment 129988 [details]
> Embedded Characters not rendered correctly (Arial Narrow & Calibri)
> 
> Is there any update on this issue? I think we are having the same issue
> where certain characters embedded are not being rendered correctly.

Sorry to add more information specifically Arial and Times new roman work however Arial Narrow and Calibri render incorrectly. I'll upload an attachment with the incorrect rendition.
Comment 14 Simon Shugar 2017-02-28 19:47:02 UTC
Created attachment 129989 [details]
Screen shot of embedded characters not rendered correctly in pdftoppm version 0.51.0
Comment 15 Simon Shugar 2017-02-28 19:47:18 UTC
(In reply to Simon Shugar from comment #13)
> (In reply to Simon Shugar from comment #12)
> > Created attachment 129988 [details]
> > Embedded Characters not rendered correctly (Arial Narrow & Calibri)
> > 
> > Is there any update on this issue? I think we are having the same issue
> > where certain characters embedded are not being rendered correctly.
> 
> Sorry to add more information specifically Arial and Times new roman work
> however Arial Narrow and Calibri render incorrectly. I'll upload an
> attachment with the incorrect rendition.

pdftoppm version 0.51.0
Comment 16 Simon Shugar 2017-07-20 17:00:18 UTC
Please ignore comments 12 to 15. After investigated the issue further I realised the issue I was having is slightly different from this issue. I have raised a new bug for it. 

https://bugs.freedesktop.org/show_bug.cgi?id=101855
Comment 17 GitLab Migration User 2018-08-20 22:07:19 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/poppler/poppler/issues/202.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.