Bug 13582

Summary: poppler cannot display characters in outside of unicode BMP with TT font
Product: poppler Reporter: Koji Otani <sho>
Component: generalAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED FIXED QA Contact:
Severity: enhancement    
Priority: medium    
Version: unspecified   
Hardware: All   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: patch solved (2),(3),(4)
screen shot without the patch
screen shot with the patch
PDF file generates screen shots

Description Koji Otani 2007-12-10 03:22:39 UTC
Adobe Japan1 6 character set includes characters in outside of UNICODE BMP.
But poppler cannot display these and some characters with TrueType font.
current poppler has following problems about that.
(1) CMap data is old.
 Current data (poppler-data-0.1.1.tar.gz ) has only Adobe Japan 4 data.
 This should be update with newer one. (GhostScript 8.60 has  already 
 new  CMap data)
(2) poppler  doesn't lookup format12 cmap table of TrueType font.
  Only format12 cmap table supports codes outside of UNICODE BMP.
(3) poppler lookups only UCS2 CMaps when making unicodeToGID map
 UCS2 CMap supports only codes in inside of UNICODE BMP.
(4) missing  handling  CID conflict in CMap .
CMap maps multiple unicode to a same CID.
So, a CID can map multiple unicode. 
Currently poppler use only the first one.
If that code is not exist in the cmap of TT Font.
It is not displayed.

I proposed patch solved (2), (3), (4).
Comment 1 Koji Otani 2007-12-10 03:25:16 UTC
Created attachment 13012 [details] [review]
patch solved (2),(3),(4)
Comment 2 Koji Otani 2007-12-10 03:34:11 UTC
Created attachment 13013 [details]
screen shot without the patch
Comment 3 Koji Otani 2007-12-10 03:42:26 UTC
Created attachment 13014 [details]
screen shot with the patch
Comment 4 Koji Otani 2007-12-10 03:45:13 UTC
Created attachment 13015 [details]
PDF file generates screen shots
Comment 5 Koji Otani 2007-12-10 03:53:44 UTC
Note: poppler with this patch can display the second screen shot with NEW CMap file and TrueType font icnludes JIS X-0213:2004 Character set. Without these, poppler with this patch can display only the first screen shot.
Comment 6 Albert Astals Cid 2007-12-10 14:25:00 UTC
Thanks a lot, patch commited to master git branch that will become poppler 0.7

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.