Bug 11413

Summary: Poppler can't display Japanese text correctly.
Product: poppler Reporter: Koji Otani <sho>
Component: generalAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium CC: jmuizelaar
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: PDF file cause this problem
correct image that acroread displayed.
incorrect image that evince displayed
a patch solves this problem
PDF file displays Japanese CID characters.
image acroread displayed
image evince displayed
image evince with patched poppler displayed
patch fixes bug comment #11 points out
PDF file that embedded font is used.
PDF file that embedded font and Identity map are used

Description Koji Otani 2007-06-28 19:18:03 UTC
Evince can't display japanese vertical text correctly,
because  poppler lacks japanese vertical text feature.
Comment 1 Koji Otani 2007-06-28 19:20:19 UTC
Created attachment 10496 [details]
PDF file cause this problem
Comment 2 Koji Otani 2007-06-28 19:21:56 UTC
Created attachment 10497 [details]
correct image that acroread displayed.
Comment 3 Koji Otani 2007-06-28 19:29:31 UTC
Created attachment 10499 [details]
incorrect image that evince displayed
Comment 4 Koji Otani 2007-06-28 19:35:34 UTC
Created attachment 10500 [details] [review]
a patch solves this problem

This is a patch for poppler source in CVS (2007/6/28).
Comment 5 Koji Otani 2007-06-28 19:46:11 UTC
Poppler can't display many CJK characters when displaying CID characters using TrueType font. This patch also make better about this. 
Comment 6 Koji Otani 2007-06-28 19:48:36 UTC
Created attachment 10501 [details]
PDF file displays Japanese CID characters.
Comment 7 Koji Otani 2007-06-28 19:51:43 UTC
Created attachment 10502 [details]
image acroread displayed
Comment 8 Koji Otani 2007-06-28 19:53:10 UTC
Created attachment 10503 [details]
image evince displayed

evince couldn't display many characters.
Comment 9 Koji Otani 2007-06-28 19:58:22 UTC
Created attachment 10504 [details]
image evince with patched poppler displayed

If poppler is chaned with this patch, evince can display more characters with variant form characters.
Comment 10 Albert Astals Cid 2007-07-13 15:21:31 UTC
Patch commited, thanks a lot, please keep contributing to poppelr with such high quality code :-)
Comment 11 Jeff Muizelaar 2007-07-30 10:05:37 UTC
This patch causes a regression in the rendering of http://people.freedesktop.org/~jrmuizel/16bitImage.pdf

The wrong glyph is chosen for ft ligature in 'software' at the bottom of the page.

Acroread displays the pdf correctly.
Comment 12 Koji Otani 2007-07-30 22:03:33 UTC
Created attachment 10922 [details]
 patch fixes bug  comment #11 points out

(In reply to comment #11)
> This patch causes a regression in the rendering of
> http://people.freedesktop.org/~jrmuizel/16bitImage.pdf
> 
> The wrong glyph is chosen for ft ligature in 'software' at the bottom of the
> page.
> 
> Acroread displays the pdf correctly.
> 

This is the case that the Encoding is "Identity" and ToUnicode exists.
GfxCIDFont::getCodeToGIDMap makes CIDTOGID map from ToUnicde map, 
but when encodeing is Identity, it should not use ToUnicode map to get GID.
If encoding is Identity, No CIDTOGID map is needed.

I attached a patch fixes this bug.

Thanks
Comment 13 Jeff Muizelaar 2007-07-30 22:30:46 UTC
Great. I've committed your fix. Thanks very much.
Comment 14 Jeff Muizelaar 2007-09-17 17:59:56 UTC
I've come across another possible regression caused by this patch.

Around GfxFont.cc:1879 there is a comment that states:
        // fall-through, assuming the Identity mapping -- this appears
        // to match Adobe's behavior

This causes pdf's that use the Adobe-Japan mapping to fall-through when the mapping file from the poppler-data package is not around. Falling through, instead of erroring causes the identity mapping to be used which is obviously not correct. 

Do you have any pdf's that rely on the fall-through behaviour? Perhaps known mappings should cause an error and only unknown mappings fall through? i.e. Adobe-Japan causes an error even if the .cidToUnicode file is not around.

What do you think? 
Comment 15 Koji Otani 2007-12-11 06:19:52 UTC
(In reply to comment #14)
> I've come across another possible regression caused by this patch.
> 
> Around GfxFont.cc:1879 there is a comment that states:
>         // fall-through, assuming the Identity mapping -- this appears
>         // to match Adobe's behavior
> 
> This causes pdf's that use the Adobe-Japan mapping to fall-through when the
> mapping file from the poppler-data package is not around. Falling through,
> instead of erroring causes the identity mapping to be used which is obviously
> not correct. 
> 
> Do you have any pdf's that rely on the fall-through behaviour? Perhaps known
> mappings should cause an error and only unknown mappings fall through? i.e.
> Adobe-Japan causes an error even if the .cidToUnicode file is not around.
> 
> What do you think? 
> 

This code is to get toUnicode map. toUnicode map is not mandatory.
And identity mapping may be correct.

When embedded font is used, neither CMap nor toUnicode map is needed.
Attached A.pdf can be displayed without language pack if it falls through.
So, falling through is better here.

Comment 16 Koji Otani 2007-12-11 06:21:39 UTC
Created attachment 13028 [details]
PDF file that embedded font is used.
Comment 17 Koji Otani 2007-12-11 06:43:07 UTC
(In reply to comment #15)
> (In reply to comment #14)
> > I've come across another possible regression caused by this patch.
> > 
> > Around GfxFont.cc:1879 there is a comment that states:
> >         // fall-through, assuming the Identity mapping -- this appears
> >         // to match Adobe's behavior
> > 
> > This causes pdf's that use the Adobe-Japan mapping to fall-through when the
> > mapping file from the poppler-data package is not around. Falling through,
> > instead of erroring causes the identity mapping to be used which is obviously
> > not correct. 
> > 
> > Do you have any pdf's that rely on the fall-through behaviour? Perhaps known
> > mappings should cause an error and only unknown mappings fall through? i.e.
> > Adobe-Japan causes an error even if the .cidToUnicode file is not around.
> > 
> > What do you think? 
> > 
> 
> This code is to get toUnicode map. toUnicode map is not mandatory.
> And identity mapping may be correct.
> 
> When embedded font is used, neither CMap nor toUnicode map is needed.
Sorry I missed.
   When embedded font and Identity mapping are used.

> Attached A.pdf can be displayed without language pack if it falls through.
> So, falling through is better here.
> 


Comment 18 Koji Otani 2007-12-11 06:44:13 UTC
Created attachment 13029 [details]
PDF file that embedded font and Identity map are used
Comment 19 Albert Astals Cid 2007-12-11 11:33:43 UTC
So i'm lost :D

Is this bug fixed or not?
Comment 20 Koji Otani 2007-12-12 21:16:28 UTC
(In reply to comment #19)
> So i'm lost :D
> 
> Is this bug fixed or not?
> 

Sorry for confusing you with my bad english.
This bug is fixed.
Commnents after #13 are not about this bug, I think. 
You may close this. 
Comment 21 Albert Astals Cid 2007-12-13 11:28:28 UTC
Closing

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.