Bug 36474 - CJK characters are not displayed
Summary: CJK characters are not displayed
Status: RESOLVED FIXED
Alias: None
Product: poppler
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Hardware: Other Linux (All)
: medium normal
Assignee: poppler-bugs
QA Contact:
URL: https://bugs.launchpad.net/ubuntu/+so...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-04-22 01:22 UTC by Koji Otani
Modified: 2011-07-28 03:44 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
patch (5.57 KB, patch)
2011-04-22 01:22 UTC, Koji Otani
Details | Splinter Review

Description Koji Otani 2011-04-22 01:22:12 UTC
Created attachment 45930 [details] [review]
patch

On Ubuntu 11.4 beta japanese language envrionnment, poppler can't display Japanese character in PDF files don't embed CID fonts in them. 
I have found a problem in selecting fonts:
In this environment, fonconfig has the following setting.

        <edit name="family" mode="prepend" binding="strong">
            <string>DejaVu Sans</string>     --- (1)
            <string>Takao P Gothic</string> --- (2)
              ......

(1) is a font for Latin-1 not for Japanese.
(2) is a font for Japanese.

So poppler select the font (1), and Japanese Characters are not dispplayed.
But, this is NOT a bug of setting!!!
Application using fontconfig is expected to fallback, when it doesn't found a glyph for a character. That is, it is expected to do the following;

When a character is Latin-1, it uses the font (1).
When a character is not Latin-1, it uses the font (2).

Intent of this setting is that application uses a font pair ((1)+(2)) as 1 font.
This is strange for me, but not a few users love this.
Setting like this appears in Chinese environment.

But poppler doesn't fallback!!!

I think that it needs many changes for poppler to fallback.
I have a idea:When Adobe-Japan1 font is required, don't select a font not supporting language "ja", and poppler displays all characters at least.

I attached a patch for this.

This patch also
   extracts codes selecting lang tag and put them into a function so that we
 can use it from other function.
   adds codes to handle "Medium" modifier that often used in Japanese font.
Comment 1 Albert Astals Cid 2011-04-26 09:45:16 UTC
I'm a bit confused, we already use the language in 

  p = FcPatternBuild(NULL,
                    FC_FAMILY, FcTypeString, family,
                    FC_LANG, FcTypeString, lang,
                    NULL);

shouldn't this make fontconfig return a font for the language we want?
Comment 2 Koji Otani 2011-04-26 21:02:10 UTC
Because binding is "strong" in the setting, order in the list in it is precedence than FC_LANG value. So, fontconfig returns "DejaVu" font in top of the list.
Comment 3 Albert Astals Cid 2011-04-27 14:58:00 UTC
So you mean that even if i ask for a "ja" font it will give me a non "ja" font first? Are you sure that this is not a bug in fontconfig? What's the point in that feature then if it does not work?
Comment 4 Koji Otani 2011-04-27 20:58:31 UTC
(In reply to comment #3)
> So you mean that even if i ask for a "ja" font it will give me a non "ja" font
> first?

Yes.
 
> Are you sure that this is not a bug in fontconfig?

I don't sure that original developers of fontconfig expect this usecase.
But intent of writer of this setting is clear.

>  What's the point in
> that feature then if it does not work?

If "binding" is not "strong", fontconfig behaves as you expect.
More detail of this setting is following;

    <match target="pattern">
        <test name="lang" compare="contains">
            <string>ja</string>                  ---- (A)
        </test>
        <test qual="any" name="family">
            <string>sans-serif</string>
        </test>
        <edit name="family" mode="prepend" binding="strong">
            <string>DejaVu Sans</string>
            <string>Takao P Gothic</string>
              ......

Please note (A). When "ja" font is required, this setting is applied!!!
Comment 5 Albert Astals Cid 2011-05-08 06:00:57 UTC
****
If "binding" is not "strong", fontconfig behaves as you expect.
****

Reading this again it seems to me that the problem is not in poppler but just that the fontconfig configuration is wrong and you are trying to add a workaround to poppler to fix that problem, couldn't you just not make the binding strong and then it will work?
Comment 6 Koji Otani 2011-05-08 20:24:40 UTC
(In reply to comment #5)
> ****
> If "binding" is not "strong", fontconfig behaves as you expect.
> ****
> 
> Reading this again it seems to me that the problem is not in poppler but just
> that the fontconfig configuration is wrong and you are trying to add a
> workaround to poppler to fix that problem, couldn't you just not make the
> binding strong and then it will work?

Author of this config file want to use 2 fonts (DejaVu Sans and Takao P Gothic) as 1 font that Latin Glyphs of Takao are replaced with DejaVu's.
His intent is to make DejaVu the top of list fontconfig returns even when lang=ja. He writes "binding=string" on purpose. and I'm afraid that programs using fontconfig except poppler behave as he expects. 
This problem is also discussed in
https://bugs.launchpad.net/ubuntu/+source/language-selector/+bug/759882
https://bugs.launchpad.net/ubuntu-translations/+bug/659280
https://bugs.launchpad.net/ubuntu/+source/language-selector/+bug/713950
Comment 7 Koji Otani 2011-05-08 20:28:33 UTC
(In reply to comment #6)

> lang=ja. He writes "binding=string" on purpose. and I'm afraid that programs
Sorry for my misspell. "binding=string" --> "binding=strong"
Comment 8 Albert Astals Cid 2011-07-28 03:42:43 UTC
Patch commited, sorry for the delay


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.