Bug 35118

Summary: Prioritize fonts that support a territory-less language variant when no exact language match
Product: fontconfig Reporter: Caolán McNamara <caolanm>
Component: libraryAssignee: fontconfig-bugs
Status: RESOLVED MOVED QA Contact: Behdad Esfahbod <freedesktop>
Severity: normal    
Priority: medium CC: akira, freedesktop, petersen
Version: 2.8   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: implement suggestion

Description Caolán McNamara 2011-03-08 07:54:40 UTC
Created attachment 44237 [details] [review]
implement suggestion

When fontconfig matches by language tag, and the language tag supplied does not match exactly a known fontconfig language tag then fontconfig considers a list of all fonts that provide support for at least a variant language of the language tag.

It doesn't however prioritize fonts that support a territory-less variant, which is likely the best one to use under these circumstances.

i.e. practically
fc-match :lang=pa-IN
will generate a list of "pa" fonts and "pa-pk" fonts and fairly arbitrarily return one of that list.
so for me I get
"Droid Sans Arabic" because that supports "pa-pk", but anything that provided just "pa" would be a better choice.

e.g. see https://bugzilla.redhat.com/show_bug.cgi?id=682716

Attached is an implementation to effectively sort "xx" before "xx-*" when no exact "xx-yy" was found.
Comment 1 Behdad Esfahbod 2011-03-14 15:34:47 UTC
I agree with the analysis.  But not the solution.  How about we change FcLangCompare() to return a new FcLangMaybeDifferentTerritory if one lang has a territory and the other doesn't?  Would have been nice if we could define that .5, but not a huge deal.  The matcher has to order them.
Comment 2 Behdad Esfahbod 2011-03-14 15:40:50 UTC
Humm, I tried, but it's kinda in the API that FcLangCompare results are monotonic.   So much approach wouldn't work.  Let me see what else I can come up with.
Comment 3 Akira TAGOH 2011-06-17 03:24:37 UTC
*** Bug 26604 has been marked as a duplicate of this bug. ***
Comment 4 Caolán McNamara 2011-06-17 03:51:34 UTC
I could rework this patch to use, say four *internal* values of, e.g.

FcInternalLangEqual,
FcInternalLangDefaultCountry, (or NoCountry, whatever)
FcInternalLangDifferentCountry,
FcInternalLangDifferentLanguage, 

everywhere internally, and then map them at the api entry/exit points to the public

FcLangEqual
FcLangDifferentCountry
FcLangDifferentLanguage

if that was considered useful
Comment 5 Behdad Esfahbod 2011-06-22 10:11:50 UTC
I fixed an embarrassing bug in language handling.  Can you please retest with master?
Comment 6 Akira TAGOH 2011-06-22 20:03:54 UTC
Behdad, did you push it to the repo?
Comment 8 Akira TAGOH 2011-06-23 22:26:37 UTC
just tried ja vs ja-jp but not the case in comment#0.

I don't see any difference between fc-match serif:lang=ja and fc-match serif:lang=ja-jp say though, does this change expect to see any changes on output even if the config files doesn't contain <test name="lang"><string>ja-jp</string></test> but <string>ja</string>?
Comment 9 Behdad Esfahbod 2011-06-24 07:03:07 UTC
(In reply to comment #8)
> just tried ja vs ja-jp but not the case in comment#0.
> 
> I don't see any difference between fc-match serif:lang=ja and fc-match
> serif:lang=ja-jp say though, does this change expect to see any changes on
> output even if the config files doesn't contain <test
> name="lang"><string>ja-jp</string></test> but <string>ja</string>?

Reading bug description again and thinking about it, the bug is still there.  May be able to resolve it by just changing FcCompareLang() to use FcLangSetContains family of operators.  That goes against the design of the matcher, but will work.
Comment 10 Akira TAGOH 2011-06-26 23:52:16 UTC
(In reply to comment #9)
> Reading bug description again and thinking about it, the bug is still there. 
> May be able to resolve it by just changing FcCompareLang() to use
> FcLangSetContains family of operators.  That goes against the design of the
> matcher, but will work.

So modifying the config file to:

<test name="family" compare="contains">
  <string>ja</string>
</test>

will works then? okay, let me try..
Comment 11 Behdad Esfahbod 2011-06-28 08:47:48 UTC
(In reply to comment #10)
> (In reply to comment #9)
> > Reading bug description again and thinking about it, the bug is still there. 
> > May be able to resolve it by just changing FcCompareLang() to use
> > FcLangSetContains family of operators.  That goes against the design of the
> > matcher, but will work.
> 
> So modifying the config file to:
> 
> <test name="family" compare="contains">

You mean name="lang".

>   <string>ja</string>
> </test>
> 
> will works then? okay, let me try..

Yes, that should work I guess.
Comment 12 Akira TAGOH 2011-11-09 23:38:42 UTC
Well, I missed the way to reproduce this on ja vs ja-jp and there seems different reason to not matching on pa-pk as per comment#0. so I failed to confirm if compare="contains" would really help.

From current implementation of the orthography, pa and pa-pk has different requirements to sasitfy. pa_pk.orth is referring to ur.orth through lah.orth now and it's obvious pa.orth and ur.orth requires different scripts coverage. given this is correct, not matching Lohit Punjabi on fc-match :lang=pa-pk would means not a bug then.

needing more info or feedback.
Comment 13 Akira TAGOH 2012-07-02 20:31:38 UTC
For another solution, I have implemented FcLangNormalize() function to fit into the language tag that fontconfig actually supports. for instance, it expects to get "ja" with FcLangNormalize("ja-jp"). this is however currently exported as the internal use only. so if it's useful, we can make it public perhaps.
Comment 14 Behdad Esfahbod 2017-08-03 09:58:41 UTC
Shall we revisit this.  I have to read the report again to understand what this was about.
Comment 15 GitLab Migration User 2018-08-20 21:45:08 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/fontconfig/fontconfig/issues/30.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.