Bug 7597

Summary: add syntax to enter charset on commandline tools (and conf file?)
Product: fontconfig Reporter: andu
Component: libraryAssignee: fontconfig-bugs
Status: RESOLVED FIXED QA Contact: Behdad Esfahbod <freedesktop>
Severity: enhancement    
Priority: high CC: akira, freedesktop
Version: 2.3   
Hardware: Other   
OS: Linux (All)   
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 8100    

Description andu 2006-07-22 08:12:50 UTC
I would like to use fc-match to figure out how substitution works in a case the  
application fills charset field in a pattern, i.e. I see that filenames in my  
browser have different fonts depending whether they are in English, or Russian.  
I tried issuing  
fc-match "Bitstream Vera Sans:charset=something"  
But I got 'segmentation fault'. I tried fc-list : charset to figure out what I  
can put there, but it gave no guess.
Comment 1 Keith Packard 2006-09-01 12:16:57 UTC
Yes, there's no human-readable string representation for charsets. Now that the
cache doesn't strings, perhaps we can replace the old nasty representation with
something sensible.
Comment 2 Keith Packard 2006-09-02 20:27:11 UTC
I fixed the segfault at least; still remaining to be decided is how to present
charsets in a sensible fashion.
Comment 3 Akira TAGOH 2011-09-04 19:32:30 UTC
maybe supporting the well-known charsets name like ISO8859-* would be more useful. or the block name in Unicode since there are no fonts covering everything in the world.
Comment 4 Behdad Esfahbod 2011-09-06 06:55:54 UTC
(In reply to comment #3)
> maybe supporting the well-known charsets name like ISO8859-* would be more
> useful. or the block name in Unicode since there are no fonts covering
> everything in the world.

Nah, ISO8859-* is not that interesting, and would need data tables that I really want to see die forever.  Unicode blocks are not interesting because of alll the holes and rare characters.  You rarely find any font supporting a full block, except for the ASCII and Latin1 blocks maybe.
Comment 5 Akira TAGOH 2011-09-06 18:26:08 UTC
Sure. well, the side-effect of supporting this might be that there are possibility to improve giving a rate to select the better fonts. right now fontconfig has the orth files per languages. I think this direction is right because rendering characters with different fonts per charset where we have seen in X core fonts was really ugly. however it has a dilemma of the strict orthography vs the lazy orthography like Bug#17619. we still need some input from someone through the fontconfig config to determine which one they prefer from the aspect of the quality etc though, how many charsets for the specific language the font support is measurable and supporting more charsets should be preferred.

For example, there are some charsets in Japanese like JIS X 0201, JIS X 0208, JIS X 0212, JIS X 0213 and some revisions on them. 0201 and 0208 is a must to support Japanese though, 0212 and 0213 may be optional in most cases. but nice to have it. 

So I'd suggest to have separate tables for charsets and link to the orth file with some information to indicate a mandatory or an optional. and give a different rate for them and select the better fonts against it then. or maybe even good to have a way to do it per character code in the config. well, it's off topic for this issue though.
Comment 6 Behdad Esfahbod 2014-07-03 22:03:41 UTC
commit e708e97c351d3bc9f7030ef22ac2f007d5114730
Author: Behdad Esfahbod <behdad@behdad.org>
Date:   Thu Jul 3 17:52:54 2014 -0400

    Change charset parse/unparse format to be human readable
    Previous format was unusable.  New format is ranges of hex values.
    To choose space character and Latin capital letters for example:
    $ fc-pattern ':charset=20 41-5a'
    Pattern has 1 elts (size 16)
        0000: 00000000 00000001 07fffffe 00000000 00000000 00000000 00000000 00000000

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.