Bug 20911 - 65-nonlatin should be updated for CJK fonts
65-nonlatin should be updated for CJK fonts
Status: REOPENED
Product: fontconfig
Classification: Unclassified
Component: conf
2.6
Other All
: medium normal
Assigned To: fontconfig-bugs
Behdad Esfahbod
http://bbs.dartmouth.edu/~fangq/blog/...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-03-27 09:00 UTC by Qianqian Fang
Modified: 2014-09-05 14:36 UTC (History)
14 users (show)

See Also:
i915 platform:
i915 features:


Attachments
mixture of Han glyphs from Japanese and Chiense fonts under en locale (133.44 KB, image/png)
2009-03-27 09:00 UTC, Qianqian Fang
Details
proposed 65-nonlatin.conf (15.55 KB, text/plain)
2009-04-19 14:34 UTC, Qianqian Fang
Details
25-unhint-nonlatin.conf (388 bytes, text/plain)
2009-06-30 02:28 UTC, MATSUU Takuto
Details
25-unhint-nonlatin.conf (592 bytes, text/plain)
2009-07-02 09:26 UTC, MATSUU Takuto
Details
proposed 65-nonlatin with updated Japanese font orders (16.29 KB, text/plain)
2009-07-20 07:45 UTC, Qianqian Fang
Details
proposed 65-nonlatin with updated Japanese font orders (16.08 KB, text/plain)
2009-07-21 20:32 UTC, Qianqian Fang
Details
55-language_fonts_ja-jp.conf (3.60 KB, application/octet-stream)
2009-07-22 21:10 UTC, Qianqian Fang
Details
55-language_fonts_ja-jp.conf (3.74 KB, application/octet-stream)
2009-07-22 21:12 UTC, Qianqian Fang
Details
61-language_fonts_ja.conf (1.98 KB, text/plain)
2009-07-23 03:01 UTC, MATSUU Takuto
Details
41-language-ja.conf (1.96 KB, text/plain)
2009-07-23 07:52 UTC, MATSUU Takuto
Details
proposed 65-nonlatin with updated Japanese font orders (16.19 KB, text/plain)
2009-07-23 08:01 UTC, MATSUU Takuto
Details
41-language-zh.conf (4.04 KB, text/plain)
2009-07-23 10:16 UTC, Qianqian Fang
Details
screenshot of zh wikipedia in a japanese firefox on karmic (333.29 KB, image/png)
2009-11-30 00:22 UTC, Jens Petersen
Details
another proposed 65-nonlatin.conf (5.65 KB, text/plain)
2009-11-30 06:34 UTC, Hideki Yamane
Details
another proposed configuration file for Japanese language (1.52 KB, text/plain)
2009-11-30 06:35 UTC, Hideki Yamane
Details
another proposed configuration file for Chinese language (2.49 KB, text/plain)
2009-11-30 06:37 UTC, Hideki Yamane
Details
screenshot of zh.wikipedia in Japanese Firefox on karmic (with another proposed conffiles) (337.03 KB, image/png)
2009-11-30 06:45 UTC, Hideki Yamane
Details
Filtered Microhei (18.14 KB, image/png)
2009-11-30 08:07 UTC, Baybal
Details
IPAMonafont (20.57 KB, image/png)
2009-11-30 08:08 UTC, Baybal
Details
separate Japanese family -> generic config (1.90 KB, text/plain)
2009-11-30 14:17 UTC, Hideki Yamane
Details
a complete test suite for the proposed config files (16.73 KB, application/x-gzip)
2009-11-30 17:41 UTC, Qianqian Fang
Details
65-nonlatin test suite svn rev23 (33.20 KB, application/x-gzip)
2009-12-02 07:46 UTC, Qianqian Fang
Details
split fontconfig files (2.60 KB, application/x-compressed-tar)
2009-12-02 14:09 UTC, Nicolas Mailhot
Details
sample config for fallback with the separate files (207 bytes, application/x-bzip2)
2010-02-01 01:44 UTC, Akira TAGOH
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Qianqian Fang 2009-03-27 09:00:50 UTC
Created attachment 24321 [details]
mixture of Han glyphs from Japanese and Chiense fonts under en locale

The font order in 65-nonlatin.conf in fontconfig has many issues, and is causing more and more troubles for supporting CJK languages. Here is a summary of the problems I found:

1. mixing proprietary fonts with free fonts

In this file, "MS Gothic","SimSun","PMingLiu", "HanyiSong" and "MS 明朝" are proprietary fonts. As far as I know, none of the Linux distros received permission to use these fonts from the copyright owners of the fonts. Giving a higher priority to proprietary fonts will increase the user's dependency to them, encouraging font piracy and reduce user feedbacks for FLOSS font development. 

In addition, "SimSun" used to be a popular "pirate" Chinese fonts for Chinese Linux users about 5 years ago due to the embedded bitmaps, but in the past 4 years, WenQuanYi project has developed high quality open-source bitmap fonts and sans-serif Chinese fonts, and getting far more popular than SimSun and most other proprietary Chinese fonts.

Also, AFAIK, "ZYSong18030" was only licensed to Redhat 9, from Zhong Yi Beijing Inc., and this font has no embedded bitmaps. Therefore, the user group of this font is quite small.


2. sans-serif and serif used the same font order

In CJK fonts, there are concepts such as Song (Mincho, Ming, or Batang), or Hei (Gothic or Dotum) correspond to sans-serif and serif font in Latin world. "Kai" is a style more or less correspond to italic or script. However, these fonts were ordered in the same way in both the serif and sans-serif blocks in 65-non-latin. The proper way should be 

for serif:
  bitmap Chinese fonts (style independent) > Song/Ming > Mincho/Batang > Hei > Gothic/Dotum > Kai > system fallback (GNU Unifont exp.)

for sans-serif:
  bitmap Chinese fonts (style independent) >  Hei > Gothic/Dotum > Song/Ming > Mincho/Batang > Kai > system fallback (GNU Unifont exp.)

I will explain the order for CJK fonts below.


3. fonts with lower unicode coverage and low quality were placed in front of more complete and polished ones

Japanese and Korean fonts usually contains only 6000 Han glyphs, while Chinese fonts, the typically charset is typically 20000. Because 65-nonlatin puts many Japanese fonts in front of Chinese fonts, when rendering a block of text with Han glyphs, one often see a mixture of Gothic, Mincho, Song and Kai glyphs, which looks horrible. See the attached screen capture:


I suggest to put Chinese fonts in front of Japanese/Korean fonts. When Pango fail to determine the Chinese text (which happens when rendering Han text under non-CJK locales), at least we can render the text with a consistent font (despite the z-variant differences). If Pango can determine the language, then, use language specific fontconfig rules to set the font order later (such as the language-selector-xx in Ubuntu).


4. order the font based on readability

The readability of Chinese fonts is a very complex problem. It is both technology (screen resolution, hinting techniques etc) and fashion (font styles from MS and Mac strongly influences Linux users) dependent. Therefore, it is constantly changing. More "modern" Chinese users prefer sans-serif fonts over the bitmap Chinese font (65% based on a survey at Ubuntu Chinese forum, N>300), while some other users prefer bitmaps. In any case, non-bitmaped serif font (Song, maybe Kai) are not preferred for most users. The order of the fonts shall not only consider the license, coverage, consistency, but also the readability.
Comment 1 Qianqian Fang 2009-03-27 09:12:54 UTC
A correction: 

Song (Mincho, Ming, or Batang), or Hei (Gothic or Dotum) correspond to serif and sans-serif fonts, respectively. My original post has the reversed order.
Comment 2 Jens Petersen 2009-04-13 20:55:10 UTC
(In reply to comment #0)
> Created an attachment (id=24321) [details]
> mixture of Han glyphs from Japanese and Chinese fonts under en locale

Right this problem is well-known.

> In this file, "MS Gothic","SimSun","PMingLiu", "HanyiSong" and "MS 明朝"
> are proprietary fonts. As far as I know, none of the Linux distros received
> permission to use these fonts from the copyright owners of the fonts. Giving a
> higher priority to proprietary fonts will increase the user's dependency to
> them, encouraging font piracy and reduce user feedbacks for FLOSS font
> development. 
> 
> In addition, "SimSun" used to be a popular "pirate" Chinese fonts for Chinese
> Linux users about 5 years ago due to the embedded bitmaps, but in the past 4
> years, WenQuanYi project has developed high quality open-source bitmap fonts
> and sans-serif Chinese fonts, and getting far more popular than SimSun and most
> other proprietary Chinese fonts.
> 
> Also, AFAIK, "ZYSong18030" was only licensed to Redhat 9, from Zhong Yi Beijing
> Inc., and this font has no embedded bitmaps. Therefore, the user group of this
> font is quite small.

Agreed.  I think the propriety fonts should be moved to a non-free .conf file at least,
which should have lower priority than free ones.  This would be a good opportunity
to clean up 65-nonlatin.conf.

> 2. sans-serif and serif used the same font order

Agree on the idea of correcting this.

> for serif:
>   bitmap Chinese fonts (style independent) > Song/Ming > Mincho/Batang > Hei >
> Gothic/Dotum > Kai > system fallback (GNU Unifont exp.)
> 
> for sans-serif:
>   bitmap Chinese fonts (style independent) >  Hei > Gothic/Dotum > Song/Ming >
> Mincho/Batang > Kai > system fallback (GNU Unifont exp.)

I think using bitmap before outline is a bad idea for JK anyway.

I suggest having a separate switch to turn on bitmap in the fontconfig rules perhaps.

> 3. fonts with lower unicode coverage and low quality were placed in front of
> more complete and polished ones

("Quality" may be subjective - anyway CJK respective styles are too different to
allow a common shared font.)

> Japanese and Korean fonts usually contains only 6000 Han glyphs, while Chinese
> fonts, the typically charset is typically 20000. Because 65-nonlatin puts many
> Japanese fonts in front of Chinese fonts, when rendering a block of text with
> Han glyphs, one often see a mixture of Gothic, Mincho, Song and Kai glyphs,
> which looks horrible.

Nod

> I suggest to put Chinese fonts in front of Japanese/Korean fonts. When Pango
> fail to determine the Chinese text (which happens when rendering Han text under
> non-CJK locales), at least we can render the text with a consistent font
> (despite the z-variant differences). If Pango can determine the language, then,
> use language specific fontconfig rules to set the font order later

Sounds reasonable enough.

> 4. order the font based on readability
> 
> The readability of Chinese fonts is a very complex problem. It is both
> technology (screen resolution, hinting techniques etc) and fashion (font styles
> from MS and Mac strongly influences Linux users) dependent. Therefore, it is
> constantly changing. More "modern" Chinese users prefer sans-serif fonts over
> the bitmap Chinese font (65% based on a survey at Ubuntu Chinese forum, N>300),
> while some other users prefer bitmaps. In any case, non-bitmaped serif font
> (Song, maybe Kai) are not preferred for most users. The order of the fonts
> shall not only consider the license, coverage, consistency, but also the
> readability.

So how about listing sans-serif (Hei) before bitmap then?
Comment 3 Qianqian Fang 2009-04-19 14:34:37 UTC
Created attachment 24952 [details]
proposed 65-nonlatin.conf

Please find in the attachment the proposed font orders for CJK languages.

A few comments about this file:

1. the original file was taken from Behdad's branch at
http://cgit.freedesktop.org/~behdad/fontconfig/tree/conf.d/65-nonlatin.conf

2. I only touched CJK fonts. As I know nothing about other non-CJK languages, so I replaced the old list by my new font list and keep all others the same.

3. the fonts were ordered pretty much based on my previous comment, in short:
    A. free > non-free
    B. screen (CJK bitmap) fonts > print fonts 
    C. Larger coverage > smaller coverage (CJK Unifonts > CJK specific fonts)
    D. for monospace, sans > serif, where sans has better readability
    E. for the same language, fonts with better "quality" are preferred

4. for "E" in (3), I would like to hear more input from CJK users.

5. I strongly suggest removing all the non-free fonts, at least move them to a separate file (plus that they are from the XP age and they are very outdated)

6. This file describes the general fallback path, and is assumed not to make any assumption on the desktop locales, thus, it is preferred to have language-specific font config files to fine-tune the font orders, such as language-selector files in Ubuntu.

Here is the CJK block I extracted from the serif block, as an example to my suggested changes. For each family, a comment line with the format of
{license, coverage, type, intended use, aliases, major lang-tag} is listed above the font name.

*GB18030(27514 glyphs)=CJK unified ideographs+CJK Ext A
*GBK(20932)=CJK unified ideographs
*GB2312(6763)=simplified Chinese minimum charset
*Big5(~13000)=traditional Chinese minimum charset
*HKSCS(vary)=HK Han glyphs scattered in CJK, CJK Ext.A and Ext.B

------------------------------------------------

<!-- ### block 1: Screen fonts ### -->
  <!-- free, GB18030, bitmap, screen font, sans/serif, zh-cn,zh-tw -->
	<family>WenQuanYi Bitmap Song</family> <!-- han (zh-cn,zh-tw) -->

<!-- ### block 2: Song/Micho/Batang print fonts ### -->
  <!-- free, GB2312+Big5+HKSCS, vector, print font, serif, zh-cn,zh-tw -->
	<family>AR PL ShanHeiSun Uni</family> <!-- han (zh-cn,zh-tw) -->
	<family>AR PL UMing CN</family> <!-- han (zh-cn,zh-tw) -->
	<family>AR PL New Sung</family> <!-- han (zh-cn,zh-tw) -->
  <!-- free, GB2312, vector, print font, serif, zh-cn -->
	<family>AR PL SungtiL GB</family>
  <!-- free, Big5, vector, print font, serif, zh-tw -->
	<family>AR PL Mingti2L Big5</family>
  <!-- free, GB2312+Big5+HKSCS, vector, print font, serif/cursive, zh-cn,zh-tw -->
 	<family>AR PL Zenkai Uni</family>
  <!-- free, JIS, vector, print/screen font, sans, ja -->
	<family>IPAMonaPMincho</family>
	<family>IPAPMincho</family>
  <!-- free, JIS, vector, print/screen font, sans, ja -->
	<family>Sazanami Mincho</family>
  <!-- free, JIS, vector, print/screen font, sans, ja -->
	<family>Kochi Mincho</family>
  <!-- free, KR, vector, print/screen font, serif, ko -->
	<family>Baekmuk Batang</family> <!-- han (ko) -->
  <!-- free, KR, vector, print/screen font, serif, ko -->
	<family>UnBatang</family> <!-- han (ko) -->

<!-- ### block 3: Hei/Gothic/Dotum print fonts ### -->
  <!-- free, GBK, vector, screen font, sans, zh-cn -->
	<family>WenQuanYi Micro Hei</family> <!-- han (zh-cn,zh-tw) -->
  <!-- free, GB18030, vector, print/screen font, sans, zh-cn -->
	<family>WenQuanYi Zen Hei</family> <!-- han (zh-cn,zh-tw) -->
  <!-- free, GB2312+Big5, vector, screen font, sans, zh-cn -->
	<family>Droid Sans Fallback</family> <!-- han (zh-cn,zh-tw) -->
  <!-- free, JIS, vector, print/screen font, sans, ja -->
	<family>VL PGothic</family>
	<family>VL Gothic</family>
  <!-- free, JIS, vector, print/screen font, sans, ja -->
	<family>IPAMonaPGothic</family>
	<family>IPAPGothic</family>
  <!-- free, JIS, vector, print/screen font, sans, ja -->
	<family>Sazanami Gothic</family>
  <!-- free, JIS, vector, print/screen font, sans, ja -->
	<family>Kochi Gothic</family>
  <!-- free, KR, vector, print/screen font, serif, ko -->
	<family>Baekmuk Dotum</family> <!-- han (ko) -->
  <!-- free, KR, vector, print/screen font, serif, ko -->
	<family>UnDotum</family> <!-- han (ko) -->
  <!-- free, JIS, vector, print/screen font, sans, ja -->
	<family>UmePlus P Gothic</family> <!-- han (ja) -->

<!-- ### block 4: non-free fonts ### -->
  <!-- nonfree, GB18030, vector, print font, serif, zh-cn -->
	<family>ZYSong18030</family> <!-- han (zh-cn,zh-tw) -->
  <!-- nonfree, GBK, vector, print font, serif, zh-cn -->
	<family>SimSun</family> <!-- han (zh-cn,zh-tw) -->
  <!-- nonfree, GBK, vector, print font, serif, zh-tw/hk -->
	<family>PMingLiu</family> <!-- han (zh-tw) -->
  <!-- nonfree, JIS, vector, print/screen font, sans, ja -->
	<family>MS Gothic</family> <!-- han (ja) -->
  <!-- nonfree, GBK, vector, print font, serif, zh-cn -->
	<family>HanyiSong</family> <!-- han (zh-cn,zh-tw) -->
Comment 4 Jens Petersen 2009-04-19 17:07:14 UTC
(In reply to comment #3)
>     B. screen (CJK bitmap) fonts > print fonts 

Should screen fonts really be bitmap - I would prefer to separate out bitmap fonts.

> <!-- ### block 1: Screen fonts ### -->
>   <!-- free, GB18030, bitmap, screen font, sans/serif, zh-cn,zh-tw -->
>         <family>WenQuanYi Bitmap Song</family> <!-- han (zh-cn,zh-tw) -->

I personally don't think a bitmap font should be preferred for CJK.
Though it should be made easy for users to use bitmaps if they want to.

> <!-- ### block 2: Song/Micho/Batang print fonts ### -->
>   <!-- free, GB2312+Big5+HKSCS, vector, print font, serif, zh-cn,zh-tw -->
>         <family>AR PL ShanHeiSun Uni</family> <!-- han (zh-cn,zh-tw) -->
>         <family>AR PL UMing CN</family> <!-- han (zh-cn,zh-tw) -->
>         <family>AR PL New Sung</family> <!-- han (zh-cn,zh-tw) -->

Isn't UMing newer than ShanHeiSun?
Comment 5 Qianqian Fang 2009-04-19 19:30:39 UTC
(In reply to comment #4)

> Should screen fonts really be bitmap - I would prefer to separate out bitmap
> fonts.
> 
> I personally don't think a bitmap font should be preferred for CJK.
> Though it should be made easy for users to use bitmaps if they want to.
> 

the sole purpose of designing those bitmaps are for screen use (obviously, they are not good for print). If you put them backwards, that basically means these fonts will never be used even they are installed. Unless it boosts it self to the front using its own config file, like wqy-bitmap-fonts. But that just makes the font swamp messier. Do you really want that way? 

> 
> Isn't UMing newer than ShanHeiSun?
> 

that's true.
Comment 6 Behdad Esfahbod 2009-04-22 16:52:42 UTC
I like the general direction this bug is heading.  Quick comments:

  - Time to put CJK stuff in its own file?

  - Helps immensely if you also tag each font in the comments whether it has Latin / Arabic / any non-CJK glyphs or not.

  - I think non-free fonts should have higher priority than free fonts.  If a user installs non-free fonts, chances are they want to use it.  For all other users though, it's a non-issue since they only have free fonts so the order doesn't matter.
Comment 7 Jens Petersen 2009-04-22 18:18:34 UTC
(In reply to comment #6)
>   - I think non-free fonts should have higher priority than free fonts.  If a
> user installs non-free fonts, chances are they want to use it.  For all other
> users though, it's a non-issue since they only have free fonts so the order
> doesn't matter.

How about having non-free fonts in a separate file though?

Comment 8 Behdad Esfahbod 2009-04-22 20:25:45 UTC
(In reply to comment #7)
> (In reply to comment #6)
> >   - I think non-free fonts should have higher priority than free fonts.  If a
> > user installs non-free fonts, chances are they want to use it.  For all other
> > users though, it's a non-issue since they only have free fonts so the order
> > doesn't matter.
> 
> How about having non-free fonts in a separate file though?

How would that help?
Comment 9 Jens Petersen 2009-04-22 21:31:14 UTC
> How would that help?

Well it would make clear which fonts are free and which not...
if it does not make the priority numbering more complicated.

It would also allow people to turn non-free fonts "on" and "off" from fontconfig.

(Though I agree it is not the main issue here.)
Comment 10 Qianqian Fang 2009-04-22 22:00:26 UTC
(In reply to comment #6)
> I like the general direction this bug is heading.  Quick comments:
> 
>   - Time to put CJK stuff in its own file?

that seems to be fine, something like 65-cjk.conf 

> 
>   - Helps immensely if you also tag each font in the comments whether it has
> Latin / Arabic / any non-CJK glyphs or not.

almost all of them have Latin (basic), but rarely have Arabic. I will add more
comment when I get chance this weekend.

>   - I think non-free fonts should have higher priority than free fonts.  If a
> user installs non-free fonts, chances are they want to use it.  For all other
> users though, it's a non-issue since they only have free fonts so the order
> doesn't matter.

As I said, I personally haven't heard any official licenses given from these
font makers to use their fonts on a Linux desktop. If anyone install and use
these fonts, it is very likely illegal. In another word, putting them in the
conf files simply makes unlicensed use of commercial fonts easier, and of
course, OSS font development projects will potentially lose users and feedback. 

In the long-run, Linux desktop needs more high-quality CJK fonts, and these
fonts are less likely come from the commercial font makers, but the active OSS
font projects. So, helping the commercial font makers to promote their fonts in
the OSS community will eventually hurt linux desktop (by binding more and more
users to the proprietary fonts).

Plus, the current OSS CJK fonts are really on-par in quality with the
commercial ones: WenQuanYi's bitmaps are of similar quality to commercial
bitmaps, and more complete; "Droid Sans Fallback" from Google is really a
professionally developed font bought from some Chinese company. WenQuanYi Zen
Hei also performs very well on Linux desktop and progresses everyday with users
feedback. As we now have plenty of choices with OSS fonts, I don't think making
the commercial fonts use out-of-box will buy us any benefit.

If CJK needs to give default support for commercial fonts, there are tons of
commercial Latin fonts (like Arial, Helvetica ...) in the market, should we
also pre-configue fontconfig for them as well...?
Comment 11 Behdad Esfahbod 2009-04-23 14:45:00 UTC
(In reply to comment #10)
> As I said, I personally haven't heard any official licenses given from these
> font makers to use their fonts on a Linux desktop. If anyone install and use
> these fonts, it is very likely illegal. In another word, putting them in the
> conf files simply makes unlicensed use of commercial fonts easier, and of
> course, OSS font development projects will potentially lose users and feedback. 
> 
> In the long-run, Linux desktop needs more high-quality CJK fonts, and these
> fonts are less likely come from the commercial font makers, but the active OSS
> font projects. So, helping the commercial font makers to promote their fonts in
> the OSS community will eventually hurt linux desktop (by binding more and more
> users to the proprietary fonts).
> 
> Plus, the current OSS CJK fonts are really on-par in quality with the
> commercial ones: WenQuanYi's bitmaps are of similar quality to commercial
> bitmaps, and more complete; "Droid Sans Fallback" from Google is really a
> professionally developed font bought from some Chinese company. WenQuanYi Zen
> Hei also performs very well on Linux desktop and progresses everyday with users
> feedback. As we now have plenty of choices with OSS fonts, I don't think making
> the commercial fonts use out-of-box will buy us any benefit.
> 
> If CJK needs to give default support for commercial fonts, there are tons of
> commercial Latin fonts (like Arial, Helvetica ...) in the market, should we
> also pre-configue fontconfig for them as well...?

I still don't think making it unnecessarily hard for people who install fonts makes any sense.  If there was *any* advantage, sure, but so far I fail to see one.
Comment 12 Qianqian Fang 2009-04-23 15:05:43 UTC
(In reply to comment #11)
> I still don't think making it unnecessarily hard for people who install fonts
> makes any sense. 

but what you said is not consistent with Latin font settings in fontconfig: I don't see Tahoma, Calibri, Segeo etc in the latin configuration files; Arial is also set to a lower priority than Dejavu/Bitstream. Why should CJK use a reversed support order? because CJK people like piracy? (joking of course)

> If there was *any* advantage, sure, but so far I fail to see one.

I thought I said it clear:

1. attract more users and feedback for CJK OSS font development. Since all OSS software can benefit from release-often-release-early model to evolve, why OSS fonts can not benefit from this model?

2. discourage unlicensed use of fonts, because it is simply wrong. if nobody respect font copyright, nobody will spend time to develop them and make them better.
Comment 13 Behdad Esfahbod 2009-04-23 15:12:25 UTC
(In reply to comment #12)
> (In reply to comment #11)
> > I still don't think making it unnecessarily hard for people who install fonts
> > makes any sense. 
> 
> but what you said is not consistent with Latin font settings in fontconfig: I
> don't see Tahoma, Calibri, Segeo etc in the latin configuration files; Arial is
> also set to a lower priority than Dejavu/Bitstream. Why should CJK use a
> reversed support order? because CJK people like piracy? (joking of course)

If they are not mentioned, they are not.  I'm just talking about the case that they are.
 
> > If there was *any* advantage, sure, but so far I fail to see one.
> 
> I thought I said it clear:
> 
> 1. attract more users and feedback for CJK OSS font development. Since all OSS
> software can benefit from release-often-release-early model to evolve, why OSS
> fonts can not benefit from this model?
> 
> 2. discourage unlicensed use of fonts, because it is simply wrong. if nobody
> respect font copyright, nobody will spend time to develop them and make them
> better.

I don't have any strong opinion here as long as the approach taken minimizing incoming bug reports in the future :).
Comment 14 Qianqian Fang 2009-04-23 15:35:30 UTC
(In reply to comment #13)
> If they are not mentioned, they are not.  I'm just talking about the case that
> they are.

if what you mean is to focus on the propitiatory CJK fonts that have already been included, I can tell you they are all past-seasons (some of them have never been a popular choice at all, such as HanyiSong and ZYSong18030). The new favorite commercial Chinese fonts are MS YaHei/Jhenghei (with good hinting) from Windows Vista, and ST Hei/LiHeiPro from MacOS X. These fonts now can be spotted over 80% in propitiatory-fonts-only-screenshots from Chinese Linux forum posts. So, keeping the "mentioned" fonts probably won't make these users happier.
Comment 15 Qianqian Fang 2009-04-23 16:01:18 UTC
(In reply to comment #13)
> If they are not mentioned, they are not.  I'm just talking about the case that
> they are.
> 
> I don't have any strong opinion here as long as the approach taken minimizing
> incoming bug reports in the future :).
> 

anyway, these are just my suggestions. It might be better you get a second opinion from other CJK developers/users.

Also, I forget to include GNU unifont. Last year, Paul Hardy incorporate WenQuanYi's Han glyphs (WenQuanYi Unibit) to this font; the latest version of GNU Unifont now covers the entire BMP (probably the only one so far)
   http://unifoundry.com/unifont.html
Although I know most of you prefer vector fonts and disable bitmaps in fontconfig, but I think it does not hurt to add it as the system fallback, in case people want more glyphs and don't care about bitmaps.

There is also a 69-unifont.conf, probably the differences between these unifonts and cjk fonts are not that big.
Comment 16 Qianqian Fang 2009-04-29 21:22:29 UTC
If fontconfig supports an additional lang-tag match when setting <prefer> list, that can make CJK settings a lot easier. Something like

  <alias>
    <test name="lang" compare="contains">
       <string>ja</string>
    </test>

    <family>sans-serif</family>
    <prefer>
      <family>DejaVu Sans</family>
      <family>Bitstream Vera Sans</family>
      <family>VL PGothic</family>
    </prefer>
  </alias>

I guess it is probably equivalent to the following (?)

 <match>
  <test name="lang" compare="contains">
   <string>ja</string>
  </test>
  <edit name="family" mode="prepend_first" binding="strong">
   <string>VL PGothic</string>
  </edit>
  <edit name="family" mode="prepend_first" binding="strong">
   <string>Bitstream Vera Sans</string>
  </edit>
  <edit name="family" mode="prepend_first" binding="strong">
   <string>DejaVu Sans</string>
  </edit>
 </match>

but I know a lot of people just don't like "strong" binding.
Comment 17 MATSUU Takuto 2009-06-30 02:18:35 UTC
I suggest to split 65-nonlatin to following files.

- 65-nonlatin-zh (or 65-chinese)
- 65-nonlatin-ja (or 65-japanese)
- 65-nonlatin-ko (or 65-korean)
- 65-nonlatin

I'll check out Japanese fonts order in this week.

btw, Should we also check 25-unhint-nonlatin and 40-nonlatin?
Comment 18 MATSUU Takuto 2009-06-30 02:28:42 UTC
Created attachment 27259 [details]
25-unhint-nonlatin.conf
Comment 19 Qianqian Fang 2009-07-01 19:24:30 UTC
(In reply to comment #17)
> I suggest to split 65-nonlatin to following files.
> 
> - 65-nonlatin-zh (or 65-chinese)
> - 65-nonlatin-ja (or 65-japanese)
> - 65-nonlatin-ko (or 65-korean)
> - 65-nonlatin
> 
> I'll check out Japanese fonts order in this week.
> 
> btw, Should we also check 25-unhint-nonlatin and 40-nonlatin?
> 

I have proposed a set of CJK language specific fontconfig settings at 
https://bugzilla.redhat.com/show_bug.cgi?id=499902
these files were tested with the above mentioned 65-nonlatin file, and appeared to eliminate some of the CJK font conflict in fedora. These language-specific files were numbered before 65.

For the unhint file, I think you may want to turn off "autohint", not "hinting". When autohint is on, hintstyle=hintmedium or hintfull will give poor rendering for CJK characters. "hintslight" may be acceptable for some users.

Also, we recently released a high quality vector font, WenQuanYi Micro Hei (Mono), build upon Google's Droid font family. We incorporated the hinting instructions from Droid Sans into this font and have achieved good screen quality of both Latin and non-Latin glyphs.

see more details about this font:
http://wenq.org/enindex.cgi?MicroHei(en)
http://wenq.org/enindex.cgi?MicroHei_BigBang_README
http://wenq.org/enindex.cgi?MicroHei_BigBang_ChangeLog
http://packages.debian.org/unstable/x11/ttf-wqy-microhei

if turning hinting off globally, I am afraid I have to put additional settings to enable it for this font.
Comment 20 MATSUU Takuto 2009-07-02 09:26:14 UTC
Created attachment 27340 [details]
25-unhint-nonlatin.conf 

changed from hinting to autohint.
I tried wqy-microhei. It seems that even hintstyle=hintslight gives poor rendering for CJK.

How to check:
$ pango-view --font='WenQuanYi Micro Hei' --text='微' --waterfall
Comment 21 Qianqian Fang 2009-07-02 14:36:32 UTC
(In reply to comment #20)
> I tried wqy-microhei. It seems that even hintstyle=hintslight gives poor
> rendering for CJK.
> 
> How to check:
> $ pango-view --font='WenQuanYi Micro Hei' --text='微' --waterfall
> 

I am not surprise at all. Autohinting for CJK glyphs is not usable. 
Comment 22 MATSUU Takuto 2009-07-05 00:12:02 UTC
I suggest these Japanese order.

old serif order:
<family>MS Gothic</family>
<family>UmePlus P Gothic</family>
<family>Sazanami Mincho</family>
<family>IPAMonaMincho</family>
<family>IPAMincho</family>
<family>Kochi Mincho</family>
<family>AR PL ShanHeiSun Uni</family>
<family>MS 明朝</family>
 
new serif order:
<family>MS PMincho</family> <!-- proprietary -->
<family>IPAPMincho</family>
<family>IPAPMonaMincho</family>
<family>Sazanami Mincho</family>
<family>Kochi Mincho</family>

'MS Gothic' and 'UmePlus P Gothic' are sans-serif fonts, not serif.
'MS PMincho' is same as 'MS P明朝'. 
'AR PL ShanHeiSun Uni' is not japanese glyph.

old sans-serif order:
<family>MS Gothic</family>
<family>UmePlus P Gothic</family>
<family>AR PL ShanHeiSun Uni</family>
<family>VL Gothic</family>
<family>IPAMonaGothic</family>
<family>IPAGothic</family>
<family>Sazanami Gothic</family>
<family>Kochi Gothic</family>

new sans-serif order:
<family>MS PGothic</family> <!-- proprietary -->
<family>UmePlus P Gothic</family>
<family>VL PGothic</family>
<family>IPAPGothic</family>
<family>IPAMonaPGothic</family>
<family>Sazanami Gothic</family>
<family>Kochi Gothic</family>

Added 'P' to the name. P means proprietary.
Sazanami and Kochi has no proprietary fonts.

old monospace order:
<family>MS Gothic</family>
<family>UmePlus Gothic</family>
<family>VL Gothic</family>
<family>IPAMonaGothic</family>
<family>IPAGothic</family>
<family>Sazanami Gothic</family>
<family>Kochi Gothic</family>
<family>AR PL ShanHeiSun Uni</family>
<family>MS ゴシック</family>

new monospace order:
<family>MS Gothic</family> <!-- proprietary -->
<family>UmePlus Gothic</family>
<family>VL Gothic</family>
<family>IPAGothic</family>
<family>IPAMonaGothic</family>
<family>Sazanami Gothic</family>
<family>Kochi Gothic</family>

'MS Gothic' is same as 'MS ゴシック'.

cf.
UmePlus http://www.geocities.jp/ep3797/modified_fonts_01.html
VL Gothic http://dicey.org/vlgothic/
IPAfont http://ossipedia.ipa.go.jp/ipafont/
IPA Mona font http://www.geocities.jp/ipa_mona/
sazanami font http://wiki.fdiary.net/font/?sazanami
Kochi font http://wiki.fdiary.net/font/?kochi-alternative
Comment 23 Jens Petersen 2009-07-05 17:34:21 UTC
(In reply to comment #22)
> I suggest these Japanese order.

Yep, looks reasonable enough to me. :-)

Just wonder: why list the MS fonts twice (beginning and end)?
 
Comment 24 MATSUU Takuto 2009-07-13 16:02:25 UTC
sorry for delay.

(In reply to comment #23)
> Just wonder: why list the MS fonts twice (beginning and end)?

I don't know why old 65-nonlatin contains twice. :)

(In reply to comment #22)
> Added 'P' to the name. P means proprietary.
> Sazanami and Kochi has no proprietary fonts.

P means propotional, not proprietary.

(In reply to comment #2)
> > I suggest to put Chinese fonts in front of Japanese/Korean fonts. When Pango
> > fail to determine the Chinese text (which happens when rendering Han text under
> > non-CJK locales), at least we can render the text with a consistent font
> > (despite the z-variant differences). If Pango can determine the language, then,
> > use language specific fontconfig rules to set the font order later
> 
> Sounds reasonable enough.
> 

Nak.

If Chinese fonts are put in front of Japanese/Korean fonts, Japanese/Korean text have same issue because each languages has different letter shape in same code of Unicode. That's why I suggest to split c/j/k configuration files.

ex.
pango-view  --waterfall --text='与返骨直' --language=zh_CN
pango-view  --waterfall --text='与返骨直' --language=zh_TW
pango-view  --waterfall --text='与返骨直' --language=ja

cf. http://d.hatena.ne.jp/mashabow/20090514/1242292024
'Arial Unicode MS' has multiple letter shapes. It's great.
Comment 25 Qianqian Fang 2009-07-13 17:22:19 UTC
(In reply to comment #24)
> 
> Nak.
> 
> If Chinese fonts are put in front of Japanese/Korean fonts, Japanese/Korean
> text have same issue because each languages has different letter shape in same
> code of Unicode. That's why I suggest to split c/j/k configuration files.
> 
> ex.
> pango-view  --waterfall --text='与返骨直' --language=zh_CN
> pango-view  --waterfall --text='与返骨直' --language=zh_TW
> pango-view  --waterfall --text='与返骨直' --language=ja

In my opinion, 65-nonlatin is only responsible for font selection when there is no preference in specific CJK variants, such as in en_US. The only requirement is how to display characters consistently, without disturbing the readers by seeing multiple CJK fonts in a continuous text flow. In this case, the code-point coverage and font glyph consistency are the only criteria that I can think of. If the system locale is set to en_US, and only Japanese and Korean fonts were installed, I really don't know which one should be preferred (it would be better to have a Chinese font installed by default).

If any of the CJK is preferred, these settings should be in a language-specific fontconfig file, which is explained in the thread of my proposal at https://bugzilla.redhat.com/show_bug.cgi?id=499902

(basically the same idea as you split cjk files, except it maintain a default non-CJK locale font order in 65-nolatin, and CJK specific files use lang tag to set the preferences)


> 
> cf. http://d.hatena.ne.jp/mashabow/20090514/1242292024
> 'Arial Unicode MS' has multiple letter shapes. It's great.
> 

this is interesting. I really want to know which table is used for variants. I have thought about doing this in the WenQuanYi's fonts.
Comment 26 Qianqian Fang 2009-07-20 07:45:40 UTC
Created attachment 27848 [details]
proposed 65-nonlatin with updated Japanese font orders

I updated the 65-nonlatin per Matsuu's suggestions. I personally would separate the non-free MS fonts to a different file or removed from fontconfig settings, but I left it there as in Matsuu's order.

Also, I downloaded UmePlus fonts and VL Gothic, and opened them with fontconfig, the font quality does not look good to me: many of the glyphs have entirely different styles (mincho styles in a gothic font), the stroke thickness, glyph baselines and stroke cluster densities are quite uneven. I know that both fonts incorporated MPlus fonts, and MPlus is good, but it has only ~1600 Han characters.

Perhaps the new Japanese variant of DroidSansFallback would be a lot better choice in the long run?
http://android.git.kernel.org/?p=platform/frameworks/base.git;a=tree;f=data/fonts
Comment 27 Jens Petersen 2009-07-20 17:46:28 UTC
(In reply to comment #26)
> Also, I downloaded UmePlus fonts and VL Gothic, and opened them with
> fontconfig, the font quality does not look good to me: many of the glyphs have
> entirely different styles (mincho styles in a gothic font), the stroke
> thickness, glyph baselines and stroke cluster densities are quite uneven. I
> know that both fonts incorporated MPlus fonts, and MPlus is good, but it has
> only ~1600 Han characters.

You should try the IPA fonts, which were recently freed
(they already in fedora and other distros).
They are probably best quality free Japanese fonts available AFAIK.

Matsuu-san, maybe IPA should be listed ahead of VL Gothic?
Comment 28 Qianqian Fang 2009-07-20 21:19:20 UTC
(In reply to comment #27)
> You should try the IPA fonts, which were recently freed
> (they already in fedora and other distros).
> They are probably best quality free Japanese fonts available AFAIK.
> 
> Matsuu-san, maybe IPA should be listed ahead of VL Gothic?
> 


yes, IPA fonts looks nice to me. It has >9000 Han glyphs. The "Droid Sans Japanese" I mentioned earlier also look very good to me (it does not have Kanas, which can be imported from Droid Sans Fallback though).

Do you guys want me to go ahead and update the 65-nonlatin file for this?
Comment 29 MATSUU Takuto 2009-07-21 07:48:01 UTC
(In reply to comment #27)
> Matsuu-san, maybe IPA should be listed ahead of VL Gothic?

Yes, of course. The reason I suggest the order is for compatibility to old configuration.

sans-serif:
-IPAPGothic
-UmePlus P Gothic
-VL PGothic
-IPAMonaPGothic
-Sazanami Gothic
-Kochi Gothic

monospace:
-IPAGothic
-UmePlus Gothic
-VL Gothic
-IPAMonaGothic
-Sazanami Gothic
-Kochi Gothic

I think IPA fonts has better quality than DroidSans Japanese.
Comment 30 Qianqian Fang 2009-07-21 20:32:01 UTC
Created attachment 27897 [details]
proposed 65-nonlatin with updated Japanese font orders

ok, here is the new list.

@Behdad:

how does this look to you? do you have a plan to consider this and my other bug (https://bugzilla.redhat.com/show_bug.cgi?id=499902)?
Comment 31 Behdad Esfahbod 2009-07-22 09:13:12 UTC
Eventually I'll pick these changes up, but it takes serious reviewing time that I don't have immediately.  2.7.2 perhaps.
Comment 32 MATSUU Takuto 2009-07-22 19:16:41 UTC
sorry, Once again, I want not to add WenQuanYi Micro Hei, WenQUanYi Zen Hei, and Droid Sans Fallback before japanese fonts.

Even in japanese locale, they are appeared as sans-serif fonts because they have 'ja' in lang attribute.

$ LANG=ja_JP.UTF-8 fc-match sans-serif
wqy-microhei.ttc: "WenQuanYi Micro Hei" "Regular"

$ fc-match -v 'WenQuanYi Micro Hei' | grep 'family\|lang'
        family: "WenQuanYi Micro Hei"(s)
        familylang: "zh-tw"(s)
        stylelang: "en"(s)
        fullnamelang: "en"(s)
        lang: aa|ab|af|ast|ava|ay|ba|be|bg|bi|bin|br|bs|bua|ca|ce|ch|chm|co|cs|cu|cv|cy|da|de|el|en|eo|es|et|eu|fi|fj|fo|fr|fur|fy|gd|gl|gn|gv|ho|hr|hu|ia|ibo|id|ie|ik|io|is|it|ja|kaa|ki|kk|kl|ko|ku|kum|kv|ky|la|lb|lez|lt|lv|mg|mh|mk|mt|nb|nds|nl|nn|no|nr|nso|ny|oc|om|os|pl|pt|rm|ru|sah|se|sel|sh|sk|sl|sma|smj|smn|so|sq|sr|ss|st|sv|sw|tg|tk|tn|tr|ts|tt|tyv|uk|uz|vo|vot|wa|wen|wo|xh|yap|zh-cn|zh-sg|zh-tw|zu(s)

$ fc-match -v 'WenQuanYi Zen Hei' | grep 'family\|lang'
        family: "WenQuanYi Zen Hei"(s)
        familylang: "zh-tw"(s)
        stylelang: "zh-tw"(s)
        fullnamelang: "zh-tw"(s)
        lang: aa|af|ast|ava|ay|be|bg|bi|br|ca|ce|ch|co|da|de|el|en|es|eu|fj|fo|fr|fur|fy|gd|gl|gv|ho|ia|id|ie|ik|io|is|it|ja|ko|kum|lb|lez|mg|nb|nds|nl|nn|no|nr|ny|oc|om|os|pt|rm|ru|sel|sh|sm|sma|smj|so|sq|sr|ss|st|sv|sw|to|ts|vo|wa|wo|xh|yap|zh-cn|zh-hk|zh-mo|zh-sg|zh-tw|zu(s)

$ fc-match -v 'Droid Sans Fallback' | grep 'family\|lang'
        family: "Droid Sans Fallback"(s)
        familylang: "en"(s)
        stylelang: "en"(s)
        fullnamelang: "en"(s)
        lang: bg|fj|ho|ia|ie|io|ja|ko|kum|nr|om|os|ru|sel|so|ss|st|sw|ts|xh|zh-cn|zh-sg|zh-tw|zu(s)

FYI, all japanese fonts doesn't have zh and ko in lang.

$ fc-match -v 'IPAPGothic' | grep 'family\|lang'
        family: "IPAPGothic"(s)
        familylang: "en"(s)
        stylelang: "en"(s)
        fullnamelang: "en"(s)
        lang: aa|ast|ay|bg|bi|br|ch|cs|da|de|en|eo|es|et|eu|fi|fj|fo|fur|fy|gd|gl|gv|ho|hu|ia|id|ie|io|is|it|ja|kum|lb|mg|nb|nds|nl|nn|no|nr|nso|oc|om|os|pl|pt|rm|ru|sel|sk|sma|smj|so|sq|ss|st|sv|sw|tn|ts|vo|vot|wa|wen|xh|yap|zu(s)

I think these chinese fonts should drop ja and ko in lang attribute.
or could fontconfig drop ja and ko from lang attribute?
Comment 33 Qianqian Fang 2009-07-22 20:07:46 UTC
I had said this many times, but one more time since you don't seem to get it.

65-nonlatin is not for fonts under Japanese locale!

This file is ONLY responsible for non-CJK preferred situations, such as en, fr, etc. If I don't set Chinese font in the front, if you browse a block of Chinese text, you will see several fonts be picked up, particular the Japanese ones (in the past, Japanese fonts were set at the front, but it does not cover all the Han glyphs). This has been complained many times, Jens should know this more.

If you are under one of the CJK locales, then install the language-specific config files I posted at https://bugzilla.redhat.com/show_bug.cgi?id=499902

Behdad and Jens, you guys are with me, right?


(In reply to comment #32)
> sorry, Once again, I want not to add WenQuanYi Micro Hei, WenQUanYi Zen Hei,
> and Droid Sans Fallback before japanese fonts.
> 
> Even in japanese locale, they are appeared as sans-serif fonts because they
> have 'ja' in lang attribute.
> 
> $ LANG=ja_JP.UTF-8 fc-match sans-serif
> wqy-microhei.ttc: "WenQuanYi Micro Hei" "Regular"
> 
> $ fc-match -v 'WenQuanYi Micro Hei' | grep 'family\|lang'
>         family: "WenQuanYi Micro Hei"(s)
>         familylang: "zh-tw"(s)
>         stylelang: "en"(s)
>         fullnamelang: "en"(s)
>         lang:
> aa|ab|af|ast|ava|ay|ba|be|bg|bi|bin|br|bs|bua|ca|ce|ch|chm|co|cs|cu|cv|cy|da|de|el|en|eo|es|et|eu|fi|fj|fo|fr|fur|fy|gd|gl|gn|gv|ho|hr|hu|ia|ibo|id|ie|ik|io|is|it|ja|kaa|ki|kk|kl|ko|ku|kum|kv|ky|la|lb|lez|lt|lv|mg|mh|mk|mt|nb|nds|nl|nn|no|nr|nso|ny|oc|om|os|pl|pt|rm|ru|sah|se|sel|sh|sk|sl|sma|smj|smn|so|sq|sr|ss|st|sv|sw|tg|tk|tn|tr|ts|tt|tyv|uk|uz|vo|vot|wa|wen|wo|xh|yap|zh-cn|zh-sg|zh-tw|zu(s)
> 
> $ fc-match -v 'WenQuanYi Zen Hei' | grep 'family\|lang'
>         family: "WenQuanYi Zen Hei"(s)
>         familylang: "zh-tw"(s)
>         stylelang: "zh-tw"(s)
>         fullnamelang: "zh-tw"(s)
>         lang:
> aa|af|ast|ava|ay|be|bg|bi|br|ca|ce|ch|co|da|de|el|en|es|eu|fj|fo|fr|fur|fy|gd|gl|gv|ho|ia|id|ie|ik|io|is|it|ja|ko|kum|lb|lez|mg|nb|nds|nl|nn|no|nr|ny|oc|om|os|pt|rm|ru|sel|sh|sm|sma|smj|so|sq|sr|ss|st|sv|sw|to|ts|vo|wa|wo|xh|yap|zh-cn|zh-hk|zh-mo|zh-sg|zh-tw|zu(s)
> 
> $ fc-match -v 'Droid Sans Fallback' | grep 'family\|lang'
>         family: "Droid Sans Fallback"(s)
>         familylang: "en"(s)
>         stylelang: "en"(s)
>         fullnamelang: "en"(s)
>         lang:
> bg|fj|ho|ia|ie|io|ja|ko|kum|nr|om|os|ru|sel|so|ss|st|sw|ts|xh|zh-cn|zh-sg|zh-tw|zu(s)
> 
> FYI, all japanese fonts doesn't have zh and ko in lang.
> 
> $ fc-match -v 'IPAPGothic' | grep 'family\|lang'
>         family: "IPAPGothic"(s)
>         familylang: "en"(s)
>         stylelang: "en"(s)
>         fullnamelang: "en"(s)
>         lang:
> aa|ast|ay|bg|bi|br|ch|cs|da|de|en|eo|es|et|eu|fi|fj|fo|fur|fy|gd|gl|gv|ho|hu|ia|id|ie|io|is|it|ja|kum|lb|mg|nb|nds|nl|nn|no|nr|nso|oc|om|os|pl|pt|rm|ru|sel|sk|sma|smj|so|sq|ss|st|sv|sw|tn|ts|vo|vot|wa|wen|xh|yap|zu(s)
> 
> I think these chinese fonts should drop ja and ko in lang attribute.
> or could fontconfig drop ja and ko from lang attribute?
> 

Comment 34 Qianqian Fang 2009-07-22 20:14:54 UTC
also, can you do me a favor, please download the Japanese config file I posted at Red Hat Bugzilla Bug 499902 and save it to your conf.d directory:

cd /etc/fonts/conf.d/
sudo wget "https://bugzilla.redhat.com/attachment.cgi?id=343146" -O 55-language_fonts_ja-jp.conf

and then try sans, serif and mono and tell me if you are bothered at all by the Chinese fonts?
Comment 35 Qianqian Fang 2009-07-22 21:10:03 UTC
Created attachment 27931 [details]
55-language_fonts_ja-jp.conf

for your convenience, I updated the language-specific fontconfig settings for Japanese and attached it here.

If you set ja as your locale, this should overwrite the orders in 65-nonlatin.
Comment 36 Qianqian Fang 2009-07-22 21:12:02 UTC
Created attachment 27932 [details]
55-language_fonts_ja-jp.conf
Comment 37 MATSUU Takuto 2009-07-22 21:47:17 UTC
(In reply to comment #33)
> I had said this many times, but one more time since you don't seem to get it.
>
> 65-nonlatin is not for fonts under Japanese locale!

Sorry my poor explanation. I know 65-nonlatin is not for japanese locale. it's different issue.
I suggested to drop ja and ko from lang attribute in chienese fonts.

For example:

$ pango-view --font=sans-serif --language=en_US --markup --text='<span lang="ja">直骨</span>'
$ pango-view --font=sans-serif --language=en_US --markup --text='<span lang="zh">直骨</span>'

Even in en, fr locale, if you use 65-nonlatin.conf(id=27897) and both Chinese and japanese fonts are installed, Japanese text even specified as 'ja' is displayed by chinese fonts. Because Chinese fonts have 'ja' in lang attribute.
Comment 38 Qianqian Fang 2009-07-22 21:59:53 UTC
(In reply to comment #37)
> 
> Sorry my poor explanation. I know 65-nonlatin is not for japanese locale. it's
> different issue.
> I suggested to drop ja and ko from lang attribute in chienese fonts.
> 
> For example:
> 
> $ pango-view --font=sans-serif --language=en_US --markup --text='<span
> lang="ja">直骨</span>'
> $ pango-view --font=sans-serif --language=en_US --markup --text='<span
> lang="zh">直骨</span>'
> 
> Even in en, fr locale, if you use 65-nonlatin.conf(id=27897) and both Chinese
> and japanese fonts are installed, Japanese text even specified as 'ja' is
> displayed by chinese fonts. Because Chinese fonts have 'ja' in lang attribute.
> 

I guess you did not install the 55-language_fonts_ja-jp.conf file in the attachments, did you? I run your commands with this config file and the Japanese font comes out nicely.

The reason I ask you to install 55-language_fonts_ja-jp.conf for this case is because you are specifying lang=ja, which means this is now a lang-specific case, and this is beyond 65-nonlatin.

for the ja/ko tags in the fonts, indeed, I did not do anything when producing the fonts, probably fontforge did it automatically based on the code point coverage from the fc-lang .orth files from fontconfig. I think the lang tag in the fonts only means they have all the code points, not necessarily the right variant, just like any of the non-cjk lang tags in these fonts.
Comment 39 MATSUU Takuto 2009-07-23 03:01:15 UTC
Created attachment 27944 [details]
61-language_fonts_ja.conf

ok. I understand. I'll ask Japanese font creator about that. 

I created 61-language_fonts_ja.conf instead of 55-language_fonts_ja-jp.conf. How about this?
Comment 40 MATSUU Takuto 2009-07-23 06:24:51 UTC
(In reply to comment #39)

> ok. I understand. I'll ask Japanese font creator about that. 

sigh, its my inept comment. I have to study fontconfig again.
Comment 41 Qianqian Fang 2009-07-23 06:36:27 UTC
(In reply to comment #39)
> Created an attachment (id=27944) [details]
> 61-language_fonts_ja.conf
> 
> ok. I understand. I'll ask Japanese font creator about that. 
> 
> I created 61-language_fonts_ja.conf instead of 55-language_fonts_ja-jp.conf.
> How about this?
> 

the second half of your file looks fine, basically the same as my second half, but cleaner. However, in my opinion, your first half is dangerous, as the <prefer> block does not match lang tag, and thus, it will overwrite other CJK fontconfig settings (depending on evaluation order).

I think the first half is not necessary at all (overlap with 65-nonlatin). The second half is enough. Overall, I like the structure I originally proposed, but merge with your second-half.

I remember the reason to name it as 55 is because there were other CJK font-specific config files, either AR Uming, or VL Gothic on Fedora. However, if you can install most of these CJK font packages and test at 61, it is fine for me to name it with 61 as long as it works.
Comment 42 MATSUU Takuto 2009-07-23 07:52:49 UTC
Created attachment 27951 [details]
41-language-ja.conf

I see. Fixed.
renamed to 41 in accordance with Arphicfonts.
http://www.freedesktop.org/wiki/Software/CJKUnifonts
Comment 43 MATSUU Takuto 2009-07-23 08:01:32 UTC
Created attachment 27952 [details]
proposed 65-nonlatin with updated Japanese font orders

@@ -42,7 +42,7 @@
 
                     <!-- free, JIS, vector, print/screen font, sans, ja -->
                        <family>IPAPMincho</family>
-                       <family>IPAPMonaMincho</family>
+                       <family>IPAMonaPMincho</family>
                     <!-- free, JIS, vector, print/screen font, sans, ja -->
                        <family>Sazanami Mincho</family>
                     <!-- free, JIS, vector, print/screen font, sans, ja -->
@@ -85,8 +85,8 @@
                        <family>SimSun</family> <!-- han (zh-cn,zh-tw) -->
                     <!-- nonfree, GBK, vector, print font, serif, zh-tw/hk -->
                        <family>PMingLiu</family> <!-- han (zh-tw) -->
-                    <!-- nonfree, JIS, vector, print/screen font, sans, ja -->
-                       <family>MS Gothic</family> <!-- han (ja) -->
+                    <!-- nonfree, JIS, vector, print/screen font, serif, ja -->
+                       <family>MS PMincho</family> <!-- han (ja) -->
                     <!-- nonfree, GBK, vector, print font, serif, zh-cn -->
                        <family>HanyiSong</family> <!-- han (zh-cn,zh-tw) -->
 
@@ -184,7 +184,7 @@
                        <family>AR PL Zenkai Uni</family>
                     <!-- free, JIS, vector, print/screen font, sans, ja -->
                        <family>IPAPMincho</family>
-                       <family>IPAPMonaMincho</family>
+                       <family>IPAMonaPMincho</family>
                     <!-- free, JIS, vector, print/screen font, sans, ja -->
                        <family>Sazanami Mincho</family>
                     <!-- free, JIS, vector, print/screen font, sans, ja -->
@@ -202,7 +202,7 @@
                     <!-- nonfree, GBK, vector, print font, serif, zh-tw/hk -->
                        <family>PMingLiu</family> <!-- han (zh-tw) -->
                     <!-- nonfree, JIS, vector, print/screen font, sans, ja -->
-                       <family>MS Gothic</family> <!-- han (ja) -->
+                       <family>MS PGothic</family> <!-- han (ja) -->
                     <!-- nonfree, GBK, vector, print font, serif, zh-cn -->
                        <family>HanyiSong</family> <!-- han (zh-cn,zh-tw) -->
 
@@ -263,6 +263,8 @@
                     <!-- free, JIS, vector, print/screen font, sans, ja -->
                        <family>IPAMincho</family>
                     <!-- free, JIS, vector, print/screen font, sans, ja -->
+                       <family>IPAMonaMincho</family>
+                    <!-- free, JIS, vector, print/screen font, sans, ja -->
                        <family>Sazanami Mincho</family>
                     <!-- free, JIS, vector, print/screen font, sans, ja -->
                        <family>Kochi Mincho</family>
Comment 44 Qianqian Fang 2009-07-23 10:16:33 UTC
Created attachment 27955 [details]
41-language-zh.conf

it looks great, thank you Matsuu. Here I attached a version for zh-* locales, followed the format in your ja file.

It would be nice if any Korean user can come up with a ko version.
Comment 45 MATSUU Takuto 2009-07-30 02:15:07 UTC
hmm, my 41-language-ja.conf is wrong. it has no effect to font order in fontconfig.
I'll check it again.
Comment 46 Nicolas Mailhot 2009-07-30 02:21:15 UTC
BTW from a maintenance POW I much prefer fontconfig rules which are split on a font-by-font basis and only installed if the associated fonts are actually available on-system (as is done with dejavu, for example, where fontconfig rules are shipped with fonts upstream). It would be very nice if CJK users reviewed the rules Fedora ships in individual packages such as droid (and then they got upstreamed somewhere)
Comment 47 Qianqian Fang 2009-07-30 06:16:25 UTC
(In reply to comment #46)
> BTW from a maintenance POW I much prefer fontconfig rules which are split on a
> font-by-font basis and only installed if the associated fonts are actually
> available on-system (as is done with dejavu, for example, where fontconfig
> rules are shipped with fonts upstream). It would be very nice if CJK users
> reviewed the rules Fedora ships in individual packages such as droid (and then
> they got upstreamed somewhere)
> 

IMHO, putting them together does not conflict with the "install-and-activate" idea.

The source of conflict for per-font setting comes from when several upstreams want there fonts to be default (such as Japanese/Chinese font overwriting). If there is a centralized place to set the order, this may encourage them to communicate before chaos happens. Of course, nothing can prevent them from keeping their own rules in the per-font config.
Comment 48 Hideki Yamane 2009-11-21 04:06:17 UTC
Hi,

 I looked proposed fix of 65-nonlatin.conf and some comments,

 * I think it is better to not include Japanese gothic in <serif>, because 
   it's "sans-serif" font.

 * And not incluse "P", Proportional font in <monospace>, too.
   monospace means "fixed-width" equals "non-proportional" font.

 * I curious to know the reason for below setting, Why non-monospace fonts 
   are in monospace?

>		<family>monospace</family>
>		<prefer>
(snip)
>                  <!-- ### CJK block 3: Hei/Gothic/Dotum non-monospace print fonts ### -->
Comment 49 Hideki Yamane 2009-11-21 04:10:40 UTC
 About 41-language-*.conf.
 I think it should contain only "family -> generic" data, 
 see /etc/fonts/conf.d/README file.

----------------------------------------------------------------------------
 Files begining with:   Contain:
 
 00 through 09          Font directories
 10 through 19          system rendering defaults (AA, etc)
 20 through 29          font rendering options
 30 through 39          family substitution
 40 through 49          generic identification, map family->generic
 50 through 59          alternate config file loading
 60 through 69          generic aliases, map generic->family
 70 through 79          select font (adjust which fonts are available)
 80 through 89          match target="scan" (modify scanned patterns)
 90 through 99          font synthesis

----------------------------------------------------------------------------
 and 40-nonlatin.conf should be cleaned up, too.
Comment 50 Nicolas Mailhot 2009-11-21 04:41:55 UTC
(In reply to comment #47)
> (In reply to comment #46)
> > BTW from a maintenance POW I much prefer fontconfig rules which are split on a
> > font-by-font basis and only installed if the associated fonts are actually
> > available on-system 

> IMHO, putting them together does not conflict with the "install-and-activate"
> idea.
> 
> The source of conflict for per-font setting comes from when several upstreams
> want there fonts to be default (such as Japanese/Chinese font overwriting). If
> there is a centralized place to set the order, this may encourage them to
> communicate before chaos happens.

I does not work out this way in practice. The source #1 source of conflict if some font packager making assumptions about other fonts with insufficient info, and having a centralized place does not stop him from making those assumptions (and CJK users are terrible about it: they're so frustrated they'll write any random crap without consideration for other users as long as it works on their system). On the contrary it means it puts those assumptions in the common file, making sure to break everyone else's config, even if they did not install this package.

After a few years of looking after Fedora font packages I've never seen a centralized config file that was correct, it always reflected the bias of the last person to edit it, and it always got in the way of good packagers that tried to isolate their config without stepping on the toes of other packagers.

IMHO the only way to bring some sanity to fontconfig is to define simple clear rules, and get everyone to follow them (because fontconfig syntax is flexible enough it us really easy to shoot oneself in the foot, and no using a single centralised file does not stop that).

Such rules (again, IMHO) should be:

1. every font package includes the rules associated to its packaged font (no delegation to another package that *will* get out of sync and *will* be maintained by someone that has not got the fine knowledge about the font the actual font packager has)

2. you are forbidden to change in any way the priority of a font not included in your own package (at most you can declare your font is a valid substitute to another font)

3. you are only allowed to use reviewed fontconfig patterns in your fontconfig file (no "the rules everyone else uses are insufficient, I'll invent my own ones, it works on my computer with my set of fonts, what can go wrong?)

4. the only way fonts are priorized is the prefix of each fontconfig file, using documented rules (that means it's easy for distros to review their font setup, and change the prefix of one fontconfig file if it's wrong for them)
Comment 51 Nicolas Mailhot 2009-11-21 04:44:42 UTC
(In reply to comment #49)
>  About 41-language-*.conf.
>  I think it should contain only "family -> generic" data, 
>  see /etc/fonts/conf.d/README file.

There is another informal rule I've documented in 
 
http://git.fedorahosted.org/git/fontpackages.git?p=fontpackages.git;a=blob;f=fontconfig-templates/fontconfig-priorities.txt

non-lgc fonts are forbidden below 65 (that limits a little the harm cjk fontconfig rules writers can do)
Comment 52 Hideki Yamane 2009-11-21 07:06:55 UTC
> non-lgc fonts are forbidden below 65 (that limits a little the harm cjk
> fontconfig rules writers can do)

Anyway, "family -> generic" mapping would not cause any problem if it is below 65 :)

Comment 53 Qianqian Fang 2009-11-21 07:34:24 UTC
bugzilla-daemon@freedesktop.org wrote:
> I does not work out this way in practice. The source #1 source of conflict if
> some font packager making assumptions about other fonts with insufficient info,
> and having a centralized place does not stop him from making those assumptions
> (and CJK users are terrible about it: they're so frustrated they'll write any
> random crap without consideration for other users as long as it works on their
> system). On the contrary it means it puts those assumptions in the common file,
> making sure to break everyone else's config, even if they did not install this
> package.
>   

I don't see your reasoning between advocating a non-centralized
config structure vs. preventing packager making assumptions.
Putting rules inside per-font config can be equally, or even more,
vulnerable to such personal bias. Centralized config file, managed
and reviewed by the upstream experienced managers, as advocated
in your later points, is exactly the reason I was trying so hard
to push this bug.

Don't make things sound more complicated, the submitted files are simply
1. a updated 65-nonlatin.conf (which is already in fontconfig), and
2. several CJK-specific rules with additional matching of the language tag

I don't think this is so different from the original fontconfig
philosophy.

> After a few years of looking after Fedora font packages I've never seen a
> centralized config file that was correct, it always reflected the bias of the
> last person to edit it, and it always got in the way of good packagers that
> tried to isolate their config without stepping on the toes of other packagers.
>   

I do respect your work and principles to protect the fontconfig
structures for Fedora, but I disagree with your statement. I think
Ubuntu is doing a better job for CJK users with fontconfig. The
packages language-selector and fontconfig-voodoo are automatically
installed/executed for the selected CJK languages, and I believe
they are working just fine. From many Chinese Linux forums that I
participated, I can see a dominant majority of users are picking
up Ubuntu simply because of its CJK friendliness. I think
that's what Fedora should learn.

The proposed language-specific config files are simply improved
versions of Ubuntu's language-selector files (with additional lang
tag matching, so that it eliminates the needs for fontconfig-voodoo).

> IMHO the only way to bring some sanity to fontconfig is to define simple clear
> rules, and get everyone to follow them (because fontconfig syntax is flexible
> enough it us really easy to shoot oneself in the foot, and no using a single
> centralised file does not stop that).
>
> Such rules (again, IMHO) should be:
>
> 1. every font package includes the rules associated to its packaged font (no
> delegation to another package that *will* get out of sync and *will* be
> maintained by someone that has not got the fine knowledge about the font the
> actual font packager has)
>
> 2. you are forbidden to change in any way the priority of a font not included
> in your own package (at most you can declare your font is a valid substitute to
> another font)
>
> 3. you are only allowed to use reviewed fontconfig patterns in your fontconfig
> file (no "the rules everyone else uses are insufficient, I'll invent my own
> ones, it works on my computer with my set of fonts, what can go wrong?)
>
> 4. the only way fonts are priorized is the prefix of each fontconfig file,
> using documented rules (that means it's easy for distros to review their font
> setup, and change the prefix of one fontconfig file if it's wrong for them)
>   

Again, as I said in the beginning, the whole point of this bug
is to propose a common set of CJK rules that are managed and
reviewed by the experienced users so that individual font package
have no need to overwrite with their own rules.

I don't know if you see a cycle here: the reason why individual
font packager write their own rules is because fontconfig is shipping
a 10 years old (or something like that) default font order
as in 65-nonlatin and it is extremely outdated; and now your
reasons for not making this update is because you are against
font packagers making their own rules. Your rule#2 will
not be going to work unless fontconfig is updated to reflect
these requests.
 
Comment 54 Qianqian Fang 2009-11-21 07:36:49 UTC
bugzilla-daemon@freedesktop.org wrote:
> There is another informal rule I've documented in 
>
> http://git.fedorahosted.org/git/fontpackages.git?p=fontpackages.git;a=blob;f=fontconfig-templates/fontconfig-priorities.txt
>
> non-lgc fonts are forbidden below 65 (that limits a little the harm cjk
> fontconfig rules writers can do)
>   

With the additional lang tag matching in the CJK specific config
files, I don't think it can do any harm at all.

Please advice me which number I should use if I want to overwrite
the font orders in 65-nonlatin.conf based on your informal rules?
Comment 55 Jens Petersen 2009-11-29 16:41:34 UTC
> I think Ubuntu is doing a better job for CJK users with fontconfig.
> The packages language-selector and fontconfig-voodoo are automatically
> installed/executed for the selected CJK languages, and I believe
> they are working just fine.

Sorry, Qianqian, I have to take some issue with this.
Ubuntu's approach may be ok for monolingual users,
but to me it is not acceptable forcing say Japanese webpages
to be rendered in Chinese fonts or vice-versa.

> From many Chinese Linux forums that I
> participated, I can see a dominant majority of users are picking
> up Ubuntu simply because of its CJK friendliness. I think
> that's what Fedora should learn.

I am not saying that Fedora's CJK fontconfig is perfect
but I feel our overall approach is more correct and consistent
though there are still some specific issues and cases that need
to be untangled and resolved.
Comment 56 Qianqian Fang 2009-11-29 17:34:30 UTC
--- Comment #55 from Jens Petersen <petersen@redhat.com> 2009-11-29 
16:41:34 PST ---
> Sorry, Qianqian, I have to take some issue with this.
> Ubuntu's approach may be ok for monolingual users,
> but to me it is not acceptable forcing say Japanese webpages
> to be rendered in Chinese fonts or vice-versa.
>   

I don't think so. Why would Ubuntu's approach force to
use the wrong language? I don't understand.

> I am not saying that Fedora's CJK fontconfig is perfect
> but I feel our overall approach is more correct and consistent
> though there are still some specific issues and cases that need
> to be untangled and resolved.
>   

what was proposed in this bug tracker is merely an
updated and extended version of the original 65-nonlatin.
This is the common piece of all distros. So, this update
really doesn't touch the distro specific fontconfig
management.
Comment 57 Jens Petersen 2009-11-30 00:22:05 UTC
Created attachment 31573 [details]
screenshot of zh wikipedia in a japanese firefox on karmic

> I don't think so. Why would Ubuntu's approach force to
> use the wrong language? I don't understand.

Because they override CJK for all langs?
Well don't take my word for it: you can test yourself. :)

I should really get round to filing a bug...
Comment 58 Hideki Yamane 2009-11-30 06:34:08 UTC
Created attachment 31592 [details]
another proposed 65-nonlatin.conf

works with 66-language-*.conf
Comment 59 Hideki Yamane 2009-11-30 06:35:56 UTC
Created attachment 31593 [details]
another proposed configuration file for Japanese language

separated from 65-nonlatin.conf, update for Japanese language.
Comment 60 Hideki Yamane 2009-11-30 06:37:30 UTC
Created attachment 31594 [details]
another proposed configuration file for Chinese language

proposed configuration file for Chinese language, same as previous
66-language-ja_jp.conf.
Comment 61 Hideki Yamane 2009-11-30 06:45:29 UTC
Created attachment 31595 [details]
screenshot of zh.wikipedia in Japanese Firefox on karmic (with another proposed conffiles)

Hi,

Now I'm trying to adjust fontconfig files with CJK, propose another configuration.
Here is a screenshot, show zh.wikipedia.org with Firefox on Ubuntu 9.10 in Japanese environment.


Some modification as

- replace /etc/fonts/conf.d/65-nonlatin.conf
- add /etc/fonts/conf.d/66-language-{ja_jp,zh_cn}.conf


How about this?
Comment 62 Qianqian Fang 2009-11-30 06:58:36 UTC
(In reply to comment #57)
> Created an attachment (id=31573) [details]
> screenshot of zh wikipedia in a japanese firefox on karmic
> 
> > I don't think so. Why would Ubuntu's approach force to
> > use the wrong language? I don't understand.
> 
> Because they override CJK for all langs?
> Well don't take my word for it: you can test yourself. :)
> 
> I should really get round to filing a bug...
> 

I do agree that this is a problem with Ubuntu's language-selector: it can only make one the preferred language. This will be a problem when using ja locale and browsing a simplified Chinese page, such as in your attachment. 

However, the proposed files can just eliminate this problem because not a single language activate at a time, all CJK settings are activated, matched by their respective lang tag. In this sense, this is a better solution than language-selector.
Comment 63 Qianqian Fang 2009-11-30 07:04:53 UTC
(In reply to comment #61)
> Hi,
> 
> Now I'm trying to adjust fontconfig files with CJK, propose another
> configuration.
> Here is a screenshot, show zh.wikipedia.org with Firefox on Ubuntu 9.10 in
> Japanese environment.
> 
> Some modification as
> - replace /etc/fonts/conf.d/65-nonlatin.conf
> - add /etc/fonts/conf.d/66-language-{ja_jp,zh_cn}.conf
> How about this?
> 

thanks for testing this. there are two places I don't not quite understand:

1. you eliminated the alias/default sections, do you want to explain why?
2. you renamed the file numbers from 41 to 66, I think that will not allow these files to override 65-nonlatin. In another word, if you remove 66-*, you will see the same effect, can you confirm on this?

Comment 64 Qianqian Fang 2009-11-30 07:08:22 UTC
the attachment "another proposed 65-nonlatin.conf" doesn't look right. No Chinese fonts in it. This will cause issues when browsing CJK pages under non-CJK locales, such as en_US. 

Please use the last version uploaded by MATSUU Takuto for your testing.
Comment 65 Baybal 2009-11-30 08:01:03 UTC
You seems to do a lot of progress here. But as a person who one time was struggling with 10+ patches freetype patchset in gentoo linux with a sole purpose of getting decent vector fonts I have some insight on current CJK font rendering affairs.

1. There are no way of getting even decent Hangeul and no need to mention traditional Han characters rendering on dpi =<90 without filtering/post-processing which is ubiquitous on commercial products available (Japanese Xerox productset and its numerous offsprings and ripoffs, software you can see used in Korean big 3 products e.t.c.) but obviously covered with 3 or even more overlapping patents in US as whole process (I.E. the famous adobe patent doesn't specify anything more than "improving the resulting image by means of advanced mathematical processing", without even specifying any mathematical concept behind it). Despite of this most of distros of linux and bsds use various filtering patchsets by default. Why I mention this is because most of advanced features enabled fonts doesn't look even as half as good if used without filtering and vice versa, please see attachment 1 [details] [review] and 2 to see how microhei and ipamona font look with freetype's bundled but disabled by default filtering enable.

2. As a result that Most of distributions of freetype turns on both bundled filtering and bci anyway both fontconfig rules writers and open fonts developers should get it in case when drawing fonts and writing config files for them. This way IPA fonts Mona modification which has some spacing and hinting hacks which makes them a drop in replacement for MS Gothic with quirk-less hinting the same way as Microhei outperforms even latest bundled Microsoft and Apple fonts. So Qianqian and Matsuu please install at least bundled filter enabled fc version and look them again prior to rating fonts.

3. Still there are some work to do like having a way to artificially lower weight for IPA Mona fonts in fc=>2.7, as since the version 2.7 setting weight doesn't work any more for those fonts (bug?), the same for ZenHei fonts which have different weight on different fontconfig versions (same bug?).

4. I propose to have separate config file layout for CJK too but with one file like 65-cjk-common to specify global stuff for CJK specific workarounds, specifying Unifont fallback in the most worst cases and so on and to left 65-jp, 65-ko,65-zh-tw,65-zh-hk e.t.c. files for specifying font family aliases and family preferences only and with 65-noncjk-nonlatin for every another thing till time come for them to move the same way.

5. I also strongly support to have commercial fonts sent to something like 
65-cjk-nonfree.

6. On bitmaps, I think there could be a final solution if there would be a per font aliases to have vector fonts and their embedded bitmaps as a separate entities. This would especially help in cases when embedded bitmaps and set of Besier curve based glyphs are not totally overlapping. And as an argument I would mention that on ultra low dpi devices like those sub 90 dpi panel now flooding low-end sector and 210-240 dpi hi-end found in Sony and other brands on Japanese internal market you have to operate CJK either strictly in bitmaps or strictly vector based fonts for various appliances, and leaving the fun of manually rewriting aliases every time either first or the second is needed to end users doesn't seem to be a humane way. Does anybody supports this solution? As fontconfig had always dealt with "synthetic" things I think it wouldn't be a big deal to implement as simple alias.

As a conclusion: I strongly support Qianqian in his will to kill off things that are just 10 or more years outdated. The whole concept of fc was to make all things working out of the box was it? This works now for European langs, the same should be for CJK and every other script. Why you guys not to schedule a brainstorm in IRC before my move to Canada next month so I could consult? People inside the problem used to hang on ##fonts irc channel on irc.freenode.net, so would I next week so come any time and we would try to get in sync for brainstorm. My timezone is UTC+10 but usually I'm on line during evening/night.

P.S. explanation for non CJK savvy devs:
1. proposed layout is : 65-cjk-global with settings like per font global advance adjuster, spacing, and setting unifont or alikes as global fallback and others are 65-cjk-$LANG, 65-noncjk-nonlatin, 95-cjk-synthetic-$LANG-faces/fonts for making bitmaps as separate entities.
2. IPA Mona font is the same IPA font with more of upstream fixes and hack antiquirks to make IPA fonts running fine within free software ecosystem and for making Japanese users comfortable about what they used to think as normal like spacing.
3. The IPA fonts themselves were built by Japanese government and probably stills to be an etalon of how Japanese font should like in general.
4. There are hkscs fonts built by Hongkong which are generally for printing but gives a good insight in concept behind modern traditional Han font.
5. Thoughts behind the original 65-nonlatin were probably based on fact that none of fonts back then had CJK extensions covered fully and the issue was just to deliver and ensure that extension glyphs are just displayed.
Comment 66 Baybal 2009-11-30 08:07:14 UTC
Created attachment 31598 [details]
Filtered Microhei

Non filtered version is just simply inferior to any thou of any filtering patchset available.
Comment 67 Baybal 2009-11-30 08:08:43 UTC
Created attachment 31599 [details]
IPAMonafont

The IPA Mona font is its full gothiqueness
Comment 68 Baybal 2009-11-30 08:11:42 UTC
Comment on attachment 31599 [details]
IPAMonafont

Hmm Pango didn't set font to IPAMonaGothic as I requested.
Comment 69 Baybal 2009-11-30 08:53:48 UTC
And the main point I left, we still getting cocktail of fonts when reading simplified Chinese.
Comment 70 Hideki Yamane 2009-11-30 14:15:38 UTC
> Comment #63 from Qianqian Fang <fangqq@gmail.com>  2009-11-30 07:04:53 PST 

> 1. you eliminated the alias/default sections, do you want to explain why?

 I intend to separate it as 41-fonts-{zh_cn,ja_jp}.conf
 I'll attach it.
 

> 2. you renamed the file numbers from 41 to 66, I think that will not allow
> these files to override 65-nonlatin. In another word, if you remove 66-*, you
> will see the same effect, can you confirm on this?

 I think "put it to below 65" means "do not harm non-CJK users", but
 match "lang" specific would not cause any harms, IMO.


> --- Comment #64 from Qianqian Fang <fangqq@gmail.com>  2009-11-30 07:08:22 PST ---
> the attachment "another proposed 65-nonlatin.conf" doesn't look right. No
> Chinese fonts in it. This will cause issues when browsing CJK pages under
> non-CJK locales, such as en_US. 

 I think it does not so bad, see 
 http://picasaweb.google.com/henrich/IceweaselOnDebianUnstableEn_USLocale
 browsing some web pages with en_US locale.

 Could you tell me which case or URL would be harmful in en_US locale with
 "another proposed 65-nonlatin.conf"? I'll check it.
Comment 71 Hideki Yamane 2009-11-30 14:17:40 UTC
Created attachment 31610 [details]
separate Japanese family -> generic config
Comment 72 Nicolas Mailhot 2009-11-30 14:56:12 UTC
If 65-nonlatin.conf is to be changed please do it properly. Do not mix info about different fonts in the same file so users and distros and users can change one font prio just by renaming a link (I still think those files should be provided by font packages themselves but at least use a model any font packager can emulate for its package).

1. sort fonts by fontconfig generic and target lang

2. for each generic, and each target lang, create a series of files named

65-<X>-<lang>-<fontname>.conf

Where X is a two-digit number prefix, lang is "zh",  "ja" or whatever, and "fontname" is the name of the font being described. Judicious use of X will allow you to order fonts within the 65 level

Make each of those files contain the following (fontconfig forces you to use contains to specify multiple langs, it is not safe, contains foo can match foo- or -foo, kittens will be hurt someday)

<match>
  <test name="lang" compare="contains">
    <string>[lang]</string>
  </test>
  <test name="family">
    <string>[generic]</string>
  </test>
  <edit name="family" mode="prepend" binding="same">
    <string>[font name]</string>
  </edit>
</match>
<alias>
  <family>[font name]</family>
  <default>
    <family>[generic]</family>
  </default>
</alias>

Distros will be able to override this ordering by renaming symlinks or dropping files at 64-XX

And *please* stop messing with the system default fonts, cjk files have no business deciding if the distro wants dejavu, or liberation, or vera first for latin glyphs

(I realise this is not completely satisfying, but I don't see how we can make it work reliably without being able to blacklist non-<lang> content of this kind of font)
Comment 73 Qianqian Fang 2009-11-30 17:29:56 UTC
Here I am attaching a test suite for the conf files submitted in this bug.
I hope everyone who want to help with this bug to do the following:

1. follow the README.txt and see if you can reproduce what I got
   in this album:
   http://picasaweb.google.com/fangqq/Fontconfig_65nonlatin_test#

2. if you want to help to improve or revise these files, please do it
   with regression tests as outlined in the README.txt and make sure it
   does not break anything that is already working

3. justify each change you made to 65-nonlatin, 41-language-zh and
    41-language-ja files (patches are more welcome than theoretical 
criticism)

 From my tests, I see all CJK fonts rendered consistently under
all en/zh/ja locales.


Here is a copy of the README.txt file:


==  1. What is in this package  ==

The attached conf.d folder contains the conf.d folder
in the latest fontconfig 2.8.0 release with the following
modifications:

  1. the 65-nonlatin.conf was updated by attachment
     http://bugs.freedesktop.org/attachment.cgi?id=27952
     (2009-07-23 08:01 PST, MATSUU Takuto)
  2. two new files:
       41-language-zh.conf from attachment
       http://bugs.freedesktop.org/attachment.cgi?id=27955
       (2009-07-23 10:16 PST, Qianqian Fang)
     and
       41-language-ja.conf from attachment
       http://bugs.freedesktop.org/attachment.cgi?id=27951
       (2009-07-23 07:52 PST, MATSUU Takuto)

nothing else is touched.


==  2. How to test  ==

2.1 Install the necessary CJK fonts

For debian/ubuntu (>=9.10):
   apt-get install ttf-wqy-zenhei ttf-wqy-microhei xfonts-wqy \
    ttf-arphic-uming ttf-arphic-ukai ttf-vlgothic ttf-unfonts

For fedora/redhat (>=F11):
   install wqy-bitmap-fonts wqy-zenhei-fonts cjkuni-uming-fonts
           cjkuni-ukai-fonts vlgothic-fonts vlgothic-p-fonts
           sazanami-mincho-fonts sazanami-gothic-fonts ipa-gothic-fonts
           ipa-mincho-fonts ipa-pgothic-fonts ipa-pmincho-fonts
           and also some version of un-fonts


For Fedora, if you want to install wqy-microhei, you can
download it from
http://sourceforge.net/projects/wqy/files/wqy-microhei/0.2.0-beta/

extract it and put the wqy-microhei.ttc to ~/.fonts folder.


2.2 Set up the new conf.d folder

You need to temporarily replace the current conf.d folder
by what's included in this package. Something like this

sudo mv /etc/fonts/conf.d /etc/fonts/conf.old
sudo mv /path/to/this/folder/conf.d /etc/fonts/conf.d
sudo mv ~/.fonts.conf ~/.fonts.conf_old
sudo fc-cache -fv

Please install the fonts first before you
switch the folder; otherwise, some font installation
may alter or add new files to the conf.d directory.


2.3 Run the test

The goal of the test is to show
1. whether the CJK rules have any side-effect to non-CJK locales
2. whether users can read consistent CJK text rendering under non-CJK 
locales
3. whether users can read consistent CJK text under the respective CJK 
locales

To do this, you need to fill the following matrix:

    bitmap status         webpage language   desktop language/locale
----------------------------------------------------------------------
 {no_bitmap,with_bitmap} - {en,zh,ja_page} - {en_US, zh_CN, ja_JP}

In another word, you need to logon your account with en/zh/ja language
as your desktop language, and then browse an English-only page,
a simplified Chinese page, a Japanese page in your browser.

"no_bitmap" means vector-priority settings. This means
you need to remove "wqy bitmap song" from the list.
For testing purposes, this can be done by moving the font files
(locate wenquanyi_ | grep .pcf ) to /tmp/

You should also pay attention to the monospaced font used in
your terminal software, and also the desktop toolbars and menus.


The suggested test pages include:

Chinese : http://wenq.org/?WQYTest
Japanese: http://news.google.com/news?ned=jp
English 
:http://www.google.com/support/news/bin/answer.py?answer=40237&topic=8851&hl=en



==  3. What should you expect  ==

1. English (non-CJK) page fonts will not be effected at all
2. In non-CJK locales, when viewing CJK pages, simplified Chinese
   fonts will be used to render all the text
3. In ja_JP locale, Japanese fonts will be used for all text rendering
4. In zh_CN locale, Chinese fonts will be used for all text rendering


==  4. Clean up script after the test ==

sudo mv /etc/fonts/conf.d /etc/fonts/conf.new
sudo mv /etc/fonts/conf.old /etc/fonts/conf.d
sudo mv ~/.fonts.conf_old ~/.fonts.conf
sudo fc-cache -fv


if everything look ok, then you can delete
folder /etc/fonts/conf.new
Comment 74 Qianqian Fang 2009-11-30 17:31:44 UTC
bugzilla-daemon@freedesktop.org wrote:
>  I think it does not so bad, see 
>  http://picasaweb.google.com/henrich/IceweaselOnDebianUnstableEn_USLocale
>  browsing some web pages with en_US locale.
>
>  Could you tell me which case or URL would be harmful in en_US locale with
>  "another proposed 65-nonlatin.conf"? I'll check it.
The two WQY test pages do not look right, the URLs are here:

http://picasaweb.google.com/henrich/IceweaselOnDebianUnstableEn_USLocale#5410005185438174370
and
http://picasaweb.google.com/henrich/IceweaselOnDebianUnstableEn_USLocale#5410005197491967026

Recognized issues:

1. in both pages, the default font is rendered by a Japanese sans font 
(see "偏" in "使用偏好")
2. the serif block in the first page failed to pick a Chinese serif font 
(i.e. uming)
3. in the second page, the line near the bottom "如果您的系统上wqy字体 
...", you can still
   see the mosaic/cocktails of different fonts. This is not surprising 
because
   the Japanese sans was selected and it does not cover all the zh_CN 
code points.


I am now uploading a simplified test suite, so that everyone can reproduce
what I want to achieve, then, we can start revising the files from there.
Comment 75 Qianqian Fang 2009-11-30 17:41:39 UTC
Created attachment 31611 [details]
a complete test suite for the proposed config files

Here is the test suite package.
Comment 76 Baybal 2009-11-30 23:19:58 UTC
IPA Mona fonts should be put prior to plain IPA fonts because original fonts contains quirks that makes them work in microsoft and vice versa with anything other.
Comment 77 Qianqian Fang 2009-12-02 07:46:28 UTC
Created attachment 31663 [details]
65-nonlatin test suite svn rev23

I uploaded the test suite to WQY's svn and hopefully that can make revision process easier.

I attached the tarball for rev 23, but you can check it out with
 svn co https://wqy.svn.sf.net/svnroot/wqy/trunk/65nonlatin_test_suite \
  65nonlatin_test_suite

A summary for the changes I've made so far:

1. I split the original 41-* files into 41- and 65-, with 41-* for family->generic (per comment#70)
2. I updated 41-language-ja by attachment 31610 [details] proposed by Hideki
3. I moved IPA Mona in front of IPA for 65-language-ja, per comment#76 by Baybal

With this version, I tested the following command:

  LANG=<zh_CN,en_US,ja_JP>.UTF-8 pango-view --markup --text 'English: 英文中的汉字,<span lang="zh">您好中文汉字</span>,<span lang="ja">您好日文</span>(Note: "您" is a Chinese-specific Hanzi)' --font="sans"

If you run this command with non-CJK locales, I anticipate the font preferences in 65-nonlatin will control the text rendering out-side the <span></span>, while the text inside the <span> will be controlled by 65-language-*.

If you run this command in a CJK locale, the text inside the <span> will remain be controlled by the 65-language files, and outside will be controlled by the 65-language-<current locale>.

With this numbering, user can overwrite the default settings with their ~/.fonts.conf.

If you make any change, please run the above test command and upload either a "svn diff" output or the tarball of the full package.
Comment 78 Nicolas Mailhot 2009-12-02 14:09:01 UTC
Created attachment 31684 [details]
split fontconfig files

Here is a conter-proposal with the fontconfig files properly split per font (ie I want a setup that any individual packager can trivially copy when he needs to package a new cjk font, not something that lefts him scratching his head and feel he'll go nowhere without going through the fontconfig packager)

Also provided are the scripts used to generate them from an easily changed csv file

I intentionnaly didn't touch the proposed CJK stacks, though I feel it is highly abusive to registed the same font in multiple generics. We have lots of latin fonts that could be classified in multiple generics (trivial example: DehaVu Sans Mono), we don't put them everywhere anyway
Comment 79 Qianqian Fang 2009-12-02 16:48:37 UTC
bugzilla-daemon@freedesktop.org wrote:
> http://bugs.freedesktop.org/show_bug.cgi?id=20911
>
> --- Comment #78 from Nicolas Mailhot <nicolas.mailhot@laposte.net>  2009-12-02 14:09:01 PST ---
> Created an attachment (id=31684)
>  --> (http://bugs.freedesktop.org/attachment.cgi?id=31684)
> split fontconfig files
>
> Here is a conter-proposal with the fontconfig files properly split per font (ie
> I want a setup that any individual packager can trivially copy when he needs to
> package a new cjk font, not something that lefts him scratching his head and
> feel he'll go nowhere without going through the fontconfig packager)
>   

I truly don't understand why this has to be done in a per-font format.
Why Latin fonts can be listed in a preferred list in 60-latin.conf, but
CJK fonts can not?

The split files only increase the maintenance complexity, reduce
the readability and gain very little (if there is any).

If you don't like what was proposed in the svn, please give me one
solid example to show it is problematic.

> Also provided are the scripts used to generate them from an easily changed csv
> file
>
> I intentionnaly didn't touch the proposed CJK stacks, though I feel it is
> highly abusive to registed the same font in multiple generics. We have lots of
> latin fonts that could be classified in multiple generics (trivial example:
> DehaVu Sans Mono), we don't put them everywhere anyway
>   

but what's wrong to have a longer list of fallback fonts? I just don't
want fontconfig to randomly pickup one from other CJK fonts that we
know it is not appropriate, or don't display at all.

The scheme makes this setup robust because 1) we give plenty of
choices, and 2) we ranked them from good to bad for each category.
Even the best font is not installed, we can still get the next-best choice,
and so on. This is exactly what we ask for: give CJK people the
best your system can provide out-of-box.
Comment 80 Qianqian Fang 2009-12-02 18:45:10 UTC
On Wed, Dec 2, 2009 at 5:09 PM, <bugzilla-daemon@freedesktop.org> wrote:

>
> I intentionnaly didn't touch the proposed CJK stacks, though I feel it is
> highly abusive to registed the same font in multiple generics.
>
>
In addition, it is NOT to register the same font in multiple generics, it is
to expand
the fallback. I agree that in family->generic mapping, it makes sense to map
to only one generic alias for every font (which I did in 41-*). However, for

generic->family fallback, it is a different story. Even I don't put a font
name
there, fontconfig will pick it up. The only difference is at least I have
control
the rest of the orders.

In an extreme case, if there is no sans CJK fonts installed, for most CJK
users,
they would rather to see a serif font of their own language to be displayed
than seeing an alien sans-serif (or cocktails) picked by fontconfig.
Comment 81 Nicolas Mailhot 2009-12-02 23:53:34 UTC
(In reply to comment #80)
> On Wed, Dec 2, 2009 at 5:09 PM, <bugzilla-daemon@freedesktop.org> wrote:
> 
> >
> > I intentionnaly didn't touch the proposed CJK stacks, though I feel it is
> > highly abusive to registed the same font in multiple generics.
> >
> >
> In addition, it is NOT to register the same font in multiple generics, it is
> to expand
> the fallback. I agree that in family->generic mapping, it makes sense to map
> to only one generic alias for every font (which I did in 41-*). However, for
> 
> generic->family fallback, it is a different story. Even I don't put a font
> name
> there, fontconfig will pick it up. The only difference is at least I have
> control
> the rest of the orders.

IMHO this is broken logic. With that logic you end up putting every font in every generic stack just in case.

You'd be better served by opening a fontconfig bug asking to look into other generic stacks by default before fallbacking to fonts that were not specified explicitely in one of them.

Comment 82 Nicolas Mailhot 2009-12-03 00:13:54 UTC
(In reply to comment #79)
> bugzilla-daemon@freedesktop.org wrote:

> I truly don't understand why this has to be done in a per-font format.
> Why Latin fonts can be listed in a preferred list in 60-latin.conf, but
> CJK fonts can not?

I don't like 60-latin.conf either but on a Fedora system we've made it mostly irrelevant. (I suppose we could also make your new files irrelevant, probably not what you want)

> The split files only increase the maintenance complexity, reduce
> the readability and gain very little (if there is any).

As I've stated (many times) my primary constraint is to have a smooth packaging flow where any font can be picked up by a packager and packaged quickly and independently. And the result is full-featured, not "it almost works but the remaining bits need integration by the fontconfig maintainer in its master files". We have one fontconfig maintainer who is awesome but also real busy working on all the text stack, whereas we have one packager per font package, and spreading the work as much as possible is basic common sense.

In such a model you have one font family per package only not huge collections of unrelated fonts (as I see very often in Debian).

This is how Fedora has been working in the past two years.

Anything that requires editing a file shared by multiple packages, instead of letting each package drop its own file, is contrary to this workflow.

Anything that implies it is ok to change font rules for fonts in other packages, such as mixing multiple font names in a single file, is a receipe for packager conflicts

Anything that implies you need to edit a file shared by multiple packages, such as the files you propose integrating, is an impediment to this workflow because people wait for the central authority to move before doing anything

Besides from a support POW, is is a lot easier to ask users to rename one symlink and re-test, than tell them to edit a font list in XML format. So this config style also helps after the packaging.


> The scheme makes this setup robust because 

The scheme is not robust. The more times you repeat a font name, the more times you introduce room for human mistakes and someone editing one instance of your declarations but not the other.

It is a good idea to fall back on fonts declared in other generic stacks before trying any random un-declared font. It is a bad idea to do it via explicit multiple font declarations
Comment 83 Qianqian Fang 2009-12-03 06:30:16 UTC
I don't think neither of us have all day to fight on this. I think it is time for fontconfig's maintainer or developers to make a choice.

Behdad, please let me and Nicolas know which way you prefer (from maintenance perspective). What ever your choice, I really hope this can be incorporated in the next release of fontconfig.
Comment 84 Qianqian Fang 2009-12-03 06:32:56 UTC
(In reply to comment #83)
> I don't think neither of us have all day to fight on this. 

ok, it's not "I don't think", it is "I am sure".
Comment 85 Akira TAGOH 2009-12-04 00:23:24 UTC
Just for my two cents,

(In reply to comment #82)
> I don't like 60-latin.conf either but on a Fedora system we've made it mostly
> irrelevant. (I suppose we could also make your new files irrelevant, probably
> not what you want)

+1. I don't like either of the rules that contains a kind of the priority list things. from the POV of distributors or packagers, it's harmful for tuning. it sometimes affects unexpectedly. getting rid of them would makes really happier as long as you have well-tuned separate config files.

From the POV of upstream, I suppose providing the easy-use configuration would be important though, that should keeps as just an example IMHO.

> As I've stated (many times) my primary constraint is to have a smooth packaging
> flow where any font can be picked up by a packager and packaged quickly and
> independently. And the result is full-featured, not "it almost works but the
> remaining bits need integration by the fontconfig maintainer in its master
> files". We have one fontconfig maintainer who is awesome but also real busy
> working on all the text stack, whereas we have one packager per font package,
> and spreading the work as much as possible is basic common sense.

I agree with you. plus, this issue is more closer to the preference issue. since fontconfig supports to have separate config files, people could do that in your machine or in your distribution. deciding shipped default configuration in fontconfig according to a little people's preference or the discussions with a little people makes no sense. having different configuration in fontconfig is more likely than other software. ideally the fonts upstream should ships fontconfig config files for their fonts but shouldn't be done in fontconfig for the specific fonts.
Comment 86 Jens Petersen 2009-12-07 21:41:32 UTC
(In reply to comment #78)
> Created an attachment (id=31684) [details]
> split fontconfig files
> 
> Here is a conter-proposal with the fontconfig files properly split per font (ie
> I want a setup that any individual packager can trivially copy when he needs to
> package a new cjk font, not something that lefts him scratching his head and
> feel he'll go nowhere without going through the fontconfig packager)

I also agree that separate .conf looks clean and they should be in the
fonts packages themselves as much as possible.  The current
blob of fontconfig rules seems quite unmaintainable.

> Also provided are the scripts used to generate them from an easily changed csv
> file

Have you thought of including something like it in Fedora's fontpackages package? :)


I dunno if we still need some bits of 65-nonlatin.conf around for now until distros stop depending on it completely?
Comment 87 Baybal 2009-12-09 00:13:35 UTC
So what is with my proposal to make embedded bitmaps a separate fonts?
Comment 88 Jens Petersen 2009-12-21 21:37:46 UTC
I am trying now to change the default Chinese font in Fedora
to WQY ZenHei but it seems I can't do this without dropping
Zenhei from 65-nonlatin.conf - otherwise it overrides
Japanese on the Fedora desktop.

I think I would like to propose just dropping 65-nonlatin.conf
completely from conf.d and recommend distros to provide font .conf
for each font they install as needed.  This is basically
what we are doing in Fedora today already and it works well.
Comment 89 Baybal 2009-12-21 22:16:29 UTC
>Here is a conter-proposal with the fontconfig files properly split per font (ie
>I want a setup that any individual packager can trivially copy when he needs to
>package a new cjk font, not something that lefts him scratching his head and
>feel he'll go nowhere without going through the fontconfig packager)

First I think if you think so, repository maintainers are still free to choice the optimal way of rendering for their fonts.
Nobody and this solution inclusively doesn't restrain people from still doing what they were doing prior to it.
Second, per font config would be unable to provide good out of the box experience by default anyway on most of typical cjk setups after few custom fonts manipulations which you should think as common for today. The main purpose of centralised 41-zh,jp,ko... is to provide adequate fallbacks, rather than having a complete solution for everything zh,jp,ko

>I think I would like to propose just dropping 65-nonlatin.conf
>completely from conf.d and recommend distros to provide font .conf
>for each font they install as needed.  This is basically
>what we are doing in Fedora today already and it works well.

Per language fallbacks what we are already discussing there would be doing exactly what are you trying to do now but in even more quirkless and non-intrusive manner, and yes they drops 65-nonlatin

I PROPOSE US TO HAVE A JOINT BRAINSTORM ON IRC.
Comment 90 Jens Petersen 2010-01-12 21:44:22 UTC
(In reply to comment #89)
> Second, per font config would be unable to provide good out of the box
> experience by default anyway on most of typical cjk setups after few custom
> fonts manipulations which you should think as common for today.

Perhaps you are referring to Ubuntu's override system that was discussed earlier?

> >I think I would like to propose just dropping 65-nonlatin.conf
> >completely from conf.d and recommend distros to provide font .conf
> >for each font they install as needed.  This is basically
> >what we are doing in Fedora today already and it works well.
> 
> Per language fallbacks what we are already discussing there would be doing
> exactly what are you trying to do now but in even more quirkless and
> non-intrusive manner, and yes they drops 65-nonlatin

> I PROPOSE US TO HAVE A JOINT BRAINSTORM ON IRC.

We could talk on ##fonts, but may be hard to find
a common time to please all, but go ahead and suggest one
if you like.


So anyone object to removing 65-nonlatin.conf from conf.d/ ?
(If some distro still wants it they could symlink it from avail.d/.)

Otherwise at minimum I think we need lang tags in 65-nonlatin.conf.
Comment 91 Qianqian Fang 2010-01-13 07:29:30 UTC
(In reply to comment #90)
> So anyone object to removing 65-nonlatin.conf from conf.d/ ?
> (If some distro still wants it they could symlink it from avail.d/.)
> 
> Otherwise at minimum I think we need lang tags in 65-nonlatin.conf.
> 

I think it is a bad idea. As I said several times previously, 65-nonlatin.conf is not for specific languages. It only provides sufficient font fallback (to prevent fontconfig randomly picking up low quality fonts) for rendering CJK char under a non-CJK specific environment (such as en_US, fr etc). 

For CJK-specific configs, they should go to the 65-language-ja.conf or 65-language-zh.conf file as in my  "65-nonlatin test suite".

What we need is a BETTER 65-nonlatin, removing from conf.d or adding lang tag miss the point completely.
Comment 92 Jens Petersen 2010-01-13 16:30:20 UTC
So how to solve https://bugzilla.redhat.com/show_bug.cgi?id=476459 ?

(In reply to comment #91)
> I think it is a bad idea. As I said several times previously, 65-nonlatin.conf
> is not for specific languages. It only provides sufficient font fallback (to
> prevent fontconfig randomly picking up low quality fonts) for rendering CJK
> char under a non-CJK specific environment (such as en_US, fr etc). 

If there is proper font config in place we shouldn't need any fallbacks.

> What we need is a BETTER 65-nonlatin, removing from conf.d or adding lang tag
> miss the point completely.

Why is adding lang tag a problem, specially for CJK?
Comment 93 Qianqian Fang 2010-01-13 17:48:36 UTC
On 1/13/2010 7:30 PM, bugzilla-daemon@freedesktop.org wrote:
> --- Comment #92 from Jens Petersen<petersen@redhat.com>   2010-01-13 16:30:20 PST ---
> So how to solve https://bugzilla.redhat.com/show_bug.cgi?id=476459 ?
>    

there are two ways:

1. add XX-vlgothic.conf (XX<65) and set prefer list for lang=ja
2. remove 44-wqy-zenhei and 66-vlgothic, download and use my 
65-language-{ja,zh}.conf files

> If there is proper font config in place we shouldn't need any fallbacks.

what proper font config are you referring to?

> Why is adding lang tag a problem, specially for CJK?

first of all, this is already done in 65-language-{zh,ja}.conf in my 
proposal,
we don't need two files to do the same thing!

second, fontconfig does need a config file to set the font orders for 
non-CJK
locales, such as en_US. In my opinion, that is the exact purpose for
65-nonlatin. Add lang-tag matching only limit the rules to be effective
for CJK locales, and leave the font orders in non-CJK locales unspecified.
Comment 94 Akira TAGOH 2010-01-22 02:37:54 UTC
(In reply to comment #91)
> I think it is a bad idea. As I said several times previously, 65-nonlatin.conf
> is not for specific languages. It only provides sufficient font fallback (to
> prevent fontconfig randomly picking up low quality fonts) for rendering CJK
> char under a non-CJK specific environment (such as en_US, fr etc). 

How is it bad? and how is it useful without the fonts? that looks like you missed the point. the certain config files should be provided by the font upstream. it's the above point and what Fedora is trying. the unnecessary built-in rules are worse than nothing.
Comment 95 Qianqian Fang 2010-01-22 07:14:56 UTC
(In reply to comment #94)
> How is it bad? and how is it useful without the fonts? 

I think the meaning of "fallback" is that when something does not exist, something else fill in the place. That is EXACTLY what fontconfig is designed for: when a font is not installed, some other designated font will be used as alternative.

> that looks like you missed the point. 
> the certain config files should be provided by the font upstream. 

I agree, but that does not justify why "font preference orders" should be provided by the font itself. In fact, fontconfig has been doing this job since day 1, and it is doing ok. The only thing is to refine it.

> it's the above point and what Fedora is trying. the unnecessary
> built-in rules are worse than nothing.
> 

how are these rules unnecessary? tell me.
Comment 96 Qianqian Fang 2010-01-22 11:52:04 UTC
I don't want to see another discussions got wasted in vain and derailed by discussing something under a completely different design. The problem for fontconfig is real, every minute we delay to give a solution, we are wasting thousands of minutes from the frustrated users. So, let's solve it first.

I would like to give a suggestion for any further discussions on this issue:

Let's separate what you want to achieve (i.e. using per-font-preference) and what this bug is trying to solve (i.e. amend 65-nonlatin to buld sufficient CJK fallback). The assumption of my proposal is that the fontconfig maintainers are happy with the current structure. If you want to propose something beyond the scope, please file a separate bug. I would be glad to join the discussion there.

For those of you who want to help, PLEASE, please download my svn files and test with it by yourself, and list any specific issues or submit your patch wrt these files, and these files only!

We've already spent a lot of time here, we really want to hear what the maintainers say, so, Behdad or Keith, please tell us what you guys think.
Comment 97 Akira TAGOH 2010-02-01 01:44:23 UTC
Created attachment 32959 [details]
sample config for fallback with the separate files

(In reply to comment #95)
> (In reply to comment #94)
> > How is it bad? and how is it useful without the fonts? 
> 
> I think the meaning of "fallback" is that when something does not exist,
> something else fill in the place. That is EXACTLY what fontconfig is designed
> for: when a font is not installed, some other designated font will be used as
> alternative.
> 
> > that looks like you missed the point. 
> > the certain config files should be provided by the font upstream. 
> 
> I agree, but that does not justify why "font preference orders" should be
> provided by the font itself. In fact, fontconfig has been doing this job since
> day 1, and it is doing ok. The only thing is to refine it.

You are misunderstanding the point then. I'm not saying that the order should be provided by the font upstream. providing the separate rule by font upstream should be easier to change the order by distro or the users with the prefix priority say. as I said to you on Red Hat Bugzilla too, thus all of the fontconfig config files shouldn't contains any other font names in it.  Since this is a kinda preference, the order should just leave to the distro or the users. that's why I think having the minimal sets of the rule in fontconfig upstream should be sufficient.

> 
> > it's the above point and what Fedora is trying. the unnecessary
> > built-in rules are worse than nothing.
> > 
> 
> how are these rules unnecessary? tell me.
> 

My objection to get rid of these (65-nonlatin.conf and similar for your proposed rules) files in upstream because they:

- prevents to have different order with additional rules.

- which mixing up several fontnames in one file requires the certain knowledges and skills to modify it for users.

- plus, need to modify two files to change the order at least. 65-nonlatin.conf (or similar) and the prefix priority in separate config file from the font package.


I want to just update the prefix priority in the config filename to change the order. it would works enough without 65-nonlatin.conf say, and easy enough.
Aside from that, speaking of the fallback, I did in vlgotnic-{,p}-fonts in Fedora to behave some fallback with separate files for Japanese like sans-serif->VL PGothic->VL Gothic.

% ls /etc/fonts/conf.d/*vlgothic*
/etc/fonts/conf.d/65-vlgothic-pgothic.conf@  /etc/fonts/conf.d/66-vlgothic-gothic.conf@

See the attached files for the details of the config files.
Comment 98 Qianqian Fang 2010-02-01 08:11:46 UTC
bugzilla-daemon@freedesktop.org wrote:
>
>
> You are misunderstanding the point then. I'm not saying that the order should
> be provided by the font upstream. providing the separate rule by font upstream
> should be easier to change the order by distro or the users with the prefix
> priority say. as I said to you on Red Hat Bugzilla too, thus all of the
> fontconfig config files shouldn't contains any other font names in it.  Since
> this is a kinda preference, the order should just leave to the distro or the
> users. that's why I think having the minimal sets of the rule in fontconfig
> upstream should be sufficient.
>   

again, I think we are talking on different pages.
What you want to propose is to change the fontconfig config file
basic schemes, and what I want is to renovate it and fine-tune.

As I said previously, it would be more efficient if you
submit another bug to discuss the new proposal.

I personally don't think your "other-font-names-free-rule" is
sufficient to handle the complex CJK situations. In addition,
using the basic rules I proposed does not conflict to
what you want to do. The only difference is that fontconfig
has some basic memory about good and bad fonts, and your
approach erase all the memories of fontconfig, and
font packagers make all the decisions by manipulating the
priority numbers.

Also, if the packager for Font A think it is better
than Font B, and the packager for B think opposite.
How would you solve it? let them fight by competing
the priority numbers?

>   
>>> it's the above point and what Fedora is trying. the unnecessary
>>> built-in rules are worse than nothing.
>>>
>>>       
>> how are these rules unnecessary? tell me.
>>
>>     
>
> My objection to get rid of these (65-nonlatin.conf and similar for your
> proposed rules) files in upstream because they:
>
> - prevents to have different order with additional rules.
>   

it doesn't. just name your file with a priority less than 65.
if you name it bigger than 65, then use prepend_first in your rules.

> - which mixing up several fontnames in one file requires the certain knowledges
> and skills to modify it for users.
>   

on the opposite, because it is centralized, it is easier
for users to modify. The most frustrating thing
using fontconfig is that when I modify one place to set
font orders, the rules never work because multiple other
config files overwrite it. It is impossible for ordinary
users to trace which one is actually functioning. The
approach you proposed is very likely leading to increasing
frustrations of such kind.

> - plus, need to modify two files to change the order at least. 65-nonlatin.conf
> (or similar) and the prefix priority in separate config file from the font
> package.
>   

no

>
> I want to just update the prefix priority in the config filename to change the
> order. it would works enough without 65-nonlatin.conf say, and easy enough.
> Aside from that, speaking of the fallback, I did in vlgotnic-{,p}-fonts in
> Fedora to behave some fallback with separate files for Japanese like
> sans-serif->VL PGothic->VL Gothic.
>
> % ls /etc/fonts/conf.d/*vlgothic*
> /etc/fonts/conf.d/65-vlgothic-pgothic.conf@ 
> /etc/fonts/conf.d/66-vlgothic-gothic.conf@
>
> See the attached files for the details of the config files.
>
>
>   
Comment 99 Akira TAGOH 2010-02-01 22:14:04 UTC
(In reply to comment #98)
> again, I think we are talking on different pages.
> What you want to propose is to change the fontconfig config file
> basic schemes, and what I want is to renovate it and fine-tune.

Sure. then my counter-proposal is to do that in your favorite distros. it's not something should be done upstream IMHO. I could start discussing this on another bug. but it's pretty opposite proposal to this, because once it gets approved, it eventually gets rid of your efforts too.

> I personally don't think your "other-font-names-free-rule" is
> sufficient to handle the complex CJK situations. In addition,
> using the basic rules I proposed does not conflict to
> what you want to do. The only difference is that fontconfig
> has some basic memory about good and bad fonts, and your
> approach erase all the memories of fontconfig, and
> font packagers make all the decisions by manipulating the
> priority numbers.

Right. because 65-nonlatin.conf prevents sane working on the separate config file idea. which means actually conflicting on it. otherwise we don't even need to get rid of it right.

> Also, if the packager for Font A think it is better
> than Font B, and the packager for B think opposite.
> How would you solve it? let them fight by competing
> the priority numbers?

The decision is up to the users or the distros. that's why I don't like to put the kind of the rules upstream. and it's not what upstream would worry about.

> it doesn't. just name your file with a priority less than 65.
> if you name it bigger than 65, then use prepend_first in your rules.

Once starting to use prepend_first, and if one wants to modify the order over it, all of fonts eventually will depends on prepend_first. it's not the right solution. it's a kind of a hack.

> on the opposite, because it is centralized, it is easier
> for users to modify.

I meant the syntax-wise etc. changing the priority order in the filename is much easier for that purpose.

>                      The most frustrating thing
> using fontconfig is that when I modify one place to set
> font orders, the rules never work because multiple other
> config files overwrite it. It is impossible for ordinary
> users to trace which one is actually functioning. The
> approach you proposed is very likely leading to increasing
> frustrations of such kind.

Not really. if we have simple rule for the font per a file, it should be easy to keep it on track with the debugging message, because any other changes for the font won't happens after that. having many rules in the different files would rather makes more complex to find out where it's really affected.

> 
> > - plus, need to modify two files to change the order at least. 65-nonlatin.conf
> > (or similar) and the prefix priority in separate config file from the font
> > package.
> >   
> 
> no

With prepend_first? I was assuming the situation on what Fedora do, but anyway.
Comment 100 Qianqian Fang 2010-02-01 22:59:27 UTC
bugzilla-daemon@freedesktop.org wrote:
> Sure. then my counter-proposal is to do that in your favorite distros. it's not
> something should be done upstream IMHO. I could start discussing this on
> another bug. but it's pretty opposite proposal to this, because once it gets
> approved, it eventually gets rid of your efforts too.
>   

I don't think we can convince each other on this matter,
and unfortunately, the maintainers do not seem to care.
so, I am getting bored ...

> Right. because 65-nonlatin.conf prevents sane working on the separate config
> file idea. which means actually conflicting on it. otherwise we don't even need
> to get rid of it right.
>   
>> it doesn't. just name your file with a priority less than 65.
>> if you name it bigger than 65, then use prepend_first in your rules.
>>     
> Once starting to use prepend_first, and if one wants to modify the order over
> it, all of fonts eventually will depends on prepend_first. it's not the right
> solution. it's a kind of a hack.
>   

looks like you just choose to ignore my first suggestion,
i.e. giving your own rules a lower prefix and overwrite 65-nonlatin.
As a result, your conclusion that 65-nonlatin conflicts with
per-font-config and your below criticism are flawed.

Just rename your own rules to 64-xxx and do a "FC_DEBUG=1029 fc-match ...",
you will see how it works.

> Not really. if we have simple rule for the font per a file, it should be easy
> to keep it on track with the debugging message, because any other changes for
> the font won't happens after that. having many rules in the different files
> would rather makes more complex to find out where it's really affected.
>   

As I said, setting 65-nonlatin DOES NOT prevent you from doing
what you want to do as a distro packager. It is important to have
some sane default rules from fontconfig upstream because not all
distros (such as some mini-system derived from LSB) have knowledgeable
maintainers for CJK fonts.
Comment 101 Akira TAGOH 2010-02-02 01:47:18 UTC
(In reply to comment #100)
> looks like you just choose to ignore my first suggestion,

I did because my idea is always against 65-nonlatin.conf. so all of the config files has to be put before 65-nonlatin.conf. playing with the narrow spaces won't make any better.

Okay, this may be a good settlement to extend the priority prefix to have more wider namespaces and align the section like this:

000-100: minimal sets of the config files from upstream.
200-300: users preference
400-500: distros preference
900-: upstream recommendation and fallbacks

the range might be improved later but this would resolves your and my issues if putting any rules prior to upstream's resolves the issue. we don't need to worry about 65-nonlatin.conf (realigned to somewhere after 600) anymore, and you can work on it upstream then. how does it sound for you?

> As I said, setting 65-nonlatin DOES NOT prevent you from doing
> what you want to do as a distro packager. It is important to have
> some sane default rules from fontconfig upstream because not all
> distros (such as some mini-system derived from LSB) have knowledgeable
> maintainers for CJK fonts.

Since all of the necessary configuration could be done in one file, assuming that it's came from font upstream, they just need to adjust the priority order to what they want. they don't eventually need to create any rules in the future.
Sorry for missing some assumptions on it. but it's possible to configure the fontconfig settings with putting a file anyway.
Comment 102 Akira TAGOH 2010-02-02 01:49:15 UTC
(In reply to comment #101)
> putting any rules prior to upstream's resolves the issue. we don't need to
> worry about 65-nonlatin.conf (realigned to somewhere after 600) anymore, and
> you can work on it upstream then. how does it sound for you?

Sorry I meant "after 900".
Comment 103 Qianqian Fang 2010-02-02 07:51:21 UTC
bugzilla-daemon@freedesktop.org wrote:
> I did because my idea is always against 65-nonlatin.conf. so all of 
> the config
> files has to be put before 65-nonlatin.conf. playing with the narrow 
> spaces
> won't make any better.

I am very glad that we finally reach some ground
and start to understand each other. that's good.

I knew you were pushed by the (artificially
determined) narrow prefix range for nonlatin
config files in Fedora [1]. I should have
pointed that out earlier.

> Okay, this may be a good settlement to extend the priority prefix to have more
> wider namespaces and align the section like this:
>
> 000-100: minimal sets of the config files from upstream.
> 200-300: users preference
> 400-500: distros preference
> 900-: upstream recommendation and fallbacks
>
> the range might be improved later but this would resolves your and my issues if
> putting any rules prior to upstream's resolves the issue. we don't need to
> worry about 65-nonlatin.conf (realigned to somewhere after 600) anymore, and
> you can work on it upstream then. how does it sound for you?
>   

I think this is now a Fedora matter, as the rules in [1]
are only followed by Fedora packagers. I prefer to define
51~64 for non-latin distro preference, as it still allows
users to use ~/.fonts.conf to overwrite.

Maybe file a bug on Fedora's bugzilla and ask Nicolas
to consider this adjustment?


[1] 
http://git.fedorahosted.org/git/fontpackages.git?p=fontpackages.git;a=blob;f=fontconfig-templates/fontconfig-priorities.txt
Comment 104 Akira TAGOH 2010-02-02 19:16:04 UTC
(In reply to comment #103)
> I think this is now a Fedora matter, as the rules in [1]
> are only followed by Fedora packagers. I prefer to define
> 51~64 for non-latin distro preference, as it still allows
> users to use ~/.fonts.conf to overwrite.

Is it? since the numbering is came from upstream, this improvement should appears in upstream no matter who follows that rule.
Comment 105 Qianqian Fang 2010-02-02 20:51:54 UTC
bugzilla-daemon@freedesktop.org wrote:
> Is it? since the numbering is came from upstream, this improvement should
> appears in upstream no matter who follows that rule.
>   

then I guess you want to look at this
http://cgit.freedesktop.org/fontconfig/tree/conf.d/README
Comment 106 Akira TAGOH 2010-02-02 21:30:11 UTC
(In reply to comment #105)
> then I guess you want to look at this
> http://cgit.freedesktop.org/fontconfig/tree/conf.d/README

And then? can you talk more? how does it explain why growing the range of the numbering for the priority order is a Fedora matter?

You understand Fedora's priority thing is based on that right?
Comment 107 Qianqian Fang 2010-02-02 22:17:56 UTC
bugzilla-daemon@freedesktop.org wrote:
>> then I guess you want to look at this
>> http://cgit.freedesktop.org/fontconfig/tree/conf.d/README
>>     
>
> And then? can you talk more? how does it explain why growing the range of the
> numbering for the priority order is a Fedora matter?
>
> You understand Fedora's priority thing is based on that right?
>   

I should have completed my sentence. What I wanted you to do
is to compare
http://cgit.freedesktop.org/fontconfig/tree/conf.d/README
 with
http://git.fedorahosted.org/git/fontpackages.git?p=fontpackages.git;a=blob;f=fontconfig-templates/fontconfig-priorities.txt

the first one is what suggested in fontconfig, and the second
one is what suggested in Fedora. see the difference?

the limitation that non-latin shall not go below 65 is
only a Fedora limitation. As long as you match lang tag as
the enclosing block in your config file, I don't think it
matters which number you choose for Latin or non-latin
fonts if 50<n<65.

<rant start>
Following rules is fine, but do not turn it into dogmatism.
Rules are meant to help, not meant to hinder.
</rant end>
Comment 108 Qianqian Fang 2010-02-02 22:58:09 UTC
bugzilla-daemon@freedesktop.org wrote:
> And then? can you talk more? how does it explain why growing the range of the
> numbering for the priority order is a Fedora matter?
>
> You understand Fedora's priority thing is based on that right?
>   
FIY, a bug was submitted to Fedora to clarify on the
prefix number range for non-latin config files:

https://bugzilla.redhat.com/show_bug.cgi?id=561246
Comment 109 Akira TAGOH 2010-02-07 23:17:01 UTC
I won't add any comments later. I'm not interested in the improvements of 65-nonlatin.conf anymore and now we have a solution to avoid the bad effects of it. but to correct the misunderstanding of:

(In reply to comment #107)
> bugzilla-daemon@freedesktop.org wrote:
> >> then I guess you want to look at this
> >> http://cgit.freedesktop.org/fontconfig/tree/conf.d/README
> >>     
> >
> > And then? can you talk more? how does it explain why growing the range of the
> > numbering for the priority order is a Fedora matter?
> >
> > You understand Fedora's priority thing is based on that right?
> >   
> 
> I should have completed my sentence. What I wanted you to do
> is to compare
> http://cgit.freedesktop.org/fontconfig/tree/conf.d/README
>  with
> http://git.fedorahosted.org/git/fontpackages.git?p=fontpackages.git;a=blob;f=fontconfig-templates/fontconfig-priorities.txt
> 
> the first one is what suggested in fontconfig, and the second
> one is what suggested in Fedora. see the difference?
> 
> the limitation that non-latin shall not go below 65 is
> only a Fedora limitation. As long as you match lang tag as
> the enclosing block in your config file, I don't think it
> matters which number you choose for Latin or non-latin
> fonts if 50<n<65.

That looks like you are talking about different point. indeed I said 65-nonlatin.conf badly affects to the separate-config idea though, my proposal posted at Comment #101 isn't for Fedora. otherwise I won't submit it here. the documented structure of the priority numbering is a good idea and inheriting this idea in Fedora is also good IMHO, but the assignment in Fedora was bad you are misunderstanding the point. since this kind of the configuration is completely preference and should be capable to customize it in various area such as at the user-side and at the distro-side, the scope of the customization should be defined in upstream. having more improvements than current policy in Fedora may works after that, but it may introduces the inconsistencies and another side-effects. that's not a solution but still a hack.
that's why I want to see the reserved area for distro and so on in upstream definition of the priority numbering, but anyway.

I'll keep an eye on another bug how it could improve.
Comment 110 Qianqian Fang 2010-02-08 07:46:28 UTC
why did you close the bug? this is not fixed. The current version of 65-nonlatin is still carrying all the issues I mentioned in the original report.
Comment 111 Akira TAGOH 2010-02-08 17:19:34 UTC
(In reply to comment #110)
> why did you close the bug? this is not fixed. The current version of
> 65-nonlatin is still carrying all the issues I mentioned in the original
> report.
> 

Oops, it's not my intention at all. sorry for that.
Comment 112 Ilyes Gouta 2010-07-14 02:23:47 UTC
Hi,

Have any bits of these updates were picked up by fontconfig or by packaged by any other distribution for official inclusion?

Any idea on the status of Fedora 13, as far as this issue is concerned?

Thanks!

Regards,
Ilyes Gouta
Comment 113 Akira TAGOH 2010-07-14 06:57:46 UTC
(In reply to comment #112)
> Hi,
> 
> Have any bits of these updates were picked up by fontconfig or by packaged by
> any other distribution for official inclusion?
> 
> Any idea on the status of Fedora 13, as far as this issue is concerned?
> 
> Thanks!
> 
> Regards,
> Ilyes Gouta

You should bring any distro specific things up on fonts list or bugzilla if you have any issues. though we have a workaround to prevent affecting 65-nonlatin.conf in f13. so it should works and improved much more than f12 I believe.
Comment 114 Ilyes Gouta 2010-07-14 08:08:07 UTC
Hi Akira,

> have any issues. though we have a workaround to prevent affecting
> 65-nonlatin.conf in f13. so it should works and improved much more than f12 I

Could you tell me more about this workaround in f13?

From what I've seen, f13 ships fontconfig 2.8.0 almost unmodified (http://cvs.fedoraproject.org/viewvc/rpms/fontconfig/F-13, 1 patch 25-no-bitmap-fedora.conf). How are CJK fonts (better) handled in f13?

Thanks,

-Ilyes Gouta

> (In reply to comment #112)
> > Hi,
> > 
> > Have any bits of these updates were picked up by fontconfig or by packaged by
> > any other distribution for official inclusion?
> > 
> > Any idea on the status of Fedora 13, as far as this issue is concerned?
> > 
> > Thanks!
> > 
> > Regards,
> > Ilyes Gouta
> 
> You should bring any distro specific things up on fonts list or bugzilla if you
> have any issues. though we have a workaround to prevent affecting
> 65-nonlatin.conf in f13. so it should works and improved much more than f12 I
> believe.
Comment 115 Akira TAGOH 2010-07-14 19:56:24 UTC
(In reply to comment #114)
> Could you tell me more about this workaround in f13?
> 
> From what I've seen, f13 ships fontconfig 2.8.0 almost unmodified
> (http://cvs.fedoraproject.org/viewvc/rpms/fontconfig/F-13, 1 patch
> 25-no-bitmap-fedora.conf). How are CJK fonts (better) handled in f13?

That has been done in each fontconfig config files in the CJK fonts packages. you can see some files that has 65-0- as a prefix say.which would be supposed to be evaluated prior to 65-nonlatin.conf.