Summary: | fake bold support to Xft library | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | Stefan Dirsch <sndirsch> | ||||||
Component: | Lib/Xft | Assignee: | James Su <suzhe> | ||||||
Status: | RESOLVED FIXED | QA Contact: | Xorg Project Team <xorg-team> | ||||||
Severity: | normal | ||||||||
Priority: | high | CC: | alan.coopersmith, eich, jay.hobson, keithp, mat, mfabian, roland.mainz, sangu.xorg, suzhe | ||||||
Version: | git | Keywords: | patch | ||||||
Hardware: | Other | ||||||||
OS: | Linux (All) | ||||||||
Whiteboard: | |||||||||
i915 platform: | i915 features: | ||||||||
Attachments: |
|
Description
Stefan Dirsch
2004-10-10 08:15:40 UTC
Created attachment 1062 [details] [review] p_xft_cjk.diff Needs to be discussed outside of Bugzilla first. reopen for discussion. Needs to be discussed with Keith, Mike and James Su. It looks like this patch conflates synthetic emboldening of bitmap fonts with a fix to repair broken metrics in fonts which are supposed to be dual-width. I have learned more about how dual-width fonts are supposed to work in terminal emulator environments and now understand that there is actually a standard (IEEE Std 1002.1-2001) which defines correct behaviour for terminal applications. I suggest that 'monospace' really means 'dualwidth' in all cases and that we should apply the standard all of the time. I think it's easy to do; just use the wcwidth function found in xterm, which already has a suitable license. Is this a good or bad plan? James, would you mind to take a look at Keith's comment and discuss this issue with him? You know more about this than I do. This patch was written by Akito (http://www.kde.gr.jp/~akito/patch/). So maybe he can give us some explanation. But he has no account in this bugzilla, so can not be added into CC list. His email: akito@kde.gr.jp Keith, can you please explain comment #5 in more detail? Do you think the way the patch tries to fix the metric issues is bad? Do you think the way synthetic emboldening is achieved is OK? Synthetic emboldening is very important for CJK as there are almost no real bold CJK fonts, at least no free ones. How do you want to use wcwidth() in that patch? I think the synthetic emboldening of bitmap fonts is a fine idea, and restriking the image multiple times seems like the only mechanism we can use. We might consider doing a hack for synthetic oblique at some point as well. Not so much because it would look really great, but because when combined with synthetic bold, we can now generate four varients for each face, which turns out to make CSS compatible matching a *lot* easier. But, I think the patch conflates the emboldening code along with two other fixes: 1) A new config option that prefers bitmaps where available (which seems fine to me. 2) Code which tries to make fonts dual-width by snapping narrower glyphs to half size and wider glyphs to full sized. It is this latter code which I believe could benefit from wcwidth -- using that table to accurately identify which glyphs are expected to be one character cell and which are expected to be two character cells should make fonts usable in terminal emulators. Without accurate identification, we'll end up occasionally mis-drawing a glyph and causing chaos in the target application. Also, I believe the correct mode here is to compute the half-width value in pixels and double it for the larger glyphs to ensure accurate alignment on the screen. But for many symbol glyphs, the wcwidth results are not suitable for CJK fonts and users. For example, most glyphs in CYRILLIC (0x0400-0x04FF), CURRENCY SYMBOLS (0x20A0-0x20CF), BOX DRAWING (0x2500-0x257F) and many other areas, are usually double-width glyphs in most CJK fonts. But wcwidth treats them as single width. If they are drawn with such fonts as single width glyphs, the result will be unacceptable. So I think it's not a good way to use wcwidth here. I'd prefer this patch instead. Keith> We might consider doing a hack for synthetic oblique at some point as well. But this works already, doesn't it? The following rule in /etc/fonts/fonts.conf seems to do it: <!-- Artificial oblique for fonts without an italic or oblique version --> <match target="font"> <!-- check to see if the font is roman --> <test name="slant"> <const>roman</const> </test> <!-- check to see if the pattern requested non-roman --> <test target="pattern" name="slant" compare="not_eq"> <const>roman</const> </test> <!-- multiply the matrix to slant the font --> <edit name="matrix" mode="assign"> <times> <name>matrix</name> <matrix><double>1</double><double>0.2</double> <double>0</double><double>1</double> </matrix> </times> </edit> <!-- pretend the font is oblique now --> <edit name="slant" mode="assign"> <const>oblique</const> </edit> </match> This rule is there for a long time already and seems to work fine. I can see 4 variants of the "Sazanami Mincho" font for example: xfd -fa "Sazanami Mincho:weight=100:slant=0" xfd -fa "Sazanami Mincho:weight=100:slant=100" xfd -fa "Sazanami Mincho:weight=200:slant=0" xfd -fa "Sazanami Mincho:weight=200:slant=100" all display fine although only one regular font is available: mfabian@magellan:~$ fc-list "Sazanami Mincho" family style file /usr/X11R6/lib/X11/fonts/truetype/sazanami-mincho.ttf: Sazanami Mincho:style=Mincho-Regular mfabian@magellan:~$ Again, let's separate the notions of synthetic style variation and fixed-width pitch adjustment. On the notion of the fixed width pitch adjustment, I cannot see how we can use an essentially random basis for deciding which glyphs are one cell and which are two cells in a terminal emulator -- any terminal-based application must have apriori knowledge about character widths or it will not be able to lay out text on the screen correctly. Markus's proposed code provides two sets of width information, one set for 'pure' unicode applications where glyphs not obviously dual width in the Unicode standard are assigned single width and a separate set for legacy CJK applications where everything traditionally encoded in two bytes in non-Unicode encodings is assigned dual width. Perhaps we should expose both tables so that applications exposing either Unicode or non-Unicode encoding to applications can get correct character widths. xterm provides access to both such tables via the cjk_width configuration option. gnome-terminal uses a kludge where it bases some widths on the current locale(!). On the notion of synthetic font varients, we have synthetic oblique for outline fonts, and I believe there is some code in FreeType to synthetically embolden outlines. With the addition here of synthetic bold bitmaps, we lack only synthetic oblique bitmaps. On the notion of the fixed width pitch adjustment, we are unable to fix all that buggy fonts which contain glyphs with wrong width. So if the width info of a glyph obtained from a standard table does not euqal to the real width in the font, the rendering result will be very ugly. If we use such font in a terminal which conform to the standard, the rendering result would be unacceptable. And I think fontconfig has no responsibility to conform to such terminal standard, it should return the real font information to applications. For applications able to adopt to whatever glyph widths are in the font, it shouldn't specify FC_MONO spacing. The question is how Xft should help applications which expect "standard" fixed-width glyphs and signal that with the FC_MONO spacing flag. By "fixed width", I mean glyphs which occupy either one or two character cells according to some convention. By using a known convention, one can actually produce output with known alignment which is independent of the font in use; a key feature for terminal-based applications. One of the "features" of the FC_MONO flag is that it "repairs" broken fonts to ensure they follow the character cell conventions expected by the applications which specify this spacing. There are many fonts which are expected to be used in character cell applications but which have a few broken glyphs, making this repair necessary. So, I guess the question is whether there are fonts useful in dual-width environments which do not generally follow either of the tables produced by Markus. Where can I get the Markus width tables? I can check these tables to see if they are ok for, at least, Chinese fonts. In another word, if all fonts conform to either of these tables, then this patch can also give correct result. Markus's code is available from http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c There are two separate functions, mk_wcwidth and mk_wcwidth_cjk, the latter is designed to match the conventional widths of national-encoding CJK fonts. Markus is a very reliable source of information in these areas, so I would expect it to be largely correct. I'm not sure how it might resolve any conflicts in conventional widths among the various national encodings; I'm not even sure there are any. I checked Markus's functions. mk_wcwidth_cjk is ok for most part of glyphs in normal Chinese fonts. But there are still some exceptions. For example, Cyrillic 0x0410-0x0451 are full width glyphs in most Chinese fonts, while mk_wcwidth_cjk treats them as half width. I just checked Chinese fonts. I don't know the whether the Japanese and Korean fonts are ok. Why don't I ask Markus to clarify the situation; perhaps that code just needs to be fixed. I think the comment at the top of the code suggests rather strongly that our intended purpose should be a perfect match for the mk_wcwidth_cjk function. I'll send him mail and report back when I hear from him. I think this general approach which uses conventional width values for each character from a table seems "more correct" than an approximate solution which assumes the glyph metrics aren't grossly inaccurate. I fear, in particular, the width difference in a Latin font between 'i' and 'W', which seem like they might be far enough apart to trigger the dual-width behaviour accidentally. Yes you are right. It should be better to use mk_wcwidth_cjk if it can match commonly used CJK fonts. Checking the widths of all glyphs is a hard work. I suggest to just use current mk_wcwidth_cjk code for now. Then we can improve it if people find any missing full-width glyph. At least it's enough for most popular Chinese fonts, I think. I have a note from Markus which describes the process by which he generated the tables for mk_wcwidth_cjk: http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c was directly derived in a very simple way from the data in http://www.unicode.org/unicode/reports/tr11/ http://www.unicode.org/Public/UNIDATA/EastAsianWidth.txt So, if there are discrepancies between any CJK font and the data in those tables, we should not only fix the mk_wcwidth_cjk function but also send along comments to the Unicode organization to try and get the source material corrected. In particular, Markus notes that: "Those Cyrillic characters which show up in legacy fonts are meant to have width-class A (ambiguous) in http://www.unicode.org/Public/UNIDATA/EastAsianWidth.txt and for these, mk_wcwidth_cjk() should return a width value 2." So, if there are Cyrillic characters in CJK fonts for which mk_wcwidth_cjk does not return 2, it is probably because the authors of TR11 didn't know about them. It should be very easy to automatically validate mk_wcwidth_cjk against any dual-width font with a quick application which loads the font without FC_MONO and compares character widths. It sounds like we have a good process for moving forward then. I suggest that we add a new fontconfig parameter that selects between three possible spacing options: + strictly single-width -- for applications which cannot manage double-width glyphs. + Unicode dual-width which assigns one character cell for these so-called 'CJK ambiguous' glyphs. + CJK dual-width which assigns two character cells to the CJK ambiguous glyphs. Then we can merge in the mk_wcwidth functions, probably referring to them as FcWcWidth/FcWcWidthCJK or some such. There remains the question of how we compute the 'cell width' for the dual-width varients. The correct basis for this is the width of the one-cell glyphs, with double the space used for two-cell glyphs. It might be reasonably efficient to just compute this when the font is loaded; there aren't generally that many one-cell glyphs in any given font. If that's too slow, we may need to find some way to pre-compute a scalable value for the cell width; that, of course, may run afoul of hinting systems. Yes. I agree with you. So please go ahead. Created attachment 1735 [details] [review] xft-2.1.1-MakeBold-20040405.patch Comment on attachment 1735 [details] [review] xft-2.1.1-MakeBold-20040405.patch Latest version of Akito's patch. Using freetype 2.1.10 and fontconfig 2.3.2, because Xft library support embolden rule, this problem is fixed in libXft-2.1.8.2 (Xorg 7.0/6.9). See Also : http://hellocity.net/~sangu/files/embolden/embolden.png Sorry about the phenomenal bug spam, guys. Adding xorg-team@ to the QA contact so bugs don't get lost in future. Mike, any updates available? SUSE is still using the p_xft_cjf.diff patch. (In reply to comment #26) > Mike, any updates available? SUSE is still using the p_xft_cjf.diff patch. I just commited this one to libXft git head. commit cb80b4493e116229d8cc46507dec0fed6febd949 Author: Stefan Dirsch <sndirsch@suse.de> Date: Sat Nov 22 20:45:02 2008 +0100 Added fake bold support (#1579, Novell #38202/223682). |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.