Bug 1579

Summary: fake bold support to Xft library
Product: xorg Reporter: Stefan Dirsch <sndirsch>
Component: Lib/XftAssignee: James Su <suzhe>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: high CC: alan.coopersmith, eich, jay.hobson, keithp, mat, mfabian, roland.mainz, sangu.xorg, suzhe
Version: gitKeywords: patch
Hardware: Other   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
p_xft_cjk.diff
none
xft-2.1.1-MakeBold-20040405.patch none

Description Stefan Dirsch 2004-10-10 08:15:40 UTC
I'll attach a patch, which adds fake bold support to Xft library.
Comment 1 Stefan Dirsch 2004-10-10 08:16:28 UTC
Created attachment 1062 [details] [review]
p_xft_cjk.diff
Comment 2 Stefan Dirsch 2004-10-11 02:35:39 UTC
Needs to be discussed outside of Bugzilla first.  
 
Comment 3 Stefan Dirsch 2004-10-12 08:27:12 UTC
reopen for discussion. 
Comment 4 Stefan Dirsch 2004-10-12 08:30:50 UTC
Needs to be discussed with Keith, Mike and James Su. 
Comment 5 Keith Packard 2004-10-12 11:47:35 UTC
It looks like this patch conflates synthetic emboldening of bitmap fonts with a
fix to repair broken metrics in fonts which are supposed to be dual-width.

I have learned more about how dual-width fonts are supposed to work in terminal
emulator environments and now understand that there is actually a standard (IEEE
Std 1002.1-2001) which defines correct behaviour for terminal applications.  I
suggest that 'monospace' really means 'dualwidth' in all cases and that we
should apply the standard all of the time.

I think it's easy to do; just use the wcwidth function found in xterm, which
already has a suitable license.

Is this a good or bad plan?
Comment 6 Egbert Eich 2004-10-18 05:14:40 UTC
James, would you mind to take a look at Keith's comment and discuss this issue
with him?
You know more about this than I do.
Comment 7 James Su 2004-10-18 06:34:14 UTC
This patch was written by Akito (http://www.kde.gr.jp/~akito/patch/). So maybe
he can give us some explanation.

But he has no account in this bugzilla, so can not be added into CC list.

His email: akito@kde.gr.jp
Comment 8 Mike FABIAN 2004-10-18 07:05:20 UTC
Keith, can you please explain comment #5 in more detail?

Do you think the way the patch tries to fix the metric issues is bad?

Do you think the way synthetic emboldening is achieved is OK?
Synthetic emboldening is very important for CJK as there are almost
no real bold CJK fonts, at least no free ones.

How do you want to use wcwidth() in that patch?
Comment 9 Keith Packard 2004-10-18 10:33:28 UTC
I think the synthetic emboldening of bitmap fonts is a fine idea, and restriking
the image multiple times seems like the only mechanism we can use.  We might
consider doing a hack for synthetic oblique at some point as well.  Not so much
because it would look really great, but because when combined with synthetic
bold, we can now generate four varients for each face, which turns out to make
CSS compatible matching a *lot* easier.

But, I think the patch conflates the emboldening code along with two other fixes:

  1) A new config option that prefers bitmaps where available (which seems
     fine to me.

  2) Code which tries to make fonts dual-width by snapping narrower glyphs
     to half size and wider glyphs to full sized.

It is this latter code which I believe could benefit from wcwidth -- using that
table to accurately identify which glyphs are expected to be one character cell
and which are expected to be two character cells should make fonts usable in
terminal emulators.  Without accurate identification, we'll end up occasionally
mis-drawing a glyph and causing chaos in the target application.  Also, I
believe the correct mode here is to compute the half-width value in pixels and
double it for the larger glyphs to ensure accurate alignment on the screen.
Comment 10 James Su 2004-10-18 23:46:33 UTC
But for many symbol glyphs, the wcwidth results are not suitable for CJK fonts
and users.

For example, most glyphs in CYRILLIC (0x0400-0x04FF), CURRENCY SYMBOLS
(0x20A0-0x20CF), BOX DRAWING (0x2500-0x257F) and many other areas, are usually
double-width glyphs in most CJK fonts. But wcwidth treats them as single width.
If they are drawn with such fonts as single width glyphs, the result will be
unacceptable.

So I think it's not a good way to use wcwidth here. I'd prefer this patch instead.
Comment 11 Mike FABIAN 2004-10-19 02:54:20 UTC
Keith> We might consider doing a hack for synthetic oblique at some point as well.

But this works already, doesn't it?

The following rule in /etc/fonts/fonts.conf seems to do it:

<!-- 
 Artificial oblique for fonts without an italic or oblique version
 -->
 
	<match target="font">
		<!-- check to see if the font is roman -->
		<test name="slant">
			<const>roman</const>
		</test>
		<!-- check to see if the pattern requested non-roman -->
		<test target="pattern" name="slant" compare="not_eq">
			<const>roman</const>
		</test>
		<!-- multiply the matrix to slant the font -->
		<edit name="matrix" mode="assign">
			<times>
				<name>matrix</name>
				<matrix><double>1</double><double>0.2</double>
					<double>0</double><double>1</double>
				</matrix>
			</times>
		</edit>
		<!-- pretend the font is oblique now -->
		<edit name="slant" mode="assign">
			<const>oblique</const>
		</edit>
	</match>

This rule is there for a long time already and seems to work fine.

I can see 4 variants of the "Sazanami Mincho" font for example:

   xfd -fa "Sazanami Mincho:weight=100:slant=0"
   xfd -fa "Sazanami Mincho:weight=100:slant=100"
   xfd -fa "Sazanami Mincho:weight=200:slant=0"
   xfd -fa "Sazanami Mincho:weight=200:slant=100"

all display fine although only one regular font is available:

mfabian@magellan:~$ fc-list "Sazanami Mincho" family style file
/usr/X11R6/lib/X11/fonts/truetype/sazanami-mincho.ttf: Sazanami
Mincho:style=Mincho-Regular
mfabian@magellan:~$
Comment 12 Keith Packard 2004-10-19 09:25:01 UTC
Again, let's separate the notions of synthetic style variation and fixed-width
pitch adjustment.

On the notion of the fixed width pitch adjustment, I cannot see how we can use
an essentially random basis for deciding which glyphs are one cell and which are
two cells in a terminal emulator -- any terminal-based application must have
apriori knowledge about character widths or it will not be able to lay out text
on the screen correctly.

Markus's proposed code provides two sets of width information, one set for
'pure' unicode applications where glyphs not obviously dual width in the Unicode
standard are assigned single width and a separate set for legacy CJK
applications where everything traditionally encoded in two bytes in non-Unicode
encodings is assigned dual width.

Perhaps we should expose both tables so that applications exposing either
Unicode or non-Unicode encoding to applications can get correct character widths.

xterm provides access to both such tables via the cjk_width configuration
option.  gnome-terminal uses a kludge where it bases some widths on the current
locale(!).

On the notion of synthetic font varients, we have synthetic oblique for outline
fonts, and I believe there is some code in FreeType to synthetically embolden
outlines.  With the addition here of synthetic bold bitmaps, we lack only
synthetic oblique bitmaps.
Comment 13 James Su 2004-10-19 10:30:32 UTC
On the notion of the fixed width pitch adjustment, we are unable to fix all that
buggy fonts which contain glyphs with wrong width. So if the width info of a
glyph obtained from a standard table does not euqal to the real width in the
font, the rendering result will be very ugly. If we use such font in a terminal
which conform to the standard, the rendering result would be unacceptable.
And I think fontconfig has no responsibility to conform to such terminal
standard, it should return the real font information to applications.
Comment 14 Keith Packard 2004-10-19 12:55:33 UTC
For applications able to adopt to whatever glyph widths are in the font, it
shouldn't specify FC_MONO spacing.  

The question is how Xft should help applications which expect "standard"
fixed-width glyphs and signal that with the FC_MONO spacing flag.  By "fixed
width", I mean glyphs which occupy either one or two character cells according
to some convention.  By using a known convention, one can actually produce
output with known alignment which is independent of the font in use; a key
feature for terminal-based applications. 

One of the "features" of the FC_MONO flag is that it "repairs" broken fonts to
ensure they follow the character cell conventions expected by the applications
which specify this spacing.  There are many fonts which are expected to be used
in character cell applications but which have a few broken glyphs, making this
repair necessary.

So, I guess the question is whether there are fonts useful in dual-width
environments which do not generally follow either of the tables produced by Markus.
Comment 15 James Su 2004-10-19 19:19:36 UTC
Where can I get the Markus width tables? I can check these tables to see if they
are ok for, at least, Chinese fonts.

In another word, if all fonts conform to either of these tables, then this patch
can also give correct result.
Comment 16 Keith Packard 2004-10-19 19:53:58 UTC
Markus's code is available from http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
There are two separate functions, mk_wcwidth and mk_wcwidth_cjk, the latter is
designed to match the conventional widths of national-encoding CJK fonts. 
Markus is a very reliable source of information in these areas, so I would
expect it to be largely correct.   I'm not sure how it might resolve any
conflicts in conventional widths among the various national encodings; I'm not
even sure there are any.
Comment 17 James Su 2004-10-19 22:25:50 UTC
I checked Markus's functions. mk_wcwidth_cjk is ok for most part of glyphs in
normal Chinese fonts. But there are still some exceptions. For example, Cyrillic
0x0410-0x0451 are full width glyphs in most Chinese fonts, while mk_wcwidth_cjk
treats them as half width.

I just checked Chinese fonts. I don't know the whether the Japanese and Korean
fonts are ok.
Comment 18 Keith Packard 2004-10-19 22:35:29 UTC
Why don't I ask Markus to clarify the situation; perhaps that code just needs to
be fixed.  I think the comment at the top of the code suggests rather strongly
that our intended purpose should be a perfect match for the mk_wcwidth_cjk
function.   I'll send him mail and report back when I hear from him.

I think this general approach which uses conventional width values for each
character from a table seems "more correct" than an approximate solution which
assumes the glyph metrics aren't grossly inaccurate.  I fear, in particular, the
width difference in a Latin font between 'i' and 'W', which seem like they might
be far enough apart to trigger the dual-width behaviour accidentally.
Comment 19 James Su 2004-10-20 06:38:09 UTC
Yes you are right. It should be better to use mk_wcwidth_cjk if it can match
commonly used CJK fonts.

Checking the widths of all glyphs is a hard work. I suggest to just use current
mk_wcwidth_cjk code for now. Then we can improve it if people find any missing
full-width glyph.

At least it's enough for most popular Chinese fonts, I think.
Comment 20 Keith Packard 2004-10-20 10:29:55 UTC
I have a note from Markus which describes the process by which he generated the
tables for mk_wcwidth_cjk:

  http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c

was directly derived in a very simple way from the data in

  http://www.unicode.org/unicode/reports/tr11/
  http://www.unicode.org/Public/UNIDATA/EastAsianWidth.txt

So, if there are discrepancies between any CJK font and the data in those
tables, we should not only fix the mk_wcwidth_cjk function but also send along
comments to the Unicode organization to try and get the source material corrected.

In particular, Markus notes that:

"Those Cyrillic characters which show up in legacy fonts are meant to
 have width-class A (ambiguous) in

  http://www.unicode.org/Public/UNIDATA/EastAsianWidth.txt

 and for these, mk_wcwidth_cjk() should return a width value 2."

So, if there are Cyrillic characters in CJK fonts for which mk_wcwidth_cjk does
not return 2, it is probably because the authors of TR11 didn't know about them.

It should be very easy to automatically validate mk_wcwidth_cjk against any
dual-width font with a quick application which loads the font without FC_MONO
and compares character widths.

It sounds like we have a good process for moving forward then.  I suggest that
we add a new fontconfig parameter that selects between three possible spacing
options:

   + strictly single-width -- for applications which cannot manage double-width
glyphs.
   + Unicode dual-width which assigns one character cell for these so-called
'CJK ambiguous' glyphs.
   + CJK dual-width which assigns two character cells to the CJK ambiguous glyphs.

Then we can merge in the mk_wcwidth functions, probably referring to them as
FcWcWidth/FcWcWidthCJK or some such.

There remains the question of how we compute the 'cell width' for the dual-width
varients.  The correct basis for this is the width of the one-cell glyphs, with
double the space used for two-cell glyphs.  It might be reasonably efficient to
just compute this when the font is loaded; there aren't generally that many
one-cell glyphs in any given font.  If that's too slow, we may need to find some
way to pre-compute a scalable value for the cell width; that, of course, may run
afoul of hinting systems.
Comment 21 James Su 2004-10-20 20:00:52 UTC
Yes. I agree with you. So please go ahead.
Comment 22 Atsushi Kanemoto 2005-01-22 10:14:51 UTC
Created attachment 1735 [details] [review]
xft-2.1.1-MakeBold-20040405.patch
Comment 23 Atsushi Kanemoto 2005-01-22 10:16:54 UTC
Comment on attachment 1735 [details] [review]
xft-2.1.1-MakeBold-20040405.patch

Latest version of Akito's patch.
Comment 24 sangu 2006-02-04 17:35:04 UTC
Using freetype 2.1.10 and fontconfig 2.3.2, because Xft library support embolden
rule, this problem is fixed in libXft-2.1.8.2 (Xorg 7.0/6.9).

See Also : http://hellocity.net/~sangu/files/embolden/embolden.png
Comment 25 Daniel Stone 2007-02-27 01:24:20 UTC
Sorry about the phenomenal bug spam, guys.  Adding xorg-team@ to the QA contact so bugs don't get lost in future.
Comment 26 Stefan Dirsch 2007-11-04 08:27:10 UTC
Mike, any updates available? SUSE is still using the p_xft_cjf.diff patch.
Comment 27 Stefan Dirsch 2008-11-22 11:52:36 UTC
(In reply to comment #26)
> Mike, any updates available? SUSE is still using the p_xft_cjf.diff patch.

I just commited this one to libXft git head.

commit cb80b4493e116229d8cc46507dec0fed6febd949
Author: Stefan Dirsch <sndirsch@suse.de>
Date:   Sat Nov 22 20:45:02 2008 +0100

    Added fake bold support (#1579, Novell #38202/223682).




Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.