Bug 76767 - Zeroing of advance of 2nd component of multiple substitution with SBL Hebrew
Summary: Zeroing of advance of 2nd component of multiple substitution with SBL Hebrew
Alias: None
Product: HarfBuzz
Classification: Unclassified
Component: src (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Behdad Esfahbod
QA Contact:
Depends on:
Reported: 2014-03-29 02:27 UTC by jjc@jclark.com
Modified: 2014-07-27 00:35 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Description jjc@jclark.com 2014-03-29 02:27:33 UTC
With SBL-Hbrw.ttf from http://www.sbl-site.org/Fonts/SBL_Hbrw.ttf, on the sequence

05e0 05b8 0591 05da 05b0

hb-shape gives


I think the advance of uni2009 should be 200 (its normal advance) not 0.

Looking at the font and the trace output of HarfBuzz, I cannot see any lookup that zeros the advance of uni2009.

What's happening is that 0591 gets decomposed into 0591+2009 (by lookup 41, called from lookup 17).  I guess HarfBuzz's zeroing of 2009 has something to do with 0591's being a mark.
Comment 1 Behdad Esfahbod 2014-04-04 01:00:06 UTC
I can confirm that Uniscribe generates different results than HarfBuzz, which are consistent with what you describe.  Will take a look.
Comment 2 Behdad Esfahbod 2014-04-04 01:03:11 UTC
If I was to guess what's happening: when we decompose a glyph into two, we inherit properties of the originating glyph to both of the new glyphs, and zero their widths.  Looks like Uniscribe doesn't do this, and it's not always desirable.  We need to figure out what to do to fix this and not regress other things.  It's nontrivial.  Jonathan?
Comment 3 jjc@jclark.com 2014-04-05 02:30:19 UTC
I know this is an extremely tricky area, but I will make a suggestion anyway.

Before commit 568000274c8edb5f41bc4f876ce21fcc8bdaeed8, you were zeroing mark advances in fix_mark_attachment. I like that approach a lot: attaching a mark using MarkToBase while still using the original advance doesn't make a lot of sense. Zeroing mark advances based on the Unicode general category when the font provides a proper "mark" feature seems overly aggressive to me.

So here's my suggestion:

- if the GPOS table provides the "mark" feature, do what you were doing before that commit (ie zero mark advances in fix_mark_attachment except for Indic and Myanmar);

- otherwise, do what you are doing now.

I suggest checking for the "mark" feature rather than just checking for GPOS, because there are fonts (eg Georgia in Windows 8) that have a GPOS without the "mark" feature but include non-spacing marks with non-zero advances.

Here is some relevant history

Comment 4 jjc@jclark.com 2014-04-05 02:38:35 UTC
Sorry forgot I wasn't on github and that commits aren't autolinked. The commit I was talking about is:


Another approach might be to do for decompositions something similar to what you are already doing for ligatures:

Comment 5 Behdad Esfahbod 2014-07-27 00:35:26 UTC

commit 9e834e29e0b657f0555df1ab9cea79ff7abcf08d
Author: Behdad Esfahbod <behdad@behdad.org>
Date:   Sat Jul 26 20:34:01 2014 -0400

    [hebrew] Zero mark advance by GDEF late
    Seems to be what Uniscribe does.
    At this point I think it's work checking our default...
    Fixes Bug 76767 - Zeroing of advance of 2nd component of multiple
    substitution with SBL Hebrew
    Micro-test added.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.