With SBL-Hbrw.ttf from http://www.sbl-site.org/Fonts/SBL_Hbrw.ttf, on the sequence
05e0 05b8 0591 05da 05b0
I think the advance of uni2009 should be 200 (its normal advance) not 0.
Looking at the font and the trace output of HarfBuzz, I cannot see any lookup that zeros the advance of uni2009.
What's happening is that 0591 gets decomposed into 0591+2009 (by lookup 41, called from lookup 17). I guess HarfBuzz's zeroing of 2009 has something to do with 0591's being a mark.
I can confirm that Uniscribe generates different results than HarfBuzz, which are consistent with what you describe. Will take a look.
If I was to guess what's happening: when we decompose a glyph into two, we inherit properties of the originating glyph to both of the new glyphs, and zero their widths. Looks like Uniscribe doesn't do this, and it's not always desirable. We need to figure out what to do to fix this and not regress other things. It's nontrivial. Jonathan?
I know this is an extremely tricky area, but I will make a suggestion anyway.
Before commit 568000274c8edb5f41bc4f876ce21fcc8bdaeed8, you were zeroing mark advances in fix_mark_attachment. I like that approach a lot: attaching a mark using MarkToBase while still using the original advance doesn't make a lot of sense. Zeroing mark advances based on the Unicode general category when the font provides a proper "mark" feature seems overly aggressive to me.
So here's my suggestion:
- if the GPOS table provides the "mark" feature, do what you were doing before that commit (ie zero mark advances in fix_mark_attachment except for Indic and Myanmar);
- otherwise, do what you are doing now.
I suggest checking for the "mark" feature rather than just checking for GPOS, because there are fonts (eg Georgia in Windows 8) that have a GPOS without the "mark" feature but include non-spacing marks with non-zero advances.
Here is some relevant history
Sorry forgot I wasn't on github and that commits aren't autolinked. The commit I was talking about is:
Another approach might be to do for decompositions something similar to what you are already doing for ligatures:
Author: Behdad Esfahbod <firstname.lastname@example.org>
Date: Sat Jul 26 20:34:01 2014 -0400
[hebrew] Zero mark advance by GDEF late
Seems to be what Uniscribe does.
At this point I think it's work checking our default...
Fixes Bug 76767 - Zeroing of advance of 2nd component of multiple
substitution with SBL Hebrew