Bug 41095 - Fall back to Unicode compat decompositions for missing glyphs
Summary: Fall back to Unicode compat decompositions for missing glyphs
Status: RESOLVED FIXED
Alias: None
Product: HarfBuzz
Classification: Unclassified
Component: src (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Behdad Esfahbod
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-09-21 15:56 UTC by Philip Withnall
Modified: 2012-08-03 20:53 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Add support for compatibility decompositions (10.99 KB, patch)
2011-09-26 09:57 UTC, Philip Withnall
Details | Splinter Review
Add support for compatibility decompositions (updated) (11.09 KB, patch)
2011-09-27 13:19 UTC, Philip Withnall
Details | Splinter Review
Add support for compatibility decompositions (updated again) (13.58 KB, patch)
2011-12-07 07:57 UTC, Philip Withnall
Details | Splinter Review

Description Philip Withnall 2011-09-21 15:56:39 UTC
Upstreaming the report in https://bugzilla.gnome.org/show_bug.cgi?id=63633 about using compat decompositions for Unicode codepoints whose glyphs are missing in the current font.

I've got some time to work on this now, and having poked around it looks simpler in the new HarfBuzz than it would've been in the old one.

Looks to me like the core of the changes should be to add support for compat decompositions in hb-ot-shape-normalize.cc’s decompose() function. However, this requires adding support for compat decompositions to hb_unicode_decompose().

I don't know about the ICU implementation, but the GLib implementation could use g_unichar_fully_decompose() similarly to how hb_glib_unicode_decompose() currently uses g_unichar_decompose().

This would require a new vfunc to be added to hb-unicode.cc (which is why I haven't gone ahead and written a patch already, since I'm unsure of the best way to do this). I was wondering about basically mirroring the GLib API:

  size_t hb_unicode_compat_decompose (hb_unicode_funcs_t*, hb_codepoint_t in, hb_codepoint_t *out, size_t out_len)

from what I can gather, this would fit with the ICU API as well.

If that sounds reasonable I'll go ahead and concoct a patch.
Comment 1 Behdad Esfahbod 2011-09-23 11:02:13 UTC
(In reply to comment #0)
> Upstreaming the report in https://bugzilla.gnome.org/show_bug.cgi?id=63633
> about using compat decompositions for Unicode codepoints whose glyphs are
> missing in the current font.

Thanks.


> I've got some time to work on this now, and having poked around it looks
> simpler in the new HarfBuzz than it would've been in the old one.

Indeed.


> Looks to me like the core of the changes should be to add support for compat
> decompositions in hb-ot-shape-normalize.cc’s decompose() function. However,
> this requires adding support for compat decompositions to
> hb_unicode_decompose().

Yeah, almost.


> I don't know about the ICU implementation, but the GLib implementation could
> use g_unichar_fully_decompose() similarly to how hb_glib_unicode_decompose()
> currently uses g_unichar_decompose().
> 
> This would require a new vfunc to be added to hb-unicode.cc (which is why I
> haven't gone ahead and written a patch already, since I'm unsure of the best
> way to do this). I was wondering about basically mirroring the GLib API:
> 
>   size_t hb_unicode_compat_decompose (hb_unicode_funcs_t*, hb_codepoint_t in,
> hb_codepoint_t *out, size_t out_len)

Indeed.  That's exactly why I added g_unichar_fully_decompose() and deprecate g_unicode_decomposition()! :).


> from what I can gather, this would fit with the ICU API as well.

Yeah, I don't see any big problem there.


> If that sounds reasonable I'll go ahead and concoct a patch.

Go ahead.  If you do the plumbing, I'll make the hb-ot-shape-normalize.cc changes.
Comment 2 Philip Withnall 2011-09-26 09:57:58 UTC
Created attachment 51635 [details] [review]
Add support for compatibility decompositions

Sorry about the wait; I ended up without an internet connection this weekend.

This adds a decompose_fully vfunc and nil, GLib and ICU implementations. It extends the normalisation tests in test-unicode.c with a small selection of compatibility decomposition tests, all of which pass for GLib and ICU for me.
Comment 3 Behdad Esfahbod 2011-09-27 09:18:39 UTC
(In reply to comment #2)
> Created an attachment (id=51635) [details]
> Add support for compatibility decompositions
> 
> Sorry about the wait; I ended up without an internet connection this weekend.
> 
> This adds a decompose_fully vfunc and nil, GLib and ICU implementations. It
> extends the normalisation tests in test-unicode.c with a small selection of
> compatibility decomposition tests, all of which pass for GLib and ICU for me.

1. Lets rename from decompose_fully to decompose_compatibility().

2. Lets ditch the buffer-len argument and require that the array has at least 18 entries room.  What do you think?

3. I like to see a fallback glib implementation too, but I can do that myself.
Comment 4 Philip Withnall 2011-09-27 13:19:39 UTC
Created attachment 51686 [details] [review]
Add support for compatibility decompositions (updated)

> 1. Lets rename from decompose_fully to decompose_compatibility().

Done.

> 2. Lets ditch the buffer-len argument and require that the array has at least
> 18 entries room.  What do you think?

I can't think of any reason not to. It seems to work for GLib's g_unichar_to_utf8(). Done.

> 3. I like to see a fallback glib implementation too, but I can do that myself.

Done, using a round-trip to UTF-8 and g_utf8_normalize() in NFKD mode.
Comment 5 Philip Withnall 2011-10-12 01:21:09 UTC
Ping?
Comment 6 Philip Withnall 2011-12-07 07:57:05 UTC
Created attachment 54183 [details] [review]
Add support for compatibility decompositions (updated again)

Here's an updated patch with the minor documentation fix we talked about on IRC, plus the necessary modifications to decompose() to use the compatibility decomposition if the original or canonical glyphs don't exist in the font. Hopefully this is what you had in mind; it certainly seems to work for me (and the test suite passes, though I haven't added any new tests).
Comment 7 Philip Withnall 2012-03-28 10:21:36 UTC
Ping?
Comment 8 Behdad Esfahbod 2012-08-01 01:37:37 UTC
Just pushed out a commit based on your patch, but with modifications to the algorithm and callback semantics.  Please test.  Thanks!
Comment 9 Behdad Esfahbod 2012-08-01 01:49:19 UTC
I tested it with a few fonts. Works beautifully.
Comment 10 Philip Withnall 2012-08-03 20:53:36 UTC
Thanks for sorting this out!


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.