This is a classic mongolian shaper patch for harfbuzz. Tested with the MonBaiti font (grabbed from a Window7 installation) and a firefox nightly build using harfbuzz level 3 on MacOS. Screenshot: https://github.com/tugstugi/mongolian-script/raw/master/misc/minefield-harfbuzz-mongolian.png As a comparison: screenshot of IE8 on Windows7 https://github.com/tugstugi/mongolian-script/raw/master/misc/ie-uniscribe-mongolian.png Alternatively, the patch could be also incorporated into the arabic shaper.
Created attachment 40960 [details] [review] classic mongolian shaper patch
Hi. Thanks for the patch! It was indeed my plan to add Mongolian support to the Arabic shaper. Can you work on that? That would also be easier for me to review.
Humm, does IE support vertical layout?! Where's the test HTML page?
Humm. You're shaping logic is simple to follow. However, I remember that Unicode was a bit more complex. Let me review.
(In reply to comment #3) > Humm, does IE support vertical layout?! Where's the test HTML page? Yes, IE supports the vertical layout. Webkit nightly build supports also through "-webkit-writing-mode:vertical-lr". The mongolian unicode test page ist here: http://www.babelstone.co.uk/Test/Mongolian.html
(In reply to comment #4) > Humm. You're shaping logic is simple to follow. However, I remember that > Unicode was a bit more complex. Let me review. Well, mongolian shaping is pretty simple because all letters are dual-joining.
(In reply to comment #6) > (In reply to comment #4) > > Humm. You're shaping logic is simple to follow. However, I remember that > > Unicode was a bit more complex. Let me review. > > Well, mongolian shaping is pretty simple because all letters are dual-joining. Right. Reading the Unicode 5.2 on Mongolian though, what happens to characters after NNBS? I read that some characters may take medial or final forms after NNBS. How is that supposed to be handled?
That said, I'd commit a patch to add Mongolian to the Arabic shaper as soon as you submit one!
Created attachment 41274 [details] arabic shaping handles now the classic mongolian
(In reply to comment #7) > (In reply to comment #6) > > (In reply to comment #4) > > > Humm. You're shaping logic is simple to follow. However, I remember that > > > Unicode was a bit more complex. Let me review. > > > > Well, mongolian shaping is pretty simple because all letters are dual-joining. > > Right. > > Reading the Unicode 5.2 on Mongolian though, what happens to characters after > NNBS? I read that some characters may take medial or final forms after NNBS. > How is that supposed to be handled? There are two special letters: MVS (Mongolian Vowel Selector) and NNBS. They should be handled in CALT. That is also the way how Uniscribe/MongolianBaiti does. Tuguldur
Ok, thanks. Good to know. I'm integrating it now.
How about U+18A9? Shouldn't it be transparent?
I committed the following. Please test and report on the open issues. Thanks. commit d86a5b3c5752abcc791724035ba4115958e6b5e2 Author: Behdad Esfahbod <behdad@behdad.org> Date: Tue Dec 21 18:36:25 2010 -0500 Bug 32274 - classic mongolian shaper Add support for classic Mongolian script to the Arabic shaper. Still work to be done around U+180E MONGOLIAN VOWEL SEPARATOR as it should not be included in the final glyph stream the same way that ZWNJ, etc should not appear in the final glyph stream. But the joining part should be done. There remains the question of how should the U+18A9 MONGOLIAN LETTER ALI GALI DAGALGA be handled as it has General Category NSM but a letter nonetheless. For now, our generic logic makes this a joining T instead of joining D as other Mongolian letters are. diff --git a/src/hb-ot-shape-complex-arabic.cc b/src/hb-ot-shape-complex-arabic.cc index 7c1b7bc..77a9c82 100644 --- a/src/hb-ot-shape-complex-arabic.cc +++ b/src/hb-ot-shape-complex-arabic.cc @@ -67,6 +67,14 @@ static unsigned int get_joining_type (hb_codepoint_t u, hb_category_t gen_cat) return j_type; } + /* Mongolian joining data is not in ArabicJoining.txt yet */ + if (unlikely (0x1800 <= u && u <= 0x18AF)) + { + /* All letters, SIBE SYLLABLE BOUNDARY MARKER, and NIRUGU are D */ + if (gen_cat == HB_CATEGORY_OTHER_LETTER || u == 0x1807 || u == 0x180A) + return JOINING_TYPE_D; + } + if (unlikely ((u & ~(0x200C^0x200D)) == 0x200C)) { return u == 0x200C ? JOINING_TYPE_U : JOINING_TYPE_C; } diff --git a/src/hb-ot-shape-complex-private.hh b/src/hb-ot-shape-complex-private.hh index 788d18a..fed167d 100644 --- a/src/hb-ot-shape-complex-private.hh +++ b/src/hb-ot-shape-complex-private.hh @@ -42,6 +42,7 @@ hb_ot_shape_complex_categorize (const hb_segment_properties_t *props) case HB_SCRIPT_NKO: case HB_SCRIPT_SYRIAC: case HB_SCRIPT_MANDAIC: + case HB_SCRIPT_MONGOLIAN: return hb_ot_complex_shaper_arabic; default:
Thank you, it works.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.