Bug 32274 - classic mongolian shaper
Summary: classic mongolian shaper
Status: RESOLVED FIXED
Alias: None
Product: HarfBuzz
Classification: Unclassified
Component: src (show other bugs)
Version: unspecified
Hardware: All All
: medium enhancement
Assignee: Behdad Esfahbod
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-12-09 12:13 UTC by Tuguldur
Modified: 2010-12-23 04:47 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
classic mongolian shaper patch (7.89 KB, patch)
2010-12-09 12:14 UTC, Tuguldur
Details | Splinter Review
arabic shaping handles now the classic mongolian (3.59 KB, application/octet-stream)
2010-12-19 14:04 UTC, Tuguldur
Details

Description Tuguldur 2010-12-09 12:13:12 UTC
This is a classic mongolian shaper patch for harfbuzz. Tested with the MonBaiti font (grabbed from a Window7 installation) and a firefox nightly build using harfbuzz level 3 on MacOS.

Screenshot:
https://github.com/tugstugi/mongolian-script/raw/master/misc/minefield-harfbuzz-mongolian.png
As a comparison: screenshot of IE8 on Windows7
https://github.com/tugstugi/mongolian-script/raw/master/misc/ie-uniscribe-mongolian.png

Alternatively, the patch could be also incorporated into the arabic shaper.
Comment 1 Tuguldur 2010-12-09 12:14:17 UTC
Created attachment 40960 [details] [review]
classic mongolian shaper patch
Comment 2 Behdad Esfahbod 2010-12-13 15:25:25 UTC
Hi.  Thanks for the patch!  It was indeed my plan to add Mongolian support to the Arabic shaper.  Can you work on that?  That would also be easier for me to review.
Comment 3 Behdad Esfahbod 2010-12-13 15:26:30 UTC
Humm, does IE support vertical layout?!  Where's the test HTML page?
Comment 4 Behdad Esfahbod 2010-12-13 15:29:16 UTC
Humm.  You're shaping logic is simple to follow.  However, I remember that Unicode was a bit more complex.  Let me review.
Comment 5 Tuguldur 2010-12-14 01:44:15 UTC
(In reply to comment #3)
> Humm, does IE support vertical layout?!  Where's the test HTML page?

Yes, IE supports the vertical layout. Webkit nightly build supports also through "-webkit-writing-mode:vertical-lr".

The mongolian unicode test page ist here: http://www.babelstone.co.uk/Test/Mongolian.html
Comment 6 Tuguldur 2010-12-14 01:48:16 UTC
(In reply to comment #4)
> Humm.  You're shaping logic is simple to follow.  However, I remember that
> Unicode was a bit more complex.  Let me review.

Well, mongolian shaping is pretty simple because all letters are dual-joining.
Comment 7 Behdad Esfahbod 2010-12-17 18:49:23 UTC
(In reply to comment #6)
> (In reply to comment #4)
> > Humm.  You're shaping logic is simple to follow.  However, I remember that
> > Unicode was a bit more complex.  Let me review.
> 
> Well, mongolian shaping is pretty simple because all letters are dual-joining.

Right.

Reading the Unicode 5.2 on Mongolian though, what happens to characters after NNBS?  I read that some characters may take medial or final forms after NNBS.  How is that supposed to be handled?
Comment 8 Behdad Esfahbod 2010-12-17 18:49:48 UTC
That said, I'd commit a patch to add Mongolian to the Arabic shaper as soon as you submit one!
Comment 9 Tuguldur 2010-12-19 14:04:30 UTC
Created attachment 41274 [details]
arabic shaping handles now the classic mongolian
Comment 10 Tuguldur 2010-12-19 14:08:18 UTC
(In reply to comment #7)
> (In reply to comment #6)
> > (In reply to comment #4)
> > > Humm.  You're shaping logic is simple to follow.  However, I remember that
> > > Unicode was a bit more complex.  Let me review.
> > 
> > Well, mongolian shaping is pretty simple because all letters are dual-joining.
> 
> Right.
> 
> Reading the Unicode 5.2 on Mongolian though, what happens to characters after
> NNBS?  I read that some characters may take medial or final forms after NNBS. 
> How is that supposed to be handled?

There are two special letters: MVS (Mongolian Vowel Selector) and NNBS. They should be handled in CALT. That is also the way how Uniscribe/MongolianBaiti does.

Tuguldur
Comment 11 Behdad Esfahbod 2010-12-21 12:17:30 UTC
Ok, thanks.  Good to know.  I'm integrating it now.
Comment 12 Behdad Esfahbod 2010-12-21 14:01:47 UTC
How about U+18A9?  Shouldn't it be transparent?
Comment 13 Behdad Esfahbod 2010-12-21 15:41:32 UTC
I committed the following.  Please test and report on the open issues.  Thanks.

commit d86a5b3c5752abcc791724035ba4115958e6b5e2
Author: Behdad Esfahbod <behdad@behdad.org>
Date:   Tue Dec 21 18:36:25 2010 -0500

    Bug 32274 - classic mongolian shaper
    
    Add support for classic Mongolian script to the Arabic shaper.
    
    Still work to be done around U+180E MONGOLIAN VOWEL SEPARATOR as it
    should not be included in the final glyph stream the same way that
    ZWNJ, etc should not appear in the final glyph stream.
    
    But the joining part should be done.
    
    There remains the question of how should the U+18A9 MONGOLIAN LETTER ALI
    GALI DAGALGA be handled as it has General Category NSM but a letter
    nonetheless.  For now, our generic logic makes this a joining T instead
    of joining D as other Mongolian letters are.

diff --git a/src/hb-ot-shape-complex-arabic.cc b/src/hb-ot-shape-complex-arabic.cc
index 7c1b7bc..77a9c82 100644
--- a/src/hb-ot-shape-complex-arabic.cc
+++ b/src/hb-ot-shape-complex-arabic.cc
@@ -67,6 +67,14 @@ static unsigned int get_joining_type (hb_codepoint_t u, hb_category_t gen_cat)
       return j_type;
   }
 
+  /* Mongolian joining data is not in ArabicJoining.txt yet */
+  if (unlikely (0x1800 <= u && u <= 0x18AF))
+  {
+    /* All letters, SIBE SYLLABLE BOUNDARY MARKER, and NIRUGU are D */
+    if (gen_cat == HB_CATEGORY_OTHER_LETTER || u == 0x1807 || u == 0x180A)
+      return JOINING_TYPE_D;
+  }
+
   if (unlikely ((u & ~(0x200C^0x200D)) == 0x200C)) {
     return u == 0x200C ? JOINING_TYPE_U : JOINING_TYPE_C;
   }
diff --git a/src/hb-ot-shape-complex-private.hh b/src/hb-ot-shape-complex-private.hh
index 788d18a..fed167d 100644
--- a/src/hb-ot-shape-complex-private.hh
+++ b/src/hb-ot-shape-complex-private.hh
@@ -42,6 +42,7 @@ hb_ot_shape_complex_categorize (const hb_segment_properties_t *props)
     case HB_SCRIPT_NKO:
     case HB_SCRIPT_SYRIAC:
     case HB_SCRIPT_MANDAIC:
+    case HB_SCRIPT_MONGOLIAN:
       return hb_ot_complex_shaper_arabic;
 
     default:
Comment 14 Tuguldur 2010-12-23 04:47:51 UTC
Thank you, it works.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.