Bug 27485 - ccmp feature for Indic Languages...
ccmp feature for Indic Languages...
Status: RESOLVED WONTFIX
Product: HarfBuzz
Classification: Unclassified
Component: src
unspecified
Other All
: medium enhancement
Assigned To: Behdad Esfahbod
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-04-06 05:39 UTC by Naveen Kumar
Modified: 2010-12-02 04:51 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Naveen Kumar 2010-04-06 05:39:06 UTC
References: 
1. https://bugzilla.redhat.com/show_bug.cgi?id=543906

2. https://bugzilla.redhat.com/show_bug.cgi?id=497090

Decomposition is sometimes required for contextual alternates. Pango as well as uniscribe, seem to have this feature for indic (CTL scripts).
Comment 1 Naveen Kumar 2010-04-06 05:43:10 UTC
Same request for ICU:
http://bugs.icu-project.org/trac/ticket/7601
Comment 2 Naveen Kumar 2010-04-22 06:23:43 UTC
Reference : http://www.microsoft.com/typography/OTSPEC/features_ae.htm

Tag: “ccmp”

Friendly name: Glyph Composition/Decomposition

Registered by: Microsoft

Function: To minimize the number of glyph alternates, it is sometimes desired to decompose a character into two glyphs. Additionally, it may be preferable to compose two characters into a single glyph for better glyph processing. This feature permits such composition/decompostion. The feature should be processed as the first feature processed, and should be processed only when it is called.

Example: In Syriac, the character 0x0732 is a combining mark that has a dot above AND a dot below the base character. To avoid multiple glyph variants to fit all base glyphs, the character is decomposed into two glyphs...a dot above and a dot below. These two glyphs can then be correctly placed using GPOS. In Arabic it might be preferred to combine the shadda with fatha (0x0651, 0x064E) into a ligature before processing shapes. This allows the font vendor to do special handling of the mark combination when doing further processing without requiring larger contextual rules.

Recommended implementation: The ccmp table maps the character sequence to its corresponding ligature (GSUB lookup type 4) or string of glyphs (GSUB lookup type 2). When using GSUB lookup type 4, sequences that are made up of larger number of glyphs must be placed before those that require fewer glyphs.

Application interface: For GIDs found in the ccmp coverage table, the application passes the sequence of GIDs to the table, and gets back the GID for the ligature, or GIDs for the multiple substitution.

UI suggestion: This feature should be on by default.

Script/language sensitivity: None.

Feature interaction: This feature needs to be implemented prior to any other feature.

-----------------------------------------------------------------------------
Since it language sensitivity is none it should be available to all scripts & without it calt is not as useful as it can be...
Comment 3 Behdad Esfahbod 2010-12-02 04:51:10 UTC
I'm closing down the bugs against the old code base.  The new harfbuzz code will apply 'ccmp' by default.