Bug 40795 - Updates to Cameroon Keyboard.
Summary: Updates to Cameroon Keyboard.
Status: RESOLVED FIXED
Alias: None
Product: xkeyboard-config
Classification: Unclassified
Component: General (show other bugs)
Version: unspecified
Hardware: x86 (IA32) Linux (All)
: medium normal
Assignee: xkb
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-09-12 01:58 UTC by SIL Cameroon
Modified: 2013-03-09 21:50 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Keymap Updates for CM symbol file in XKB (Lines to insert) (1.80 KB, patch)
2011-09-12 01:58 UTC, SIL Cameroon
Details | Splinter Review
A patch for symbols/cm (16.47 KB, patch)
2011-09-13 06:57 UTC, SIL Cameroon
Details | Splinter Review
diff -uB cm mycm > cm.patch (9.83 KB, patch)
2011-09-14 01:03 UTC, SIL Cameroon
Details | Splinter Review
patch α U+03B1 -> ɑ U+0251 (2.98 KB, text/plain)
2013-03-09 21:45 UTC, Denis Jacquerye
Details

Description SIL Cameroon 2011-09-12 01:58:59 UTC
Created attachment 51064 [details] [review]
Keymap Updates for CM symbol file in XKB (Lines to insert)

Greetings Sergey,
Attached are some updates to the keys of the Cameroon keyboards.  

cm symbol file:
-We have found that some users need a glottal not easily accessible on the previous version, and the alpha character chosen was not acceptable to the language using it. Also, the Dvorak keyboard should be Cameroon Multilingual (Dvorak) like the others.  These updates are noted in the attached file.

Base.xml.in:
-Line 2238, change <_description>English (Cameroon Dvorak)</_description> to <_description>Cameroon Multilingual (Dvorak)</_description>
-Please add <iso639Id>nmg</iso639Id> to both Cameroon Multilingual (azerty) and Cameroon Multilingual (qwerty).  This language group was a major contributor to getting this keyboard up and running.

Thanks, 
Matthew Lee
Linguistic Technology Specialist
SIL Cameroon
Comment 1 Sergey V. Udaltsov 2011-09-12 13:07:45 UTC
Would you be able to attach the proper patch please? That one is cumbersome to use
Comment 2 SIL Cameroon 2011-09-13 06:57:18 UTC
Created attachment 51120 [details] [review]
A patch for symbols/cm

My first patch that should work to input my changes into the cm file.  It needs to be placed in the symbols folder.  I couldn't get the recursive patch to run.
Comment 3 SIL Cameroon 2011-09-13 07:05:26 UTC
Thanks Sergey,

Well, I learned a lot today.  Patch attached for symbols/cm.  Even though I couldn't get the recursive patch working, this single patch should do the hard work for cm.

Do I need to make a patch for evdev, too?

I guess this is harder for me as I'm not working from source, but from my system, to make these changes.  Beyond that, I'm behind a hateful proxy that makes Git a nightmare.

~Matthew
Comment 4 Sergey V. Udaltsov 2011-09-13 14:47:01 UTC
Sorry, not quite there - could you please use the option -u while doing diff?

> Do I need to make a patch for evdev, too?
No, you do not. The layouts are driver-independent, generally.
Comment 5 SIL Cameroon 2011-09-14 01:03:06 UTC
Created attachment 51179 [details] [review]
diff -uB cm mycm > cm.patch

Here we go again, this configuration of the patch file makes more sense.
Comment 6 Sergey V. Udaltsov 2011-09-15 13:46:01 UTC
Committed! Thank you!
Comment 7 Denis Jacquerye 2011-10-05 14:42:00 UTC
> -We have found that some users need a glottal not easily accessible on the
previous version, and the alpha character chosen was not acceptable to the
language using it.

from https://bugs.freedesktop.org/attachment.cgi?id=51179
-	key <AC05> { [ g, G, UA78C, UA78B ] };	// SMALL LETTER G, CAPITAL LETTER G, SMALL LETTER GLOTTAL (SALTILLO), CAPITAL LETTER GLOTTAL (SALTILLO)
-	key <AC06> { [ h, H, U0251, U2C6D ] };	// SMALL LETTER H, CAPITAL LETTER H, SMALL LETTER ALPHA, CAPITAL LETTER ALPHA
+	key <AC05> { [ g, G, U02BC, UA78B ] };	// SMALL LETTER G, CAPITAL LETTER G, CURVED GLOTTAL, CAPITAL LETTER GLOTTAL (SALTILLO)
+	key <AC06> { [ h, H, U03B1, U2C6D ] };	// SMALL LETTER H, CAPITAL LETTER H, SMALL LETTER ALPHA, CAPITAL LETTER ALPHA

The ʼ/Ꞌ <U+02BC, U+A78B> uppercase-lowercase pair is interesting, I guess it's language specific, like ʔ/Ꞌ <U+0294, U+A78B>.
But the pair α/Ɑ <U+03B1, U+2C6D> is not acceptable. It should be ɑ/Ɑ <U+0251, U+2C6D> like before the patch.

U+03B1 (Greek Small Letter Alpha) is a Greek character, U+0251 (Latin Small Letter Alpha) and U+2C6D (Latin Capital Letter Alpha) are both Latin and defined as each other's case variant. Instead of using U+03B1, a proper font with an acceptable glyph for U+0251 should be used.

I am reopening this bug. The alpha change is not coherent with Unicode.
The character U+0251 correct, use another font if it's shape isn't.
Comment 8 Denis Jacquerye 2011-10-05 14:50:44 UTC
(In reply to comment #7)
> I am reopening this bug. The alpha change is not coherent with Unicode.
> The character U+0251 correct, use another font if it's shape isn't.

The character U+0251 is correct, use another font if its shape isn't.
Sorry for the poor grammar.

SIL Cameroon: Which font have you been using? It could probably be patched. After all, besides IPA, U+0251 is pretty much only used in Cameroon (GACL).
It would make sense to have a glyph acceptable for Cameroon in the default fonts you are using.
Comment 9 Sergey V. Udaltsov 2011-10-08 14:37:50 UTC
Denis, would you be able to submit a patch?
Comment 10 SIL Cameroon 2011-10-09 05:17:58 UTC
Yes Denis, you've pointed out a sore point and a long story.  We're stuck here...there's not much else we can do.  

These forms were chosen in the 70's as acceptable letters for all Cameroon Orthography.  Originally, we had a simple a, rounded a, and alpha on our keyboard. Most languages use the rounded "a" for literacy work (looks like the handwritten a) and the normal "a" for publishing. We did an analysis of all the current orthographies and found that no languages were using all 3.

We were able to eliminate the rounded a from the newest keyboard, as they can use literacy fonts if they need the rounded a, but we were left with the distinction between a and alpha.  If we were only using one a, this would be ok, but we have a language that uses "a" (the rounded version) "A", "α", and "Ɑ".  I agree that that was not the intention to use both, but apparently their vowel structure necessitated 2 a's.  It was quite frustrating when we found this.

To be in line with Unicode, I tried U+0251 alpha.  In testing with the language community, the alpha without an upper tail (U+0251) was not distinguishable from rounded "a" they used. We were forced to go back to 03B1, which is the alpha they've been using for many years (but has no official rounded capital).

Bottom line, to distinguish between regular a's and this combination, we need an alpha with 2 tails, and a rounded capital A to match.  Secondly, we need a regular "a" (rounded form) and "A".  A literacy font solves the rounded "a" problem, but we still need a recognizable alpha.  Hacking the font for just this language would put us further backwards in this progression.  We'd like to have an alpha in the extended Latin section of Unicode with matching rounded capital, but with only one small language using it, we were not convinced that we could push it through Unicode.  I would hope to leave it the way it's patched, and discourage future linguists from using the 2 a's.

The fonts we use in Cameroon:
-Charis SIL
-Charis SIL Literacy
-Charis SIL Compact
-Charis SIL Literacy Compact
-Doulos SIL
-Doulos SIL Literacy
-Doulos SIL Compact
-Doulos SIL Literacy Compact
-Andika
-Andika Compact
-Gentium Plus
-Gentium Compact
-And we would use DejaVu if they could get the diacritic placement right on capital letters (another open bug report in my name)

(In reply to comment #8)
> (In reply to comment #7)
> > I am reopening this bug. The alpha change is not coherent with Unicode.
> > The character U+0251 correct, use another font if it's shape isn't.
> 
> The character U+0251 is correct, use another font if its shape isn't.
> Sorry for the poor grammar.
> 
> SIL Cameroon: Which font have you been using? It could probably be patched.
> After all, besides IPA, U+0251 is pretty much only used in Cameroon (GACL).
> It would make sense to have a glyph acceptable for Cameroon in the default
> fonts you are using.
Comment 11 Jenni 2011-10-10 02:28:30 UTC
Greetings

Denis is quite right U+03B1 (Greek Small Letter Alpha) is a Greek character, U+0251 (Latin Small Letter Alpha) and U+2C6D (Latin Capital Letter Alpha) are both Latin and defined as each other's case variant. 

But as Matthew has said in Cameroon we need three a's. Some languages contrast the standard a and the rounded ɑ (latin alpha) [ie Muyang], whereas many languages see those tow as the same letter with the rounded ɑ as a literacy variant of the standard a, but at least two of those languages use the greek letter to contrast with the standard a and use the Latin Capital Letter Alpha as its capital.

That is the three a's are as follows
a/A
ɑ/Ɑ <U+0251, U+2C6D> 
α/Ɑ <U+03B1, U+2C6D>

a/A  = ɑ/Ɑ <U+0251, U+2C6D>  (literacy version)
α/Ɑ <U+03B1, U+2C6D>  (contrasting letter)

Using a font with a different shape for the latin alpha would not work for these languages. I am investigating whether a proposal for the addition of third a can be submitted to Unicode. This is what we call alpha but in Latin script would probably become hook alpha, in the meantime for the keyboard to meet the needs of all Cameroonian languages it needs to be able to type the three distinct a's (a,ɑ,α).
Comment 12 Denis Jacquerye 2011-11-27 21:43:02 UTC
Sorry for the late reply.

> But as Matthew has said in Cameroon we need three a's.

Yes and no.
The General Alphabet of Cameroon Languages clearly only specifies 2 graphemes :
 letter a, and letter alpha. The first grapheme can have two forms : double story a (with bowl and top) and single story a (only with a bowl aka literacy version). The second grapheme alpha is like the Greek alpha, i.e. like you describe it with 2 tails.

> Using a font with a different shape for the latin alpha would not work for
these languages.

I don't believe there is any Cameroonian language using more than those two graphemes resembling a, i.e. the three forms at the same time. The GACL clearly only has those two and acknowledges the two forms of the letter a while giving only one form for the letter alpha. There doesn't seem to be any recent orthography diverging from this in SIL Cameroon's documents. 

The fact that some users end up with the wrong form is a matter of glyph in what font they use. Glyphs and characters are different things, wrong glyphs should not lead to changing the character, but instead to changing the font.

Readers used to either forms of the letter a for the first grapheme, who don't use the second pair, don't seem to be in trouble with any font.

Readers used to the single story a aka "literacy a" for the first grapheme and the alpha with 2 tails for the second grapheme should use a font with a single story glyph for the character a (U+0061) and a glyph like the Greek alpha with 2 tails for the character ɑ (U+0251), and their matching uppercase (Ɑ U+2C6D should probably be with 2 tails in that font).

This kind of per language font situation is not uncommon. Serbian or Macedonian use different form of some Cyrillic letters that Russian doesn't use. This can be handled within the same font, if OpenType has the proper language tags, the language is specified in the interface and the renderer supports the feature. A lot of conditions, but it's feasible.

Using the Greek alpha U+03B1 could also give the wrong form since some fonts can have a non adequate glyph for those users. See the Ubuntu font for example, which has the Greek alpha you don't want! Clearly this is a font issue! There will always be fonts with characters that have forms that are useless to you. What you need are fonts that are useful! Abusing Unicode will be no sane nor long term remedy.

> Hacking the font for just this language would put us further backwards in this progression.
> ...
> I would hope to leave it the way it's patched, and discourage future linguists from using the 2 a's.

I don't understand what you're trying to do. Are you trying to progress by allowing people to properly type for their languages, or trying to discourage linguists from following the GACL or orthographies by abusing Unicode?

If you consider it is imperative to have a keyboard layout that breaks Unicode purposely for a matter of glyphs instead of characters, there must be an alternative layout that doesn't break Unicode. Both layouts could be labelled with the added "(Latin alpha)" and "(Greek alpha)".
Comment 13 Sergey V. Udaltsov 2011-11-29 15:56:00 UTC
Lads, one side question from the person that have no idea about those subtle differences: what 'a' letters are used by the spellcheckers? Is the difference totally ignored?

As a maintainer, I would be interested in Unicode-correct layouts rather than layouts that use similar glyphs. If I understand your argument correctly.. Fonts can and should be fixed anyway.
Comment 14 Denis Jacquerye 2013-03-09 21:45:57 UTC
Created attachment 76245 [details]
patch α U+03B1 -> ɑ U+0251

This is a patch changing lowercase Greek α U+03B1 to lowercase Latin ɑ U+0251 to correspond with uppercase Latin Ɑ U+2C6D.

Like mentioned before, using the Greek alpha can still look unacceptable with some fonts, just like the Latin alpha can. This is a font issue not a character issue, use a font with the right Latin alpha glyph to fix this.

Using the Greek character breaks other things like sorting, case-folding in search or in spellchecking.
Comment 15 Sergey V. Udaltsov 2013-03-09 21:50:51 UTC
thanks, committed.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.