17752 – Request for U+FBEA, U+FBEE, U+FBF2, U+FBF9 and U+FC04

Bug 17752 - Request for U+FBEA, U+FBEE, U+FBF2, U+FBF9 and U+FC04

Summary: Request for U+FBEA, U+FBEE, U+FBF2, U+FBF9 and U+FC04

Status:	RESOLVED INVALID

Alias:	None

Product:	DejaVu
Classification:	Unclassified
Component:	Sans (show other bugs)
Version:	unspecified
Hardware:	Other All

Importance:	medium normal
Assignee:	Deja Vu bugs
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2008-09-24 03:54 UTC by Erdal Ronahi
Modified:	2008-10-05 03:36 UTC (History)
CC List:	0 users

See Also:
i915 platform:
i915 features:

Attachments
Kurdish ligatures (4.14 KB, image/png) 2008-10-02 17:47 UTC, nocturnaldreamer	Details
View All

Description Erdal Ronahi 2008-09-24 03:54:41 UTC

These letters are necessary to render initial vocals in Kurdish (Sorani) written with the Arabic script. Possibly they are also necessary for other languages like Uzbek.

Examples:

http://www.decodeunicode.org/en/u+fbea
http://www.decodeunicode.org/en/u+fbee
http://www.decodeunicode.org/en/u+fbf2
http://www.decodeunicode.org/en/u+fbf9
http://www.decodeunicode.org/en/u+fc04

A discussion of these characters is at http://wiki.ferheng.org/doku.php/initial_hamza_in_sorani

Comment 1 nocturnaldreamer 2008-10-02 17:47:49 UTC

Created attachment 19347 [details]
Kurdish ligatures

Comment 2 nocturnaldreamer 2008-10-02 17:48:49 UTC

Comment on attachment 19347 [details]
Kurdish ligatures

If I understood correctly there are some misconceptions about the proposed letters. The code points you cited are from Arabic Presentation Forms-A, these are not meant to be used directly. Instead they are provided for compatibility with legacy implementations.

I've tried out the letter combinations in OpenOffice.org and they seem to work fine (see attachment).

Comment 3 Erdal Ronahi 2008-10-03 07:45:57 UTC

Yes, the ligatures work. But if you write them as the combination of two characters, there are still two letters. You cannot address them directly with one keystroke. Or if you delete one, the other is still there.

For Arabic that's fine, but in Kurdish they make sense only together. That's why they should be usable directly and not only as a ligature.

Comment 4 nocturnaldreamer 2008-10-04 07:56:21 UTC

(In reply to comment #3)
> Yes, the ligatures work. But if you write them as the combination of two
> characters, there are still two letters. You cannot address them directly with
> one keystroke. Or if you delete one, the other is still there.

It should be possible to map one keystroke to multiple code points, this feature is needed for accented Latin letters without precomposed forms, for instance.
 
> For Arabic that's fine, but in Kurdish they make sense only together. That's
> why they should be usable directly and not only as a ligature. 

Still this workaround creates more problems than it solves. For one, as contextual forms they do not shape according to surrounding letters. So you would have to map multiple keys to a single character. I think this is analogous to Dutch "ij" or Spanisch "ll". They are or were seen as one letter rather than two, but still the preferred representations are i+j and l+l. Otherwise searching and collating could yield unexpected results.

As the Unicode standards stands (and as implemented e.g. on the Kurdish Wikipedia), you are supposed to use a string of two code points (cf. chapter 2 of the standards). However implementations specially tailored for Kurdish could still treat these as a single letter. As far as I know, this how it's done for various Indic scripts and the accented letters I mentioned above.

Comment 5 Erdal Ronahi 2008-10-05 03:36:55 UTC

Thanks for the clarification.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.