Bug 9306 - Add new keysyms for dead psili and daseia
Summary: Add new keysyms for dead psili and daseia
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Protocol/Core (show other bugs)
Version: unspecified
Hardware: All All
: high normal
Assignee: Daniel Stone
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 10824
  Show dependency treegraph
 
Reported: 2006-12-11 16:48 UTC by Alexandros Diamantidis
Modified: 2007-10-19 17:40 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments

Description Alexandros Diamantidis 2006-12-11 16:48:04 UTC
Some years ago, I hacked together an Xkb map and a Compose file for polytonic
Greek. In addition to the other dead keys already defined, I needed two keysyms
for dead psili/daseia (comma and reversed comma above). As these weren't
defined, I abused dead_horn and dead_ogonek respectively, as they aren't used in
Greek, and they are curvy diacritics. In the meantime, these quick-and-dirty
files were reworked by others and added to the X distribution, and they
generally work for most people today, but of course they're not The Right Thing.

Is it possible to define two new keysyms for this functionality? For instance,

#define XK_dead_abovecomma      0xfe64
#define XK_dead_aboverevcomma   0xfe65

(not sure if the names are good)

I've seen discussions about problems caused by the current abuse of dead_horn
and dead_ogonek, and a suggestion to use Unicode keysyms U0313 and U0314 (U+0313
COMBINING COMMA ABOVE and U+0314 COMBINING REVERSED COMMA ABOVE). This is not
correct, either, since according to the X Window System Protocol, Appendix A
(KEYSYM encoding)...

> Dead keys, which place an accent on the next character entered, shall be
> encoded as Function KEYSYMs, and not as the Unicode KEYSYM corresponding
> to an equivalent combining character.

Of course, after defining the new keysyms, dead_horn and dead_ogonek should be
removed from all other places where they are used as psili and daseia (keyboard
definitions, Compose files, input methods, libraries such as Gtk+). I don't
think it'll be difficult, but it will take quite some time and of course
backwards compatibility can be maintained in the meantime by having both the old
and new definitions working. Better late than never...
Comment 1 Daniel Stone 2007-02-27 01:35:04 UTC
Sorry about the phenomenal bug spam, guys.  Adding xorg-team@ to the QA contact so bugs don't get lost in future.
Comment 2 Daniel Stone 2007-05-07 05:19:52 UTC
(In reply to comment #0)
> I've seen discussions about problems caused by the current abuse of dead_horn
> and dead_ogonek, and a suggestion to use Unicode keysyms U0313 and U0314 (U+0313
> COMBINING COMMA ABOVE and U+0314 COMBINING REVERSED COMMA ABOVE). This is not
> correct, either, since according to the X Window System Protocol, Appendix A
> (KEYSYM encoding)...

I don't see how this is incorrect?  In general, we don't add new keysyms which already have UTF-8 codepoints.
Comment 3 Alexandros Diamantidis 2007-05-15 14:43:09 UTC
(In reply to comment #2)
> I don't see how this is incorrect?  In general, we don't add new keysyms which
> already have UTF-8 codepoints.

Right, but there is a difference between U0313 and U0314 and the new keysyms requested: the former correspond to the combining Unicode characters U+0313 and U+0314, while the latter are function keysyms for dead keys. It's the same difference between, for instance, dead_grave and U0300.

Also, as I pointed out in the bug description, appendix A of the X Protocol reference advises against using Unicode keysyms for combining characters as dead keys.
Comment 4 Daniel Stone 2007-05-16 03:07:08 UTC
okay, i'm just too dumb to understand the difference between dead and combining from our point of view.  will add.
Comment 5 Simos Xenitellis 2007-08-20 13:24:27 UTC
Pinging this issue. It is being discussed at the XKB mailing list.
Comment 6 Simos Xenitellis 2007-08-20 13:29:45 UTC
To summarise, what's needed is to add

#define XK_dead_abovecomma      0xfe64
#define XK_dead_aboverevcomma   0xfe65

in 
http://gitweb.freedesktop.org/?p=xorg/proto/x11proto.git;a=blob;f=keysymdef.h
Comment 7 James Cloos 2007-09-27 16:03:02 UTC
This was committed to git in 40ed4eef92e31fcf7ea0a436e1a00cdf49484c1b
and is part of the xproto-7.0.11 release.
Comment 8 Markus Kuhn 2007-10-19 10:36:25 UTC
The main difference between a dead key and a Unicode combining character is:

  - a dead key is pressed *before* the key for the base character that it modifies (dead as in "does not advance typewriter carriage, causes next key to overstrike the character")

  - a Unicode combining character *follows* the base character that it modifies (a convention chosen because this works better for Indic scripts and others where combining characters follow phonetically the consonants that the modify).

It, therefore, seems logically cleaner to keep the two apart. If the accent does not follow the base character, it is not a Unicode combining character, but something else, a dead key. The input method will not only have to convert the code values, but also reposition them in the resulting text string.

The interesting question is: should we have some form of standard algorithm for converting a Unicode value of a combining character into a corresponding X11 dead key keysym, as we have already for all other Unicode characters (i.e., adding 0x01000000)? Something like adding 0x02000000 to the Unicode value of the combining key to get the corresponding X11 keysym?

This way, as we get interested in additional dead keys that correspond to existing or future Unicode combining characters, we will not have to invent new keysym code positions. Inventing new code positions is always bad, as it requires us to update conversion tables forever.
Comment 9 Daniel Stone 2007-10-19 17:40:46 UTC
Thanks for the explanation, Markus: that was really helpful.  Hopefully Google archives this somehow.

As for the keysym positioning, doesn't Unicode already give us combining characters? Poking through gucharmap shows a load; do we need more than these? If so, I guess we need a new space for arbitarary combining Unicode characters, indeed.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.