Summary: | Compose file problem with some Greek accents | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | Brice Goglin <brice.goglin> | ||||||
Component: | Lib/Xlib | Assignee: | James Cloos <cloos> | ||||||
Status: | RESOLVED FIXED | QA Contact: | Xorg Project Team <xorg-team> | ||||||
Severity: | normal | ||||||||
Priority: | medium | CC: | adia, mat, mfabian, simos.bugzilla, sndirsch | ||||||
Version: | git | ||||||||
Hardware: | Other | ||||||||
OS: | All | ||||||||
URL: | http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=436923 | ||||||||
Whiteboard: | |||||||||
i915 platform: | i915 features: | ||||||||
Bug Depends on: | |||||||||
Bug Blocks: | 13275 | ||||||||
Attachments: |
|
Description
Brice Goglin
2007-08-10 08:57:37 UTC
> Perhaps the fundamental solution is to introduce new named keysyms (proposal: > dead_dasia for "314" and dead_psili for "313") for use in both files. Now that new keysyms have been defined (see bug #9306), other places that use the old ones should be changed to reflect this. The el_GR.UTF-8/Compose file currently has two lines for each psili/dasia compose sequence, one with dead_horn/dead_ogonek and one with U0313/U0314. The new keysyms are dead_psili/dead_dasia. The only question is, should the new sequences *replace* the old ones, or should they just be added to the existing ones for backwards compatibility? I'm attaching two versions of the changed file, corresponding to the two possibilities. Someone more qualified than me should select the best one to use. BTW, I also took the opportunity to fix spacing in the file, so that columns would align. That's why I'm not attaching patches, since the real changes would be drowned in the whitespace ones. Sorry if this causes trouble for those reviewing the changes. Created attachment 12592 [details]
nls/el_GR.UTF-8/Compose.pre with added sequences
This one adds a new line with the new keysyms. For example...
<dead_horn> <Greek_alpha> : "ἀ" U1f00
<U0313> <Greek_alpha> : "ἀ" U1f00
<dead_psili> <Greek_alpha> : "ἀ" U1f00
Created attachment 12593 [details]
nls/el_GR.UTF-8/Compose.pre with only psili/dasia keysyms
Same as the previous attachment with all lines using the old keysyms removed.
(In reply to comment #4) > Could this be related to Bug #5129? Yes, in that the same changes should be applied to en_US.UTF-8/Compose too. Apart from that, I see that the polytonic Greek part of that file has many other problems that should definitely be fixed. I'll take a look at this and post a patch there. Thanks! Vassilis Vasaitis wrote a script to create the section for Greek polytonic. His website is not active anymore, however I put the script online at http://planet.ellak.gr/misc/polytonic-compose.pl Would it make sense to get the Greek polytonic section created as the output of the script only? Stefan Dirsch> ould this be related to Bug #5129 Yes, I reopened bug #5129 just because of this problem. Somebody replaced all Uxxxx with U1000xxxx in the compose file recently because of a perceived conflict which didn’t really exist. This change must be reverted because it was wrong. I used a script similar to the one in the log for libX11.git commit c76d30253f1483ac8200ad5c032a818907e65030 to add dead_psili and dead_dasia entries to the en_US.UTF-8 and el_GR.UTF-8 Compose.pre files. The en file has entries like <U10000313> where then el has <U0313>. en was changed by commit 4c3e34bece7402f08139d34d1ef5834e3cf533c7; I'll update el to match later today. If anyone wants to work on making the build use http://planet.ellak.gr/misc/polytonic-compose.pl from Simos’ comment above to generate the Compose.pre at tar-creation-time, please add it here! In fact, you should make the opposite change - that is, keep el as it is now and update en. Unicode keysym names have the following meaning: Uxxxx, where xxxx is a hexadecimal string between 100 and 10ffff, corresponds to keysym with value 0x01000000 + xxxx. So, entries like <U10000313> are incorrect - the correct keysym for Unicode character U+0313 is <U0313>. Comment 13 on bug #5129 is saying this, too. Here's a Perl one-liner to fix this: perl -pe 's/U1([0-9A-Fa-f]{7})/sprintf "U%04X", hex $1/ge' ...although the en file has various other problems, and at least for the polytonic Greek part it will be better, I think, to dump the current entries and recreate it with an appropriate script. The current el Compose seems mostly correct to me, although I haven't tested it yet. The only strange thing I noticed is sequences like... <dead_dasia> <dead_ogonek> <U0313> <dead_horn> ...etc. - that is, entries mixing different incorrect keysyms for aspirations. They don't cause any problems, but since a given keyboard layout will either have the correct new keysyms for psili and dasia, or will have one of the old icorrect pairs, they are unnecessary. (In reply to comment #9) > Unicode keysym names have the following meaning: Uxxxx, where xxxx > is > a hexadecimal string between 100 and 10ffff, corresponds to keysym with value > 0x01000000 + xxxx. So, entries like <U10000313> are incorrect - the correct > keysym for Unicode character U+0313 is <U0313>. OK. I see in libX11/src/KeysymStr.c (cf: http://cgit.freedesktop.org/xorg/lib/libX11/tree/src/KeysymStr.c at the end of the file) code which does exactly that; I’ll push a fix throughout the nls files in libX11.git. It looks like code points beyond UFFFF use UXXXXXXXX (a ten octet buffer is allocated rather than a six octet buffer). I’ll ensure any of those are also correct. The U1000XXXX → UXXXX and U1001XXXX → U1XXXX issue is fixed in commit 438d02ebc08ee171cf1d3936f4c81050d428ab92. Please reopen if I missed anything. Thanks James. On a somewhat similar note, I filed Bug 14013 to add "dead_perispomeni". Currently Greek Polytonic uses dead_tilde which corresponds to 0x303 (Unicode), and not 0x342 (Perispomeni). I am posting this in case any subscribers to this bug are also interested. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.