22004 – us(intl-unicode) and us(alt-intl-unicode): us(intl) variants with Unicode combining

Bug 22004 - us(intl-unicode) and us(alt-intl-unicode): us(intl) variants with Unicode combining

Summary: us(intl-unicode) and us(alt-intl-unicode): us(intl) variants with Unicode com...

Status:	RESOLVED FIXED

Alias:	None

Product:	xkeyboard-config
Classification:	Unclassified
Component:	General (show other bugs)
Version:	unspecified
Hardware:	Other All

Importance:	medium enhancement
Assignee:	xkb
QA Contact:

URL:	http://github.com/leoboiko/us-intl-un...
Whiteboard:
Keywords:	NEEDINFO

Depends on:	21466
Blocks:
	Show dependency tree / graph

Reported:	2009-05-30 08:21 UTC by Leonardo Boiko
Modified:	2009-06-27 13:32 UTC (History)
CC List:	1 user (show)

See Also:
i915 platform:
i915 features:

Attachments
add us(intl-unicode) and us(alt-intl-unicode) to xkb/symbols/us (3.47 KB, patch) 2009-05-30 08:21 UTC, Leonardo Boiko	Details \| Splinter Review
View All

Description Leonardo Boiko 2009-05-30 08:21:52 UTC

Created attachment 26306 [details] [review]
add us(intl-unicode) and us(alt-intl-unicode) to xkb/symbols/us

I wrote a couple of us(intl) variants that use Unicode combining sequences instead of precomposed characters.  This is more general and powerful, but also less supported and less natural to type in most countries.  If there is interest, I'd be honored to have them included in xkb.

See diff attached, or URL for more documentation.

Comment 1 Sergey V. Udaltsov 2009-06-09 03:20:53 UTC

Would you have any estimations regarding how many people are using these variants?

Comment 2 Leonardo Boiko 2009-06-09 06:04:38 UTC

Yes, one (namely me).

Unicode combining today isn't well supported in much software, and is little used with an US layout except perhaps by linguists and a few minority languages employing the Latin alphabet with tons of diacritics.  My intention in submitting these variants was to give more exposure to the Unicode mechanism; but if the policy is to avoid feature bloat until there's a real need, no hard feelings here.

Comment 3 Troy Korjuslommi 2009-06-10 02:39:03 UTC

There is no real need for this in X11, but we ran into this solution when designing a multilingual keyboard layout, one with support for most European characters. The current MS Windows' keyboard drivers don't support multiple diacritics, so we had to include a description of a method where the base character is entered first.
This issue first popped up when working on the current Finnish keyboard standard, and then again in our current work on a CEN recommendation for multilingual keyboard layouts for Europe (MEEK). You can find related documents, and info on our work at 
http://www.csc.fi/meek/
and
http://www.cen.eu/cenorm/businessdomains/businessdomains/isss/activity/ws+meek.asp

The document's public commenting period has just started and will continue until August 28th, 2009.

Comment 4 Sergey V. Udaltsov 2009-06-10 14:04:41 UTC

Leonardo, you're saying that some apps are not compatible, and some fonts. Do you have a list of problematic apps/toolkits? Did you file bugs related to that feature?

I am asking because I think there would be no point to support that layout if people would have massive issues using it, right?

Comment 5 Simos Xenitellis 2009-06-10 15:20:32 UTC

Thanks Troy for the links to the documents.

I probably hijack the report; apologies.
My current view (without having studied those documents!) is that 
* the EU languages are based on either Latin, Greek or Cyrillic
* it makes sense to me to have at least three layouts, Latin-all, Greek-all and Cyrillic-all, so that the base alphabet
* for the vast majority of characters there exist pre-composed versions. This is the case for Greek and I believe for Latin-based scripts, as used in the EU countries.
* for cases such as in Cyrillic there is no full range of pre-composed Unicode characters. In this case, it is possible to define compose sequences such as
<dead_diaeresis> <Cyrillic_GHE>  : "Г̈"   (imaginary example producing Г̈  and 0x308, two characters).
* the layout can have all dead keys accessible under the AltGr shortcut. See the default 'gb' layout for an example.
* we already moved so such a single layout for Greek, produces all Greek relevant characters.

Comment 6 Troy Korjuslommi 2009-06-16 06:21:22 UTC

Simos, I hope you will find the time to have a peek at the MEEK documents. 
The MEEK document contains multiple solutions. Its primary purpose is to 
help with latin based keyboards. Cyrillic and Greek scripts are mentioned briefly.
The coverage is primarily from the perspective of a latin keyboard user who
needs to enter those scripts.
There is also a document on this at ISO (also in the comment phase), and the MEEK document contains the ISO proposal as one of the possible solutions.
The dead key method is another one of the possible solutions.
The third possible solution uses a global database (on servers) of modifiable cross-platform layouts and is targeted especially for Internet cafe users and others who need to enter letters from multiple languages frequently. I guess I should mention that this last one was authored by me.

Comment 7 Sergey V. Udaltsov 2009-06-27 13:32:30 UTC

WFIW, I've added to 'extra' section. But still I would like to get a list of related bugs in other projects

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.