Keysyms
Marty Jack
martyj19 at comcast.net
Sat Jan 8 10:12:19 PST 2011
I hope that last comment didn't come across as snippy. It merely meant to point out that, for any piece of software, there are folks who use it and folks who don't. The idea that there should be more than one way to do everything goes all the way back to when the students at Berkeley thought they could outdo the professionals at AT&T, and BSD forked from SysV, and we got two of a lot of things. It seems to make Unix people nervous if they feel they don't have a choice about what to run.
I have been looking mostly at what do we need around the edges of Wayland so we can get from a world that is all X to a world that is as much Wayland as possible. One of the main things I could think of that is written with the older toolkits and still widely used is xterm.
I have also been looking at the input side of things, trying to understand exactly how device enumeration is done and concerning how events get from the driver to the application, and a really deep dive into Xkb.
I do have something for discussion though.
Pardon me while I do a little history lesson for those who are younger and weren't around for this.
X was released commercially in around 1987, a task which this author played some part in. At that time, non-ASCII characters were handled in two ways. The languages that use ideographs (you will see them referred to as "CJK", Chinese, Japanese, Korean) were using 16-bit characters; this is how we got the wchar_t type. The Western languages were using 8-bit characters and shift codes to change what part of the character encoding space was mapped to what character repertoire. The earlier attempts at this were the "DEC Multinational", "DEC Technical", "DEC Cyrillic" and so on that you will see if you are reading documentation for the VT terminals. These codes later got standardized as ISO/IEC 8859 and ISO/IEC 2022 for the shifting part. What happened was that the X people assigned keysym codes arbitrarily for all of these characters, because there wasn't any standard they had they could conform to.
The early work on the VT terminals also gave us the idea of Compose, so you could type Compose c comma and the terminal itself would put out "C with cedilla". These are now the "dead keys" or "combining characters" of Unicode.
The Unicode project wasn't started until 1991 or so. Now we have Unicode assignments for every character on the planet. X keysyms can now be of the form 0x01kkkkkk, where kkkkkk is the Unicode character, in addition to whatever keysym they used to have. There are a couple thousand of the legacy codes clogging up the mapping algorithm.
The Master List of X keysyms is in /usr/include/X11/keysymdef.h.
So the question is, given that Wayland is in many ways a fresh start, could we get to a place where we return the Unicode keysyms instead of the legacy keysyms for the glyphs where they differ and the clients can handle it, and ditch the legacy keysyms entirely.
More information about the wayland-devel
mailing list