Using media keys in a Linux console application
I've been working on an embedded Linux media application that runs with a small screen in console mode. This screen only displays text, as the main user interface is on a different system; so there's no need for the overhead of a graphical display or desktop. However, the embedded system does require some keyboard interaction. I don't want (or need) a full keyboard, so I thought to use one of those cheap Windows Media Centre remote controls. These have a USB connection, and look exactly like a USB keyboard to the host computer. These little units cost only a few pounds, and have about a dozen keys. These are cursor keys, page up/down, tab, backspace, and a bunch of keys with no conventional interpretation.
Note:
This article is about the Linux kernel console -- what you get when there is no graphical interface at all. It does not apply in any way to a console application running in a terminal emulator on a graphical desktop.
These unconventional keys on the Media Centre remote control have labels like 'email' and 'web', along with a set of media transport keys (play, stop, forward, etc). It probably isn't entirely fair to describe these keys as 'unconventional', as most mainstream Linux distributions assign some sort of functions to these keys. For example, the volume up/down keys often do, in fact, raise and lower volume. The media playback keys might also do something. However, they are unconventional in that they did not exist on the kinds of dumb terminals on which Linux keyboard handling is modelled.
In a mainstream Linux desktop system, the fact that these unconventional keys do something at all is thanks to the graphical desktop. The desktop intercepts these keys, and takes some action, which may (or may not) be configurable by the user.
My application has no graphical user interface and, therefore,
no graphical desktop. It uses ncurses
to read the
keyboard and to generate the display. The usual function in
the ncurses
library to read from the keyboard is getch()
. This
function returns a number corresponding to the key pressed. For ordinary
alphanumeric keys, the value returned is simply an ASCII value. However,
ncurses
will interpret the terminal escape sequences
generated by the console or a terminal emulator. For example,
pressing the 'down' key will usually generate 'escape open-bracket B',
just as dumb terminals did back in the day. The ncurses
getch()
functions quietly decodes all this legacy silliness,
and returns a single number. The value of this number is arbitrary,
but there's a
constant KEY_DOWN
defined in ncurses.h
that
applications can test for.
So, in my naivety, I assumed that the unconventional keys on the Media Centre remote control would produce some code that my application could capture and process, even if I had to find out by trial-and-error what that code was.
Boy, was I wrong.
This isn't the fault of the ncurses
library, and the rest
of this article has nothing to do with ncurses
. The sad
fact is that, in the console, these extended keys generate absolutely
nothing at all. You can see this simply by going to the console and
running
$ cat -vand pressing keys until something happens. Or doesn't. If the kernel's keyboard support doesn't generate any keycode, then
ncurses
can't do anything, and nor can anything else.
In the end, I was able to get all the keys on the remote control to generate something that my application could use. But doing so required a fairly deep dive into the murky world of Linux console keyboard handling.
Kernel console key mapping
So what's going on here?
The raw keyboard driver generates a scan code when a key is pressed or released. The scan code is a completely arbitrary number that represents the position of the key on the keyboard. USB keyboards have standardized scan codes -- the 'Esc' key, for example, is 0x29. The letter 'A' on my keyboard has scan code '5', but that's dependent on the keyboard layout. The scan code depends on the position of the key, not its function.
Linux was originally developed for IBM PC hardware, generally using PS/2 keyboards. As a result, the kernel normalizes raw scan codes from the various keyboard devices into the scan codes that would be used by a PS/2 keyboard. On such a keyboard, the 'Esc' key has scan code 0x01, not 0x29. It is this PS/2-style scan code that is important in keyboard mapping, because most likely you won't easily be able to change what happens to the raw USB codes. It's also important to understand that, even after this first stage of translation, we're still talking about scan codes -- numbers representing where keys are positioned in space. Applications will expect the kernel to translate these codes into something they can understand. These final translated codes are usually called 'key codes', but terminology in this area is notoriously vague. Whatever we call the final result, the translation has to take account of the fact that the key code will be affected by any modifiers (shift, ctrl) that are held down at the same time, and the status of sateful keys like 'caps lock' and 'num lock'.
Translation of scan codes to key codes is not a one-to-one mapping. Some scan codes will generate multiple key codes, and some key codes require multiple scan codes. As discussed, 'cursor down' will become ctrl-open-bracket-B. But, unless the caps lock key is down, the letter 'A' requires four scan codes -- shift down, 'a' down, 'a' up, shift up.
Some of this complexity is necessary, just because of the way keyboards work. Some, however, exists because there is so strong a requirement for backward compatibility. Almost all Linux application will expect cursor keys to generate ancient VT52 escape sequences, although there isn't the slightest reason why they should -- this assumption is so entrenched in Linux that it would be almost impossible to dislodge now.
The keyboard translation table
Be that as it may, the kernel maintains a rather complex keyboard
translation table to deal with the vagaries of scan code to
key code translation. The table is manipulated using the loadkeys
utility. This utility takes a filename as an argument, but usually the utility
will search a set of well-defined directories to find the specified
file -- it isn't usually necessary to give a full pathname, but it's
certainly possible to.
In modern Linux installations, the keyboard translation tables will be
compressed in GZIP format. However, when uncompressed, they reveal
ordinary text files. loadkeys
will work with both
compressed and uncompressed keyboard mapping files.
Here are the first few lines from the standard US
keyboard layout file, us.kmap.gzip
. The location of this
file varies with the Linux distribution. I've seen
/usr/lib/kbd/keymaps/legacy/i386/qwerty/
and
/usr/share/keymaps/legacy/i386/qwerty/
, and no doubt there
are other possibilities.
# us.map keymaps 0-2,4-6,8-9,12 alt_is_meta include "qwerty-layout" include "linux-with-alt-and-altgr" strings as usual keycode 1 = Escape ...
Note that this file is not a complete key map -- it
uses include
to
load more general maps. Note also -- and this is the most important
part -- that it uses PS/2-style scan codes, not USB scan codes. This
standardization on PS/2 codes means that a single
translation table can be used for different
keyboard types even though, in practice,
they are all USB these days.
In the keyboard mapping file, the 'Esc' key is mapped to a
token Escape
, which eventually
ends up as the ASCII code 27. So far as I know, you can't put actual
numeric ASCII codes in key map files -- we have to use the tokens that
are defined.
To get a list of these constants, use `dumpkeys --long-info`. Irritatingly, this command has to be run on a console, even though we just want a list of recognized tokens, and this does not depend on the current key mappings.
Fortunately, the mapping tokens are reasonably self-explanatory. For
example, the key code corresponding to the key combination Ctrl+N is
Control_n
. I wouldn't have guessed this, but it's easy to
spot in the list.
Finding scan codes
To create or edit a keyboard mapping file, we need to know the (PS/2-style)
scan codes of the keys. These are documented, but it's easier just to
run showkey
while poking the keys. This utility displays
the PS/2-style scan codes of keys as they are pressed.
Here are the scan codes of the non-standards keys on a Media Centre remote control. So far as I can tell, ordinary keyboards with these extended keys generate the same codes.
mute 113 volume down 114 volume up 115 power off 116 mail 155 next track 163 play/pause 164 previous track 165 stop 166 www 172
The other keys on the Media Centre remote control all generate recognizable key codes, so might not need any further processing. For example, the fast-forward and rewind keys generate 'cursor right' and 'cursor left'.
It's also interesting to note that the A, B, C, and D keys on the remote control generate key codes for F1 - F4. On a Linux console, these function keys switch virtual terminals. These keys can be remapped as the others can, but it might be useful in some applications to leave them with their default functions.
Putting it all together
So how do we use this information to make it possible for a console application to use the non-conventional (media) keys? Simple: we need to change the keyboard mapping table so that these keys generate actual key codes. What key codes? That's entirely up to the application. If the application doesn't use any other keyboard, then the choice of mappings is completely arbitrary. We could map the remote control scan codes to letters, for example, and then code the application to expect these letters.
On the other hand, if your application might use a real keyboard as well, you need to pick key codes with that in mind. In my application, for example, I use the '+' and '-' keys to control volume. These keys make some kind of sense on a regular keyboard, and can be mapped to the volume up/volume down keys on the remote control. I use ctrl+P and control+N to play the previous and next items; I could just use 'P', and 'N', but the application can actually accept letters as input, so the raw 'P' and 'N' would clash. To stop playback I use ctrl+X. 'Pause' is just the space bar.
Of course, these are just my application choices. Since I'm writing the application, and setting the keyboard translation, I have complete freedom in this area. However, it makes sense to use mappings that would make some kind of sense on a regular keyboard, even if only for testing the application.
So, having decided the mappings to use, we need to set these mappings in the kernel. The way I do this is to edit one of the existing keyboard map files. Since the 'real' keyboards I use generally have US layout, I'm basing my custom keyboard map on the stock US layout.
To make the changes I uncompress the US layout file, us.kmap.gzip
and copy it to file with a name of my choice. Then I edit this file, to
add my custom mappings. For the settings I described above, I'm adding
these lines:
keycode 115 = plus keycode 114 = minus keycode 164 = space keycode 163 = Control_n keycode 165 = Control_p keycode 166 = Control_x
Again, the tokens plus
, minus
, etc., are found
by inspecting the output of dumpkeys --long-info
.
To test this mapping, we can just load the edited file using
loadkeys
. Poke the keys whilst running cat -v
to check that recognizable codes are produced.
The final step is to ensure that the key map is
loaded at boot time. How to do this depends on the kind of Linux
you're running; I just use one of the start-up scripts to run
loadkeys
.
Closing remarks
It turns out, in fact, that mapping the extended media keys onto real
key codes is actually straightforward, when you know how:
it's just a matter of editing
a single text file, and processing it using loadkeys
.
The problem is that this process is largely undocumented. If you do a
web search for keyboard mapping in Linux, all the results you get will
be about keyboard mapping in X.
This is unhelpful here, because X uses a totally different process for keyboard mapping. None of the methods described for keyboard mapping with a graphical desktop are remotely useful for the console.