What’s broken in the WM_KEYDOWN/WM_CHAR input model?

I was planning on writing a rant on the input model of the Win32 API. See, while developing the input model for NGEDIT, I’ve been pondering how to implement the key-mappings, input handling, etc. Although I had used it in the past, it wasn’t in an edit-intensive, fully customizable, internationally targeted program (although I’ve implemented IME-based chat for China, Japan and Korea 🙂 ). The more I looked at the model and I thought about how to handle it, the more it seemed a complete mess in my head. But I could not pin down exactly what it was, and what a better model may be.

I set out to grasp a complete understanding of the model, which I will be describing here for whoever needs it while programming (I couldn’t find a fine explanation that made sense – of course MSDN is of no help in this kind of situations).

The Win32 model works like this: when a key is pressed, you get a WM_KEYDOWN message (more if the key is kept depressed past the auto-repeat time). When it is released, you get a WM_KEYUP. You can access the state of the modifier keys in that moment (SHIFT, ALT and CONTROL). Both messages indicate the pressed key with their virtual key code: an integer code which uniquely identifies a key on the keyboard.

Apart from this, as messages get translated by the TranslateMessage() Win32 function, they are checked with the current keyboard config, and an additional WM_CHAR message is sent for each WM_KEYDOWN, with the current-codepage character code of the key pressed.

This is all fine and dandy, although things start getting tricky after this.

  • In non-US keyboard mappings, or in the US international keyboard, you can use some keys to modify the character typed afterwards with an accent. If you press one of these “accent” keys, a WM_DEADCHAR is generated that probably you’d better ignore, and the next WM_KEYDOWN of a vowel character will generate a WM_CHAR message with a different char code representing the accented character.
  • When you want to handle a key in your program, you have a choice as to handle either the WM_KEYDOWN or the WM_CHAR message. Keys such as the cursor arrows or function keys don’t have a character equivalent, so you need to handle WM_KEYDOWN. For regular character keys that are used for typing, you probably prefer to handle the WM_CHAR message, as it will get you either the lower case or upper case version or even the accented version in international version. A doubt always arises about what to do regarding other keys which generate both a WM_KEYDOWN and a WM_CHAR: what to do with TAB, RETURN, ESC, etc…? Even worse, the DEL key doesn’t generate the ASCII DEL (127) WM_CHAR message that you would expect – it is a “mute” key WM_CHAR-wise. I think I remember reading Charles Petzold mourning the same decision in his programming windows book – I don’t remember what conclusion he reached, but I remember it didn’t satisfy me.
  • The virtual key code, numerically matches the uppercase-ASCII-code of the character on the keyboard, and has special codes for cursor arrows, function keys, and some weird keys called stuff like VK_OEM_1 (for the ’tilde’ key in a US mapping). A key such as VK_OEM_1 is mapped to an actual alphabetic character in non-english european languages, making it difficult to distinguish alphabetic from non-alphabetic WM_KEYDOWNs.
  • Windows nicely throws a different version of the messages on you if the ALT key is pressed at the same time: you get WM_SYSKEYDOWN and WM_SYSCHAR instead of the regular WM_KEYDOWN and WM_CHAR. You also get WM_SYSKEYUP for the sake of consistency. This messages usually get handled by DefWindowProc() in order to bring up the corresponding menu item and some special key processing. Curiously, pressing the ALT key by itself and releasing it afterwards, buys you a WM_SYSKEYDOWN/WM_SYSKEYUP pair for the ALT key. But if you press ALT, then a character key, and release both, you will all WM_SYS* messages, except for the KEYUP’s for keys released after the ALT key (and including it!), which generate regular WM_KEYUP’s. The F10 key as well is a “special key”, generating WM_SYS* versions of its KEYDOWN – I guess because that brings up the menu as well, and I’m unsure about whether its KEYUP is SYS or not. No wonder the virtual key code of the ALT key is VK_MENU.
  • I couldn’t obviously derive from MSDN whether having CTRL or SHIFT pressed at the same time as ALT would buy me SYS messages or plain vanilla messages.
  • European and US-int’l mapping treat the right ALT key as a different AltGr key, that buys some extra characters that are not placed in regular positions. This key is sent to the application as having both CTRL and ALT pressed at the same time, and the system takes care of mapping the WM_KEYDOWN to the appropriate char code for the WM_CHAR message.
  • If you are interested in your Asian customers, you have a new world of IME messages to serve you. It opens a whole new avenue for spec nightmare – suffice it to say that you will be receiving two WM_CHAR messages in a row for some complex kanji or hanzi or hangeul double-byte character, corresponding to no keypress, as the user selects complex characters from his/her IME window. Your US-only-thought application may even work if you just don’t shuffle your WM_CHAR’s too much.

After writing the list, I’m looking back at the title of the post and wondering if it’s really a good question. Anyway.

Coming back to development, I keep a “todo list” of vi/vim commands that need implementation. It happens that the innocent “Redo” command in vi/vim is achieved by the Ctrl-R key combination. There I went and mapped the ‘r’ WM_CHAR with the CTRL key modifier set to the undo command, pressed Ctrl-R and… nothing. To no avail. I checked what the program was receiving, and found out that Windows was translating the VK_R keypress with CTRL into the proper C-R ASCII code, which happens to be 0x12 or 18 in decimal.

In the moment, I just hated the fact, and the memory of how input behaves from previous projects (which was not as clear as I have described above) kicked in, and had me for days happily implementing all sorts of other vi/vim command that didn’t require handling the !~@# control key. I was working, but in some way, I was procrastinating 🙂 Seriously, this is the kind of stuff that keeps me from tackling a task.

After a few days, I finally decided to drag myself to do it, and started researching all the intrincacies of key input, dumping their results, and trying to build a clear mental image – I found out some stuff that I didn’t know, haven’t seen anywhere on the web (of course, MSDN is less than complete).

Here is a nice list of the messages that the R key can generate in its different combinations:


            nothing   CTRL    ALT    CTRL+ALT
  nothing    KD+'r'  KD+^R  SKD+S'r'    KD
  SHIFT      KD+'R'  KD+^R  SKD+S'r'    KD

The row selects between having Shift pressed or not pressed. The column selects between the combination of CTRL and ALT that is pressed. ‘KD’ represents the WM_KEYDOWN message, SKD represents WM_SYSKEYDOWN. A ‘x’ character represents the WM_CHAR message. S’x’ represents WM_SYSCHAR. ^X means the ASCII C-X control code in WM_CHAR. WM_KEYUPS are not shown. And of course, the char code in the left is toggled depending on the CAPSLOCK status (I’m unsure about what happens when CAPSLOCK is on and ALT is used together with the key, although your program’d better not behave too differently).

Other non-alphabetic keys get similar mappings, although they don’t get an WM_CHAR when pressed together with CTRL (in most cases, as we’ll see in a moment).

I set out to try to complete my understanding on the mapping, and wondered what would happen with other keys. It turns out, Windows goes a long length towards trying to get the ASCII code that may correspond to the keypress (in some cases quite dubious to my understanding of affairs).

All (English) alphabetic keys (from A to Z all included) get mapped to their control codes (ASCII 1 to 26). That means that if you press Ctrl-M, your program will receive a nice WM_CHAR message with the 13 code, usually corresponding to the RETURN key. Be sure to filter that if you provide Ctrl+Key mappings! It even works as a replacement to RETURN in unsophisticated programs such as notepad.

Then I set out to find out if Windows would generate the rest of the control codes – poking around, looking at ASCII charts in more detail that I’d like to, I found out that you can generate WM_CHAR with ASCII 28 with C-\, ASCII 29 with C-], ASCII 30 with C-S-6 (which is more like C-^), and ASCII 31 with C-S-hyphen (which should be read as C-_). I usually use a non-US keyboard, for which I can tell you that the ASCII mapping is a bit lousier than for the US mapping, as the ‘\[]^’ symbols are moved around to other keys but Windows still performs the mapping as if the keyboard had US keytops, except in the case of DEADCHAR generating keys which simply behave differently…

The ASCII code 27, that is, ESC, can be both generated by the ESC key and by the C-[ combination.

And I also discovered some curious behavior: RETURN by itself generates WM_CHAR with ASCII 13, while C-RETURN generates WM_CHAR with ASCII 10. I was about not to tell you that C-S-RETURN generates no WM_CHAR, in order to get you lost if you aren’t already 🙂

And a misterious ASCII code, DEL (127), was conspicuously absent on the keyboard – the DEL key would not generate it with any combination of modifier keys. But finally I found it: while BACKSPACE generates ASCII 8 (BS), combined with the CTRL key it generates the famous ASCII 127 DEL character.

Wow… I sincerely hope someone this explanation turns out useful or interesting for anyone. I would’ve been happy to find it myself a couple of weeks ago.

With this understanding, I could finally go and map the Ctrl-R key to the Undo command, go around binding other keys, and go on. I could also comment it with geek friends and all have a good laugh. But something was still dangling in my mind. Apart from awkward, the system seemed flawed, but I could not pinpoint why.

Even if this is all a mess, and disregarding the far-east input stuff, I was thinking that I had to come up with a better design if I was to rant about it. Of course, the SYS nightmare should be easy to fix in a new design (messages should be the same, and the fact that ALT or F10 or ALT-Shortcut brings the menu should be stuffed somewhere else), but something seemed to be fundamentally wrong and I could not grasp it.

Finally, I realized today.

The input model is losing one piece of critical information: the fact that WM_KEYDOWN and WM_CHAR are separate and completely unrelated loses the concept of a keypress. And there is no way of bringing that back.

See, having the system “cook” some type of information, such as the character mapping of the key if the key has a sensible one, is ok. Receiving extra information is great and the system will do a much better job of supporting all national specifics than the app. Even if that information is “cooked” in a weird way, such as the SYS stuff (or even the DEADCHAR stuff), it is ok and you could just cope with it. But you dearly want to be able to handle a keypress in some way, either using or ignoring the “cooked” information or just the raw virtual key code.

Realizing this, I could finally think up what a better scheme would be. Wouldn’t it be great if Windows provided a WM_KEYPRESS message including all the information? Say, the virtual key code, the fully-cooked ASCII mapping that WM_CHAR gets, some sort of indication of a DEADCHAR keypress, the state of the modifier keys, maybe even the uncooked keytop ASCII char if that exists?

Probably, Asian IME could generate keypresses with a null virtual key code to send weird characters to un-aware applications just handling input, but making it clear that no key was pressed for that character to be sent to the application, and the application would not try to match it up with keyboard mappings. Key maps would always refer to virtual keys, and text input would always just take the fully-cooked ASCII component. And we would all be happy.

The good thing is that this could have been implemented in any moment, as adding a new message does not create incompatibility. But we are now stuck with the really weird WM_KEYDOWN/WM_CHAR model.

I won’t hold my breath until Microsoft includes something along the line 🙂

13 Responses to “What’s broken in the WM_KEYDOWN/WM_CHAR input model?”

  1. Eric W. Bachtal Says:

    Wonderful. Thanks for taking the time to share your findings. You’ve saved me countless hours trying to decipher this myself.

  2. J Says:

    You’re welcome Eric. I plan on setting up an “articles” section up on http://www.ngedit.com linking to these posts, as I do think they can be useful. Glad it was useful to you.

  3. Dan Says:

    Ditto, thanks a ton.

  4. Null Says:

    I’d also like to say thanks for sharing your findings and insight. And Google certainly seems to think highly of you. 🙂

  5. J Says:

    You’re welcome. Google certainly likes this article – first page on both WM_CHAR and WM_KEYDOWN! I would certainly trade that, though, for a top spot in other keywords 🙂 But at least programmer souls confused by the confusing input model will find this easily.

  6. Dennis Says:

    Thanks for the great summary!

    You mention that the concept of a keypress is lost, because WM_KEYDOWN and WM_CHAR are completely unrelated. However, at least according to MSDN (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/winui/winui/windowsuserinterface/userinput/keyboardinput/aboutkeyboardinput.asp) – I know you don’t think highly of MSDN 😉 – WM_KEYDOWN messages contain a “previous key-state flag” in their lParam value. If this flag is 1, the WM_KEYDOWN message is a result of the autorepeat functionality. Further down on the same page they say, “The contents of the lParam parameter of a character message are identical to the contents of the lParam parameter of the key-down message that was translated to produce the character message.”
    So it should be possible to recognize keypress events by examining vanilla WM_KEYDOWN or cooked WM_CHAR events. Am I missing something?

  7. J Says:

    Dennis, thanks a lot for the pointer, and it may very well be that you are completely right. I will certainly research the issue. It will be incredibly useful, as establishing a reliable link betwen an WM_KEYDOWN and its WM_CHAR would allow me to recover ‘keypresses’ with all the relevant info and provide a solid key remapping model. I’ll post about what I find out.

    Thanks!

  8. pieter Says:

    Well, say when receiving WM_KEYDOWN you remember the lParam, in order to compare it to the lParam of the next WM_CHAR message, but what with events that only generate WM_KEYDOWN (e.g the arrow keys)? There you are with your lParam. This would be a solution if with the WM_KEYDOWN some kind of flag was passed that says if you can expect a corresponding WM_CHAR or not.
    Still, thanks a lot for all the information.

  9. Glenn Says:

    Kudos on this page. not much to add but I like it and found it to be extremely useful. Wanted to use that delete key myself! Nuts.

  10. J Says:

    Glenn, thanks a lot. I’m surprised but happy that this article is #2 if you Google for WM_CHAR and #6 for WM_KEYDOWN. It certainly ensures easy access to the info for other programmers trying to slay the same dragon. I’m now implementing full mapping support in my vi emulator for VS, and it’s proving daunting!

  11. Reuben Says:

    Thanks for the USEFUL information.

    I have also noticed that you can trap the delete key as wParam = 46 in a WM_KEYDOWN message.
    I have used this to stop users using the delete key in certain fields.

  12. Julien Picalausa Says:

    First, I would like to say that this post was very inspiring for me. I learned some useful things from it 🙂

    Also, this is a simple way that seems to work for me when trying to get the information of WM_KEYDOWN and WM_CHAR together:

    BOOL MyTranslateMessage(const MSG* message)
    {
    if (!TranslateMessage(message))
    return FALSE;

    MSG translated_message;
    BOOL has_translated_message =PeekMessage(&translated_message, message->hwnd, WM_KEYFIRST, WM_KEYLAST, PM_REMOVE);

    if (has_translated_message)
    {
    //Do stuff with the data of the translated message and the original message
    }
    else
    {
    //Do stuff with the message
    }

    }

    I cannot guarantee this to be entirely flawless since I’ve only been using it for testing purposes, but logically, you generate one message with TranslateMessage that you retrieve and remove directly with PeekMessage, so unless something else posts WM_CHAR messages to your message queue, you should be fine.

  13. J Says:

    Julien, that might be a workable solution in a stand-alone app that you control. Nowadays, in my ViEmus, I am subclassing Visual Studio’s editor window, and I have rely on just the messages that arrive to the window, so it’s not a solution.

    In any case, if you end up testing the approach, I’d be grateful to learn how it went.

Leave a Reply