OS Detecting QMK keyboard
What bits are exchanged when you plug in a USB keyboard? Can you detect the OS with those bits?
Recently, I got a mechanical keyboard named KBDfans Tiger Lite as a gift (Thanks Rose!). The keyboard runs the QMK (Quantum Mechanical Keyboard) keyboard firmware, which is open source and allows easy custom modification of the keyboard. I was pretty excited to get my hands on it, since I've been wanting to be able to customize my keyboard firmware.
There were a few tweaks I wanted to make directly on the keyboard that were either impossible or hard to do reliably otherwise:
- Consistent Media key layout across computers.
- On Windows, AutoHotkey can't intercept keystrokes for Admin applications without also itself running as Admin, which I didn't want to do.
- On Mac:
- I want to use a single modifier key (i.e., Right Alt) to switch between languages. MacOS doesn't allow a single-modifier shortcuts, and the Mac Fn key, which can switch languages, is non-trivial to simulate.
- I found the MacOS emoji shortcut ^⌘+Space is legitimately hard to enter for me, compared to the Windows equivalent ⊞+..
- I wanted the keyboard to automatically switch to the MacOS-friendly layout when I switch from my personal PC to my work laptop, and vice versa.
The first three items were fairly simple to do with QMK. However, the last item, OS detection, proved to be non-trivial. QMK doesn't have such a feature built-in, and the USB devices don't get much information about its host (i.e., PC) at all.
I was able to put together something that works over the holidays, and I wanted to share the details of how it works.
The idea was first described in an Arduino project called FingerprintUSBHost, and Ruslan Sayfutdinov (KapJI) implemented a working version for QMK. Without the existing examples, I wouldn't have been able to come up with this idea.
Just Merge it!
After merging KapJI's change, the code mostly worked, but it was missing one major feature for me: when I switch between the PC and the Mac, the OS detection stopped working after the first device.
After reading the OS-detection code by KapJI, I understood that there is a function
get_usb_descriptor to return whatever USB descriptor type the host requests, and that their os detection code records the frequency of the value of the
wLength field for "String-type descriptors". I had a vague understanding of what a "usb descriptor" might be (that it's related to the USB device initialization) but wasn't sure how it works. I understood that this counter has to be reset but wasn't sure where to do it. There was documentation for the feature but, it did not explain to me why this works, or what the meaning of those fields were. I decided to dig deeper since I was having a slow week.
How does the OS detection work at all?
- Every USB device goes through the setup process. As part of the setup process, the host requests for a bunch of descriptors to learn about the devices. "Descriptors" are generic structures used to describe a USB device. There are many subtypes of descriptors.
- A USB device is described by one Device descriptor:
- A Device descriptor contains Configuration descriptors (but usually there is only one).
- A Configuration descriptor contains Interface descriptors.
- An Interface descriptor corresponds to what we end-users think of as an actual "device" in the OS. An Interface descriptor contains Endpoint descriptors.
- Endpoint descriptors describe the communication pipes for their interfaces, which is how the interface actually exchange bits and bytes with the host.
- Descriptors can refer to each other. For example, a Device descriptor has its name and its manufacturer, but they are not included in the Device descriptor itself. Rather, the Device descriptor makes a reference to a String descriptor by its index, which contains the actual String data.
- For String Descriptors,
wLengthfield refers to the maximum size of the string that the host is willing to accept.
As an example this was the list of all of its descriptors for my keyboard.
Putting it All Together
With this background, I was finally able to understand how the OS-detection works:
- When the keyboard is plugged in, the host asks for the device descriptor and its subparts.
- For this particular keyboard, there are 3 String descriptors of interest:
0x00: List of supported languages for all String descriptors. For QMK, it's hard-coded as US English
0x01: This is the name of the manufacturer of the keyboard
0x02: This is the name of the product itself (i.e., keyboard)
One interesting quirk that makes the OS-detection work is that the real-world OSes request the same String descriptor multiple times with different
wLength, which specifies the maximum size of the String that the host is willing to accept.
So why does this particular call pattern occur? My guess is that this behaviour exists to work around defective USB devices. For Linux, I was able to find its source for querying String descriptors. First thing to note is that the behaviour that the OS-detection code looks for is consistent with the source code that we see. Linux asks, right away, for strings with
wLength 255 (0xff) and never again if the device is well-behaved. Second, Below the initial String query, you can also see that Linux has different workarounds for defective devices, which didn't kick in for my keyboard.
While I can't read the source for Windows or MacOS, but based on the Linux code, it seems likely that the other OSes also have similar workarounds for different USB devices. Lucky for me, these
wLength patterns occur consistently enough for the OS-detection code to work reliably.
When to reset the OS detection data?
Now that I know how the OS detection works, I still had to find out where to reset the OS detection counter, since the original code didn't have any call to clear them.
My first attempt was to just delete the counter when the host asks for the Device descriptor, since that is the top-most logical object of the device. Unfortunately, this didn't work for a couple of reasons. First, there is no guarantee that the Device descriptor is only queried once and in fact the supported language String descriptor (index 0) can be logically queried even before the Device descriptor itself, since it's independent of the Device descriptor.
After going through the QMK source code, I found that there is an undocumented hook named
notify_usb_device_state_change_user, which gets called on USB reset, which happens when the KVM switches between the host devices.
The original code also doesn't specify how long I need to wait after the USB-reset before executing the OS-detection (it just says "probably hundreds of milliseconds") but by now I knew exactly how long:
Based on my observation, however, 1 second was plenty, so that's what I settled with, and now I have a cool, one-of-a-kind keyboard 😁