I’m a professional writer and hobbyist hacker, and I’m bashing a keyboard for at least 75% of my working day. Unsurprisingly, then, I’m also a mechanical keyboard user. So, when I read about a new threat that can remotely tell what you are typing just by listening to the audio of your clicky keyboard, I’m all ears. After all, if a remote hacker could listen in to my typing and grab my passwords and other ‘secrets’ out of the ether, that would be quite the cause for concern.
The threat in question comes by way of a security researcher who has been exploring the viability of such remote eavesdropping activity for a while now. The latest incarnation of the research is called keytap3, and the researcher even has a website where you can try it out for yourself, assuming you are typing in English and don’t type too quickly. It works, and this is very much an over-simplification, by analyzing the frequency of n-grams (contiguous sequences of items from a given sample) in the audio clusters recorded. Well, I say works…
Can the sound of your keyboard reveal what you are typing?
Please don’t get me wrong, this article is not in any way meant to ridicule or demean research such as this. As regular readers of mine will appreciate, and you can see for yourself with a quick search of my published work archive, I’m fascinated and sometimes awed by the work done by security researchers. OK, with that preamble out of the way I’m guessing you may have already realized what the answer to the question posed by the sub-title is. Actually, it’s not a big no but rather a maybe. Not everything, well very little if the truth be told, that is demonstrated under lab conditions will work in the ‘real world’ beyond those carefully controlled boundaries. However, such work can eventually lead to working exploits along the way.
I took part in the test scenario and the result was much as I would expect given that the software analyzing the audio didn’t know what keyboard I was using (and was unlikely to have been trained upon every variation out there – both keyboard model and key switch type) nor the microphone being used to name but two variables that could throw a spanner in the works. That result was gobbledegook and nothing remotely resembling what I had actually been typing. However, it is clear that the researcher in question, Georgi Gerganov, has been improving his techniques and this third incarnation, Keytap3, has had success in the lab and the threat model continues to evolve. I took the liberty, therefore, of having an online chat with Gerganov about why my interactions with Keytap3 failed and where he goes from here.
Do you need to worry about a threat actor being able to hack your password by listening to you type? Heck no. There are far easier ways that they can employ to get your passwords than this, no matter how interesting from a security-nerd perspective it may be. However, watch this space. Or should I say, listen up?
Interview with a listener
I explained to Gerganov that the particular keyboard I was using, a Logitech G915 with the ‘clicky’ key switch option and a Marantz Pro MPM1000 microphone and I was hard-pressed to find a single word that made sense, and none that resembled what I’d been typing. I, therefore, asked if such a threat scenario of translating typing sound to typed text without substantial ‘training’ to encompass different keyboards and microphones, typing styles, and the like, could ever be expected to work. Here’s what he told me:
“One possible explanation to the results that you observe is that simply Keytap3 is somehow overfitted to my setup or style. Even though I have tried to keep the implementation as general as possible, without making unnecessary assumptions about the typing style or the devices (keyboard and mic) it is still possible that the algorithm performs well only in the limited set of environments that I have tested it with. I only have two mechanical keyboards and using them the results are pretty good, but this is certainly a small set of data points to be able to make a conclusion about the efficiency of the approach in general.”
“I am trying to gather more data in order to better understand the limitations of the approach. Anyone willing to help with that can use the “Save” button on the page after recording a sample and send it to me so I can analyze it. I don’t have access to the recordings that people are doing because the test runs completely in the client browser and in order to keep their privacy none of the data is stored or uploaded back to me. That is why I have to rely on people to manually save the record and send it to me via email for example.”
“Another thing you can try is to wait for the analysis to for a little longer. In some cases, it takes a bit more iterations for the algorithm to start showing meaningful results. Also, you can try recording more text – for example, 200 characters. Use the slider before pressing the Init button.”
“The main factor that determines if Keytap will succeed or not, is how accurately it can match the key sounds to one another (i.e. determine if two separate sounds are produced by the same key). Currently, Keytap uses a time-domain cross-correlation metric to match the keys with one another and it is definitely not perfect. It was actually a surprise to me that it performs that well. If this part of the algorithm is improved somehow, I believe it can become highly efficient. It’s interesting to investigate what are the physical properties of mechanical keyboards that make the sounds so easy to differentiate – maybe this will provide insight into which audio similarity metrics are suitable for this purpose. Probably some frequency-domain metrics could provide better results – I have done a few experiments, but without much success.”
“The typing speed factor on the other hand I think is not very important. In the Keytap3 test, I have suggested a certain speed limit (250 CPM) simply because it makes automation much easier. However, a bad actor could easily analyze the recording and manually select the key sounds with high accuracy, instead of relying on automation to do that. A note about the Keytap3 test is that I wanted to make a simple page that is fully automated so that it is very easy for anyone to give it a try using a phone/tablet and a keyboard. An attacker, on the other hand, could create a more sophisticated exploit that allows them to control the different free parameters of the algorithm and potentially be able to recover more information by fine-tuning the algorithm. Once the keys are matched with one another, the problem becomes equivalent to reversing a Substitution Cipher. This problem is well understood and there are different algorithms to solve it. In Keytap3, I use a Beam Search algorithm which performs very well.”