To be clear, the researchers did not gain access to Apple’s headset to see what they were viewing. Instead, they worked out what people were typing by remotely analyzing the eye movements of a virtual avatar created by the Vision Pro. This avatar can be used in Zoom calls, Teams, Slack, Reddit, Tinder, Twitter, Skype, and FaceTime. The researchers alerted Apple to the vulnerability in April, and the company issued a patch to stop the potential for data to leak at the end of July. It is the first attack to exploit people’s “gaze” data in this way, the researchers say. The findings underline how people’s biometric data — information and measurements about your body — can expose sensitive information and beused as part of the burgeoning surveillance industry.
The GAZEploit attack consists of two parts, says Zhan, one of the lead researchers. First, the researchers created a way to identify when someone wearing the Vision Pro is typing by analyzing the 3D avatar they are sharing. For this, they trained a recurrent neural network, a type of deep learning model, with recordings of 30 people’s avatars while they completed a variety of typing tasks. When someone is typing using the Vision Pro, their gaze fixates on the key they are likely to press, the researchers say, before quickly moving to the next key. “When we are typing our gaze will show some regular patterns,” Zhan says. Wang says these patterns are more common during typing than if someone is browsing a website or watching a video while wearing the headset. “During tasks like gaze typing, the frequency of your eye blinking decreases because you are more focused,” Wang says. In short: Looking at a QWERTY keyboard and moving between the letters is a pretty distinct behavior.
The second part of the research, Zhan explains, uses geometric calculations to work out where someone has positioned the keyboard and the size they’ve made it. “The only requirement is that as long as we get enough gaze information that can accurately recover the keyboard, then all following keystrokes can be detected.” Combining these two elements, they were able to predict the keys someone was likely to be typing. In a series of lab tests, they didn’t have any knowledge of the victim’s typing habits, speed, or know where the keyboard was placed. However, the researchers could predict the correct letters typed, in a maximum of five guesses, with 92.1 percent accuracy in messages, 77 percent of the time for passwords, 73 percent of the time for PINs, and 86.1 percent of occasions for emails, URLs, and webpages. (On the first guess, the letters would be right between 35 and 59 percent of the time, depending on what kind of information they were trying to work out.) Duplicate letters and typos add extra challenges.
Read more of this story at Slashdot.