The adoption of virtual reality (VR) technologies has rapidly gained momentum in recent years as companies around the world begin to position the "metaverse" as the next major medium of human-computer interaction. The latest generation of VR devices, including the Apple Vision Pro and Meta Quest 3, blur the lines between virtual and augmented reality (AR), resulting in extended reality (XR) systems that are expected to be more deeply and seamlessly integrated with our daily lives than ever before. As companies with a clouded reputation for respecting user privacy become increasingly involved in XR development, the attention of researchers and the general public is rightly shifting toward the unique security and privacy threats that these platforms may pose.
Motion tracking "telemetry" data lies at the core of nearly all modern XR and metaverse experiences. While it has long been known that people reveal information about themselves via their motion, the extent to which these findings apply to XR platforms has, until recently, not been widely understood, with most users perceiving motion to be amongst the more innocuous categories of data in XR. Contrary to these perceptions, this dissertation explores the unprecedented risks and opportunities of XR motion data. We present both a series of attacks that illustrate the severity of the XR privacy threat and a set of defensive countermeasures to protect user privacy in XR while maintaining a positive user experience.
We first present a detailed systematization of the landscape of VR privacy attacks and defenses by proposing a comprehensive taxonomy of data attributes, information flow, adversaries, and countermeasures based an analysis of over 60 prior studies. We then identify and describe a novel dataset of over 4.7 million motion capture recordings, voluntarily submitted by over 105,000 XR device users from over 50 countries. In addition to being over 200 times larger than the largest prior motion capture research dataset, this data is critical to enabling several major contributions throughout this dissertation.
First, using our new dataset, we show that a large number of real VR users (N=55,541) can be uniquely and reliably identified across multiple sessions using just their head and hand motion relative to virtual objects. After training a classification model on 5 minutes of data per person, a user can be uniquely identified amongst the entire pool of 55,541 with 94.33% accuracy from 100 seconds of motion, and with 73.20% accuracy from just 10 seconds of motion. Then, we go a step further, showing that a variety of private user information can be inferred just by analyzing motion data recorded from VR devices. After conducting a large-scale survey of VR users (N=1,006) with dozens of questions ranging from background and demographics to behavioral patterns and health information, we demonstrate that simple machine learning models can accurately and consistently infer over 40 personal attributes from VR motion data alone. In a third study, we show that adversarially designed VR games can infer an even wider range of attributes than can be observed by passive observation alone. After inviting 50 study participants to play an innocent-looking "escape room" game in VR, we show that an adversarial program could accurately infer over 25 of their data attributes, from anthropometrics like height and wingspan to demographics like age and gender.
While users have, to some extent, grown accustomed to privacy attacks on the web, metaverse platforms carry many of the privacy risks of the conventional internet (and more) while at present offering few of the defensive utilities that users are accustomed to having access to. To remedy this, we present the first known method of implementing an "incognito mode" for VR. Our technique leverages local ε-differential privacy to quantifiably obscure sensitive user data attributes, with a focus on intelligently adding noise when and where it is needed most to maximize privacy while minimizing usability impact. However, we then demonstrate a state-of-the-art VR identification model architecture that can convincingly bypass this anonymization technique when trained on a sufficiently large dataset. Therefore, we ultimately propose a "deep motion masking" approach that scalably and effectively facilitates the real-time anonymization of VR telemetry data. Through a large-scale user study (N=182), we demonstrate that our method is effective at achieving both cross-session unlinkability and indistinguishability of anonymized motion data.
This dissertation represents a comprehensive tour of the unique set of privacy risks presented by XR technologies. In doing so, our aim is not to discourage the use of XR devices but rather to provide users with an enhanced understanding of the associated hazards and arm them with the tools necessary to mitigate those risks.