Access to user mobility and activity data (uMAD) is crucial for researchers and practitioners in various areas of technology and infrastructure planning. It reveals a number of aspects of user behavior and trends at different spatio-temporal scales which in turn provide invaluable information to guide the design, operation, and management of critical infrastructure, services and applications. However, previous academic/industry efforts to collect user mobility and activity (uMA) information face important challenges raised by issues such as uMA data diversity, privacy and protection concerns. Consequently, even if uMA data is collected successfully, it cannot be generalized and/or shared publicly. To address these challenges, there has been significant work on the generation of synthetic uMA datasets as well as work on data anonymization. Prior work in these areas, however,target specific applications and datasets, and thus make it harder to generalize them for use across different scenarios.
Our aim is to fill these gaps by providing an uMA ecosystem that manages classification, generation, evaluation and analysis. As part of this goal, our pipeline uMAD aims to include the following features: Classification: enabling existing or new uMA data to be classified into our proposed taxonomy buckets; Generation: allowing users to capture patterns and generate realistic uMA datasets by leveraging well known Machine learning generation models like Generative Adversarial Networks (GANs); Trace Analysis: helping users analyze and visualize patterns in existing and new uMA datasets; and Model Analysis providing users with a broad understanding of the ML model resource consumption and parameters. uMAD’s open source a command line interface (CLI) is eventually meant to generate realistic synthetic uMA datasets that mimic existing traces for a range of user-configurable parameters and provide users with existing datasets that can be selected based on the users’ specific needs.