- Main
Efficient and Secure Learning across Memory Hierarchy
- Gupta, Saransh
- Advisor(s): Rosing, Tajana S
Abstract
Recent years have witnessed a rapid growth in the amount of generated data. Learning algorithms, like hyperdimensional (HD) computing, promise to reduce the computation complexity of processing such a huge amount of data. However, traditional computing systems are highly inefficient for such algorithms, mainly due to the limited cache capacity and memory bandwidth. Processing in-memory (PIM) is an emerging paradigm which tries to address these issues by using memories as computing units. In this dissertation, we propose a PIM-based HD computing architecture that accelerates all phases of the HD computing pipeline namely, encoding, training, retraining, and inference. Our architecture is enabled by fast and energy-efficient in-memory logic operations, combined with a hardware-friendly distance metric. However, the improvements from PIM decrease as the size of the dataset increases beyond the memory capacity. Hence, we also design an in-storage computing (ISC) solution. Our ISC design includes on-flash-chip acceleration of HD encoding, which encodes multiple data points in parallel across different flash chips, exploiting the high parallelism provided by the flash hierarchy. This is supported by a controller-level accelerator that performs HD training, retraining, inference, and clustering. Our proposed PIM and ISC solutions provide 434x and 222x speedup as compared to the state-of-the-art HD computing implementations on CPU.Many applications, most notably in healthcare, finance, and defense, rely on cloud computing for learning tasks and demand privacy which today’s solutions cannot fully provide. Fully homomorphic encryption (FHE) elevates the bar of today’s solutions by adding confidentiality of data during processing, while introducing noticeable data size expansion - the ciphertext is 5000x bigger than the aggregate of native data types. In this dissertation, we present a design of the first PIM-based accelerator of both client and server using the latest Ring-GSW based homomorphic encryption schemes. Our design supports various security levels and provides on average 2007x higher throughput than CPU while running FHE-enabled neural networks. This improvement comes from a significant reduction in total data-transfers and the high number of processing in-memory cores, which enables higher parallelism and deeper pipelining in our design.
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-