Lonkar, Amogh

Optimizing Sparse Graph and Tensor Algorithms

2024

Lonkar, Amogh
Advisor(s): Beamer, Scott

Creative Commons 'BY' version 4.0 license

Abstract

Most big data processing tasks today rely heavily on graph and tensor algorithms to uncover useful information within real-world data. Graph algorithms are used to model relationships between entities, such as social networks. On the other hand, tensor algorithms are essential for processing multi-dimensional data, such as those encountered in machine learning, image recognition, and natural language processing.

Real-world data is sparse and irregular, making these data processing tasks difficult for modern processors. These algorithms require a significant amount of communication between compute and memory, and sometimes between different compute nodes for very large problem sizes. In this dissertation, we analyze graph and tensor workloads to understand data access patterns. We then present optimizations for improving data communication efficiency within the system to improve performance. We consider multiple problems, such as clique counting in graphs (computationally intensive), sparse tensor decomposition (effects of higher dimensionality) and distributed breadth-first search (synchronization and communication across a network).

In this dissertation, we present ComSpark, a clique counting algorithm which scales linearly on CPUs and outperforms GPUs in some cases. To improve the perfomance of sparse tensor decompositions, we present two software optimizations which improve memory bandwidth utilization. Finally, we reduce the amount of communication required in distributed breadth-first search by using asynchronous actor messages. Our techniques allow challenging sparse workloads to run faster, and enable us to process larger graphs and tensors on current hardware.

Main Content

For improved accessibility of PDF content, download the file to your device.

UC Santa Cruz

Optimizing Sparse Graph and Tensor Algorithms