Deep Learning (DL) has unlocked unstructured data for analytics. It has enabled new applications, insights, and value in various domains, including enterprises, domain sciences, and healthcare. However, DL workloads are highly resource-intensive and time-consuming, which hinder their adoption. Thus, optimizing them from a systems standpoint has attracted significant attention in recent years. In this dissertation, we fundamentally re-imagine DL workloads as data processing workloads and optimize them from a data management standpoint. Using a combination of abstractions already available in DL practice, new algorithms, system design, theoretical and empirical analysis, we show how classical query optimization ideas such as rewrites, multi-query optimization, materialization optimization, incremental view maintenance, approximate query processing, and predicate push-down can be re-imagined in the context of DL workloads to optimize them. We show that our techniques can enable significant runtime and resource savings (even over 10X for some cases) for a variety of popular and important end-to-end DL workloads. Our work fills a critical technical gap in DL systems architecture and opens up new connections between query optimization and DL systems.