Physical Design Methods and Research Infrastructure for Machine Learning Accelerators
- Wang, Zhiang
- Advisor(s): Kahng, Andrew B
Abstract
As we move towards the era of artificial intelligence (AI), the rapid development of AI technologies significantly exacerbates the demand for high-performance, energy-efficient machine learning accelerators. However, the design and development of such accelerators presents numerous challenges for integrated-circuit physical design tasks, including (i) large scale of standard cells and extreme memory dominance, (ii) novel dataflow and datapath structures, and (iii) demand for fast backend prototyping tools. Achieving high-quality machine learning accelerators necessitates improved physical design optimization methodologies. Towards this goal, this thesis presents innovative physical design methods and open-source research infrastructure to address challenges arising from modern machine learning accelerator designs in advanced technology nodes.
To address scalability challenges, this thesis develops methods that break down the original problem into smaller, more manageable subproblems, thereby significantly reducing complexity. New partitioning methods are proposed to efficiently partition large-scale designs into multiple blocks, tiles, or devices which can be better handled by place-and-route (P&R) tools. Furthermore, a novel multilevel macro placement approach is presented to overcome the extreme memory dominance seen in modern machine learning accelerators.
In response to the novel dataflow and datapath structures in ML accelerators, this thesis introduces the concept of dataflow-driven placement. The proposed dataflow-driven macro placer and dataflow-driven global placer enhance existing place-and-route algorithms to effectively accommodate these novel structures, thereby helping to realize the performance and power benefits offered by new computer architectures.
Lastly, to address the need for fast placement prototyping tools, a GPU-accelerated global placer is developed, providing efficient placement prototyping capabilities. A roadmap of future research directions in these areas is also provided.