As the new ``race to the moon'', quantum computing can possibly trigger a computation revolution due to its strong potential in several important domains, e.g., cryptography, chemistry simulation, optimization, and machine learning. However, as an emerging research area, grand challenges remain ahead since state-of-the-art quantum computing, from software to hardware, is still highly immature. This dissertation explores high-performance, efficient, and reliable quantum computing systems, and strikes a synergy among different technology stacks, including application, programming language, compiler optimization, hardware architecture design, and simulation. In particular, this dissertation focuses on two directions: 1) cross-layer co-design for quantum computing system; and 2) enabling deep quantum software/compiler optimizations at the high level. In the first direction, this dissertation studies how to efficiently map quantum software to hardware via carefully designed compiler optimization, and then investigates the application-specific architecture design with substantial hardware efficiency improvement. Following the application-specific principle and putting the algorithm optimization and hardware design together, this dissertation proposed a software-hardware co-optimization for chemistry simulation and achieved a wide range of benefits across multiple system stacks. In the second direction, this dissertation explores leveraging the algorithmic information, which is usually carried by new high-level programming languages, to design quantum software optimizations that are hard to implement in conventional quantum software infrastructures. These optimizations include a Pauli-string-based intermediate representation for large-scope compiler optimization on quantum simulation programs, a projection-operator-based runtime assertion language for efficient quantum program testing and debugging, and a trial scheduling technique to identify and eliminate redundant computation in noisy quantum computing simulation.