An essential step in designing a new computer architecture is the careful examination of different design options. It is critical that computer architects have efficient means by which they may estimate the impact of various design options on the overall machine. This task is complicated by the fact that different programs, and even different parts of the same program, may have distinct behaviors that interact with the hardware in different ways. Researchers use very detailed simulators to estimate processor performance, which models every cycle of an executing program. Unfortunately, simulating every cycle of a single benchmark program takes on the order of months to complete. To address this problem we developed analysis techniques for characterizing the time varying program behavior. Using data clustering algorithms from machine learning to automatically find repetitive patterns in a program's execution we can avoid simulating the same behavior many times. By simulating one representative of each repetitive behavior pattern, simulation time can be reduced to hours instead of months for standard benchmark programs, with very little cost in terms of accuracy.This dissertation describes this important problem and the tool we created, called SimPoint, to automatically find simulation points in programs. Additionally, we describe data-mining and statistical advances in doing phase analysis that optimize both the runtime and accuracy of SimPoint as well as target the overall simulation time. We present an approach that finds a single set of simulation points to be used across all binaries for a single program. This allows for simulation of the same parts of program execution despite changes in the binary due to ISA changes or compiler optimizations. Finally, we present a method of characterizing the behavior of parallel applications and use it to pick simulation points to guide multi-threaded simulations