Abstract Background Patient-specific aberrant expression patterns in conjunction with functional screening assays can guide elucidation of the cancer genome architecture and identification of therapeutic targets. Since most statistical methods for expression analysis are focused on differences between experimental groups, the performance of approaches for patient-specific expression analyses are currently less well characterized. A comparison of methods for the identification of genes that are dysregulated relative to a single sample in a given set of experimental samples, to our knowledge, has not been performed. Methods We systematically evaluated several methods including variations on the nearest neighbor based outlying degree method, as well as the Zscore and a robust variant for their suitability to detect patient-specific events. The methods were assessed using both simulations and expression data from a cohort of pediatric acute B lymphoblastic leukemia patients. Results We first assessed power and false discovery rates using simulations and found that even under optimal conditions, high effect sizes (>4 unit differences) were necessary to have acceptable power for any method (>0.9) though high false discovery rates (>0.1) were pervasive across simulation conditions. Next we introduced a technical factor into the simulation and found that performance was reduced for all methods and that using weights with the outlying degree could provide performance gains depending on the number of samples and genes affected by the technical factor. In our use case that highlights the integration of functional assays and aberrant expression in a patient cohort (the identification of gene dysregulation events associated with the targets from a siRNA screen), we demonstrated that both the outlying degree and the Zscore can successfully identify genes dysregulated in one patient sample. However, only the outlying degree can identify genes dysregulated across several patient samples. Conclusion Our results show that outlying degree methods may be a useful alternative to the Zscore or Rscore in a personalized medicine context especially in small to medium sized (between 10 and 50 samples) expression datasets with moderate to high sample-to-sample variability. From these results we provide guidelines for detection of aberrant expression in a precision medicine context.