Asthma, like other common diseases, has both genetic and environmental causes. Understanding the heterogeneity in asthma, the genetics of associated traits, and how we introduce error by using certain methods are critical to determining the causes of asthma and other complex diseases.
We examined two ways to define asthma heterogeneity: using statistical clustering methods and using principal components analysis. We compared the fit of these variables and how well they predicted asthma exacerbations in data from 1,085 Latino and African American children with asthma. We found that principal components both fit the data better and predicted exacerbations better than cluster groups. These variables need to be compared to other known predictors of exacerbations.
In addition, we conducted a genome-wide association study and admixture mapping study of bronchodilator response (BDR) in 1,782 Latino children with asthma. Four of the genome-wide significant SNPs were promising rare variants. All four had good dose-response relationships with BDR and two were in promising candidate genes. Our admixture mapping found five regions where a specific ancestry was significantly associated with BDR. Since rare variants are often present on specific ancestral backgrounds, this result supports the hypothesis that rare variants are important for BDR. Unfortunately, replication of individual rare variants is difficult. Future efforts should focus on sequencing the regions we identified to find other rare variants and better understand their function.
Finally, we compared the accuracy of haplotype inference error between four populations from HapMap Phase 3. We found that haplotype inference error was highest in the African populations, intermediate in the Mexican population, and lowest in the European population. In addition, some regions had higher haplotype inference error than others and this was not explained by several measured features of the regions. Comparisons between haplotype association studies across populations should account for possible differences in haplotype inference error between populations.