||Machiela MJ, Chen C, Liang L, Diver WR, Stevens VL, Tsilidis KK, Haiman CA, Chanock SJ, Hunter DJ, Kraft P, National Cancer Institute Breast and Prostate Cancer Cohort Consortium, Andriole GL, Berndt SI, Henderson B, Johansson M, Marchand LL, Lindstrom S, Quiros J, Gapstur SM, Gaziano J, Giovannucci E, Panico S, Schumacher F, Stampferz MJ, Tjonneland A, Travis R, Trichopoulos D, Virtamo J, Willett WC, Yeager M, Bueno-de-Mesquita HB
||BACKGROUND: Genotype imputation substantially increases available markers for analysis in genome-wide association studies (GWAS) by leveraging linkage disequilibrium from a reference panel. We sought to (i) investigate the performance of imputation from the August 2010 release of the 1000 Genomes Project (1000GP) in an existing GWAS of prostate cancer, (ii) look for novel associations with prostate cancer risk, (iii) fine-map known prostate cancer susceptibility regions using an approximate Bayesian framework and stepwise regression, and (iv) compare power and efficiency of imputation and de novo sequencing. METHODS: We used 2,782 aggressive prostate cancer cases and 4,458 controls from the NCI Breast and Prostate Cancer Cohort Consortium aggressive prostate cancer GWAS to infer 5.8 million well-imputed autosomal single nucleotide polymorphisms (SNPs). RESULTS: Imputation quality, as measured by correlation between imputed and true allele counts, was higher among common variants than rare variants. We found no novel prostate cancer associations among a subset of 1.2 million well-imputed low-frequency variants. At a genome-wide sequencing cost of $2,500, imputation from SNP arrays is a more powerful strategy than sequencing for detecting disease associations of SNPs with minor allele frequencies (MAF) above 1%. CONCLUSIONS: 1000GP imputation provided dense coverage of previously identified prostate cancer susceptibility regions, highlighting its potential as an inexpensive first-pass approach to fine mapping in regions such as 5p15 and 8q24. Our study shows 1000GP imputation can accurately identify low-frequency variants and stresses the importance of large sample size when studying these variants.