Human Molecular Genetics Advance Access published online on June 24, 2009
Human Molecular Genetics, doi:10.1093/hmg/ddp295
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Harnessing the Information Contained Within Genome-wide Association Studies to Improve Individual Prediction of Complex Disease Risk
1 MRC Centre for Causal Analyses in Translational Epidemiology, Department of Social Medicine, University of Bristol, United Kingdom 2 Genetic Epidemiology and Queensland Statistical Genetics, Queensland Institute of Medical Research, Australia
* Address Correspondence to: David M. Evans. MRC Centre for Causal Analyses in Translational Epidemiology, Department of Social Medicine, University of Bristol, Oakfield House, Oakfield Grove, Bristol, BS8 2BN, United Kingdom. Tel: +44 (0)117 3310094, Fax: +44 (0)117 3310123. Email: dave.evans{at}bristol.ac.uk.
Received February 26, 2009; Revised June 15, 2009; Accepted June 22, 2009
The current paradigm within genetic diagnostics is to test individuals only at loci known to affect risk of complex disease- yet the technology exists to genotype an individual at thousands of loci across the genome. We investigated whether information from genome-wide association studies could be harnessed to improve discrimination of complex disease affection status. We employed genome-wide data from the Wellcome Trust Case Control Consortium to test this hypothesis. Each disease cohort together with the same set of controls were split into two samples- a "Training Set", where thousands of SNPs that might predispose to disease risk were identified, and a "Prediction Set", where the discriminatory ability of these SNPs was assessed. Genome-wide scores consisting of, for example, the total number of risk alleles an individual carries were calculated for each individual in the prediction set. Case-control status was regressed on this score and the area under the receiver operator characteristic curve (AUC) estimated. In most cases, a liberal inclusion of SNPs in the genome-wide score improved AUC compared to a more stringent selection of top SNPs, but didn't perform as well as selection based upon established variants. The addition of genome-wide scores to known variant information produced only a limited increase in discriminative accuracy but was most effective for bipolar disorder, coronary heart disease and type II diabetes. We conclude that this small increase in discriminative accuracy is unlikely to be of diagnostic or predictive utility at the present time.