Human Molecular Genetics Advance Access originally published online on September 25, 2009
Human Molecular Genetics 2009 18(24):4853-4867; doi:10.1093/hmg/ddp457
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Detecting natural selection by empirical comparison to random regions of the genome
1 Department of Genetics, Harvard Medical School, Boston, MA, USA, 2 Broad Institute of MIT and Harvard, Cambridge, MA, USA, 3 Division of Neurogenetics and Howard Hughes Medical Institute, Beth Israel Deaconess Medical Center, Boston, MA, USA, 4 Division of Genetics, Children's Hospital, Boston, MA, USA, 5 Department of Molecular and Human Genetics, Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA, 6 Department of Biology, Rensselaer Polytechnic Institute, Center for Biotechnology and Interdisciplinary Studies, Troy, NY USA, and 7 Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY, USA
* To whom correspondence should be addressed at: Department of Molecular and Human Genetics, Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, N1629, Houston, TX 77030, USA. Tel: +1 7137987676; Fax: +1 7137985741; Email: fyu{at}bcm.edu
Received August 22, 2009; Accepted September 23, 2009
Historical episodes of natural selection can skew the frequencies of genetic variants, leaving a signature that can persist for many tens or even hundreds of thousands of years. However, formal tests for selection based on allele frequency skew require strong assumptions about demographic history and mutation, which are rarely well understood. Here, we develop an empirical approach to test for signals of selection that compares patterns of genetic variation at a candidate locus with matched random regions of the genome collected in the same way. We apply this approach to four genes that have been implicated in syndromes of impaired neurological development, comparing the pattern of variation in our re-sequencing data with a large-scale, genomic data set that provides an empirical null distribution. We confirm a previously reported signal at FOXP2, and find a novel signal of selection centered at AHI1, a gene that is involved in motor and behavior abnormalities. The locus is marked by many high frequency derived alleles in non-Africans that are of low frequency in Africans, suggesting that selection at this or a closely neighboring gene occurred in the ancestral population of non-Africans. Our study also provides a prototype for how empirical scans for ancient selection can be carried out once many genomes are sequenced.