Estimating Y chromosome specific microsatellite mutation frequencies using deep rooting pedigrees
Estimating Y chromosome specific microsatellite mutation frequencies using deep rooting pedigreesEvelyne Heyer, Jack Puymirat1, Patrick Dieltjes2, Egbert Bakker2 and Peter de Knijff2,*
Laboratoire D'Athropologie Biologique CNRS UMR152, Musee de L'Homme, 17 place du Trocadero, 75116 Paris, France, 1Unité de recherche en Génétique Humaine, CHU Laval, 2705 boulevard Laurier, Ste-Foy, Quebec, G1V 4G2, Canada and 2MGC-Department of Human Genetics, Leiden University, P.O. Box 9503, 2300 RA Leiden, The Netherlands
Received January 2, 1997;Revised and Accepted February 12, 1997
Recently, a set of highly polymorphic chromosome Y specific microsatellites became available for forensic, population genetic and evolutionary studies. However, the lack of a mutation frequency estimate for these loci prevents a reliable application. We therefore used seven chromosome Y tetranucleotide repeat loci to screen 42 males who are descendants from 12 `founding fathers' by a total number of 213 generations. As a result, we were able to estimate an average chromosome Y tetranucleotide mutation frequency of 0.20% (95% CIL 0.05-0.55). This closely matches the often cited Weber and Wong estimate of 0.21% for a set of autosomal tetranucleotide repeats. Expanding the set of microsatellites with two more loci (a tri- and a pentanucleotide repeat locus) an average chromosome Y microsatellite mutation frequency of 0.21% (95% CIL 0.06-0.49) was found. These estimates suggest that microsatellites on the Y chromosome have mutation frequencies comparable to those on the autosomes. This supports the hypothesis that slippage-generated growth is the driving force behind the microsatellite variability.
Mitochondrial DNA (mtDNA) has been used widely to trace back the human genetic history with intriguing results (1 -3 ). Recently, these human evolutionary trees based on mtDNA have been confirmed by using large sets of autosomal microsatellites, although some differences in tree topology remained (4 ,5 ). What is still missing is a human evolutionary tree based on highly polymorphic chromosome Y loci. Until recently this has been hampered by the lack of informative markers on the Y chromosome. Only a small number of Y-chromosomal variations have been reported so far (6 -9 ). Most of these markers showed genetic diversity between populations but not between individuals in a single population. This limits their usefulness in some, but not all aspects of evolutionary population genetic research (6 ). Recently, a series of highly polymorphic Y-specific microsatellites have been developed and tested on different population samples (10 -14 ). These markers show genetic Y-chromosomal heterogeneity within and between populations and seem to be very useful to trace back human evolutionary processes at a historical time-scale, i.e. delineating recently split, thus still closely related populations. At least one attempt has been made to use chromosome Y microsatellite heterogeneity for tracing back part of the human evolution at a much longer timescale (15 ). Especially for the latter purpose, its use on an evolutionary rather than historical timescale, a calibration of mutation rates of these chromosome Y markers is essential. To date, no mutation rate estimate for chromosome Y microsatellites was available. We speculated that paternally deep rooting pedigrees would be ideally suited for this purpose since they include a good number of generations which can be analyzed by testing only relatively few individuals. Here we describe, for the first time, such a use of deep rooting pedigrees to estimate the mutation rate of nine polymorphic chromosome Y specific microsatellites.
We analyzed 42 males descending from 12 ancestral `founding fathers' from the Saguenay Region (North-east Quebec, Canada), for nine polymorphic chromosome Y-specific microsatellite loci (Fig. 1 ). The 12 pedigrees harboured 257 independent paternal meioses. One clear case and two possible cases of illegitimacy were identified. Male 050.4376 was found to have a nine-locus chromosome Y haplotype differing from his relatives in seven loci. This makes it very unlikely (p = 8.1 * 10-13, assuming a mutation frequency of 0.21% and using formula 1 in Materials and Methods) that the Y chromosome of this male directly descends from the ancestral male 18219. Therefore we excluded all generations (n = 9) along this genealogical line, leaving 248 independent generations. Less clear cases are illustrated by the descendants of male 18642 (involving 19 generations) and male 18251 (involving 16 generations). They differ for three and two out of nine Y loci, respectively, which have probabilities of 4.3 * 10-5 and 1 * 10-3. We decided to present our results with and without the generations connecting the descendants of these two males in Tables 1 and 1 . However, we will only discuss our findings based on the exclusion of both pedigrees, thereby presenting the most conservative mutation rate estimate.
In this study we present the first empirically derived mutation frequency estimate for chromosome Y microsatellites. It is therefore difficult to put our results in perspective. For human autosomal tetranucleotide repeat loci various studies have resulted in different mutation rates, ranging from 0.015% (16 ) to 0.21% (17 ). Our conservative mutation rate estimate of 0.20%, for chromosome Y tetranucleotide repeat loci only, closely matches the often used Weber and Wong estimate of 0.21%, for a set of chromosome 19 tetranucleotide repeats (17 ). Including one tri- and one pentanucleotide repeat locus in our estimate did not alter our result significantly.
These results indicate that, at least for tri-, tetra- and pentanucleotide repeat loci, the Y chromosome has a mutation rate comparable to, if not higher, than the autosomes. This is in strong contrast with the observed reduced nucleotide diversity on the Y chromosome, when compared with the autosomes (7 -9 ). Recently, it was shown that there is a 7.9-fold reduction of nucleotide diversity on the Y chromosome when compared with the autosomes, where a 4-fold reduction (due to its reduced effective population size) was expected (15 ).
Our conservative estimate has several consequences. (i) It provides strong support for a major role of slippage related processes leading to microsatellite variability (18 ). If a recombination-related process would be one of the driving forces (as has been suggested for minisatellites), Y-microsatellites, because of the lack of recombination, would have a much lower mutation rate. (ii) This estimate also indicates that chromosome Y microsatellites could be of little use for creating a deep enough rooting human genetic evolutionary tree as can be illustrated by one example:
Recently, it was attempted to date the pre-Columbian peopling of the New World using chromosome Y haplotypes based on a newly identified C -> T transition and the tetranucleotide microsatellite DYS19. From this study it was concluded that this specific haplotype was introduced in the New World at least 30 000 years ago (15 ). This estimate assumed a (autosomal) microsatellite mutation rate of 0.015% (16 ) because no estimate for chromosome Y microsatellites was available. Recalculating the above, but now using the Weber and Wong estimate of 0.21% (17 ) the authors derived a date of entry of only 2147 years which they regarded as an obvious underestimate (15 ). Our empirically derived Y-tetranucleotide mutation rate of 0.20% evidently results in almost the same, quite recent estimated age of introduction. Even when our lowest 95% CIL (0.05) is used, entry of the haplotype of interest to the New World will not be dated earlier than ~7500 years ago. As a consequence, our data would predict that the use of chromosome Y haplotypes, strictly based on microsatellites, for defining a human evolutionary tree would result in relative short branches. Because of the high mutation rate of Y microsatellites, identical haplotypes will be found across the world because of recurrent mutations and not because of direct descent. Precisely this has been found recently by Deka et al. (19 ) and by us (M. Kayser et al. in preparation) in attempts to use chromosome Y microsatellite based haplotypes to construct a Y-based human evolutionary tree.
As explained above, our estimate of 0.20% is most likely an underestimate since it is based only on those pedigrees in which only one mutation event was observed (Tables 1 and 1 ). For one locus, DYS19, only two mutation events were observed among 626 father-son pairs, resulting in a mutation frequency estimate of 0.32% (95% CIL 0.04-0.67) (M. Kayser and L. Roewer, in preparation) which falls well within our range of less conservative mutation estimates.
In conclusion, our results seem to suggest that, because of their relative high mutation frequency, chromosome Y microsatellites offer little help in tracing back our ancient genetic history. At most, they would be useful in historical, rather than evolutionary studies, which has already been demonstrated by us (12 ) and others (13 ,14 ).
Because of the high prevalence of different genetic disorders, since 1971 the Interuniversity Institute for Population Research has been studying the Saguenay Region (North-east Quebec, Canada) and has developed a genealogical database in collaboration with the Historical Demography Research Program (PRDH, University of Montreal, Canada). This genealogical database traces back pedigrees of contemporary individuals until the 17th century. For some of these individuals, blood samples were available from previous studies. Among them, we identified all pairs of male individuals who were related through paternal lineage. Drawing the pedigrees, we calculated the number of generations (Fig. 1 ).
All males were screened, as previously described (12 ), for the following tetranucleotide repeat loci (GDB locus names given): DYS19, DYS385a and b, DYS389, DYS390, DYS391, and DYS393. In addition, DYS392, a trinucleotide repeat and DXYS156, a pentanucleotide repeat were used.
Primer sequences and PCR conditions were essentially as published (12 ). In short, fluorescent PCR assays were run in 25 [mu]l volumes containing 20-100 ng genomic DNA, 50 mM KCl, 10 mM Tris-HCl (pH 8.3), 1.5 mM MgCl2, 0.2 mM of each dNTP (Pharmacia, Uppsala, Sweden), 100 ng of each primer with the forward primer marked by a 5' FITC label, 0.001% (w/v) gelatin and 1 U Amplitaq polymerase (Perkin Elmer/Roche Molecular Systems, Inc., Branchburg, NJ, USA) followed by separation of the PCR products on 6% denaturing polyacrylamide gels using the ALFTM automated sequencer (Pharmacia, Uppsala, Sweden) to detect and analyze the PCR products. Using the GDB primers, two PCR products are detected for DYS389 and DYS385. For DYS389, the longest PCR product identifies all repetitive units contained in this locus whereas the shorter fragment only harbours part of the entire locus. Therefore we only used the length variability of the longest fragment (P. de Knijff, in preparation) in this study. For DYS385 no complete sequence information is available yet, but preliminary studies suggest that the two PCR products for this locus indeed reflect two independent loci (unpublished). We therefore regarded the two DYS385 products, DYS385a and DYS385b, as two independent loci. Reference samples, allelic ladders, detailed protocols and further information on the markers used are available from the corresponding author.
Calculations
The probability P that with a mutation rate [mu]after G generations N loci can be mutated is calculated by using the following formula:
Formula 1: PN(G,[mu])=[1-(1-[mu])G]N.
For a given mutation rate [mu], the number of mutations X observed among M independent meioses follows a binomial distribution of parameter [mu]and M (formula 2, below). The best estimator of [mu], is the value that maximises P: = X/M. The 95% interval is given by the two values of [mu] so that P = 2.5%. These two values have been estimated by simulation. In the case where X = 0, we can only calculate the maximum value of [mu] so that any higher rate would have a less than 5% probability to give 0 mutations.
Formula 2: P ( X = x ) = {left [ pile {M above back 9 down 38 x} right ]} {{^ ^ mu} sup x} ( 1 - mu {) sup {m - x}}
All samples were genotyped at the Leiden laboratory without prior knowledge about the pedigree structures. After genotyping, a nine locus chromosome Y haplotype could be constructed for each individual, with the nine loci in the following order: DYS19-DYS390-DYS391-DYS392-DYS393-DYS389-DYS385a-DYS385b-DXYS156. For the ease of interpretation the allele designation for each locus was re-coded into a simple code where 1 indicates the smallest observed allele for a given locus, and any number higher indicates the increase in size in number of repeat units relative to this shortest allele. For the loci analyzed by us, the shortest alleles contained the following number of repeats: DYS19 1 = (n-repeats in shortest allele) 13; DYS390 1=7; DYS391 1=8; DYS392 1=9; DYS393 1=12; DYS389 1=25; DXYS156 1=11. Since for DYS385a and DYS385b the exact number of repeats for each locus is still unknown, allele 1 just indicates the shortest allele.
1 Vigilant, L., Stoneking, M., Harpending, H., Hawkes, K. and Wilson, A.C. (1991) African populations and the evolution of human mitochondrial DNA. Science253, 1503-1507.MEDLINE Abstract
2 Stoneking, M. (1994) Mitochondrial DNA and human evolution. J. Bioener. Biomem. 26, 251-259.
3 Ayala, F.J. (1995) The myth of Eve: molecular biology and human origins. Science270, 1930-1936.MEDLINE Abstract
4 Bowcock, A.M., Ruiz-Linares, A., Tomfohrde, J., Minch, E., Kidd, J.R. and Cavalli-Sforza, L.L. (1994) High resolution of human evolutionary trees with polymorphic microsatellites. Nature368, 455-457.MEDLINE Abstract
5 Jorde, L.B., Bamshad, M.J., Watkins, W.S., Zenger, R., Fraley, A.E., Krakowiak, P.A., Carpenter, K.D., Soodyall, H., Jenkins, T., and Rogers, A.R. (1995) Origins and affinities of modern humans: a comparison of mitochondrial and nuclear genetic data. Am. J. Hum. Genet.57, 523-538.MEDLINE Abstract
6 Jobling, M.A. and Tyler-Smith, C. (1995) Father and sons-the Y chromosome and human evolution. Trends Genet., 11, 449-456.MEDLINE Abstract
7 Dorit, R.L., Akashi, H. and Gilbert, W. (1995) Absence of polymorphism at the ZFY locus on the human Y chromosome. Science268, 1183-1185.MEDLINE Abstract
8 Hammer, M.F. (1995) A recent common ancestry for human Y chromosomes. Nature378, 376-378.MEDLINE Abstract
9 Whitfield, L.S., Sulston, J.E. and Goodfellow, P.N. (1995) Sequence variation of the human Y chromosome. Nature378, 379-380.MEDLINE Abstract
10 Mathias, N., Bayes, M. & Tyler-Smith, C. (1994) Highly informative compound haplotypes for the human Y chromosome. Hum. Mol. Genet.3, 115-123.MEDLINE Abstract
11 Roewer, L., Arnemann, J., Spurr, N.K., Grzeschik, K.-H. and Epplen, J.T. (1992) Simple repeat sequences on the human Y chromosome are equally polymorphic as their autosomal counterparts. Hum. Genet.89, 389-394.MEDLINE Abstract
12 Roewer, L., Kayser, M., Dieltjes, P., Nagy, M., Bakker, E., Krawczak, M., and de Knijff, P. (1996) Analysis of molecular variance (AMOVA) of Y-chromosome-specific microsatellites in two closely related human populations. Hum. Mol. Genet.5, 1029-1033.MEDLINE Abstract
13 Cooper, G., Amos, W., Hoffman, D., and Rubinsztein D.C. (1996) Network analysis of human Y microsatellite haplotypes. Hum. Mol. Genet.5, 1759-1766.
14 Jobling, M.A., Samara, V., Pandya, A., Fretwell, N., Bernasconi, B., Mitchell, R.J., Gerelsaikhan, T., Dashnyam, B., Sajantila, A., Salo, P.J., Nakahori, Y., Disteche C.M., Thangaraj, K., Singh, L., Crawford, M.H. and Tyler-Smith, C. (1996) Recurrent duplication and deletion polymorphisms on the long arm of the Y chromosome in normal males. Hum. Mol. Genet.5, 1767-1775.
15 Underhill, P.A., Jin, L., Zemans R., Oefner, P.J., and Cavalli-Sforza, L.L. (1996) A pre-Columbian Y chromosome-specific transition and its implications for human evolutionary history. Proc. Natl. Acad. Sci. USA93, 196-200.MEDLINE Abstract
16 Jin, L., Zhong, Y., Shriver, M.D., Deka, R. and Chakraborty R. (1994) Distribution of repeat unit differences between alleles at tandem repeat microsatellite loci. Am. J. Hum. Genet.55, Suppl., 39 (abstr.).
17 Weber, J.L. and Wong, C. (1993) Mutation of human short tandem repeats. Hum. Mol. Genet.2, 1123-1128.MEDLINE Abstract
18 Dover, G. (1995) Slippery DNA runs on and on and on... Nature Genet. 10, 254-256.MEDLINE Abstract
19 Deka, R., Jin, L., Shriver, M.D., Yu, L.M., Saha, N., Barrantes, R., Chakraborty, R. and Ferrel, R.E. (1996) Dispersion of human Y chromosome haplotypes based on five microsatellites in global populations. Genome Res.6, 1177-1184.MEDLINE Abstract
*To whom correspondence should be addressed. Tel: +31 71 527 4318; Fax: +31 71 527 4517; Email: knijff@ruly46.leidenuniv.nl
-->
This page is maintained by OUP admin. Last updated Fri Apr 11 08:44:24 BST 1997. Part of the OUP Journals World Wide Web service.
Copyright
Oxford University Press, 1996