| Human Molecular Genetics | Pages |
Ascertainment bias cannot entirely account for human microsatellites being longer than their chimpanzee homologues
Introduction
Results
Discussion
Materials And Methods
Amplification of microsatellite loci
DNA samples
Acknowledgements
References
Ascertainment bias cannot entirely account for human microsatellites being longer than their chimpanzee homologues
INTRODUCTION
Microsatellites are arguably the most important class of genetic markers yet discovered. Although it is widely accepted that most mutations involve a length change of just one repeat unit (1,2), other aspects of microsatellite evolution remain unclear and the subject of controversy. For example, there is growing evidence from direct detection of germline mutations in pedigrees as well as from population studies that mutations resulting in the addition of a repeat unit are more common than those resulting in the loss of a repeat unit (1-7). However, this pattern has yet to be reconciled satisfactorily with the general rarity of long microsatellites.
We previously examined 40 human microsatellite markers and their homologues in a panel of non-human primates, and showed that human loci tend to be longer (5), a trend which is apparent in several other studies (3,8-13). Taken at face value, these data indicate that, since their most recent common ancestor, more microsatellite expansion mutations have occurred in the lineage leading to humans compared with the lineage leading to chimpanzees. We suggested that this provided evidence that microsatellites tend to expand with time and are doing so more rapidly in humans (14). However, an alternative explanation is that the length differences are due to ascertainment bias arising from the selection of longer than average human loci as markers (15). Here we present the necessary reciprocal experiment showing that human microsatellites tend to be longer than their chimpanzee homologues, regardless of the species from which the loci were cloned. Our data comprise 38 chimpanzee-derived CA-repeat microsatellites (Table 1 and ref. 16) which were amplified in a panel of six chimpanzees and six humans.
RESULTS
Of the 38 chimpanzee-derived microsatellites, nine loci were excluded because they did not amplify products of the expected size consistently, and one because it amplified products from the chimpanzees but not from humans. Length comparisons for the remaining 28 informative loci are presented in Table 2. Fourteen loci were longer in humans than in chimpanzees, eight were longer in chimpanzees and the remaining six showed no significant difference (P > 0.05 by Mann-Whitney two-sample rank test). This ratio of 14:8 does not differ significantly from the ratio of 33:7 longer in humans for human markers (P = 0.18, Fisher's exact test) (5), but does differ very significantly from the converse ratio of 7:33 (P = 0.0007) which would be expected from a reciprocal ascertainment bias. It is important to note that the chimpanzee-derived loci used in this current study have a similar mean repeat length compared with the human microsatellites used previously (5) (P = 0.13, two-tailed t-test). We therefore reject the null hypothesis that the greater length of human microsatellites can be explained by ascertainment bias alone.
Table 1.
| Primers | Primer sequence (5[prime]->3[prime]) | Repeat sequence |
| 21.01F | CCATTGCTATCTACTTCCCTGA | (GT)12 |
| 21.01R2 | CCTGACAAATACAGGTCCCA | |
| 21.04F | GCAGGAGAATCGCTTGAATC | (CA)12 |
| 21.04R2 | TGCATCACTAGGTAAAACAGGTG | |
| 21.11F | GCAACACAGCAAGACTCCAT | (CA)21 |
| 21.11R2 | CAGTAGAGGGTCTCAAACTCTTCA | |
| 21.12F | AAAACACCATGGCACGTGTA | (TG)17 |
| 21.12R | TGATTTTGTATCCTAAAACCTTACTG | |
| 21.14F | TCCTGATGTCAATTTCCTAAACAA | (AC)16 |
| 21.14R | CATTTTATACCAGGGACTTGGG | |
| 21.22F | AAAGATGGGGTGATGTGATAGG | (GT)12 |
| 21.22R | CAGGGCTCATCTCACTTGGT | |
| 21.24F | CCAAAATGTACTTCTATGGCCTG | (GT)13 |
| 21.24R | CAACCAGCCCTCCAGTTAAA | |
| 21.26F2 | GGGATGGTCTTGGCAATTTA | (TG)12 |
| 21.26R | CCCCACCAGAAACACCAAAT | |
| 22.05F | TCTGCCACTTAAAGATGGGG | (GT)16 |
| 22.05R | TCAGGAATTTTAAGAGAAGGAACA | |
| 22.14F | TTGAAAGAATCATTGTAAATACCACC | (GT)14 |
| 22.14R | GGTCTCGCACTTCTGACCTC | |
| 22.37F2 | TTAAAATGGCGTGTGCTCAA | (AT)12T(TG)14 |
| 22.37R2 | CACAAACCAAGGTAGTGGCA | |
| 22.39F | ACATGGCACAAGGGATTGTT | (TG) 16 |
| 22.39R | CCTCCCATGTGGCTAATGTT | |
| 22.40F | TCTTTTGGATTCTAAATCTGATGAAA | (AC)21 |
| 22.40R2 | TTGGTTTTCTCTCCCACCTG | |
| 22.52F | CTGTGGTCCCTTGAAGTCCT | (AC)6ATG(CA)6TGT(AC)6GT(AC)3 |
| 22.52R2 | TGCTCTTCCCAATCTTAGAATGTA | |
| 22.57F | GACCTACCATTGCCTTTGGA | (GT)12 |
| 22.57R2 | CATGTTGCAGGGAAACTGAA | |
| 22.59F2 | CAAAACAGAGGTGGGAGGAA | (GT)14 |
| 22.59R | GGGTCATGATGCTATCTCAGC | |
| 22.62F2 | TATTTTGGCTTCATCTGGCA | (GT)22 |
| 22.62R | ACTTTGTTTTGGGGCAAGTG | |
| 22.67F2 | AGGGGAGACAGTGGGTAGGT | (TG)23 |
| 22.67R | CAAACCTAGCCTGCCTGTTG | |
| 24.30F | AGCTTGTTAACCAGAATTAGG | (GT)15 |
| 24.30R | GCGGATAACAATTTCACACA | |
| 26.06F | CATGGCCCTGATAAGAGGAA | (TG)21 |
| 26.06R | AATGACTTTGGAGCATTGCC | |
| 26.10F | AGGGTGAGGCAGGAGAATTT | (CA)20 |
| 26.10R | TGTGGAACGAGTTCAGCATT | |
| 26.17F | GGTGTCTCTGCTTTCCTTGC | (AC)22 |
| 26.17R | CCAAAAGCATCACGTTACCA | |
| 26.21F | ACTGGTGCCAGGCTACATTT | (GT)18 |
| 26.21R | TGACCTCTGGTTAGTTGCCA | |
| 26.22F | CCCAAGTGTACTTTTCCACCTT | (GA)13(GT)28 |
| 26.22R | GAGGAGAGGTTAAATAGAGACATAGAA | |
| 26.27F | CACCCCTGCATCTTTCAAGT | (AC)23 |
| 26.27R | GTTGGCAGAATTCCCACATT |
Table 2.
| Locus | Human | Chimpanzee | Probability | Conclusion | ||||
| Median length | Size range (bp) | No. of alleles | Median length | Size range (bp) | No. of alleles | |||
| 21.01 | 180 | 170-188 | 7 | 176 | 170-184 | 7 | 0.20 | NS |
| 21.12 | 219 | 181-257 | 8 | 181 | 171-187 | 6 | 0.0002 | hu > ch |
| 21.14 | 168.5 | 160-184 | 7 | 172 | 154-180 | 6 | >0.9 | NS |
| 21.22 | 260.5 | 256-265 | 7 | 256 | 254-258 | 3 | 0.0023 | hu > ch |
| 21.24 | 287 | 279-299 | 7 | 260 | 249-271 | 10 | <0.0001 | hu > ch |
| 21.26 | 183 | 180-188 | 5 | 180 | 172-188 | 7 | 0.11 | NS |
| 22.05 | 147 | 145-151 | 3 | 135 | 131-141 | 5 | <0.0001 | hu > ch |
| 22.14 | 288 | 282-296 | 7 | 276 | 260-280 | 6 | <0.0001 | hu > ch |
| 22.37 | 182 | 178-184 | 4 | 197 | 192-200 | 5 | <0.0001 | ch > hu |
| 22.39 | 192 | 184-198 | 6 | 190 | 186-194 | 5 | 0.029 | hu > ch |
| 22.40 | 177 | 173-181 | 4 | 173 | 159-177 | 6 | 0.0048 | hu > ch |
| 22.52 | 267 | 243-271 | 6 | 254 | 249-257 | 4 | 0.033 | hu > ch |
| 22.57 | 178 | 176-180 | 3 | 182 | 170-196 | 8 | 0.095 | NS |
| 22.59 | 174 | 168-174 | 3 | 189 | 174-200 | 7 | 0.0001 | ch > hu |
| 22.62 | 148 | 146-150 | 3 | 165 | 156-168 | 5 | <0.0001 | ch > hu |
| 22.67 | 158 | 148-162 | 7 | 160 | 150-170 | 8 | 0.24 | NS |
| 26.06 | 163 | 163-175 | 5 | 155 | 149-165 | 8 | 0.0004 | hu > ch |
| 26.10 | 212 | 203-226 | 8 | 213 | 206-220 | 6 | 0.84 | NS |
| 26.17 | 191 | 190-198 | 4 | 181 | 180-186 | 4 | <0.0001 | hu > ch |
| 26.21 | 216 | 212-226 | 7 | 210 | 198-222 | 6 | 0.0066 | hu > ch |
| 26.22 | 121 | 118-122 | 3 | 150 | 138-180 | 8 | <0.0001 | ch > hu |
| pTGT81 | 224 | 220-226 | 4 | 218 | 212-222 | 5 | 0.0009 | hu > ch |
| pTGT211 | 106 | 100-108 | 5 | 116.5 | 110-120 | 5 | <0.0001 | ch > hu |
| pTGT241 | 110 | 107-117 | 5 | 114 | 109-121 | 6 | 0.018 | ch > hu |
| pTGT221 | 90 | 87-93 | 4 | 87 | 87-89 | 2 | 0.0002 | hu > ch |
| 21.04 | 177 | 177 | 1 | 171 | 171-177 | 3 | hu mono | hu > ch |
| pTGT153 | 124 | 124 | 1 | 143 | 137-165 | 5 | hu mono | ch > hu |
| pTGT271 | 80 | 80 | 1 | 87 | 85-91 | 4 | hu mono | ch > hu |
Since ascertainment bias alone cannot account for the observed pattern of length differences, we must consider the possibility that some of the effect is due to different rates of microsatellite expansion in the two lineages. This possibility raises a further complicating source of observer bias. Most markers are selected to be polymorphic in the species from which they were cloned. However, a proportion of homologues in other species are monomorphic. On average, monomorphic loci have lower mutation rates than equivalent polymorphic loci. Consequently, biased mutation favouring expansion will cause most monomorphic loci to be shorter than their polymorphic homologues. In practice, failure to exclude monomorphic loci will accentuate the ascertainment bias.
The size of the effect associated with monomorphism is shown clearly by a study in which 448 polymorphic microsatellites cloned from cattle were tested on sheep (17). Whereas most polymorphic sheep loci are significantly longer than their bovine homologues (198/308 longer in sheep, or 64%), this trend is reversed among monomorphic sheep loci, where a large majority are shorter (109/131 shorter in sheep, 83%). Given this dramatic difference, we repeated our analyses after excluding three further loci which were polymorphic in chimpanzees but monomorphic in humans. As expected, all P-values become a little more extreme. Thus, the amended ratio of 13:6 does not differ significantly from the ratio of 31:6 longer in humans (polymorphic in both species) for human markers (P = 0.325, Fisher's exact test) (5), but does differ very significantly from the converse ratio of 6:31 (P = 0.0003) which would be expected from a reciprocal ascertainment bias. In addition, the trend is maintained if we conservatively restrict our analyses to dinucleotide repeat loci only: 13:6 does not differ from 19:3 (P = 0.315) but it does differ from the converse ratio of 3:19 (P = 0.0009).
For loci which are polymorphic in both species, the difference between the reciprocal comparisons can be used to estimate the size of the ascertainment bias affecting human-chimpanzee comparisons. For dinucleotide repeat loci cloned and characterized in humans (n = 22), human loci were an average of 5.18 repeat units longer than in chimpanzees, while dinucleotide repeats cloned from chimpanzees (n = 25) were on average 1.23 repeat units longer in humans. Assuming that the ascertainment bias is identical in both directions, its magnitude can be calculated as half of the difference between these means, which is 1.97 repeat units. By subtraction of this estimate of ascertainment bias from the result of 5.18 repeat units, we find that a difference of 3.21 repeat units remains which cannot be attributed to the ascertainment bias. Similarly, if all repeat types are included in this analysis, we obtain an estimate of 1.74 repeat units for the ascertainment bias and 2.44 repeat units for the inter-species difference. It should be noted that the confidence limits on these estimates of ascertainment bias are rather large (standard error of the mean is 0.91 for dinucleotide repeats and 0.73 for all loci), and larger data sets would allow the calculation of more accurate values.
DISCUSSION
By means of a reciprocal comparison, we have been able to calculate the size of the artefactual length difference between human and chimpanzee microsatellites which can be ascribed to the initial selection for long microsatellites as human markers. Over and beyond this, there remains a significant trend for human dinucleotide repeats to be on average 3.21 repeat units longer than their chimpanzee homologues. To explain this, we are left with the alternative two-part hypothesis. First, that there is mutational bias favouring expansion, confirming population genetic and direct mutation data. Second, these results argue that the mutation rate of microsatellites in the human lineage is greater than that in the chimpanzee lineage.
Reciprocal tests of microsatellite lengths have also been conducted on sheep and cows, but two studies reach contrasting conclusions. The larger study (17), based on almost 500 markers, finds that of 20 ovine microsatellites which are polymorphic in both species, 16 are longer in sheep than in cattle. Conversely, among 308 microsatellites of bovine origin, 198 are longer in sheep than cattle (significantly different from 1:1, [chi]2 = 25.14, 1 df, P < 0.0001). This clear finding that sheep microsatellites tend to be significantly longer than those of cattle regardless of the species of origin of the marker supports our conclusion that ascertainment bias does not provide a complete explanation for length differences between species. The other study is much smaller, based on only 13 bovine-derived microsatellites and 14 ovine-derived microsatellites (18), and concludes that a strong ascertainment bias does operate. However, this study fails to exclude loci which are monomorphic in one or other species. Of their original 27 loci, only 14 loci (seven derived from each species) are polymorphic in both species, too few to detect a significant deviation from 1:1.
By quantifying the extent of the ascertainment bias and showing it to account for only about one-third of the observed length difference we originally reported, we have provided further support for the model in which human microsatellites are expanding more rapidly than their homologues in chimpanzees. The apparently greater mutation rate in humans could result from a number of phenomena. (i) Human polymerases and/or repair systems may be simply more error-prone than those in related species. (ii) The frequency of microsatellite mutations seems to be correlated with the number of cell divisions involved in genesis of the germline, as suggested by the higher frequency with which microsatellite mutations are derived from paternal rather than maternal genetic contribution, and increasing rates of mutation in sperm with age (1,2). Since there is a longer period between sexual maturity and reproduction in humans compared with chimpanzees, this phenomenon could lead to inter-specific differences in mutation rates. (iii) Microsatellite allele length may be constrained by some type of length ceiling, which is lower in chimpanzees than in humans (8,10,19). (iv) Microsatellite mutations may involve interchromosomal events (20,21) with mutations occurring preferentially in heterozygotes with a large length difference between the alleles (2). Such heterozygote instability would increase the mutation rate in larger or expanded populations such as humans.
At present, we cannot exclude any of these possible explanations and, indeed, more than one may contribute to the observed pattern. The demonstration that sheep microsatellites are longer than cow microsatellites (17) could be called on to support suggestions (i), (iii) or (iv) [but probably not (ii)]. We currently favour (iv) as an explanation which could be generalized to many species without invoking idiosyncrasies in polymerases or length boundaries in each species. Further reciprocal studies of microsatellites in abundant species and their non-abundant congeners would provide valuable tests of this hypothesis, and help both to quantify and to understand the precise nature of the ascertainment bias.
Our results show that human microsatellites are on average longer than their chimpanzee homologues. This provides further support for the empirical finding of a bias towards mutations which increase the length of microsatellites, and suggests further that mutation-driven expansion is progressing faster in humans than in chimpanzees. Although the exact mechanism for this cannot be determined at present, it would be interesting to see whether this major deviation from the molecular clock for microsatellite evolution reflects a nuclear genome-wide phenomenon for other types of mutation.
MATERIALS AND METHODS
Amplification of microsatellite loci
(CA)n microsatellite loci were identified from a library of 500-750 bp fragments of AluI-digested chimpanzee DNA by standard methods (22). Pairs of primers were designed from the sequences of 24 clones using the criterion of an uninterrupted TG/CA repeat of at least 12 repeat units (maximum cloned was 23 repeat units, mean size 17.3 repeat units). In addition, primers were designed to amplify one interrupted repeat of total 23 repeat units with four interruptions of two or three bases each (locus 22.52). Primer sequences for previously unpublished loci are given in Table 1. PCR reactions were labelled by incorporation of [[alpha]-32P]dCTP, and PCR products were compared with a known sequence ladder on 6% denaturing acrylamide gels.
DNA samples
Chimpanzee DNA samples were from unrelated Pan troglodytes troglodytes individuals at the International Centre for Medical Research, Gabon. Human DNA samples were collected from East Anglians and black South Africans. All DNA samples were from peripheral blood lymphocytes.
ACKNOWLEDGEMENTS
We thank E. Jean Wickings of CIRMF, Gabon, for providing chimpanzee DNA samples, and T. Jenkins, M. Kotze and R. Ramesar for human DNA samples from Africa. This work was supported by the Leverhulme Trust (Grant F/752/A). D.C.R. is a Glaxo Wellcome Research Fellow.
REFERENCES
This article has been cited by other articles:
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: www-admin{at}oup.co.uk
Last modification: 12 Aug 1998
Copyright©Oxford University Press, 1998.
![]()
CiteULike
Connotea
Del.icio.us What's this?
![]()
![]()

![]()
![]()
![]()
C.L. Galindo, L.J. McIver, J.F. McCormick, M.A. Skinner, Y. Xie, R.A. Gelhausen, K. Ng, N.M. Kumar, and H.R. Garner
Global Microsatellite Content Distinguishes Humans, Primates, Animals, and Plants
Mol. Biol. Evol.,
December 1, 2009;
26(12):
2809 - 2819.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
M. Kayser, E. J. Vowles, D. Kappei, and W. Amos
Microsatellite Length Differences Between Humans and Chimpanzees at Autosomal Loci Are Not Found at Equivalent Haploid Y Chromosomal Loci
Genetics,
August 1, 2006;
173(4):
2179 - 2186.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
E. J. Vowles and W. Amos
Quantifying Ascertainment Bias and Species-Specific Length Differences in Human and Chimpanzee Microsatellites Using Genome Sequences
Mol. Biol. Evol.,
March 1, 2006;
23(3):
598 - 607.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
R. Sainudiin, R. T. Durrett, C. F. Aquadro, and R. Nielsen
Microsatellite Mutation Models: Insights From a Comparison of Humans and Chimpanzees
Genetics,
September 1, 2004;
168(1):
383 - 395.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
W. Amos, C. M. Hutter, M. D. Schug, and C. F. Aquadro
Directional Evolution of Size Coupled with Ascertainment Bias for Variation in Drosophila Microsatellites
Mol. Biol. Evol.,
April 1, 2003;
20(4):
660 - 662.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
G. Liu, N. C. S. Program, S. Zhao, J. A. Bailey, S. C. Sahinalp, C. Alkan, E. Tuzun, E. D. Green, and E. E. Eichler
Analysis of Primate Genomic Variation Reveals a Repeat-Driven Expansion of the Human Genome
Genome Res.,
March 1, 2003;
13(3):
358 - 368.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
A. C. Stone, R. C. Griffiths, S. L. Zegura, and M. F. Hammer
High levels of Y-chromosome nucleotide diversity in the genus Pan
PNAS,
December 21, 2001;
(2001)
12364999.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
D. Bachtrog, M. Agis, M. Imhof, and C. Schlotterer
Microsatellite Variability Differs Between Dinucleotide Repeat Motifs--Evidence from Drosophila melanogaster
Mol. Biol. Evol.,
September 1, 2000;
17(9):
1277 - 1285.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
H. Kaessmann, V. Wiebe, and S. Pääbo
Extensive Nuclear DNA Sequence Diversity Among Chimpanzees
Science,
November 5, 1999;
286(5442):
1159 - 1162.
[Abstract]
[Full Text]
![]()
![]()
![]()

![]()
![]()
![]()
A. C. Stone, R. C. Griffiths, S. L. Zegura, and M. F. Hammer
High levels of Y-chromosome nucleotide diversity in the genus Pan
PNAS,
January 8, 2002;
99(1):
43 - 48.
[Abstract]
[Full Text]
[PDF]
![]()
This Article ![]()
![]()
Abstract
![]()
FREE Full Text (PDF)
![]()
Alert me when this article is cited
![]()
Alert me if a correction is posted
![]()
Services ![]()
![]()
Email this article to a friend
![]()
Similar articles in this journal
![]()
Similar articles in ISI Web of Science
![]()
Similar articles in PubMed
![]()
Alert me to new issues of the journal
![]()
Add to My Personal Archive
![]()
Download to citation manager
![]()
Search for citing articles in:
ISI Web of Science (38)
![]()
Request Permissions ![]()
Google Scholar ![]()
![]()
Articles by Cooper, G.
![]()
Articles by Amos, W.
![]()
Search for Related Content
![]()
PubMed ![]()
![]()
PubMed Citation
![]()
Articles by Cooper, G.
![]()
Articles by Amos, W.
![]()
Social Bookmarking ![]()
![]()
What's this?