| Human Molecular Genetics | Pages |
De novo mutations and allelic diversity at minisatellite locus D7S22 investigated by allele-specific four-state MVR-PCR analysis
Introduction
Results
Allele-specific four-state MVR-PCR method
Allelic variation
Allelic variation and repeat unit composition in alleles of <2 kb
DNA sequencing of selected alleles of <2 kb
Allelic variation and repeat unit composition in alleles in size group 2-4 kb
Allelic variation and repeat unit composition in alleles of >4 kb
Investigation of de novo mutations by MVR-PCR
Discussion
Materials And Methods
Materials
Southern analysis
MVR-PCR
Gel extraction of size-separated genomic DNA
DNA sequencing
References
De novo mutations and allelic diversity at minisatellite locus D7S22 investigated by allele-specific four-state MVR-PCR analysis
INTRODUCTION
Human minisatellite loci frequently show an extreme variation in allele length revealed by Southern blot analysis. The high degree of length polymorphism is due to allelic variation in the number of tandemly repeated units (1-5), and is generated by a high spontaneous germline mutation rate (6,7). The tandemly repeated units in a hypervariable minisatellite locus also usually show sequence and/or micro-length variation in the consensus sequence (1,5). This variation can be investigated with the minisatellite variant repeat unit mapping by PCR (MVR-PCR) method (8) which reveals the interspersion pattern of variant repeats differing in sequence and/or length at a given position. Analysing hypervariable minisatellite loci using this approach has revealed an extremely high level of inter-allelic variation (8-13). Further improvements of the method have allowed simultaneous detection of several polymorphic sites within the repeat array (11,14) as well as the generation of MVR maps of haplotypes directly from genomic DNA by allele-specific MVR-PCR (15). The MVR-PCR method reveals a further level of minisatellite variation than the length variation detected by Southern analysis, useful for forensic casework (16) as well as for studying relationships between human populations (13,17,18).
The MVR-PCR approach has been particularly important for studying minisatellite instability. The distribution of variant repeats within the repeat array in alleles at a given minisatellite locus has given indirect information about the evolution of the locus. MVR-PCR analysis of haplotypes in parent/child samples or in father/sperm samples of de novo mutants identified in pedigrees or by small-pool PCR (19) has given direct and detailed knowledge about the mutation mechanisms leading to the generation of new alleles (8-12,18-21).
The human minisatellite locus D7S22 (g3) located on chromosome 7 (7q36-qter) (22,23) is a hypervariable minisatellite locus widely used in forensic casework and paternity testing. Southern blot analysis has indicated a heterozygosity frequency of at least 96% in Caucasians (23,24). The repeat array is GC-rich with a strand asymmetry in base composition, and it consists of 37 bp repeat units with some interspersed 33 and 36 bp repeats (23,25). Comparisons of sequenced repeat units from cloned DNA (23) and from small alleles (25) have revealed several base pair substitutions within the repeat array. The paternal mutation rate in D7S22 (~1.2%) is almost 10-fold higher than the maternal mutation rate (26,27). The alleles are distributed in several allele length groups, each group dominated by a particular flanking haplotype, and there is a heterogeneity in the mutation rate associated with allele length and allelic state at a substitution polymorphism (54C/G) close to the repeat array (26).
To investigate further the allelic diversity and mutation mechanisms at this minisatellite locus, we have developed an allele-specific four-state MVR-PCR assay. In this study, we present the method and the results from analyses of 150 alleles selected from different size groups and allelic states at 54C/G (26). We also present analyses of 54 small families with D7S22 de novo mutations obtained from extensive paternity casework (26).
RESULTS
Allele-specific four-state MVR-PCR method
Four different discrete repeat variants were detected with the selected MVR-specific primers. The variants were arbitrarily named A (tag-A), G (tag-G), a (tag-a) and 3 (tag-3). The two alternative flanking primers 54C and 54G discriminated between the allelic state at the substitution polymorphism 54 bp upstream of the repeat array (expected heterozygosity frequency 0.48) (26). Thus, allele-specific four-state MVR maps were generated directly from genomic DNA in heterozygous individuals (Fig.
Figure 1. A Southern blot showing three alleles (A-C) analysed by allele-specific MVR-PCR. One of four variant repeats (G, A, a, 3) was detected in a given position in the repeat array of each allele. Examples of null-alleles are indicated (*) in some positions in allele A. Alleles A, B and C were 12.0, 3.2 and 2.7 kb respectively, when sized on blots of HinfI-digested genomic DNA. MVR-PCR of allele A was performed using flanking primer 54C while MVR-PCR of alleles B and C was performed using flanking primer 54G. Autoradiography was for1 h 40 min. Prolonged exposure of the blot was sometimes necessary in order to detect the smallest and largest variant repeats in a given allele. The arrow indicates the first scored repeat (a G-variant) in allele A. The base pair size of selected fragments in the BRL-ladder (B) is shown. In some positions, the MVR-specific primers failed to detect any of the four repeat variants. These uncharacterized variant repeats were called null (0)-variants, and the percentage of 0-variants varied from 10 to 20% in the alleles investigated. In some maps, the 3-variants differed more in band intensity than expected. This was probably caused by suboptimal primer annealing due to additional sequence variation located in the 5[prime] region of the MVR-specific primer site. With micro-variation in size between repeats (23,25), 0-variants and band intensity differences, diploid codes could not be typed with certainty. However, with allele-specific MVR-PCR, this variation did not create any typing problem. When MVR-PCR products were detected with an ABD373 (ABD-based method), we managed to type at most 30 repeats into the repeat array. Three different variants were detected in the same lane and, using an internal ladder (25), the fragments could be sized with high precision. These fragment size measurements showed that the size of the smallest repeat variant (first repeat variant in the MVR map) varied between alleles from different subgroups, revealing a difference in number of 0-variants close to the flanking DNA. In contrast to the ABD method where detection of a given fragment is not dependent on its size but on its molar concentration, larger fragments bind more probe when using the Southern method. In agreement with this, MVR-PCR performed with the Southern method (Fig.
Allelic variation
The mapping of alleles revealed highly polymorphic interspersion patterns (MVR-codes). All differently sized alleles could also be distinguished by comparison of MVR-codes, and several alleles typed as identical on Southern blots revealed different MVR-codes. Alleles close in size and with an identical flanking haplotype tended to show higher similarity in MVR-code, and could be classified into subgroups. Alleles in these subgroups revealed a high content of one particular repeat variant and a characteristic interspersion pattern, indicating a close evolutionary relationship. In general, the inter-allelic variation was not biased towards any particular part of the repeat array.
Allelic variation and repeat unit composition in alleles of <2 kb
MVR-codes of the alleles in this size group are given in Figure
Figure 2. MVR-maps of alleles of <2 kb (MVR-code). Alleles with identical MVR-codes are grouped, and the number of alleles in each group is given (No alleles). Allele size is given in number of repeats (R). One allele from each group was analysed by DNA sequencing, and repeat variants were differentiated further (subscript numbers) by additional substitution polymorphisms. Repeat variants with additional substitutions not present in 14R are shown with a subscript N. Repeat variants D, C and G2 correspond to D, C and B2, respectively, in Andreassen and Olaisen (25). Eleven of the 14Rs were analysed further by DNA sequencing to disclose the total amount of substitution polymorphisms within the repeat array. Eight of these were identical to previously published 14Rs (25) revealing a repeat array composed of nine different repeat variants. One additional base pair substitution were revealed in three 14Rs (two of these were identical) compared with the other 14Rs. The tandem repeat array of one allele from each of the other subgroups (Fig. All 37 alleles in this size group were mapped in full with the Southern method. All alleles were 54G at the flanking site. Mapping of the alleles revealed highly polymorphic interspersion patterns (Figs Figure 3. 3-rich alleles in the size group 2-4 kb. Gaps (--) were introduced in some maps to illustrate the similarity in the MVR-code between alleles. The number of alleles with identical MVR-codes is given (No). Figure 4. G-rich alleles in the size group 2-4 kb. Segments of AG-variants or A-variants are shown in parentheses, with the number of AGs or As, respectively, denoted by a number. Gaps (--) were introduced in some maps to illustrate the similarity in the MVR-code between alleles. The number of alleles with identical MVR-codes is given (No). A total of 49 alleles were mapped, and a selection of MVR maps from 20 alleles of >4 kb is shown in Figure In 32 cases, no difference between the progenitor and mutant was detected, while in the remaining 22 cases the site where the mutation occurred could be identified. In eight of these cases, the alleles were mapped in full. Distribution of the mutations within the repeat array in these eight alleles is illustrated in Figure In 46 cases, the progenitor allele length exceeded 60 repeats, and in 14 of these cases (30%) the site where mutation occurred was revealed within the MVR-mapped segment. The number of repeats investigated in these 46 cases, 2760 altogether, represents 35% of all repeats in these alleles. Thus, 30% of mutations were revealed within 35% of the total number of repeats. This indicates a fairly even distribution of mutation events between the repeats investigated (35%) and the repeats further into the repeat array (65%). In the 14 cases where the mutation site was revealed, we performed a cumulative hazard plot produced by SPSS (28) to estimate probability of mutation as a function of location within the 60 repeats mapped. There was no evidence for a higher mutation rate in the repeats close to the flanking DNA. On the contrary, if there was any tendency, the 25 first repeats appeared more stable than the remaining repeats further into the array. In five cases (three gain and two loss mutations), the alleles in the parental sample were 54C and at the same time too close in size to be size separated. Although the mutations were clearly seen in the diploid MVR-codes, haplotype MVR maps of progenitors and non-progenitors were not generated in these cases. In the remaining 17 cases (10 gain and seven loss mutations), the haplotypes of progenitors and non-progenitors in parental samples and mutants in the child were revealed by allele-specific MVR-PCR or by MVR-PCR on size-separated HinfI-digested alleles. When analysing size-separated alleles, the average number of variants mapped was a little less than in MVR maps generated directly from genomic DNA, but in all cases the alleles were mapped at least 50 repeats into the tandem repeated array. In all these cases, the mutant revealed an identical MVR-code to the progenitor on both sides of the deleted or inserted repeats. Five of the loss mutations were deletions of 1-3 repeats without any other changes in MVR-codes compared with the progenitor. One case was a large deletion of 91 repeats (based on Southern blot size measurements). Due to the large deletion, the progenitor was not mapped far enough into the repeat array to reveal whether this was a pure deletion of repeats. However, there was no sign of inter-allelic transfer from the non-progenitor when comparing the interspersion pattern in the corresponding position with the site where the deletion occurred. MVR maps of mutant, progenitor and non-progenitor in gain mutations and in one case with both loss and gain (case 11) are given in Figure Figure 5. MVR maps of representatives from alleles of >4 kb. The size of alleles on Southern blots (size) and state at the flanking polymorphism (54) is given for each allele. Figure 6. Relative location of mutations (arrows) in the repeat array from eight cases with de novo mutation where the alleles were mapped in full. The site of mutation within the repeat array was revealed by comparison of progenitor and mutant and, in cases with ambiguity in the positioning of the mutation site, the site closest to the flanking DNA was chosen as the position where the mutation had occurred. Location of the 54C/G polymorphic flanking site relative to the repeat array is indicated (54C/G).
Figure 7. Partial MVR maps of mutants (m) with additional repeats compared with progenitors (p) (underlined). Partial MVR maps of non-progenitors in the corresponding position (n) are given (alleles were aligned from the side of the repeat array close to the 54C/G polymorphism). For each allele, the allelic state in the 54 site (54), the size of the allele in number of repeats (R) and the size in kb on HinfI blots (in parentheses) is given. The position in the repeat array where the mutation occurred is given as the first repeat revealing a difference between progenitor and mutant (M). In de novo mutations where the parental alleles were gel separated, the progenitor allele could be identified by comparison of MVR-codes even if the exact site where the mutation occurred was not detected. This allowed us to reveal the size of the progenitor, and in 14 out of 17 cases (82%) the progenitor was the parental allele closest in size to the mutant allele. The MVR maps of alleles revealed the existence of different subgroups of alleles that are closely related. The small alleles (<2 kb), the 3-rich and the G-rich alleles share the 54G flanking haplotype, but the MVR maps showed that they could be divided further into three different lineages. The alleles >4 kb comprise a fourth group defined by their interspersion pattern, but this allele group could not be divided in subgroups based on common MVR characteristics. The MVR-PCR of D7S22 gives information about relationships among alleles which is not obtained by Southern analysis, and might, as in other minisatellites (17,18), be a tool for investigation of populations and their demographic history. MVR-PCR analysis of size-separated parental alleles from the mutation material showed that in most cases (82%) the progenitor was the parental allele closest in size to mutant. This is in good accordance with earlier findings (29), and it supports results from a previous study (26) where we reported that most mutations in D7S22 involved small size changes. If new alleles were generated by inter-allelic unequal recombination, one would expect these alleles to have the flanking haplotype and interspersion pattern from one subgroup combined with the interspersion pattern from another subgroup beyond the mutation site. No such alleles were found in the population material (e.g. alleles starting with MVR-codes from 3-rich alleles and ending with MVR-codes from G-rich alleles). Furthermore, the de novo mutations did not seem to be combinations of the parental alleles. These findings support earlier observations (11,19,29) that minisatellite mutation ordinarily is not accompanied by inter-allelic unequal recombination. In a study of length and sequence variation in small alleles (<2 kb), we concluded that rare small alleles are likely to originate from the common 14R (25). The low variation revealed by DNA sequencing of the 14Rs in this study supports that this is a homogenous, stable allele. To investigate further the mechanism involved in generating the other small alleles, rare small alleles larger than 14R were grouped by MVR-PCR. One allele from each of these groups was analysed by DNA sequencing. In accordance with our earlier findings (25), the DNA sequencing suggested that the other larger rare alleles (Fig. In the de novo mutation material, the ratio between intra- and inter-allelic events could be studied further. While most loss mutations are pure losses of repeats, at least one (case 7, Fig. As pointed out in other studies (19,20), the conversion-like events indicate that the mutations occur in meiosis, where homologues are paired. None of the de novo mutations investigated in this study indicated that any post-meiotic mechanisms are involved. Mutation studies in several other minisatellites have revealed polarized variation with a hot spot for mutation within the repeat array (9,10,19,21). The mutation rate in these minisatellites seems to be independent of allele length, and the highly unstable hot spot for mutation is the least homogeneous part of the repeat array judged from the distribution of MVR variation (19,20). Based on these findings, a mutation model has been proposed which implies that the high instability revealed in hypervariable GC-rich minisatellites is not an intrinsic property of the repeat array itself, but is regulated by presently unknown cis-acting factors (19). Such a hypothesis could also explain the observed heterogeneity in mutation rate associated with some flanking markers (20,21,26). In a study of D7S22, we reported a heterogeneity in mutation rate associated with the allele length and the state at the 54C/G site (26). The finding that a lineage structure is present in alleles of <4 kb, but not in larger alleles is in agreement with our previous results that larger alleles have a higher mutation rate. These factors could themselves somehow contribute to the heterogeneity in mutation rate, or be passenger effects of other unknown mutation enhancers such as internal attributes of the repeat array or cis-acting factor(s). In this study, we have applied MVR-PCR analysis to explore this further. First, MVR-PCR analysis shows that the repeat array is composed of several different repeat variants. A further typing of sequence variation by DNA sequencing (25) shows that the repeat array can be divided into numerous different variants, and homogeneous stretches with identical repeats are rare. We cannot rule out that one repeat variant might be more unstable than others, but when mapping the location of a mutation in the progenitors there was no obvious preference for mutation occurring in one particular repeat variant. Second, the MVR mapping indicates that there is no polarized variation or hot spot for mutation in this minisatellite locus. Therefore, an explanation for the observed association between allele length and mutation could be that the GC-rich repeat array itself affects the stability. Since mutation occurs at any part of the repeat array, larger alleles would have more repeat array available for mutation. As a consequence of this, instability of alleles would increase with the size of the repeat array, possibly in combination with a lower limit of the number of repeats needed for mutation (30). Such a mechanism does not rule out the possibility that there are additional cis-acting factors affecting the mutation rate, or that the impact of factor(s) affecting the mutation rate might be different for intra- and inter-allelic events. Genomic DNA was obtained from paternity case material consisting of 5700 mother/child pairs and 4600 fathers for allelic diversity analysis by MVR-PCR. Individuals likely to be heterozygous at the 54C/G flanking polymorphism were selected for MVR-PCR based on allele size on Southern blots (26). A total of 64 alleles of <2 kb were analysed. Of these, 34 were the common 14R while the remaining alleles were rare alleles larger than 14R. Twenty four alleles in this size group were analysed further by nucleotide sequencing of the tandem repeat array. In addition, 37 alleles from the 2-4 kb size group, 32 alleles in the 4-9 kb size group and 17 alleles >9 kb were analysed by allele-specific four-state MVR-PCR. Genomic DNA from 54 small families (mother/child/father) with a de novo mutation in locus D7S22 in the child were obtained from the same paternity case material and analysed by MVR-PCR. DNA was extracted from blood samples and analysed by the Southern method as described (26). Four MVR-specific primers were selected based on sequence data from a cloned large allele (EMBL accession no. M31583) and from small alleles (25). A 20 nucleotide 5[prime] extension (tag-tail) (8) was added to each of the primers. Together, these primers detected sequence variation in two sites separated by 9 bp as well as a 4 bp deletion. The MVR-specific primers used were: tag-A, 5[prime]-tag-tail-GGCAGGGAGAGGCAGGAA-3[prime]; tag-G, 5[prime]-tag-tail-GGCAGGGAGAGGCAGGAG-3[prime]; tag-a, 5[prime]-tag-tail-GGCAGGGAGGGGCAGGAA-3[prime]; tag-3, 5[prime]-tag-tail-GGTGGTGTGGGCAGGGG-3[prime]. Allele-specific flanking primers were constructed based on a base pair substitution close to the repeat array (26). These primers were used to haplotype alleles directly in genomic DNA. The allele-specific flanking primers used were: 54C: 3[prime]-GTTATTATAAAGGGTCGAAGAGCCACAAC-5[prime]; and 54G: 3[prime]-CTTATTATAAAGGGTCGAAGAGCCACAAC-5[prime]. For MVR-PCR, genomic DNA (50 ng) was added to 1 nM of one of the MVR-specific primers, 1 µM tag-primer, 2 µM of one of the flanking primers, 1.75 mM MgCl2, 0.25 mM dNTP, 5 µl 10× PCR reaction buffer (Promega), together with H2O to a final volume of 50 µl. The MVR-PCR reactions were performed on a Perkin Elmer 9600. The cycling conditions were: initial denaturation 94°C for 90 s, 20 cycles of 94°C for 30 s, 65°C for 30 s, 70°C for 3 min, and a terminal extension for 10 min. The annealing temperature when amplifying with tag-a or tag-3 was 67°C. When MVR-PCR products were detected on ABD373 (ABD-based method), the concentration of MVR-specific primers was 20 nM, the tag-primers were labelled with different fluorochromes (one label in combination with each of the four MVR-specific primers) and 26 cycles were performed. All alleles of <2 kb were detected using an ABD 373 (Applied Biosystems Division), denaturating 6% polyacrylamide gel, TBE buffer and set up parameters as described by the manufacturer. Genescan-2500 and Genescan-372 software were used in fragment length measurements. MVR-PCR products from alleles of >2 kb were separated in 1% agarose gel, Southern blotted (26) and detected using the phosphatase-labelled probe g3 (Cellmark) (31). Typing of the Southern blots was performed manually. All haplotypes in the population material were obtained by allele-specific four-state MVR-PCR. The individuals in the mutation material were screened by allele-specific MVR. When homozygous individuals (54G/G or 54C/C) were screened, diploid codes were generated, but the site where the mutation occurred could be localized by comparison of MVR maps from mother, child and father since any length change would displace the MVR-code in the mutant. In cases where the mutation site was revealed, haplotypes of the progenitor and non-progenitor as well as the mutant allele in the child were obtained by allele-specific MVR-PCR or by size separation of HinfI-digested alleles followed by MVR-PCR of genomic DNA extracted from gel slices. In one case, the haplotypes in the child were deduced by comparing the diploid code in the child with the haplotypes in the mother and father. Samples of HinfI-digested genomic DNA were size separated in 0.7% agarose gel. The kb DNA ladder (Stratagene) was used in adjacent lanes as a molecular weight standard. Using UV light and ethidium bromide staining of the gels (0.5 µg/ml), the area containing D7S22 alleles was identified by comparison with the kb DNA ladder and sliced out. Genomic DNA was recovered from gel slices using the Qiaex II gel extraction kit (Qiagen). DNA sequencing of the tandem repeat array in small alleles was performed as described (25).
DNA sequencing of selected alleles of <2 kb
Allelic variation and repeat unit composition in alleles in size group 2-4 kb
Allelic variation and repeat unit composition in alleles of >4 kb
Investigation of de novo mutations by MVR-PCR

DISCUSSION
MATERIALS AND METHODS
Materials
Southern analysis
MVR-PCR
Gel extraction of size-separated genomic DNA
DNA sequencing
REFERENCES
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: www-admin{at}oup.co.uk
Last modification: 13 Nov 1998
Copyright©Oxford University Press, 1998.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This Article ![]()
![]()
Abstract
![]()
FREE Full Text (PDF)
![]()
Alert me when this article is cited
![]()
Alert me if a correction is posted
![]()
Services ![]()
![]()
Email this article to a friend
![]()
Similar articles in this journal
![]()
Similar articles in ISI Web of Science
![]()
Similar articles in PubMed
![]()
Alert me to new issues of the journal
![]()
Add to My Personal Archive
![]()
Download to citation manager
![]()
Search for citing articles in:
ISI Web of Science (7)
![]()
Request Permissions ![]()
Google Scholar ![]()
![]()
Articles by Andreassen, R.
![]()
Articles by Olaisen, B.
![]()
Search for Related Content
![]()
PubMed ![]()
![]()
PubMed Citation
![]()
Articles by Andreassen, R.
![]()
Articles by Olaisen, B.
![]()
Social Bookmarking ![]()
![]()
What's this?