| Human Molecular Genetics | Pages |
Re-examination of factors associated with expansion of CGG repeats using a single nucleotide polymorphism in FMR1
Introduction
Results
ATL1 and CGG repeat length
ATL1 and haplotype associations
ATL1 and AGG interspersion pattern
Origin of ATL1
Discussion
Repeat structure: position of first AGG interruption
Haplotype associations and mechanistic implications of ATL1
Materials And Methods
Sample populations
SSCP analysis and DNA sequencing
Polymorphism detection
Haplotype analysis
Statistical analysis
Acknowledgements
References
Re-examination of factors associated with expansion of CGG repeats using a single nucleotide polymorphism in FMR1
INTRODUCTION
Information obtained from allele state at flanking polymorphic markers was essential for cloning of the FMR1 gene in 1991 (1). Since then, a number of studies have focused on the use of polymorphic markers to investigate the origins of fragile X expansion mutations. The most studied polymorphism is the CGG repeat in exon 1 of FMR1 itself, which ranges from six to ~60 repeats in normal individuals. The repeat is unstable in non-penetrant premutation carriers (~60-200 repeats) and in full mutation patients (>200 repeats). Expansion to the full mutation state results in hypermethylation of the repeat and a nearby CpG island of the FMR1 promoter, thus suppressing transcription of FMR1 (1-4). The exact molecular basis of the instability is largely unknown; however, models assuming multistep pathways are thought to be the most accurate in describing CGG repeat length progression (5-8). In these models, four general states of the repeat alleles are described: N, normal and stable alleles (<41 repeats); S, high end normal or predisposed alleles (41-60 repeats); Z, premutation and unstable alleles; L, expanded, full mutation alleles. The transition steps between each pool have variant rates and the mutational mechanisms which move CGG repeat alleles from one pool to the next are not well defined (5,9). Currently, the use of data from dinucleotide polymorphisms in and around FMR1 provide the best estimates of the mutational steps involved in producing unstable and expanded alleles.
Several studies have shown that the CGG repeat of FMR1 often contains one or more AGG interruptions, generally at regular intervals in normal individuals (10-12). In stable repeat tracts, the number and position of AGG interruptions can provide information as to the age of alleles (13), particularly when combined with haplotypes defined by flanking loci (14,15). These interruptions, thought to have been present in the FMR1 repeat since early hominoid ancestors (16), appear to play a crucial role in the maintenance of repeat stability. FMR1 repeat tracts with at least 34 pure CGGs have been observed to be unstable upon transmission, including all premutation alleles where interspersion patterns have been determined (12,14,17,18). Thus, loss of one or more AGG interspersions, creating a longer perfect repeat tract, almost certainly represents one mutational step hypothesized for repeat tracts to progress to instability and disease (9). Alternatively, a slow, sequential lengthening of a perfect repeat array may represent another route to premutation alleles (10).
One paradoxical aspect of the fragile X mutation has been the maintenance of linkage disequilibrium with flanking markers despite the high estimated overall mutation rate of 0.8 × 10-4 from normal repeat alleles to unstable alleles (9). The most characterized marker loci used in association studies (Fig.
Figure 1. (A) Representative picture of SSCP analysis for novel single nucleotide polymorphism. Samples 1-6 are normal (non-fragile X) DNA subjected to PCR with ATL1 primers and run on a 6% non-denaturing gel. (B) Detail of automated sequence analysis for ATL1 polymorphism. The arrow indicates base 19445 of the FMR1 sequence. (Left) ATL1-A homozygote (male); (right) ATL1-G homozygote (male). (C) Representative dot blots probed with allele-specific oligonucleotides. The A probe was used on the left and the G probe on the right. Samples are dotted in rows of eight, increasing from top to bottom and from left to right. Sample 3 is a known A homozygote (male), sample 4 is a water control and sample 5 is a known G homozygote (male). (D) Location of polymorphic markers in the FMR1 gene region. The 5[prime] end of the human FMR1 gene, through exon 3 and the upstream region are depicted (not to scale). Exons are shown as red boxes. Arrows indicate the approximate locations of markers used in this and previous studies and distances are indicated in green between markers. The ATL1 polymorphism is shown in sequence context below the gene. Despite the utility of these microsatellite markers, they have certain disadvantages common to dinucleotide repeats. Determination of allele state at microsatellite markers requires gel analysis, which is not practical for rapid processing of large numbers of samples. Also, discernment of exact allele state by gel analysis can often be difficult due to a ladder of bands of similar intensity. This problem is particularly evident with the complex marker FRAXAC2 (21). Gel analysis, in particular, suffers from variation in migration of repeat-rich PCR products, resulting in difficulty comparing studies between laboratories which use different standards for size determination. In addition to these largely technical issues, microsatellites have a high mutation rate, up to ~1/1000 at some loci (22). While this leads to the informativeness of multiple alleles, the high mutation rate may blur allelic associations over time. For example, almost two thirds of full mutation chromosomes are linked to four or five haplotypes, while the remainder of the full mutations appear to occur on rare backgrounds (9). These rare haplotypes may well be due to mutations at the microsatellites or recombination which alters haplotypes after the transition to S alleles but before full expansion, a process that is estimated to take >90 generations (23). Indeed, it has also been suggested that genome-wide or regional instability could affect both the microsatellite markers and the FMR1 repeat simultaneously (24-26), obscuring the linear relationship of haplotypes and the CGG repeat. Therefore, more stable markers, such as single nucleotide polymorphisms (SNPs) could provide a distinct advance to studies investigating the origin of CGG repeat instability in FMR1. Indeed, SNP loci are both technically superior to microsatellites and much more stable (27). Unfortunately, SNPs suffer from having only two alleles and if the variant allele is either very infrequent or of very recent origin, the utility of the SNP for haplotype analysis is limited. We have previously described two SNPs in FMR1: FMRa, the deletion of a T nucleotide in intron 1, and FMRb, a G->A transition in exon 5 (10). However, the rarer allele of each marker is completely linked to the same FRAXAC1 allele in both normal and fragile X Caucasian samples (10,15), thus minimizing the additional information obtained by typing these two SNPs. Here we report the identification and characterization of a novel SNP in intron 1 of FMR1, 5.6 kb distal of the CGG repeat. This SNP, named ATL1, has two alleles at near equivalent frequencies in the normal population and displays marked disequilibrium with fragile X chromosomes. Moreover, ATL1 alleles define haplogroups of each previously defined haplotype, thereby providing considerable insight into the mutational history of FMR1 alleles and potential mechanisms of instability.
RESULTS
In order to identify new polymorphisms in the FMR1 gene, single-stranded conformational polymorphism (SSCP) analysis was conducted on sequences in intron 1. Several primer sets of 200-300 bp products were tested on a panel of normal and fragile X male samples of various FMR1 haplotypes. Use of one primer set (ATL1-F and ATL1-R) yielded differing band patterns in a number of the samples (Fig.
ATL1 and CGG repeat length
This assay was tested first on X chromosomes with normal CGG repeat lengths at FMR1 (see Materials and Methods). In 564 chromosomes tested (Table 1), the ATL1-A and ATL1-G alleles were represented at 60 and 40%, respectively. The two alleles of ATL1 are linked to similar numbers of different CGG repeat length alleles, with the A allele on 28 individual repeat length backgrounds and the G allele on 31. However, a significant difference (Fisher's exact test, P < 0.001) was observed in the overall length distribution of repeats with respect to each ATL1 allele, shown graphically in Figure
Figure 2. CGG repeat distribution for the two alleles of the ATL1 polymorphism. Frequencies on the y-axis are the percentage of all 564 normal X chromosomes tested. Table 1. Further studies revealed a highly significant difference ([chi]2 = 89.33, P < 0.001) in the distribution of the two ATL1 alleles between normal repeat length and fragile X chromosomes. Among 152 fragile X chromosomes tested (Table 1), the G was found on 83%. Chromosomes with 41-60 CGG repeats are termed intermediate or `gray zone' alleles. Interestingly, if the 24 chromosomes in the normal repeat sample set with CGG repeats between 41 and 60 are examined separately, the distribution of the two ATL1 alleles is the same as that seen in fragile X chromosomes, at 17% A and 83% G. An independently ascertained group of 34 intermediate allele and premutation-sized chromosomes (see Materials and Methods) showed a similar distribution of 15% A and 85% G. These results further extend previous observations that intermediate alleles comprise the pool from which expanded repeat chromosomes are derived. In addition, we have seen that the G allele of ATL1 is in significant linkage disequilibrium with larger sized CGG repeat alleles at FMR1.
CGG repeat size
(n)
Percentage A allele (n)
Percentage G allele (n)
Normal
564
60 (340)
40 (224)
Repeats < 30 (14-29)
196
54 (103)
46 (90)
Repeats = 30
207
81 (168)
19 (39)
Repeats > 30 (31-57)
161
41 (66)
59 (95)
Intermediate and premutation-sized alleles
Sample set I (41-57)
24
17 (4)
83 (20)
Samples set II (41-74)
34
15 (5)
85 (29)
Fragile (>200)
152
17 (26)
83 (126)
ATL1 and haplotype associations
Marked linkage disequilibrium has been observed between the CGG repeat locus and flanking microsatellite markers (locations shown in Fig.
First, the ATL1-G allele is associated with many more haplotypes than the ATL1-A allele, even though the G allele is the less frequent in the normal chromosomes tested. The A allele is almost exclusively (95%, 323 of 340 chromosomes) linked to AC1 allele 3. Furthermore, 15 of the 17 chromosomes not linked to AC1 allele 3 have AC1 alleles mutated by only one dinucleotide repeat, thus being AC1-2 or AC1-4. In contrast, the ATL1-G allele is distributed among three predominant alleles at the FRAXAC1 marker, with AC1 alleles 1, 3 and 4 represented on normal G allele chromosomes at 22, 41 and 36%, respectively. For the fragile X chromosomes examined, the A allele is completely associated with AC1 allele 3. The G allele is linked to the same three alleles of AC1, but in a dissimilar distribution from control chromosomes, at 35, 16 and 47% for AC1 alleles 1, 3 and 4, respectively.
For further examination of common haplotypes, the results from our microsatellite analysis are indicated graphically in Figure In contrast, the 7-3 haplotype has been hypothesized to have a negative effect on the probability for expansion. We saw the 7-3 haplotype itself present on 62% of all normal X chromosomes tested, comparable with previous studies (9,14). Interestingly, the 7-3 haplotype is not equally observed between the A and G alleles in normal and fragile X chromosomes. Indeed, the 7-3-A allele is significantly under-represented ([chi]2 = 79.4, P < 0.001) in fragile X chromosomes, at 12 compared with 51% of all normal chromosomes, while the 7-3-G allele is found at similar levels ([chi]2 = 1.36, not significant) in normal (11%) and fragile X chromosomes (9%). Thus, the ATL1 locus can be used to discriminate between two 7-3 haplotypes, one protected from expansion (7-3-A) and one not protected (7-3-G). Table 2.
DXS548-FRAXAC1 haplotype
(n = 564)
Percentage of normal chromosomes(n = 152)
Percentage of fragile chromosomes
% ATL1-A (n)
% ATL1-G (n)
% ATL1-A (n)
% ATL1-G (n)
0-1
0.2 (1)
1-1
2.9 (15)
0.7 (1)
2-1
3.9 (22)
22.9 (35)
3-1
0.2 (1)
0.7 (1)
5-1
0.2 (1)
0.7 (1)
6-1
0.2 (1)
0.4 (2)
1.3 (2)
7-1
1.4 (8)
3.3 (5)
2-2
0.2 (1)
6-2
0.2 (1)
7-2
1.1 (6)
0.2 (1)
1.3 (2)
8-2
0.2 (1)
1-3
0.7 (4)
2-3
0.9 (5)
1.6 (9)
1.3 (2)
2.0 (3)
3-3
0.7 (1)
5-3
0.2 (1)
0.2 (1)
6-3
3.9 (22)
2.1 (12)
2.6 (4)
2.0 (3)
7-3
50.9 (287)
11.3 (64)
11.8 (18)
8.5 (13)
8-3
1.4 (8)
0.2 (1)
1.3 (2)
2-4
0.5 (3)
1.3 (2)
3-4
0.7 (1)
4-4
0.2 (1)
0.7 (1)
5-4
0.7 (4)
0.7 (1)
6-4
7.1 (40)
32.7 (50)
7-4
1.4 (8)
5.5 (31)
2.6 (4)
8-4
0.2 (1)
7-5
0.2 (1)
Total
340
224
26
126
In a subset of normal and fragile X chromosomes, we also determined allele state at the FRAXAC2 complex microsatellite marker. The results are shown in Table 3, with chromosomes separated by DXS548-FRAXAC1-FRAXAC2 haplotype and by allele state at ATL1. As in Table 2, the chromosomes are ordered by allele state at FRAXAC1, as it is the marker closest to the repeat itself. Even with the large sample size, many of the haplotypes only occur on one chromosome or are only associated with one AC2 allele, so the amount of information obtained by typing this marker is minimized. However, we observed significantly unequal distributions of the AC2-4 ([chi]2 = 34.4, P < 0.001) and AC2-4+ ([chi]2 = 162.5, P < 0.001) alleles based on their ATL1 association. For example, Table 3 shows that addition of the FRAXAC2 marker splits the 7-2 haplotype into two groups: 7-2-4, which is only associated with ATL1-G and includes normal and fragile X chromosomes; 7-2-4+, which is only associated with ATL1-A and only on normal chromosomes. This split is also found on the 6-3-4 versus 6-3-4+ haplotypes and on 7-3-4 versus 7-3-4+. Although both of these AC2-4+ groups show fragile X chromosomes with the ATL1-A allele, expansions to full mutations appear to occur at a lower frequency on AC2-4+, ATL1-A chromosomes than on AC2-4, ATL1-G chromosomes.
ATL1 and AGG interspersion pattern
To completely describe the relationship between known FMR1 polymorphisms and ATL1, we studied the AGG interruption pattern of the FMR1 CGG repeat on 203 of our normal repeat length X chromosomes (10,14). Of these chromosomes, 122 have the A allele of ATL1 and 81 have the G allele. Figure
Table 3.
| 5487-AC1-AC2 haplotype | Normal ATL1-A | Normal ATL1-G | Fragile A | Fragile G |
| 0-1-3 | 1 | |||
| 1-1-3 | 13 | |||
| 1-1-4 | 1 | |||
| 2-1-3 | 13 | 24 | ||
| 2-1-4+ | 1 | |||
| 3-1-3 | 1 | |||
| 5-1-5 | 1 | |||
| 6-1-3 | 1 | 1 | ||
| 6-1-4+ | 1 | |||
| 7-1-2 | 1 | |||
| 7-1-3 | 2 | 3 | ||
| 7-1-4+ | 2 | |||
| 2-2-4 | 1 | |||
| 7-2-4 | 1 | 2 | ||
| 7-2-4+ | 4 | |||
| 1-3-4 | 2 | |||
| 1-3-5 | 2 | |||
| 2-3-4 | 2 | 5 | 1 | 2 |
| 2-3-4+ | 3 | 1 | ||
| 3-3-3 | 1 | |||
| 6-3-4 | 2 | 1 | 3 | |
| 6-3-4+ | 15 | 3 | 2 | |
| 7-3-2 | 1 | |||
| 7-3-2+ | 1 | |||
| 7-3-3 | 2 | |||
| 7-3-3+ | 17 | 3 | 1 | |
| 7-3-4 | 27 | 34 | 4 | |
| 7-3-4+ | 176 | 11 | 7 | 1 |
| 7-3-5 | 1 | |||
| 7-3-5+ | 1 | 1 | ||
| 7-3-6+ | 1 | |||
| 8-3-4+ | 6 | 1 | 1 | |
| 2-4-5 | 1 | |||
| 2-4-6 | 1 | |||
| 2-4-6+ | 2 | |||
| 3-4-5 | 1 | |||
| 4-4-5 | 1 | |||
| 5-4-3 | 1 | |||
| 5-4-4 | 1 | |||
| 6-4-3 | 1 | |||
| 6-4-4 | 12 | 7 | ||
| 6-4-5 | 9 | 14 | ||
| 6-4-5+ | 3 | |||
| 6-4-6+ | 3 | 4 | ||
| 7-4-4 | 1 | |||
| 7-4-5 | 1 | 2 | 1 | |
| 7-4-5+ | 2 | |||
| 7-4-6 | 1 | |||
| 7-4-6+ | 5 | 20 | ||
| 7-4-7+ | 1 | |||
| 8-4-6 | 1 | |||
| 7-5-0+ | 1 | |||
| Total chromosomes | 268 | 162 | 12 | 82 |
Figure 3. Microsatellite haplotype analysis of X chromosomes. The results of Table 2 are depicted in graphical form, with normal chromosomes on the left and fragile X chromosomes on the right. In each group, percentages of ATL1 alleles are indicated at the side as shades of red and percentages of microsatellite haplotypes as shades of blue. Within the most common normal haplotype (7-3-X), the use of ATL1 clearly shows a separation of CGG repeat alleles with the 9+n and 10+n interruption patterns. The overall repeat length is similar for chromosomes with the A or G allele linked to this haplotype (Fig. Previously, repeat tracts containing >24 pure repeats have been suggested to represent alleles at risk for expansion to fragile X syndrome (10). These alleles have been hypothesized to account for 0.5-4% of normal X chromosomes, possibly corresponding to the S pool of alleles in the multistep progression models (5,6,10). In our survey we find 7/205 control samples tested have tracts of >24 perfect repeats, or 3.4%. No alleles with >24 pure CGG repeats were observed linked to the ATL1-A allele; thus, all seven samples with >24 pure repeats are linked to ATL1-G. Four of these seven X chromosomes have the `predisposing' 2-1-X haplotype background and have a 9+9+n repeat configuration. The remaining three are either 9+n or completely pure and are associated with haplotypes more rarely seen in fragile X syndrome: 1-3-4, 2-3-4 and 7-4-6+. Three completely pure repeat tracts are linked to the A allele but none exceeded 23 repeats; in contrast, those observed linked to the G allele were 54, 27 and 24 uninterrupted CGG repeats respectively. Table 4.
DXS548-FRAXAC1-FRAXAC2 haplotype
ATL1-A
ATL1-G
1-1-3
9 + 9 + 26
9+9+25
2-1-2
9+9+31
9+9+35
9+9+24
9+9+32
9+9+23
9+9+30
9+9+22
2-1-3
9+9+28
9+9+22
9+9+25
9+9+21
9+9+25
9+9+10+24
9+9+24
9+9+10+20
9+9+24
9+9+9+9+9
6-1-3
9+9+38
9+9+24
6-4-4
10+24+9
9+64
9+10+12+9
6-4-5
9+33
9+31
7-3-4
9+9+21
7-3-4+
10+32
10+9+9+10
10+9+21
8-3-4+
11+44
9+48
Figure 4. Analysis of AGG interruption pattern. A total of 203 control human chromosomes and 5 chimpanzee chromosomes are depicted, with CGG repeats as open circles and AGG repeats as dark circles. Alleles are organized by DXS548-FRAXAC1-FRAXAC2 haplotype, listed by ascending FRAXAC1 allele and separated by allele state at ATL1, with A alleles on the left and G alleles on the right. An X in the haplotype indicates chromosomes where the FRAXAC2 allele is not known (i.e. 7-3-X). To further examine the ATL1 status of chromosomes with larger repeat tracts (Table 4), we tested the 34 intermediate allele and premutation-sized chromosomes mentioned previously (see Materials and Methods). This set of chromosomes contained a large proportion of 2-1-3 haplotype samples, reflecting the tendency for repeats on the haplotype to accrue large 3[prime]-end tracts after two 9+ units (14). We observed exclusive linkage of the ATL1-G allele to FRAXAC1 alleles 1 and 4 and thus to haplotypes considered predisposed to repeat expansion. The ATL1-A allele was only observed on 5/34 chromosomes and exclusively on the 7-3-4+ or 8-3-4+ haplotypes. Again, we saw a significant difference (Fisher's exact test, P < 0.005) between the two ATL1 alleles and the 9+n or 10+n repeat configurations. Of the 29 chromosomes with the ATL1-G allele, 28 (97%) had the 9+n repeat pattern. This pattern was seen on 1/5 of ATL1-A chromosomes, while the 10+n pattern was found on 3/5. These data suggest that the strong linkage disequilibrium of ATL1 among fragile X chromosomes may result from an association with predisposing risk factors leading to expansion of CGG repeats at the FMR1 locus, including flanking microsatellite haplotype and length and purity of the CGG repeat. Chromosomes with the ATL1-A allele appear to be remarkably homogeneous for both the microsatellite markers tested and the CGG interruption pattern, in contrast to those with the ATL1-G allele. Given this fact, we wished to examine two possible explanations: either the ATL1-G allele is the ancestral state or the G allele is associated with greater instability at surrounding repeat sequences. To characterize human chromosomes of advanced genetic age, we examined the ATL1 allele state in several isolated African populations, including 10 !Kung, 10 Khwe, three Mbuti pygmy and two Biaka pygmy samples (all males). All of these samples but one Khwe male, or 96%, had the ATL1-G allele (data not shown). In addition, 36 normal repeat length X chromosomes known to be African-American in origin were tested and 72% of these had the G allele. In this open population, the A allele is completely linked to AC1 allele 3 and 78% of the A allele chromosomes were of the 7-3-X haplotype (data not shown). To further determine the ancestral ATL1 allele, we tested the ATL1 status of nine male chimpanzees (Pan troglodytes, sequence of five samples shown in Fig.
Origin of ATL1
DISCUSSION
We have identified a single nucleotide polymorphism in intron 1 of the human FMR1 gene. The two alleles of this marker, either an A or a G nucleotide, are not equally linked to chromosomes with expanded CGG repeats and, thus, fragile X syndrome. Expansion mutations appear to occur on both ATL1 backgrounds but are five times more commonly associated with the G allele. Previous reports have suggested that SNPs located extremely close to other repeat arrays actually may influence the stability of the repeat (28,29). However, as ATL1 is 5.6 kb away from the CGG repeat of FMR1, we believe this SNP is less likely to directly affect repeat stability. Thus, ATL1 is more likely linked to cis-acting sequences directly influencing stability. Accordingly, we have used the ATL1 marker to revisit all of the known risk factors identified with predisposition to CGG repeat expansion in FMR1: overall repeat length, haplotype at three flanking microsatellites and CGG repeat structure.
Repeat structure: position of first AGG interruption
Our sample set represents one of the largest groups of normal and fragile X chromosomes examined at the FMR1 locus thus far, particularly in such detail. We observed a range of repeat lengths and haplotypes quite similar to previous studies, including a notable difference in haplotype distribution between normal and fragile X chromosomes (reviewed in ref. 9). However, use of the ATL1 polymorphism has allowed us to further separate X chromosomes into related groups with apparently quite different rates of progression to repeat expansion. For example (Fig.
The sampled X chromosomes with the ATL1-A allele appear to have radiated from a common founder with the 7-3-4+ haplotype, as 66% of normal and 58% of fragile X ATL1-A chromosomes tested at all markers still maintain this haplotype. The remaining A allele samples appear to have mostly undergone small changes at the DXS548, FRAXAC2 and (much more rarely) FRAXAC1 loci. Normal African-American chromosomes also showed a tight linkage of the 7-3-X haplotype to ATL1-A. As the A allele was observed in one African bushman, the ATL1 mutation may pre-date the split of African and non-African populations, generally estimated at 100 000 years ago (30) or this single individual may carry an outbred rather than indigenous haplotype. Still, the significant under-representation of 7-3-X, ATL1-A chromosomes with the fully mutated CGG repeat implies the presence of some protective cis-acting factor which is itself in linkage disequilibrium with the A allele of ATL1 or the absence of a predisposing factor.
Figure 5. Distribution of first AGG interruption in normal X chromosomes. A total of 203 samples are grouped by ATL1 allele. Graphs indicate relative frequency within each ATL1 group of four different placements of the first AGG interruption within the CGG repeat of FMR1. 10+n denotes an AGG in the eleventh position of the array and 9+n denotes an AGG in the tenth position. Other indicates any other position of the first AGG and Pure indicates the presence of no AGGs in the CGG repeat. In addition to ATL1, any cis-acting factor seems to also be significantly linked to the position of the first AGG interruption (Figs Pure repeat tracts in excess of 24 repeats have been hypothesized to lead to unstable tracts (10) and tracts with [ge]34 repeats have been shown to be unstable upon transmission (18). If the 9+n configuration itself was also linked to unstable alleles, we would expect to find it over-represented on large normal and premutation chromosomes. Indeed, in our study we found a highly significant linkage between intermediate and premutation-sized alleles and 9+n repeat tracts (Fig. Previous studies have found the 10+n repeat pattern in almost all populations, including genetically closed populations (13,15). If the 9+n pattern is the ancestral state, as has been suggested (13), addition of a 5[prime] CGG to create the 10+n pattern must also have occurred at least 100 000 years ago (30). However, the 10+n pattern is significantly under-represented in unstable repeat arrays. Thus, we propose that the 10+n pattern itself may serve to actually stabilize the repeat array. If the 10+n structure were more stable, selective pressure might explain why this array is the most common in almost all populations tested (13,15), despite theoretically occurring later in evolution. This is the first suggestion that the 5[prime] end of the repeat tract might be an important factor for risk of expansion; at present, we cannot distinguish between simple linkage of the repeat pattern to other causative factors or an actual involvement of the 5[prime] end of the tract in repeat stability. Clearly, CGG tracts with other interspersion patterns, such as 8+n, 10+n, 11+n, etc. occur on both large normal and premutation chromosomes, but at vastly reduced rates. Here the utility of the ATL1 marker becomes evident: within a group of chromosomes with the same microsatellite haplotype and similar CGG repeat lengths, the ATL1 allele state closely corresponds to the position of the first AGG interruption, particularly the ATL1-A allele to the 10+n pattern. This is most clearly seen on the 2-3, 6-3 and 7-3 haplotypes in Figure
Haplotype associations and mechanistic implications of ATL1
The majority (70-80%) of fully mutated fragile X alleles has been shown to be linked to three predominant DXS548-FRAXAC1 marker haplotypes (76% in this study). This leaves 20-30% of expanded fragile X alleles linked to `rare' haplotypes. Several previous reports have suggested that general instability at repeat loci could be responsible for mutating not only the CGG repeat but also the flanking dinucleotide markers to produce these rare haplotypes (25). There is, however, limited evidence for this hypothesis and, in such a case, the use of a haplotype of single nucleotide polymorphisms could be quite valuable. As these biallelic markers would presumably not be affected by general instability, they could distinguish founder chromosomes which have over time radiated into a number of small haplotypes surrounding the predominant founder haplotype from chromosomes which experienced a sudden change at most or all of the repeat markers at the FMR1 locus. Our data from the ATL1 polymorphism alone seem to point to the former case. For example, although the ATL1-A allele is observed associated with the AC1-3 allele in 323 chromosomes out of 340, 88% of the remaining ATL1-A chromosomes carry FRAXAC1 alleles differing by a single dinucleotide from the AC1-3 allele (i.e. AC1-2 and AC1-4).
In our samples, we found only fragile X chromosomes and no normal chromosomes on the following haplotypes: 2-4-5; 3-3-3; 3-4-5; 4-4-5; 5-1-5; 5-4-3; 5-4-4. All of these chromosomes had the ATL1-G allele and carry FRAXAC1 and FRAXAC2 alleles exclusively or almost exclusively linked to ATL1-G on all chromosomes studied (Table 3). Thus, the `rare' nature of these haplotypes derives mostly from the DXS548 allele on these chromosomes. Therefore, instead of generalized microsatellite instability in the FMR1 region causing rare haplotypes, specific instability of DXS548 alleles may occur or recombination between DXS548 and FRAXAC1 could reshuffle haplotypes during the progression of N to S to Z alleles. Indeed, recombination between DXS548 and the CGG repeat upon transmission has been reported in several fragile X pedigrees (31,32), although it is unclear if recombination in this 150 kb interval is elevated in general or enhanced in combination with lengthy CGG repeats.
The data reported above suggest that the most stable markers in the FMR1 region are the first AGG interruption in the CGG repeat itself and SNPs like ATL1. These markers reveal nearly complete distinctions between groups of X chromosomes demonstrating unequal frequencies of CGG repeat expansion (Fig.
MATERIALS AND METHODS
Sample populations
Caucasian DNA samples from the normal repeat length population were composed of two groups, which were subsequently combined as their CGG repeat lengths and DXS548-FRAXAC1 haplotype distributions were not significantly different. The ethnicity of the samples was self-reported as Caucasian at the time of sampling. Of the samples 360 were random male samples from an ongoing survey of the FMR1 gene among students in special education classes in the five county metropolitan area of Atlanta, as previously described (33), 158 were from male blood donors in Wessex, UK and have also been previously described (24). For the AGG interruption analysis, the UK chromosomes were added to a group of 46 US Caucasian males previously studied (10,15). A collection of 34 intermediate allele and premutation-sized chromosomes from Wessex, UK was also examined, but as this group was not a random collection, we could not add them into the `normal' sample set without significantly altering the repeat and haplotype distributions. Fragile X samples were also composed of two groups: 67 were US Caucasian males previously studied (10) and 85 were of similar background ascertained from the Southeastern USA. The African !Kung and Khwe male samples were generously provided by Dr Douglas C.Wallace (34). The African pygmy male samples are three Mbuti (NA10495A, NA10492 and NA10494) and two Biaka (NA10469A and NA10470), obtained from the NIGMS Human Genetic Mutant Cell Repository. African-American samples were also collected from special education classes in metropolitan Atlanta (33). The nine male chimpanzee DNAs, five of which have been previously studied (16), were collected from the Yerkes Primate Center.
SSCP analysis and DNA sequencing
Primer sets amplifying products of ~250 bp were selected in and around the FMR1 gene. Reaction conditions for each set were optimized with the PCR Optimizer Kit (Invitrogen). Aliquots of 1 pmol each [gamma]-32P-labeled primer and 20 pmol each unlabeled primer, the appropriate buffer, 200 µM dNTPs and a 3:1 mix of Taq to Pfu (Stratagene) polymerases were used to amplify 100-200 ng genomic DNA. A panel of 20-30 normal samples and 15-20 fragile X samples of various DXS548-FRAXAC1 haplotypes were tested for each primer set. After addition of formamide loading buffer, reactions were denatured at 95°C and 2-4 µl were loaded on gels made of 0.5× MDE gel solution and 0.6× TBE. Gels were run at 15-25 W in 0.6× TBE buffer for 6-8 h.
Shifted samples and several non-shifted controls were reamplified without radioactivity and products were cloned with the TA Cloning Kit (Invitrogen). Multiple clones from separate PCRs for each sample were sequenced with vector primers on the ABI373 automated sequencer and sequences were aligned using the GeneWorks program (Oxford).
Polymorphism detection
The novel single nucleotide polymorphism was basically characterized using the protocol of Handelin and Shuber (35). The region containing the marker was amplified using primers ATL1-F (CCC TGA TGA AGA ACT TGT ATC TC) and ATL1-R (GAA ATT ACA CAC ATA GGT GGC ACT). Approximately 100 ng genomic DNA were added to a reaction of 1× Gene-Amp PCR Buffer II (Perkin-Elmer), 1.5 mM MgCl2, 160 µM dNTPs, 30 pmol each primer, 1 U Taq and 0.3 U Pfu polymerases. Reactions were heated at 94°C for 2 min, then amplified for 30 cycles of 94°C for 15 s, 62°C for 30 s, 72°C for 1 min, followed by 10 min at 72°C and a 4°C soak. Following PCR, denaturation buffer was added to the reactions and half of each was blotted onto duplicate Hybond N+ membranes (Amersham).
Probes ATL1-A (AAA TGT TTT TGC ATT TG) and ATL1-G (AAA TGC TTT TGC ATT TG) were labeled with [[gamma]-32P]ATP and T4 polynucleotide kinase. Each probe was added to 20 ml hybridization solution containing TMAC (35) and blots were exposed to the probe in solution for 2.5 h at 52°C. Blots were washed twice with TMAC wash solution (35) for 20 min at 52°C and either exposed to a bio-imaging screen (Fuji) for 1 h or to film overnight at -70°C.
Haplotype analysis
For a fraction of the samples, haplotypes had previously been determined (10,14,15,24). Allele state at the DXS548 microsatellite was determined either alone by radioactive methods previously described (19) or multiplexed with the FRAXAC1 and FRAXAC2 microsatellites in a fluorescent method (36). The allele numbering of Eichler et al. (14) was used for all three microsatellites; in addition, one sample was found to contain a 208 bp DXS548 allele and thus designated 0 and a 160 bp FRAXAC2 allele was designated 0.5 or 0+. FRAXA allele size was also determined by a fluorescent method as previously described (33,36). AGG interspersion analysis had been completed previously on 203 samples (10,14,15).
Statistical analysis
Distributions of alleles were compared using standard [chi]2 analysis if possible. However, if expected frequencies were <5 for any cell, the StatXAct-3 program (Cytel Software) was used to calculate the Fisher-Freeman-Halton exact test (designated as Fisher's exact test) with the Monte Carlo method of repeated sampling, due to the large sizes of the data sets.
ACKNOWLEDGEMENTS
We would like to thank Dr Doug Wallace and Dr Jeanette J.A.Holden for generously providing samples, Dr Pat Jacobs and Dr Newton Morton for helpful comments and the Warren laboratory for technical advice, particularly Jane Iber and Aileen Kenneson. W.P. and C.B.K. are Associates and S.T.W. an Investigator for the Howard Hughes Medical Institute. This work was supported, in part, by NIH grants HD29909, HD20521 and HD35576.
REFERENCES
This article has been cited by other articles:
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: www-admin{at}oup.co.uk
Last modification: 12 Nov 1998
Copyright©Oxford University Press, 1998.
![]()
CiteULike
Connotea
Del.icio.us What's this?
![]()
![]()

![]()
![]()
![]()
C. Dombrowski, S. Levesque, M. L. Morel, P. Rouillard, K. Morgan, and F. Rousseau
Premutation and intermediate-size FMR1 alleles in 10 572 males from the general population: loss of an AGG interruption is a late event in the generation of fragile X syndrome alleles
Hum. Mol. Genet.,
February 1, 2002;
11(4):
371 - 378.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
R. I. Richards
Dynamic mutations: a decade of unstable expanded repeats in human genetic disease
Hum. Mol. Genet.,
October 1, 2001;
10(20):
2187 - 2194.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
D. J. Mathews, C. Kashuk, G. Brightwell, E. E. Eichler, and A. Chakravarti
Sequence Variation Within the Fragile X Locus
Genome Res.,
August 1, 2001;
11(8):
1382 - 1391.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
D. C. Crawford, B. Wilson, and S. L. Sherman
Factors involved in the initial mutation of the fragile X CGG repeat as determined by sperm small pool PCR
Hum. Mol. Genet.,
November 1, 2000;
9(19):
2909 - 2918.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
M. L. Moseley, L. J. Schut, T. D. Bird, M. D. Koob, J. W. Day, and L. P.W. Ranum
SCA8 CTG repeat: en masse contractions in sperm and intergenerational sequence changes may play a role in reduced penetrance
Hum. Mol. Genet.,
September 1, 2000;
9(14):
2125 - 2130.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
D. C. Crawford, F. Zhang, B. Wilson, S. T. Warren, and S. L. Sherman
Fragile X CGG repeat structures among African-Americans: identification of a novel factor responsible for repeat instability
Hum. Mol. Genet.,
July 22, 2000;
9(12):
1759 - 1769.
[Abstract]
[Full Text]
[PDF]
![]()
This Article ![]()
![]()
Abstract
![]()
FREE Full Text (PDF)
![]()
Alert me when this article is cited
![]()
Alert me if a correction is posted
![]()
Services ![]()
![]()
Email this article to a friend
![]()
Similar articles in this journal
![]()
Similar articles in ISI Web of Science
![]()
Similar articles in PubMed
![]()
Alert me to new issues of the journal
![]()
Add to My Personal Archive
![]()
Download to citation manager
![]()
Search for citing articles in:
ISI Web of Science (31)
![]()
Request Permissions ![]()
Google Scholar ![]()
![]()
Articles by Gunter, C.
![]()
Articles by Warren, S. T.
![]()
Search for Related Content
![]()
PubMed ![]()
![]()
PubMed Citation
![]()
Articles by Gunter, C.
![]()
Articles by Warren, S. T.
![]()
Social Bookmarking ![]()
![]()
What's this?