The most common mutation causing Friedreich ataxia (FRDA), an autosomal recessive neurodegenerative disease, is the hyperexpansion of a polymorphic GAA triplet repeat localized within an Alu sequence (GAA-Alu) in the first intron of the frataxin (X25) gene. GAA-Alu belongs to the AluSx subfamily and contains several polymorphisms in strong linkage disequilibrium either with a subgroup of normal alleles, or with hyperexpanded FRDA-associated alleles. GAA repeat sizes in 300 normal chromosomes (97 from carriers and 203 from controls) were distributed in two separate groups: 83% of them contained between six and 10 triplets (small normal alleles), while the remaining 17% had more than 12 triplets, up to 36 (large normal alleles). Sequence analysis showed that no normal, stable allele contained more than 27 uninterrupted GAA triplets. All longer normal alleles were interrupted by a hexanucleotide repeat (GAGGAA). An allele containing an uninterrupted run of 34 GAA triplets was stably transmitted in four instances, but in one case underwent hyperexpansion to 650 triplets. Overall, our results suggest that the FRDA-associated expanded GAA repeats originate from normal alleles by recurrent expansions of alleles at risk.
The most common DNA abnormality associated with Friedreich ataxia (FRDA; 1 ) is the expansion of a GAA triplet repeat polymorphism localized within an Alu sequence in the first intron of the gene encoding frataxin, on chromosome 9q13. It is found in 95% of all FRDA chromosomes, with the remaining ones carrying point mutations in the frataxin gene, including missense, nonsense and splice site mutations (1 ,2 ). The triplet repeat expansion in FRDA has three novel, and so far unique features: it involves GAA trinucleotides (polypurine), is in an intron, and is associated with an autosomal recessive disease. Repeats in normal chromosomes are stable when transmitted from parent to offspring and of equal size in all tissues (1 ,3 ), while hyperexpanded, disease-associated repeats show meiotic as well as mitotic instability (3 -5 ). Analysis of FRDA families (1 ,4 ) has shown that maternally transmitted expansions contract or expand with equal frequency, while paternal transmission almost always results in size contraction (4 ). Mitotic instability has been demonstrated in different cell types from the same patient (5 ). In particular, most brain regions show a very complex pattern of allele sizes, indicating extensive heterogeneity (3 ).
Hyperexpansion of the GAA repeat leads to suppression of frataxin gene expression, probably through a directional blockade of transcriptional elongation resulting from the formation of a non-B DNA structure (6 ). Such a loss-of-function pathogenetic mechanism fits the model for a recessive disease. Furthermore, the residual amount of frataxin mRNA and protein is inversely proportional to expansion sizes (Campuzano et al., submitted). This graded effect provides a biological basis for the correlation between expansion sizes and phenotypic features, including age at onset as well as severity and extent of disease involvement, that has been determined in several studies (5 ,7 -8 ).
The underlying process by which hyperexpansion of normal alleles occurs in FRDA remains unknown. The finding of strong linkage disequilibrium between FRDA and neighboring markers (9 -11 ) suggests that either a few, possibly even a single, ancestral events gave rise to the FRDA expansion, or that there are recurrent expansions in alleles at risk (12 ). The second hypothesis is supported by the observation of a de novo hyperexpansion of a premutated FRDA allele which we report here. In addition, to obtain clues about possible mechanisms responsible for the expansions and polymorphisms of normal alleles, we describe the characteristics of the Alu sequence containing the GAA triplet repeat polymorphism, and the size distribution and structure of normal alleles.
Sequence analysis revealed that the GAA repeat is localized approximately in the middle of the Alu sequence containing the GAA repeat (GAA-Alu) and is preceded by an A6TACA16 sequence, a degeneration of the A5TACA5 canonical sequence linking the two halves of Alu repeats (13 ; Fig. 1 ). GAA-Alu is flanked by a 13 bp perfect direct repeat (AAAATGGATTTCC), underlined in Figure 1 . When GAA-Alu was used to perform a BLASTN search (14 ) on REPBASE (Alu Repeat database) the best score (89% identity) was obtained with the human AluSx subfamily consensus sequence (15 ).
We determined the GAA repeat size in 300 normal alleles (Fig. 2 ). These alleles were stably transmitted from parent to child, as no length variation was detected in 66 examined meioses (Table 1 ). The majority of normal alleles (82.5%) contained between six and 10 GAA units, with a peak at 9 (49 %) triplets, and were assigned to a `small normal' group. The remaining alleles (17.5%) formed the `large normal' group. Large normal alleles contained >12 triplets and had a uniform size distribution. The largest alleles in this group had sizes corresponding to 34, 35 and 36 triplets (Table 2 ). There was no statistically significant difference in the size distribution of normal GAA alleles between healthy carriers from FRDA families (97 chromosomes) and control subjects (203 chromosomes).
Sequence analysis of seven normal alleles whose length ranged between 21 and 36 triplets revealed a pure GAA repeat only in those containing 27 or less triplets (Table 2 ). The three longest alleles, corresponding to 34, 35 and 36 triplets, were interrupted by between five and nine GAGGAA tandemly arranged hexanucleotide units localized in the 3' half of the repeat, and followed by four GAA triplets. These alleles were found in individuals of different ethnic background (French-Acadian, Italian, and Anglo-Saxon). No interruptions were seen in the small normal group.
An individual with typical FRDA carrying two expanded GAA repeats, both containing ~650 GAA triplets, was identified by PCR. Sequencing showed that the patient's father carried a normal allele with 21 GAA triplets and an expanded one with ~1050 triplets, as determined by PCR. Sequencing also revealed that the patient's mother carried a smaller allele with nine GAA triplets and a larger one containing an uninterrupted run of 34 GAA triplets. Sequence analysis of additional family members revealed that the (GAA)34 allele had been transmitted to the patient's mother from her father. The same allele was found in her brother and in two unaffected sibs of the patient. Haplotype analysis using three microsatellites spanning the FRDA region confirmed that a chromosome carrying 650 repeats in the patient corresponds to the maternal chromosome carrying 34 repeats, thus excluding paternal isodisomy as the cause of homozygous expansion in this individual. Therefore, in four out of five instances of parent to child transmission this allele had remained stable, and in one it underwent hyperexpansion.
Alu sequences are a heterogeneous group of primate-specific interspersed repetitive DNA elements with an estimated frequency of 500 000 to 1 million copies per genome. They may serve as functional polIII genes and are probably derived from 7SL genes. Their pervasiveness and variability are the result of constant amplification and retrotranposon-mediated reinsertion throughout the genome over 65 million years of primate evolution (15 ). Despite their diversity, Alu sequences can be grouped into subfamilies whose members share a few, common diagnostic base changes. By comparing differences between these sequences, Alu elements can be used as molecular clocks to estimate the age of a particular subfamily or member of a subfamily. We analyzed the Alu sequence containing the GAA repeat associated with FRDA (GAA-Alu) and assigned it to the human AluSx subfamily. Identity between GAA-Alu and the AluSx consensus sequence was 89%, in agreement with the overall 92% +- 3 identity between individual AluSx subfamily sequences and the consensus sequence. According to similarity calculation, the average age of the AluSx subfamily has been estimated at 37 million years (15 ). The FRDA-associated GAA repeat lies in the middle of the GAA-Alu repeat preceded by a stretch of 16 As, apparently derived from an expansion of the canonical A5TACA6 sequence linking the two halves of Alu sequences. GAA-Alu is flanked by a 13 bp perfect direct repeat (AAAATGGATTTCC), suggesting a recent Alu retroposition/insertion event, an idea supported by the estimated age of the AluSx subfamily (17 ).
Analysis of the length polymorphism of the FRDA-associated GAA repeat in normal alleles suggests that it was generated by two types of events. Small changes, plus or minus one trinucleotide, may have caused size heterogeneity within the `small normal' and the `large normal' groups (Fig. 3 ). Such small changes were likely to be the consequence of occasional events of polymerase `stuttering' during DNA replication, i.e., slippage followed by mis-realignment of the newly synthesized strand by one or, rarely, a few repeat units (18 ). This basic polymorphism-generating mechanism has been postulated for all simple-sequence repeats (19 ). By comparison, the jump from the `small normal' to the `large normal' group was probably a rare or singular event (Fig. 3 ). Preliminary linkage disequilibrium results revealing the association of different marker haplotypes to `small normal' and `large normal' alleles (20 ) have been reinforced by our findings with a polymorphism (Alu VpA) involving the deletion of two TAA tandem repeats in the poly(A) tail at the 3' end of GAA-Alu. This polymorphism is eight times more frequent in normal than FRDA chromosomes and is in complete linkage disequilibrium with normal alleles carrying eight GAA repeats. It is hard to speculate about the mechanism leading to a sudden doubling of the repeat size. Unequal sister-chromatid exchange and gene conversion have been proposed as generators of variability in VNTRs (19 ), but additional data are needed to test these hypotheses, as well as alternative ones such as the occurrence of an exceptionally large slippage event. Regardless of the mechanism, it seems likely that after a number of small increases in size because of slippage events, `large normal' alleles eventually reach the threshold for instability and undergo hyperexpansion (Fig. 3 ). This is again in agreement with preliminary linkage disequilibrium data which show that the marker haplotypes associated with FRDA chromosomes are also associated with some `large normal' alleles (20 ).
Patient P.C. is an individual with a clinical diagnosis of FRDA, who was referred to us for molecular analysis. This patient has the typical features of the disease, including relentlessly progressive ataxia with onset at age 15, tendon areflexia, loss of vibration and position sense, kyphoscoliosis, pes cavus, hypertrophic cardiomyopathy, and neurophysiologic evidence of axonal sensory neuropathy.
Venous blood (5-10 ml) was obtained for DNA analysis, and DNA was extracted from peripheral blood lymphocytes using the QIAmp Tissue Kit (Qiagen), following the manufacturer's recommendations. In order to detect expanded repeats, the portion of intron 1 of the frataxin gene containing the GAA triplet repeat was amplified using primers `Bam' and `2500F' and the GeneAmp XL PCR kit (Perkin Elmer), as described. Amplification products were electrophoresed on a 0.8% agarose gel and visualized by ethidium bromide staining. The size of PCR products was determined by comparing their migration rate with a molecular weight standard (1 kb ladder, Gibco-BRL), and the number of triplets was estimated as (s - 1500)/3 (where s = size in bp of the PCR product), rounding the result to the nearest multiple of 50. Alleles in the normal range were more accurately sized by amplification either using the GAA-F/GAA-R primers (1 ), or the following primer pair:
GAA-147F, 5'-GAAGAAACTTTGGGATTGGTTGC-3'
GAA-602R, 5'-TTTTCCAGAGATGCTGGGAAATC-3'
Both primer pairs flank the GAA repeat and generate a PCR product of 451 (GAA-F/GAA-R) or 430 (GAA-147F/GAA-602R) + 3n bp (where n = number of GAA triplets). To analyze GAA repeat length of normal alleles and to detect other possible sequence variations, PCR products were digested either with BstNI (GAA-F/GAA-R amplification products) or with MspI (GAA-147F/GAA-602R amplification products) to generate smaller fragments. PCR digested products were separated by electrophoresis through a 6% denaturing polyacrylamide gel. The GAA repeat number was obtained by subtracting 191 from the BstNI restriction fragment containing GAA repeats, or 92 bp from the length of the corresponding MspI restriction fragment, and dividing by 3. The presence of interruptions containing the GAGG sequence in the run of GAA triplets was revealed by digestion with the restriction enzyme MnlI, which cuts DNA 6 bp 5' to a GAGG sequence. Digestion of GAA-F/GAA-R PCR products with MnlI generates five invariant fragments of 168, 86, 32, 24 and 6 bp, and a fragment of 137 + 3n bp, which is cut by the enzyme when containing GAGGAA repeat units. Size estimates were done either by comparison to a pUC18 sequence ladder (Silver Sequence DNA sequencing system, Promega), or to a set of reference alleles, whose length had been determined by sequencing.
GAA-F/GAA-R amplification products were purified from NuSieve gels using the QIAquick Gel Extraction Kit (Qiagen), and directly sequenced using the Sequenase PCR Product Sequencing Kit (USB). Because of the presence of a poly-A stretch preceding the GAA repeat, which interferes with the sequencing reaction, samples were sequenced on the opposite (CTT) strand, using the GAA-R primer.
All members of the family in which a new expansion of the GAA triplet repeat was observed were typed with the microsatellite markers D9S1844, D9S273 and D9S1799. Genotyping was performed by PCR amplification with an end-labeled and a cold primer, followed by separation of the products on 6% polyacrylamide sequencing gels and detection by autoradiography. The map distances between these markers are: D9S1844-1.1 cM-D9S273-0.5 cM-D9S1799, with the FRDA locus localized between D9S1844 and D9S273.
All statistical calculations were performed by using the SPSS/PC+ computer package.
We thank S.Jiralerspong for suggestions and editing of the manuscript. This work was supported by grants from the National Institutes of Health (NS34192, M.P.), the Muscular Dystrophy Association, USA (M.P.), the Italian Telethon Grant No. 722 to S.C. and by grants from the Ministries of Health and the University. We thank Drs K.Ohshima and R.D.Wells for useful discussions and suggestions. We are grateful to the families involved in this study for their participation and encouragement.
*To whom correspondence should be addressed at: Centre de Recherche Louis-Charles Simard, Pavillon De Sève, 1560 Sherbrooke Est, Montréal, Québec H2L 4M1, Canada. Tel: +1 514 281 6000; Fax: +1 514 896 896 4762; Email: pandolm@magellan.umontreal.ca
+These authors equally contributed to this work.
Human Molecular Genetics
Pages
Introduction
Results
Analysis of the frataxin GAA-Alu element
GAA repeat polymorphism in normal chromosomes
Interruption of GAA repeats in large normal alleles
A new expansion of the GAA triplet repeat
Discussion
Materials And Methods
Patient
Molecular analysis of GAA repeats
Sequence analysis
Statistical analysis
Acknowledgements
References
REFERENCES
This page is maintained by OUP admin. Last updated Tue Jul 15 11:13:42 BST 1997. Part of the OUP Journals World Wide Web service. Copyright Oxford University Press, 1996


