Primate origin of the CMT1A-REP repeat and analysis of a putative transposon-associated recombinational hotspot
Primate origin of the CMT1A-REP repeat and analysis of a putative transposon-associated recombinational hotspot Hidenori Kiyosawa and Phillip F. Chance*
Division of Neurology, The Children's Hospital of Philadelphia and Departments of Neurology and Pediatrics, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania 19104, USA
Received February 14, 1996;Revised and Accepted March 20, 1996
The CMT1A-REP repeat on chromosome 17p11.2-12 is proposed to mediate misalignment and meiotic unequal crossover leading to a 1.5 Mb pair duplication associated with Charcot-Marie-Tooth neuropathy type 1A (CMT1A) and a reciprocal deletion associated with hereditary neuropathy with liability to pressure palsies (HNPP). Restriction enzyme endonuclease mapping indicated that the size of the CMT1A-REP repeat is approximately 24 kb and DNA sequence analysis determined that the repeat is flanked by inverted Alu sequences. Full length Alu sequences are present at the centromeric ends of the proximal and distal CMT1A-REP repeats and at the telomeric end of the distal repeat. A truncated Alu sequence is present at the telomeric end of the proximal repeat suggesting that the distal CMT1A-REP repeat is the progenitor copy. The crossover breakpoints for a series of unrelated CMT1A and HNPP patients were mapped using a variant SacI site found only in the proximal CMT1A-REP repeat. Seventy-six percent (66/85) of patients had breakpoints which mapped to a 3.2 kb interval, providing further evidence for a recombinational hotspot within the CMT1A-REP repeat. A mariner-like element was mapped within the CMT1A-REP repeat approximately 700 bp centromeric to the 3.2 kb interval containing the hotspot. Analysis of this sequence suggested that it does not encode a functional transposon. By Northern blot analysis a cloned fragment from the CMT1A-REP repeat containing the mariner-like sequence detected a 2.2 kb transcript only in testis. Two cDNA clones which contain the mariner-like element were isolated from a human testis cDNA library. These clones which are interrupted by Alu and other repeats appear to be non-functional versions of the transposon. The functional relationship of the mariner-like element to the recombinational hotspot remains unknown. The origin of the CMT1A-REP repeat was investigated through an analysis of homologous sequences in non-human primates. Southern blot analysis indicated that the chimpanzee has two copies of a CMT1A-REP-like sequence, whereas gorilla, orangutan, and gibbon have a single copy. A high degree of conservation amongst non-human primates for restriction fragments specific to the human distal CMT1A-REP repeat provides further evidence that the distal repeat is the progenitor copy. The mariner-like sequence was detected in association with the CMT1A-REP sequence in all primates studied suggesting that the mariner-like element was introduced into the progenitor CMT1A-REP sequence prior to emergence of the proximal and distal CMT1A-REP repeats. These observations suggest that CMT1A-REP sequence appeared as a repeat before the divergence of chimpanzee and human, but after gorilla and human around 6 to 7 million years ago.
The CMT1A-REP repeat is a low copy number repeat located on human chromosome 17p11.2-12 and it flanks a 1.5 megabase pair (Mb) segment that is duplicated in Charcot-Marie-Tooth disease type 1A (CMT1A) and reciprocally deleted in hereditary neuropathy with liability to pressure palsies (HNPP) (1 -4 ). Misalignment and unequal crossover mediated by the CMT1A-REP repeat is a proposed mutational mechanism for CMT1A and HNPP (2 ,4 ,5 ). Normal individuals have two copies of the CMT1A-REP repeat on each chromosome 17, while three copies of the repeat are found on CMT1A chromosomes and only one copy of the repeat is present on HNPP chromosomes (4 ,5 ). Both disorders are autosomal dominant demyelinating peripheral neuropathies, yet their clinical and histological features are quite different (6 ). The peripheral myelin protein-22 gene (PMP22) maps within the duplicated or deleted region (7 -10 ), and it is proposed that altered expression of PMP22 through gene dosage leads to the demyelinating neuropathy phenotype (11 -13 ). Homogeneity for size of the duplication/deletion in unrelated patients (1 ,5 ,14 ) and the frequent occurrence of de novo duplication/deletion events (2 ,5 ,14 ,15 ), suggest that a precise, recurring meiotic mechanism accounts for generation of the duplication and the deletion.
The CMT1A-REP repeat was initially estimated to span at least 17 kb (4 ), and in subsequent analysis found to be approximately 27 kb (16 ). Sequence analysis for a middle 5 kb portion found >98% sequence identity between the proximal and distal CMT1A-REP repeats and restriction endonuclease mapping suggested that the entire repeat is continuously homologous (16 ).
We have previously constructed a detailed physical map of the CMT1A-REP repeat regions and shown that all crossover breakpoints in a series of CMT1A and HNPP patients occurred within the CMT1A-REP repeat (16 ). Although crossovers mediated by the CMT1A-REP could occur at any region within the 27 kb repeat, our analysis indicated an apparent recombinational hotspot as 77% (40 of 52) of CMT1A and HNPP chromosomes contained breakpoints which mapped specifically within a 7.9 kb interval (16 ). The observation of a recombinational hotspot within the CMT1A-REP repeat has been independently confirmed (17 ) and a sequence having high identity to a transposable element (a mariner-like element) was found to map to the region of the hotspot (17 ). This element, termed MITE (Mariner insect transposon-like element), is proposed to mediate strand exchange events via cleavage by a transposase at or near the 3' end of the element.
The high degree of DNA sequence homology (>98%) between the proximal and distal CMT1A-REP repeats hampered the identification of variant restriction enzyme sites needed to detect patient-specific, junctional fragments by routine Southern blot analysis. In this report, we present such novel-sized, patient-specific restriction fragments, utilizing variant sites specific to the proximal and distal CMT1A-REP repeats. To better understand the origin, evolution and nature of the CMT1A-REP repeat we extended DNA sequence analysis, searched for evidence supporting the role of a transposable element and determined copy number and hybridization patterns in a series of non-human primates.
Previously we showed that 77% of unrelated CMT1A and HNPP patients had crossover breakpoints which mapped within regions B and C defining a 7.9 kb interval of CMT1A-REP (Fig. 1 ). A polymorphic HindIII site in the proximal repeat defines the B/C interface, but could not be used to further differentiate regions B and C (designated B/C) (16 ). As this region contains the majority breakpoints, we extended DNA sequencing in the centromeric direction and identified a SacI site specific to the proximal CMT1A-REP repeat. This SacI site was utilized to subdivide region B/C into two regions (B1 and B2/C) and to detect patient-specific, novel junctional fragments shown in Figure 2 .
DNA sequence analysis was directed to region B1 as 76% of CMT1A and HNPP patients were found to have breakpoints mapping within this 3.2 kb interval. We detected a 1.4 kb segment having 77% sequence homology to a previously reported transposable element of the mariner family (Fig. 1 ; 18 ). This mariner-like element which is located in the 3' (telomeric) end of the region A is flanked by 36 bp inverted repeats as described for other mariner-like elements (18 ,19 ). The mariner-like element located in the CMT1A-REP repeats appears to be non-functional as numerous stop codons were found in the putative open reading frame (ORF). However, it remains possible that this non-functional mariner sequence may be a target for a functional form of the transposon protein which is transcribed from a gene located elsewhere.
In order to test the possibility that a mariner-like element is expressed in human tissues, we performed Northern blot hybridization using polyA+ RNA samples from various organs or tissues using the probe pHK0.7D containing an internal 0.7 kb portion of the mariner-like element from the distal CMT1A-REP repeat. The results of this analysis are shown in Figure 4 . In lane 4, the sample from testis had a visible band of 2.2 kb. There were no detectable bands in other lanes except very large transcripts (about 4.5 kb in lanes 2 and 7 and 7.0 kb in the lanes 1-5 and 16) of unknown origin.
Figure 4. Northern analysis of human mariner-like element. Northern membranes prepared from polyA+ RNA of various human tissues and organs were probed with pHK0.7D containing an internal 0.7 kb of the human mariner-like element from the distal CMT1A-REP repeat. Only the sample from testis had a visible band of 2.2 kb, consistent with the size of isolated cDNA clones from the testis cDNA library. Hybridization to [beta]-actin is also shown as a control. Lane 1: spleen; lane 2: thymus; lane 3: prostate; lane 4: testis; lane 5: ovary; lane 6: small intestine; lane 7: colon (mucosal lining); lane 8: peripheral blood leukocyte; lane 9 heart; lane 10: brain; lane 11: placenta; lane 12: lung; lane 13: liver; lane 14: skeletal muscle; lane 15 kidney; lane 16: pancreas.
Previous studies have found that de novo duplication/deletion events are almost exclusively of paternal origin, suggesting that unequal crossover occurs at a much higher rate during spermatogenesis (5 ,20 ,21 ). The Northern analysis described above suggested that a mariner-like element is transcribed in testis. We searched for expressed mariner-like sequences in a human testis cDNA library using probe pHK1.8D that maps to the distal CMT1A-REP repeat and contains about 80% of the mariner-like element. Of three independent clones isolated, two proved to be identical by DNA sequence analysis. The inserts of two different clones (pcHMT1 and pcHMT2) were both 2.2 kb which is the same size as the mariner transcript described above. By sequence analysis neither of these clones appears to be derived from any previously reported mariner-like genomic sequences or from the mariner-like sequence which maps within the CMT1A-REP repeat. The sequence similarities among the various mariner-like elements detected in humans are summarized in Table 1 . As shown in Figure 5 , neither cDNA clone contains a complete transposon sequence. pcHMT1 has the transposon sequence in reversed orientation relative to transcriptional direction and lacks approximately 260 bp of the 5' end of the transposon sequence. There are three repetitive sequence elements within the transcript (Alu, SVA, and THE1, Fig. 5 ) (22 ). pcHMT2 has a proper orientation relative to transcriptional direction. This clone also has an Alu sequence insertion and lacks the 5' inverted repeat. Neither of these cDNA clones appears to encode a functional protein and their roles are unknown.
Comparison of DNA sequence identity between human mariner-like elements
CR-MLE
U38613
U38614
U38615
pcHMT1
pcHMT2
CR-MLE1
77%
79%
80%
75%
80%
U386132
74%
73%
79%
80%
U386142
68%
75%
76%
U386152
76%
76%
pcHMT13
76%
pcHMT23
1Mariner-like element found in the proximal CMT1A-REP repeat.2GenBank accession numbers for previously reported human mariner-like elements (18).3Only the portions of cDNA clones containing a mariner-like element were compared with other mariner-like elements.
Repeat elements often possess characteristic sequences at each end or in their flanking sequences. Though the CMT1A-REP repeat is extremely large (27 kb, an estimate before this study) as compared to other more widely dispersed or high-copy number repeats, it may be better understood as a DNA segmental duplication itself. To investigate the origin of the CMT1A-REP repeat we sequenced and analyzed the centromeric and telomeric ends of the proximal and distal CMT1A-REP repeats and their flanking sequences (Figs 1 and 6 ).
At the centromeric ends of the CMT1A-REP repeat, both the proximal and distal CMT1A-REP repeats have a full length Alu-Sx sequence and the centromeric end of the Alu sequence coincides with the centromeric end of the CMT1A-REP repeat (Fig. 6 A). Alu-J sequences were found at the telomeric ends of the proximal and distal CMT1A-REP repeats. A full length Alu-J sequence was present in the distal CMT1-REP repeat (292 bp), whereas the Alu-J sequence was truncated in the proximal CMT1A-REP repeat (116 bp). The telomeric end of the CMT1A-REP repeat is defined by the truncation point of the Alu-J sequence in the proximal CMT1A-REP repeat (Fig. 6 B). The existence of a full length Alu sequence at the 3' (telomeric) end of the distal CMT1A-REP repeat suggests that it might be the original (progenitor) sequence. These Alu sequences located at each end of the proximal and distal CMT1A-REP repeats are in an inverted orientation and serve as inverted repeats at both ends of the CMT1A-REP repeats.
Figure 5. Schematic presentation of human mariner cDNA clones. Two transcribed mariner-like sequences isolated from a human testis cDNA library using probe pHK1.8D from the distal CMT1A-REP repeat. Insert sizes (bp) of two different clones (pcHMT1 and pcHMT2) are shown. Neither cDNA clone contains a complete transposon sequence nor do they appear to encode a functional protein. Locations of the mariner sequences are shown. Unshaded areas indicate portions of mariner-like sequences present and shaded areas indicate absent portions relative to published human mariner sequences (17). pcHMT1 has a transposon sequence in reversed orientation relative to transcriptional direction and lacks approximately 260 bp of the 5' end of transposon sequence. Three repetitive sequence elements are found within this transcript (Alu-Sx, SVA, and THE1) (21). pcHMT2 has a proper orientation relative to transcriptional direction, but has an Alu sequence insertion and lacks the 5' inverted repeat.
We previously estimated the size of the CMT1A-REP repeat to be 27 kb based on restriction enzyme digestion and hybridization analysis (16 ). In this study, we identified the ends of the CMT1A-REP repeats at the DNA sequence level. Combined with data previously presented (16 ), the revised estimated size of the CMT1A-REP repeat unit is 24 kb. The region sequenced is summarized in Figure 1 . All regions sequenced were found to exhibit 98-99% identity between the proximal and distal CMT1A-REP repeats.
Figure 6. DNA sequence of the centromeric (5') and telomeric (3') ends of the proximal and distal CMT1A-REP repeats and their flanking sequences. Numbering system submitted to GenBank has been modified for simplicity. Alu sequences define the ends of the CMT1A-REP repeat and are shown in boxes. (A) Centromeric ends of the proximal and distal CMT1A-REP repeats. The centromeric ends of Alu sequences define the centromeric end of the CMT1A-REP repeats. (B) Telomeric ends of the proximal and distal CMT1A-REP repeats. The site of truncation for the Alu sequence within the proximal CMT1A-REP repeat defines the telomeric boundary of sequence identity between the proximal and distal repeats. This site marks the telomeric end of the CMT1A-REP repeat. The distal CMT1A-REP repeat has a full size continuous Alu sequence, almost two thirds of which is not a part of the CMT1A-REP repeat. In order to understand how the CMT1A-REP repeat evolved we searched for homologous sequences and determined copy number in a series of non-human primates. The hybridization patterns using total genomic DNA and probes from the CMT1A-REP repeats were analyzed in the chimpanzee (Pan troglodytes), gorilla (Gorilla gorilla), orangutan (Pongo pygmaeus), and gibbon (Hylobates lar). Lack of hybridization to bovine, murine, rabbit, or Drosophila DNA was previously reported (4 ). The probes used were pHK1.85D and pHK1.0P for EcoRI digestion, and pHK5.2P for HindIII digestion (Fig. 1 ). The human fragments detected with these probes are shown in Figure 1 . As shown in Figure 7 A,B, all primates tested had sequences homologous to the CMT1A-REP repeat sequence. The sizes of fragments detected with probes pHK1.0P and pHK5.2P in the various non-human primates were similar to those found in humans, suggesting a high degree of conservation of DNA sequence and restriction site placement within the CMT1A-REP repeat. Initially, probe pHK1.85D which maps to the middle portion of the CMT1A-REP repeat was tested. As shown in Figure 7 A, probe pHK1.85D detects two fragments in humans and the chimpanzee, while the gorilla, orangutan, and gibbon appear to have a single fragment. This observation suggested that the CMT1A-REP repeat might be present in two copies in the chimpanzee. To address the possibility of restriction site polymorphism in the chimpanzee, a second probe from another region of the CMT1A-REP repeat was tested. As seen with probe pHK1.85D, probe pHK1.0P which maps to the telomeric region of the CMT1A-REP repeat and detects fragments of 3.2 kb and 2.3 kb in humans, also detects two fragments of apparently the same size in the chimpanzee. A single 3.2 kb fragment is detected with probe pHK1.0P in the gorilla and gibbon while the orangutan has a larger sized fragment. The results obtained with these two probes which map to different regions of the CMT1A-REP repeat suggest that the extra fragments detected in the chimpanzee are due to the presence of two copies of the CMT1A-REP repeat, rather than restriction site polymorphisms in this region of the chimpanzee chromosome.
Figure 7. Southern blot analysis with human CMT1A-REP repeat probes in primate DNA. Sizes of the human bands are shown in kb. Lane 1, human; lane 2, chimpanzee; lane 3, gorilla; lane 4, orangutan; lane 5, gibbon. (A) DNA digested with EcoRI. Probes used are shown next to the lanes. Probe pHK1.85D maps to the middle portion of the CMT1A-REP repeat and probe pHK1.0P maps to the telomeric end of the CMT1A-REP repeat. Both probes detect two fragments in the human and chimpanzee samples and only one fragment in the gorilla, orangutan and gibbon. (B) DNA digested with HindIII, probed with pHK5.2P which maps to the middle of the CMT1A-REP repeat and is adjacent to probe pHK1.85D. Probe pHK5.2P detects a 12.7 kb fragment from the proximal repeat and a 14.2 kb fragment from the distal repeat as previously described (16). A single band of 14.2 kb is seen in the gorilla, orangutan and gibbon. A broad band consisting of two high molecular weight fragments visible by short exposure is seen in the chimpanzee.
To further explore the possibility that the CMT1A-REP repeat is present in two copies in the chimpanzee genome a third probe was tested. As diagrammed in Figure 1 and shown in Figure 7 B, probe pHK5.2P detects a 14.2 kb fragment from the distal CMT1A-REP repeat and a 12.7 kb fragment from the proximal CMT1A-REP repeat (16 ). All primates tested had a 14.2 kb fragment. The chimpanzee had an additional, smaller fragment as seen in humans. As shown in Figure 7 B, a broad band is seen in the chimpanzee which consisted of a 14.2 kb fragment and a slightly smaller fragment visible when exposure time was reduced (data not shown). Additionally, a 1.8 kb fragment which is characteristic of the distal CMT1A-REP repeat in humans was seen in all primate samples. We have previously reported that a polymorphic HindIII site is present in the proximal CMT1A-REP repeat at the B2/C interface resulting in either a 1.8 kb HindIII fragment or a 3.0 kb fragment (Figs 1 , 3 ; ref. 16 ). A non-polymorphic HindIII site is present at the same position in the distal CMT1A-REP repeat in all human populations (Fig. 1 ; ref. 16 ). The chimpanzee appears to have this HindIII site in both repeats as evidenced by detection of only the 1.8 kb fragment shown in lane 2 of Figure 7 B.
Therefore, all three probes tested from the CMT1A-REP repeat consistently detected two fragments in humans and in the chimpanzee, while only one fragment was detected in the gorilla, orangutan and gibbon. Furthermore, probe pHK5.2P uniformly detected a 14.2 kb fragment in all primates which is characteristic of the distal CMT1A-REP repeat in humans (Fig. 1 ; ref. 16 ). Probe pHK1.0P detected a 3.2 kb fragment in the chimpanzee, gorilla and gibbon, which also maps to the distal CMT1A-REP repeat in humans (Fig. 1 ; ref. 16 ). While two fragments were detected with probe pHK1.85D in the chimpanzee, the fragment sizes found in the various non-human primates did not coincide with those seen in humans. These observations in primates with probes pHK1.0P and pHK5.2P suggest that the distal CMT1A-REP repeat is the progenitor copy.
We performed additional Southern blot analysis to determine if the mariner-like element is also contained within the CMT1A-REP-like sequence in non-human primates. Probe pHK0.7D, an internal 0.7 kb fragment of the mariner-like element from the distal CMT1A-REP, was hybridized to the same membrane used to generate an autoradiograph for Figure 7 A. A similar pattern of bands as presented in Figure 7 A with probe pHK1.85D was obtained with probe pHK0.7D (data not shown). We did not detect the 6.0 kb band from the distal CMT1A-REP in humans as this fragment does not contain the mariner-like element. Instead, a band of 1.8 kb from the distal repeat containing the mariner-like element in humans was seen (see Fig. 1 ). An additional band of 4.7 kb was seen in the chimpanzee suggesting that probe pHK0.7D spans an EcoRI site in this species. In the gorilla and gibbon we detected the same sized fragments as those presented in Figure 7 A. A 4.5 kb fragment was seen in the orangutan. These observations suggest that the mariner-like element is a part of the CMT1A-REP-like sequence in non-human primates and that insertion of the mariner-like element into this region occurred before duplication of the CMT1A-REP-like sequence in the chimpanzee.
Repeated sequences may contribute to the speciation process through enhancement of chromosomal misalignment during meiosis leading to genomic rearrangements which disrupt or change the expression of genes. Especially, such large scale rearrangements as those found in CMT1A and HNPP as mediated by the CMT1A-REP repeats may potentially cause the altered expression of many genes at once leading to marked phenotypic consequences.
We have previously described an apparent non-random clustering of crossover breakpoints within a 7.9 kb interval of the CMT1A-REP repeat, providing direct evidence that this sequence contains a recombinational hotspot leading to CMT1A and HNPP (16 ). We have confirmed this observation through further analysis of the distribution of crossover breakpoints in a larger series of CMT1A and HNPP patients using a variant SacI site located in the proximal CMT1A-REP repeat. Seventy-six percent of combined CMT1A and HNPP chromosomes had breakpoints within a 3.2 kb interval (region B1) providing further evidence for a recombinational hotspot. Our observations confirm the localization of a transposon sequence within the CMT1A-REP repeat having high sequence homology to a mariner element (17 ). We mapped this mariner-like element within region A of the CMT1A-REP repeat, approximately 700 bp centromeric to the 3.2 kb interval which contains the hotspot. Only 5% of the breakpoints map to region A (about 10 kb) that contains the mariner-like element and 1% of the breakpoints map to region B2/C that is adjacent to region B1 in the telomeric direction. Therefore, the recombinational hotspot does not map within the mariner-like sequence.
DNA sequence identity is very high between the proximal and the distal CMT1A-REP repeats. The 11 kb of continuous DNA segment sequenced in the middle portion of the CMT1A-REP repeats had 99% sequence identity between the proximal and distal repeats. The conservation of restriction sites suggests that a high degree of sequence identity likely exists throughout the entire repeated region. While crossovers could have potentially occurred within any region of the repeat, we observed an unequal distribution of breakpoints raising the possibility that a highly sequence-specific element may be involved. A further analysis of the distribution of the breakpoints by sequencing of the CMT1A-REP repeats in individual patients may define the recombinational hotspot at the nucleotide level.
Region D spans about 8 kb and harbors 18% of the total breakpoints. The breakpoint distribution within the region D requires further analysis to explore the possibility of an additional hotspot such as that found in region B1. Further DNA sequence analysis is necessary to identify restriction site differences useful to divide region D into smaller units.
There are an estimated 100 copies of mariner-like sequences in the human genome, including one known to map within the ABL gene (19 ). The role, if any, of these sequences in humans is unknown and functional elements mapping to the CMT1A/HNPP region or elsewhere in the genome have yet to be identified (17 ,18 ). The presence of a mariner-like sequence in the CMT1A-REP repeat raises the possibility of a functional role for this widely distributed transposon sequence in the recombinational hotspot leading to CMT1A and HNPP (17 ). We isolated two cDNA clones containing portions of the mariner-like element which do not seem to encode a functional transposon protein as they also contain numerous repetitive sequences disrupting the transposon sequence. Among 16 various tissue RNA samples tested in this study, only the testis sample produced a visible band of 2.2 kb, the same size as the insert of both cDNA clones isolated. Even if these transcripts do not encode a functional protein, it is an interesting and unexplained coincidence that only testis had detectable expressed sequences containing the mariner-like element. The size of the transposon belonging to the mariner family is about 1250 bp (18 ,19 ). We did not detect any fragments by Northern blot analysis in this size range. Furthermore, we did not find evidence suggesting that the mariner-like element within the CMT1A-REP repeat is transcribed. As proposed earlier, it remains possible that the mariner-like sequence within CMT1A-REP repeat exists as a target site for a protein with transposase activity (17 ). A continued search in other cDNA libraries or the application of RT-PCR techniques using primers from conserved regions of the mariner-like element is necessary to further address the possibility of a functional, expressed transposon sequence.
How the CMT1A-REP repeats appeared is another interesting question. It was previously reported that CMT1A-REP-like sequences are not detected in bovine, murine, rabbit, or Drosophila genomes, yet a single copy of the repeat was found in an unspecified monkey species (4 ). We extended our analysis to include higher non-human primates and detected CMT1A-REP-like sequences in all non-human primates tested. Given the location of the CMT1A-REP repeat on chromosome 17 in humans, these homologous sequences likely map to chromosome 19 in the chimpanzee and orangutan, chromosome 4 or 19 in the gorilla and chromosome 8, 13 or 16 in the gibbon (24 ). Southern blot analysis indicated that the chimpanzee has two copies of a CMT1A-REP-like sequence, while gorilla, orangutan, and gibbon likely have only one copy. The dating of evolutionary divergence between human and chimpanzee is estimated to have occurred 4 to 6 million years ago and that between human and gorilla is estimated at 7 to 8 million years ago using mitochondrial DNA sequences (26 ). Our result is consistent with this dating. Unless it was an independent event, duplication of the original CMT1A-REP sequence should have occurred before human-chimpanzee divergence, but after human-gorilla divergence. This is an example in humans in which the progenitor and progeny sequence and their locations on the chromosome are known. There are other reports indicating the existence of species-specific repetitive DNA elements such as Alu (27 ). It is also known that a progenitor Xba1 DNA element gave rise to a progeny Xba2 repeat, resulting in two copies of the Xba element in human, but there remains only one original copy in chimpanzee, gorilla, and orangutan (27 ). The hybridization analysis of primates, together with DNA sequence analysis of each end of the CMT1A-REP repeats, suggests that the distal CMT1A-REP repeat is more likely the original, progenitor sequence. How the CMT1A-REP sequence itself became duplicated during non-human primate evolution is unknown. The major difference between the Xba element and the CMT1A-REP repeat is their sizes, about 300 bp and 24 kb, respectively. The CMT1A-REP repeat is also flanked by inverted repeats that are complete Alu sequences except for the Alu sequence that flanks the telomeric end of the proximal CMT1A-REP repeat and is only 110 bp. As we discussed in an earlier report (16 ), the size of the CMT1A-REP repeat itself (24 kb) and the size of the duplicated/deleted region (1.5 Mb) are also major differences from other repeated sequences that are proposed to mediate meiotic unequal crossover leading to human disorders (28 -35 ).
While our observations suggest that the distal CMT1A-REP repeat is the progenitor sequence, the nature of how this sequence became duplicated during primate evolution is unknown. Additional physical mapping of the CMT1A-REP repeat region in the chimpanzee and other non-human primates is now required to further explore this interesting transposition. The degree of homology by intensity of molecular hybridization and similarity of fragment sizes suggest a high degree of sequence conservation is present amongst humans and other primates for the CMT1A-REP repeat and the mariner-like element contained within it. Our analysis also suggests that introduction of the mariner-like element within the primordial CMT1A/HNPP region occurred prior to the molecular event which lead to genesis of the CMT1A-REP repeat. Additional physical mapping and sequence analysis in non-human primates is also needed to clarify the relationships between the CMT1A-REP repeat and the mariner-like element and to explore a possible role for the mariner-like element in the generation of CMT1A-REP repeat during primate evolution.
A diagnosis of CMT1A was established by the presence of typical clinical features of distal motor and sensory abnormalities, electrophysiologic studies (35 ) and presence of the chromosome 17p11.2-12 duplication (1 ,2 ). A diagnosis of HNPP was established in probands and individuals at risk by a history of prolonged palsies following mild trauma, characteristic neurophysiologic findings (37 ) and presence of the deletion in chromosome 17p11.2-12 associated with HNPP (5 ). Evidence of the CMT1A duplication included presence of a novel 500 kb SacII fragment detected with VAW409, or demonstration of trisomic dosage for one or more markers (VAW409, EW401, VAW412, 5H5, 4A11, 6G1) known to map within the duplication (1 ,2 ,7 ). Evidence of the HNPP deletion included presence of a novel 770/820 kb SacII fragment detected with a cloned fragment from the distal CMT1A-REP repeat (38 ), or demonstration of misinheritance of alleles for one or more markers within the deletion (5 ). Under a protocol of informed consent (The Children's Hospital of Philadelphia), 30 cc of blood was obtained by venipuncture for DNA isolation and establishment of permanent cell lines (39 ).
Total human DNA and YAC DNA were isolated from peripheral blood or lymphoblastoid cell lines as previously described (16 ,40 ). Approximately 5 [mu]g of total genomic DNA per sample was digested with appropriate restriction enzymes according to the manufacturer's instructions. Size fractionation was carried out by 1% agarose gel electrophoresis in 2* TAE, at 30 V for 16 h and resultant fragments were transferred to nylon membranes (Zeta-Probe GT, Bio-Rad). Probes were labeled by the random hexamer primer method (41 ) and hybridized at 60oC over night. Estimation of gene copy number was determined by direct quantitative assessment of radioactive signals on Southern blots using a PhosphorImager (Molecular Dynamics, Naperville). Dosage ratios obtained with CMT1A and HNPP patient samples were normalized to reflect a ratio of 1.0 for normal persons. Northern blot membranes were purchased from Clontech and hybridized according to the manufacturer's instructions. Membranes were then washed and exposed to X-ray film (Hyperfilm, Amersham) for 1-3 days at -70oC.
Bacterial plasmids and cosmids were purified by method of either Qiagen column according to the manufacturer's instructions or CTAB (cetyltrimethylammonium bromide) precipitation (42 ). EcoRI subfragments of the CMT1A-REP repeats were already subcloned into either pGEM-3Z or pGEM-7Zf(-) (Promega) (16 ). pHK1.85D is our equivalent to pNEA102 that is a 1.85 kb EcoRI end fragment of cosmid c20G2 and hybridizes to a 7.8 kb EcoRI fragment from the proximal CMT1A-REP repeat and a 6.0 kb EcoRI fragment from the distal repeat (4 ,37 ). pHK1.0P contains a 1.0 kb PstI-EcoRI fragment that hybridizes to a 2.3 kb EcoRI fragment from the proximal CMT1A-REP repeat and a 3.2 kb EcoRI fragment from the distal CMT1A-REP repeat (16 ). pHK1.8D is from the 1.8 kb EcoRI fragment of the distal CMT1A-REP repeat and homologous to the 1.8 kb centromeric end of the pHK7.8P. pHK0.7D contains a 0.7 kb EcoRI-HindIII fragment that completely resides within the mariner-like element. The locations of probes used are shown in Figure 1 . DNA sequencing was performed using an automated DNA sequencer (Applied Biosystems, Inc.) and the sequence analyzed with the Sequence Analysis Software Package from Genetics Computer Group. DNA sequence obtained from the proximal and distal CMT1A-REP repeats was submitted to GenBank (centromeric end of the proximal CMT1A-REP: U48215; centromeric end of the distal CMT1A-REP: U48216; telomeric end of the proximal CMT1A-REP: U48217; telomeric end of the distal CMT1A-REP: U48218). Internal portions of the CMT1A-REP repeats were updated (the proximal CMT1A-REP: L44118; the distal CMT1A-REP: L44119).
Human testis cDNA library (Stratagene) was screened with the probe pHK1.8D that contains about 80% of the human mariner-like element (18 ). The screening process and conversion of the phage to plasmid were performed according to the manufacturer's instruction. GenBank accession numbers are U48696 for pcHMT1 and U48697 for pcHMT2.
The generous participation of patients and their families is appreciated. We thank Dr Jerzy Jurka for use of a pythia server and suggestions for analysis of human repetitive elements. We thank Aki Ito for technical assistance. We thank Dr H. H. Kazazian for suggestions and critical review of the manuscript. H.K. is supported by a postdoctoral fellowship from the Muscular Dystrophy Association. P.F.C. is supported by the Muscular Dystrophy Association, the March of Dimes Birth Defects Foundation, the National Institutes of Health (R01-NS30804) and a gift from the Myer S. Shandelman Trust.
2 Raeymaekers, P., Timmerman, V., Nelis, E., De Jonghe, P., Hoogendijk, J.E., Baas, F., Barker, D.F., Martin, J.-J., De Visser, M., Bolhuis, P.A. and Van Broeckhoven, C. (1991) Neuromusc. Dis., 1, 93-97.MEDLINE Abstract
3 Raeymaekers, P., Timmerman, V., Nelis, E., Van Hul, W., De Jonghe, P., Martin, J.-J. and Van Broeckhoven, C. (1992) J. Med. Genet., 29, 5-11.MEDLINE Abstract
9 Timmerman, V., Nelis, E., Van Hul, W., Nieuwenhuijsen, B.W., Chen, K.L., Wang, S., Ben Othman, K., Cullen, B., Leach, R.J., Hanemann, C.O., De Jonghe, P., Raeymaekers, P., van Ommen, G.J.B., Martin, J.-J., Muller, H.W., Vance, J.M., Fischbeck, K.H. and Van Broeckhoven, C. (1992) Nature Genet., 1, 171-175.MEDLINE Abstract
10 Valentijn, L.J., Bolhuis, P.A., Zorn, I., Hogendijk, J.E., van den Bosch, N., Hensels, G.W., Stanton Jr., V.P., Housman, D.E., Fischbeck, K.H., Ross, D.A., Nicholson, G.A., Meershoek, E.J., Dauwerse, H.G., van Ommen, G.J.B. and Baas, F. (1992) Nature Genet., 1, 166-170.MEDLINE Abstract
19 Auge-Goullou, C, Bigot, Y., Pollet, N., Hamelin, M.-H., Meunier-Rotival, M., and Periquet, G. (1995) FEBS, 368, 541-546.
20 Palau, F., Lofgren, A., De Jonghe,P., Bort, S., Nelis, E., Sevilla, T., Martin, J.J. et al. (1993) Hum. Mol. Genet., 2, 2031-2035.MEDLINE Abstract
21 Lorenzetti, D., Pareyson, D., Sghirlanzoni A., Roa, B.B., Abbas, N.E., Pandolfo, M., Di Donato, S. and Lupski, J.R. (1995) Am. J. Hum. Genet., 56, 91-98.MEDLINE Abstract
22 Jurka, J., Walichiewicz, J., and Milosavljevic, A. (1992) J. Mol. Evol., 35, 286-291. MEDLINE Abstract
23 Spradling, A. and Rubin, G.M. (1982) Science, 218, 341-347.MEDLINE Abstract
25 Jauch, A., Wienberg, J., Stanyon, R., Arnold, N., Tofanelli, S., Ishida, T., and Cremer, T. (1992) Proc. Natl Acad. Sci. USA, 89, 8611-8615. MEDLINE Abstract
26 Horai, S., Satta, Y., Hayasaka, K, Kondo, R., Inoue, T., Ishida, T., Hayashi, S., and Takahata, N. (1992) J. Mol. Evol., 35, 32-43. MEDLINE Abstract
27 Minghetti, P.P. and Dugaiczyk, A. (1993) Proc. Natl Acad. Sci. USA, 90, 1872-1876. MEDLINE Abstract
28 Tsubota, S.I., Rosenberg, D., Szostak, H. et al. (1989) Genetics, 122, 881-890.MEDLINE Abstract
29 Metzenberg, A.B., Wurzer, G., Huisman, T.H.J. and Smithies, O. (1991) Genetics, 128, 143-161.MEDLINE Abstract
40 Chance, P.F., Bird, T.D., O'Connell, P., Lalouel, J.-M., Lipe, H. and Leppert, L. (1990) Am. J. Hum. Genet., 47, 915-925.MEDLINE Abstract
41 Feinberg, A.P. and Vogelstein, B. (1984) Anal. Biochem., 137, 266-267.MEDLINE Abstract
42 Del Sal, G., Sena, E.P. and Schneider, C. (1989) Biotechniques, 7, 514-520.MEDLINE Abstract
*To whom correspondence should be addressed at: Division of Neurology, 516 Abramson Pediatric Research Center, The Children's Hospital of Philadelphia, 34th and Civic Center Blvd., Philadelphia, Pennsylvania, 19104, USA
This page is maintained by OUP admin. Last updated Thu Oct 31 15:24:35 GMT 1996. Part of the OUP Journals World Wide Web service.Copyright Oxford University Press, 1996