The gene which is defective in Duchenne muscular dystrophy (DMD) is the largest known gene. The product of the gene in muscle, dystrophin, is a 427 kDa protein. The same gene encodes at least six additional products: two non-muscle dystrophin isoforms transcribed from promoters located in the 5'-end region of the gene and four smaller proteins transcribed from internal promoters located further downstream. Several other genes, encoding evolutionarily related proteins, have been identified. These include a structurally very similar gene in vertebrates encoding utrophin (DRP1), which is closely related to dystrophin, and a number of small and simple genes in vertebrates or invertebrates encoding proteins similar to some of the small products of the DMD gene. We have isolated a sea urchin gene showing very strong sequence and structural homology with the DMD and utrophin genes. Sequence and intron/exon structure similarities suggest that this gene is related to a precursor of both the DMD gene and the gene encoding utrophin. The sea urchin gene has the unique complex structure of the DMD gene. There is at least one, and possibly more, product(s) transcribed from internal promoters, as well as a large product of >300 kDa containing at least three of the four major domains of dystrophin. The small product seems to be evolutionarily related to Dp116, one of the small products of the human DMD gene. Partial characterization of this gene helped us to construct an evolutionary tree connecting the vertebrate dystrophin gene family with related genes in invertebrates. The constructed evolutionary tree also implies that the vertebrate small and simple structured gene encoding a Dp71-like protein, called DRP2, evolved from the dystrophin/utrophin ancestral large and complex gene by a duplication of only a small part of the gene.
The gene which is defective in Duchenne muscular dystrophy (DMD) is the largest gene known to date, spanning >2500 kb of the X chromosome. The product of the gene in muscle,dystrophin, is a 427 kDa rod-shaped protein consisting of four domains: an N-terminal actin binding domain, 24 triple helix spectrin-like repeats with four hinge regions, a cysteine-rich domain with two potential calcium binding motifs and a unique C-terminal domain (1). Dystrophin is believed to form a linkage between the cytoskeletal actin and a group of membrane proteins (dystrophin-associated proteins, DAPs) (2,3). Association with the DAPs is mediated mainly by the cysteine-rich and C-terminal domains of dystrophin (4,5). One of the DAPs, [alpha]-dystroglycan, binds to a subunit of laminin, thus forming a linkage between this complex and the extracellular matrix (6-8).
The DMD gene also encodes two non-muscle isoforms of dystrophin, each controlled by a different promoter located in the 5'-end of the gene; the brain-type (9-11) and Purkinje cell-type dystrophins (12). In addition, internal promoters located within introns further downstream in the huge DMD gene regulate expression of smaller products. Dp71, a 70.8 kDa protein, consists of only the cysteine-rich and C-terminal domains of dystrophin (13,14). It is the most abundant non-muscle product of the DMD gene. The highest levels of Dp71 are found in the brain (15,16). The other known small products of the DMD gene consist of the cysteine-rich and C-terminal domains with various extensions into the spectrin-like repeats domain (reviewed in 17). These products are Dp116 (18), Dp140 (19) and Dp260 (20), which are expressed mainly in Schwann cells, brain and retina respectively and have molecular weights of 116, 140 and 260 kDa. The functions of the non-muscle dystrophins and of the smaller products of the DMD gene are not known.
Several genes encoding proteins with various levels of sequence identity with dystrophin were cloned. Utrophin (DRPI), encoded by an autosomal gene, consists of all four domains of dystrophin. At the amino acid level the two proteins are ~51% identical. The exon/intron structures of the two genes are also very similar (21,22). The utrophin gene also encodes smaller products transcribed from downstream promoters (23,24).
The other known genes that seem to be related to the DMD gene are much smaller, simple genes encoding single proteins with low but significant homology to the cysteine-rich and C-terminal domains of dystophin. These include the Torpedo 87 kDa phosphoprotein and its mammalian homologue dystrobrevin (25-27) and a Drosophila 75 kDa protein, dah (28). Another Drosophila protein, MSP300, shows low but significant sequence homology to the spectrin-like domain of dystrophin and has an actin binding domain (29). However, it does not contain the cysteine-rich and C-terminal domains which are characteristic of all other known DMD gene products and dystophin-related proteins. A small simple structured gene encoding a 110 kDa protein (DRP2) similar in amino acid sequence to Dp71 but also containing two spectrin like repeats (like in Dp116) has recently been characterized in vertebrates (30).
No genes homologous to the complex DMD gene have been identified so far in invertebrates. To gain further insight into the evolution of the unusually large and complex DMD gene and into the function of its various products we attempted to clone homologous genes from invertebrates.
Sea urchins are deuterostomes, which evolved early in the evolutionary branch leading to the primitive chordates and to vertebrates. Here we report on the identification and partial characterization of a sea urchin gene closely related in structure and encoded proteins to the vertebrate DMD gene. The data indicate that at least some of the features of the complex DMD gene are very ancient. The sea urchin gene seems to be related to an ancestral gene of both the utrophin and dystrophin genes. The evolutionary tree derived from this study also suggests that the small and simple structured DRP2 gene evolved from a complex gene by duplication of only a part of the gene.
Database screening with a sequence of a cDNA clone randomly isolated from a sea urchin embryo cDNA library (Strongylocentrotus purpuratus) revealed a sequence with homology to a region in mammalian Dp71 mRNA (E.Davidson, personal communication). Using two different synthetic oligonucleotides containing parts of this sequence as probes to screen a cDNA library prepared from RNA of 24 h old S.purpuratus embryos we isolated two positive clones. Additional clones were isolated from a 43 h embryo library using new probes and 5' walking. The combined cloned sequence contained a 5'-untranslated region (UTR), an open reading frame (ORF) encoding a 98 kDa protein (based on the first in-frame ATG) and a long (>3 kb) 3'-UTR.
Alignment of the deduced amino acid sequence of the protein encoded by the sea urchin gene, as obtained and confirmed by sequencing several independently isolated different cDNA clones, with that of dystrophin (Fig. 1A) showed that the ORF of the sea urchin cDNA begins with a sequence which is homologous to exon 56 of dystrophin. This exon is the first coding exon of Dp116 mRNA (18). However, unlike the case of Dp116 mRNA, in which the initiator ATG is found in the first (Dp116-specific) exon located in intron 55 of the DMD gene, the first ATG codon in the sea urchin cDNA is located 132 codons downstream of the beginning of the conserved ORF. Thus, while the mRNA of the sea urchin protein is structurally similar to mammalian Dp116 mRNA, the encoded protein is smaller than Dp116 (98 versus 116 kDa) and contains only one triple helix repeat, whereas Dp116 contains two repeats.
Comparison of the deduced amino acid sequence of the cloned sea urchin cDNA with that of the corresponding region of human dystrophin and Dp71 reveals a relatively high homology, with stretches of 10-20 identical amino acids (Fig. 1A). Moreover, the known functional domains identified so far in the C-terminal region of dystrophin are conserved in the sea urchin protein. These include the WWP domain (reviewed in 31), the coiled coil region (1), the EF hands (1) and the ZZ domain (32). It should also be noted that some highly conserved stretches of amino acids are found outside the known functional domains. The conservation of these sequences suggests that they are functionally important.
As was previously shown, due to alternative splicing two exons which are present in dystrophin mRNA (exons 71 and 78) are missing in the published sequence of mammalian Dp71 mRNA. The proximal alternative splicing results in an in-frame deletion of 13 amino acids. The distal alternative splicing results in a deletion of 32 nt and in a frameshift. Consequently, the last 13 amino acids of dystrophin are replaced in the published sequence of mammalian Dp71 by 32 new amino acids (14). As can be seen in Figure 1B, in the sea urchin protein the sequence in these two regions is similar to the published sequence of Dp71 and not to that of dystrophin. The substantial conservation of the amino acid sequence resulting from the absence of exon 78 suggests a functional role for this sequence.
Interestingly, the sea urchin protein contains near its C-terminus a short sequence which is not found in dystrophin, with six adjacent histidine residues (confirmed by genomic DNA sequencing). The function of such a histidine run is unknown.
On the basis of the calculated molecular weight of the cloned protein and its sequence homology to Dp116 and Dp71 we called this protein SuDp98.
The 3'-UTR of mRNAs encoding homologous proteins in distantly related organisms often show conservation of sequence and size (33). Mammals and birds separated ~300 000 000 years ago, yet the 3'-UTR of mammalian and avian dystrophin mRNAs are both very large, spanning 2.7 kb and sharing very high sequence homology (34). In addition, the various DMD gene products share the same 3'-UTR (13). The high sequence conservation over a period of 300 000 000 years indicated important yet unknown function. Interestingly, the mRNA of SuDp98 also has an unusually large 3'-UTR spanning >3 kb. However, no significant sequence conservation between the 3'-UTR of human dystrophin and that of SUDp98 was detected.
In order to determine the exon/intron structure of the SuDp98 gene we amplified by PCR sea urchin genomic DNA fragments corresponding to most of the SuDp98 coding sequence. Exon-intron borders (according to the structure of the human DMD gene) were sequenced and the borders in the sea urchin gene were determined.
Alignment of the exon/intron map of the gene encoding SuDp98 with that of the corresponding region in the human DMD gene reveals a significant proportion of the borders in identical locations in the two genes (Fig. 2). However, some of the introns are missing in the sea urchin gene (introns 60, 63 and 66) and in some cases either the 3' or the 5' border was shifted (introns 58, 68 and 74). (The numbering of exons is according to the published structure of human dystrophin; 35).
One of the most puzzling features regarding the dystrophin gene is its enormous size, which is due to the great number and size of introns (some of which are 200-300 kb long). Since mutations interfering with expression of dystrophin are lethal, the huge size should be a great disadvantage and one would expect a very strong selective pressure to greatly reduce the size of introns. Yet this unusual size and structure of the gene is present in mammals and in birds, which separated ~300 million years ago. It was of interest to determine whether the sea urchin gene also has an unusually huge size.
In order to compare the size of the gene encoding SuDp98 with that of the corresponding region in the DMD gene we amplified partially overlapping fragments of S.purpuratus genomic DNA, using oligonucleotide primers along SuDp98 mRNA (data not shown). The calculated genomic DNA size between exons 62 and 79 (numbering is according to human DMD gene exons) is ~20 kb. The corresponding region in the human DMD gene is ~150 kb (15). Thus the average intron size in the region of the sea urchin gene encoding SUDp98 is 7.5 times smaller than the average size of introns in the corresponding region in the DMD gene.
As mentioned above (Fig. 1A), the sequence 5' of the first AUG of SuDp98 mRNA contains an in-frame ORF of 131 codons with significant amino acid sequence homology to the corresponding region in dystrophin (32% sequence identity). This conservation suggested that this sequence is part of a coding sequence of an mRNA which is larger than SuDp98 mRNA. Indeed, by screening the 43 h embryo cDNA library with the 5' coding sequence of SuDp98 cDNA we isolated several clones which did not contain the distal part of the 5'-UTR of SuDp98 mRNA, located upstream of that ORF. However, they did contain the 131 codons (which in SuDp98 are part of the 5'-UTR) as well as additional 5' in-frame coding sequences. Thus the 131 codons which are part of the 5'-UTR of SuDp78 mRNA are part of the coding sequence of a larger mRNA. These results and the similarity between Dp116 and SuDp98 mRNAs indicate that SUDp98 is a genuine evolutionary homologue of the mammalian DMD gene product Dp116 (Fig. 3).
Isolation of cDNA clones from embryonic cDNA libraries indicated expression of the sea urchin gene related to the DMD gene. We tested expression of this gene more directly by RT-PCR of embryo RNA samples. cDNA was synthesized on total RNA samples using an antisense primer at the beginning of the ORF of SuDp98 mRNA. PCR was performed on the cDNAs using the same antisense primers and one of two sense primers: (i) primer from the specific 5'-non-coding sequence of SuDp98 mRNA or (ii) a primer from exon 53. We found that both SuDp98 and the longer mRNA(s) were expressed in S.purpuratus embryos of various ages (not shown).
Earlier studies demonstrated a high level of structural and sequence similarity between mammalian dystrophin and utrophin, indicating a common evolutionary origin, most likely as a result of gene duplication (22). A much smaller sequence similarity was found between the 87 kDa Torpedo phosphoprotein or its mammalian homologue, dystrobrevin, and the homologous region in human dystrophin (25,27). The cloning of a sea urchin gene encoding a dystrophin-like protein provided an opportunity to extend this comparison over a considerably larger evolutionary scale. The results which are presented in Table 1 show that the hinge 4 region seems to be the most conserved domain. The cysteine-rich and the C-terminal domains are also highly conserved. The amino acid sequence of the spectrin-like repeats is most divergent. However, as mentioned above, the repeat structure of this domain is conserved.
Table 1.
Comparison made between the homologous regions showed that (i) the amino acid sequence identity between SuDp98 and dystrophin is similar to that between SuDp98 and utrophin and (ii) the sequence identity between dystrophin and utrophin is greater than between SuDp98 and each one of them (Table 1). Assuming a similar rate of amino acid substitution, these data suggest that divergence of dystrophin and utrophin (apparently by gene duplication) occurred after divergence of the ancestral sea urchin gene and a common precursor of dystrophin and utrophin (Fig. 5). Since both dystrophin and utrophin are found in distantly related vertebrates, it is likely that duplication of the ancestral gene occurred before vertebrate radiation.
Similar considerations suggest that the recently discovered DRP2 evolved after divergence of the ancestral Dp98 and the common precursors of dystrophin and utropin but before separation of dystrophin and utropin.
The sequence similarity between SuDp98 and Torpedo and human dystrobrevins is much smaller than that between SuDp98 and dystrophin, utrophin and DRP2, indicating a much earlier divergence of the gene encoding dystrobrevin (or Torpedo 87 kDa phosphoprotein) and the ancestral gene for the genes encoding dystrophin, utrophin and the sea urchin protein.
It should, however, be emphasized that the possibility that specific selective pressures differentially accelerated or slowed down the rate of change in some of the protein lineages cannot be excluded.
The results of comparison of the exon/intron structure of the genes encoding dystrophins and related proteins are mostly compatible with the suggested evolutionary relationships between the genes. Thus the dystrophin and utrophin genes have an almost identical structure (location of introns). Also, the location of introns in the DRP2 gene is identical to that found in the homologous region of utrophin (30). As shown above, there is a significant but lower similarity of the exon-intron borders in the SuDp98 gene and those in the DMD gene. There is some similarity between the structure of SuDp98 and the DMD gene and that of the gene encoding dah in Drosophila, indicating a common origin (Fig. 2).
We have here described the cloning of a sea urchin gene with great similarity to the vertebrate dystrophin gene. Several features indicate that the encoded protein is an invertebrate homologue of dystrophin and utrophin.
Figure
(i) Substantial sequence identity exists between dystrophin/utrophin and the sea urchin protein, especially in the C-terminal regions. The cloned part of the cDNA encoded three of the four domains of dystrophins.
(ii) We have identified in the sea urchin gene an internal transcript which is structurally similar to the transcript of the DMD gene encoding Dp116. Both are initiated at homologous positions in the two genes.
(iii) The exon/intron structure of the sea urchin gene is similar to that of the DMD gene.
(iv) The homologues of exons 71 and 78, which are alternatively spliced in the human DMD gene products, are absent in the sea urchin mRNA.
To the best of our knowledge the sea urchin gene is the first characterized invertebrate gene with such a high level of similarity to the DMD gene. As mentioned above, the two other invertebrate proteins that show some sequence similarity to dystrophin are Drosophila MSP300 and dah. However, MSP300 does not contain the characteristic cysteine-rich and C-terminal domains and dah, like Dp71, consists of only these two domains. Also, unlike the sea urchin protein, these two proteins are encoded by small simple genes and not by complex DMD-like genes.
As described in the Introduction and presented schematically in Figure 5, the known genes encoding dystrophin and related proteins can be roughly divided into two groups: (i) large and complex genes encoding dystrophin-like proteins and encoding additional smaller products, containing mainly the C-terminal and the cysteine-rich domains and sometimes part of the spectrin-like region (the genes encoding dystrophin and utrophin); (ii) much smaller and simpler genes, encoding relatively small products consisting mostly of the two C-terminal domains (DRP2, dystrobrevin, dah). An intriguing question is the evolutionary relationship between the large compound genes and the small genes. Partial characterization of the sea urchin gene and the indicated evolutionary relationships imply that the existence of the large compound gene preceded formation of the small DRP2 gene and that the latter evolved relatively recently by duplication of the 3'-region of a large gene, independently of dah and dystrobrevin (Fig. 5). Sequence comparison suggests that this duplication occurred after the divergence of echinoderms and vertebrates, but before the duplication that resulted in formation of the dystrophin and utrophin genes. This does not imply that other small dystrophin-related proteins like dah and dystrobrevin also evolved by partial duplication of a large gene. These genes may be related to a small ancestral gene from which the complex and large DMD-like genes evolved by addition of exons and promoters or by fusion with pre-existing genes (such as the Drosophila MSP300 gene).
Regardless of the mode of evolution, the existence of this pattern of genes and their products in this gene family and the widespread prevalence of the small products consisting mainly or only of the cysteine-rich and the C-terminal domains support the notion based on some experimental data (16,36,37) that in addition to the partially documented function of dystrophin as a link between the cytoskeleton, the sarcolemma and extracellular matrix, there are some, as yet unknown, important functions associated specifically with the highly conserved C-terminal and cysteine-rich domains which are common to the full-length dystrophins as well as the small products. These functions are probably mediated by complexing with DAPs. It should be mentioned in this context that a mutation in the Drosophila dah gene, which encodes a protein similar in structure to Dp71, is embryonic lethal, affecting formation of the cleavage furrows during early embryogenesis (28). More detailed investigation of the occurrence and expression of various products of the DMD family in distantly related organisms may contribute to the elucidation of their evolution and function.
Roberts and Bobrow (38) used a different approach to study evolution of the DMD and related genes. They used degenerate primers to clone a part of the 3' coding sequence of such genes from various vertebrates and invertebrates. The phylogenetic tree of the dystrophin gene family, based on these studies, is very similar to that of our studies.
Plasmid DNA and gel-purified PCR products were sequenced using an Applied Biosystems DNA sequencer.
Sequence analysis was done using the software package of the Genetics Computer Group (GCG) of the University of Wisconsin. BESTFIT was used to compare sequences and DOTPLOT to analyse the repeat structure of the cloned sea urchin gene.
The neighbour joining method of the DISTANCE program was used for comparison of multiple proteins and construction of the phylogenetic tree.
Sea urchin (S.purpuratus) 24 and 43 h embryo cDNA libraries in the [lambda]ZapII vector were generously provided by E.Davidson. When using synthetic oligonucleotides for the screening hybridization was performed at 50°C in a solution containing 8× SSC, 5× Denhardt's solution, 0.1% SDS, 0.1% sodium pyrophosphate and 50 µg/ml sonicated Escherichia coli DNA. When using DNA fragments as probes hybridization was in a solution containing 0.6 M NaCl, 0.12 M Tris-HCl, pH 8, 8 mM EDTA, 0.1% SDS, 0.1% sodium pyrophosphate, 10× Denhardt's solution and 50 µg/ml sonicated E.coli DNA. After hybridization the filters were washed at 50°C in a solution containing 2× SSC and 0.1% SDS (for oligonucleotide probes) or at 65°C in a solution containing 0.2× SSC and 0.1% SDS (for DNA probes). Bacteriophage that gave a positive signal on duplicate filters were plaque purified. The insert cDNA was subcloned in a plasmid vector using the rapid excision kit from Stratagene.
PCR amplification of genomic DNA and cDNA was performed using the expanded High Fidelity System (Boehringer Mannheim). One hundred nanograms of genomic DNA or the cDNA product of 2 µg total RNA were amplified using the following program: one cycle of 2 min at 95°C, 35 cycles of 20 s at 94°C, 30 s at 58°C and 3 min at 72°C and then one cycle at 4°C. For long DNA fragments the extension time was 7-10 min instead of 3 min. The PCR products were analysed on 1% agarose gels. cDNA was synthesized on total RNA samples using MMLV reverse transcriptase (Promega).
We have recently completed the sequencing of the sea urchin large dystrophin related protein. The new data support our conclusion that this gene is a genuine dystrophin/utrophin homologue.
We wish to thank Drs E.Davidson and F.Wilt for sea urchin cDNA and genomic DNA libraries, Dr Davidson for unpublished data on cDNA sequences, Ms S.Neuman and Ms Z.Levi for excellent technical assistance and Ms V.Laufer for secretarial assistance. This work was supported in part by the Muscular Dystrophy Association, USA, Association Française contre les Myopathies, France, the Muscular Dystrophy Group of Great Britain and Northern Ireland, The Minerva Foundation and the Forchheimer Center for Molecular Genetics.
Human Molecular Genetics
Pages
Introduction
Results
Isolation of cDNA clones encoding a sea urchin 98 kDa protein with high homology to the C-terminal region of dystrophin
The 3'-UTR
Exon/intron structure of the gene region encoding SuDp98
The region of the gene encoding SuDp98 is significantly smaller than the corresponding region in the human DMD gene
The cloned sea urchin gene encodes a larger product(s)
Expression of the sea urchin DMD-like gene
Evolutionary relation between the gene encoding SuDp98 and other genes in the dystrophin gene family
Discussion
Materials And Methods
DNA sequencing and sequence analysis
RT/PCR and PCR
Note Added In Proof
Acknowledgements
References
Figure
REFERENCES
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: www-admin{at}oup.co.uk
Last modification: 14 Mar 1998
Copyright© Oxford University Press, 1998.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
S. L. Hooper and J. B. Thuma Invertebrate Muscles: Muscle Specific Genes and Proteins Physiol Rev, July 1, 2005; 85(3): 1001 - 1060. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. Pozzoli, G. Elgar, R. Cagliani, L. Riva, G. P. Comi, N. Bresolin, A. Bardoni, and M. Sironi Comparative Analysis of Vertebrate Dystrophin Loci Indicate Intron Gigantism as a Common Feature Genome Res., May 1, 2003; 13(5): 764 - 772. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Royuela, G. Hugon, F. Rivier, J. A. Fehrentz, J. Martinez, R. Paniagua, and D. Mornet Variations in Dystrophin Complex in Red and White Caudal Muscles from Torpedo marmorata J. Histochem. Cytochem., July 1, 2001; 49(7): 857 - 866. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





