The Survival Motor Neuron (SMN) gene shows deletions in the majority of patients with Spinal Muscular Atrophy (SMA), a disease of motor neuron degeneration. To date only two missense mutations have been reported in SMN in patients with SMA. The fact that no SMN-homologues have been forthcoming from database searching has resulted in a lack of hypotheses concerning the structural and functional consequences of these mutations. Recently SMN has been shown to interact with heterogeneous nuclear ribonucleoproteins (hnRNPs) suggesting a role in mRNA metabolism. We describe a novel missense mutation and the subsequent identification of a triplicated tyrosine-glycine (Y-G) peptide sequence at the C-terminal of SMN which encompasses each of the three predicted amino acid sequence substitutions. We have identified apparent orthologues of SMN in Caenorhabditis elegansand Schizosaccharomyces pombe. These sequences retain the highly conserved Y-G motif and provide additional support for a role of SMN in mRNA metabolism.
Autosomal recessive proximal spinal muscular atrophy (SMA) is characterised pathologically by death of lower motor neurons in the spinal cord and clinically by neurogenic amyotrophy. It has an estimated incidence of ~1 in 10 000 live births (1 ,2 ). The phenotype is variable and classified by age of onset and maximal motor milestones achieved (3 ). Type I SMA (Werdnig-Hoffmann disease) is a severe neonatal form with onset in utero or soon after birth and death before the age of 2 years. Type II children have onset in the first 2 years of life, are never able to stand and have variable survival depending on the degree of respiratory muscle involvement. Type III SMA children have a milder form of the disease and achieve the ability to stand unaided. Life expectancy may be normal. Adult onset SMA appears to be genetically heterogeneous with only a proportion of cases being linked to the other forms of the disease (4 ,5 ).
The three childhood forms of the disease have been linked to chromosome 5q13 (6 -9 ). Two candidate genes, the Survival Motor Neuron (SMN) gene (10 ) and the Neuronal Apoptosis Inhibitory Protein (NAIP) gene (11 ), lie within an inverted duplication in the candidate region. Both show homozygous deletions in patients with SMA but neither fulfils all the criteria for the disease gene (12 ,13 ). The NAIP gene has been shown to be a good candidate on biological grounds in that it functions as a negative regulator of apoptosis (14 ). However, exon 5 of this gene is homozygously deleted in only 10-60% of patients depending on disease severity (11 ,15 ). Furthermore homozygous deletions are occasionally observed in asymptomatic carriers. A role for NAIP in the pathogenesis of SMA therefore remains uncertain.
The two copies of the Survival Motor Neuron (SMN) gene differ by two single nucleotides; one in exon 7 which does not alter the amino acid composition and one in exon 8 which is an untranslated exon (10 ). Therefore, both genes predict identical proteins. Single stranded conformation polymorphism (SSCP) analysis can be used to distinguish between the two copies of SMN. Exon 7 of the telomeric copy of SMN is disrupted by deletion or gene conversion in greater than 95% of patients with SMA of all grades of severity (10 ,16 -19 ). In addition to deletions of exon 7 and 8, several other types of mutation have been described including splice site mutations which are predicted to disrupt exon 7 (10 ), a short deletion which introduces a stop codon in exon 3 (20 ), a duplication which introduces a stop codon in exon 6 (21 ) and a frameshift resulting from a 5 bp deletion in exon 3 which results in a premature stop codon (22 ). Each of these mutations is predicted to produce a protein truncated at its C-terminal. While providing strong evidence that SMN is a SMA determining gene, these data provide little information about the function of SMN except to suggest that the region encoded by exons 6 and 7 is important, as this is disrupted in all types of mutation.
The absence of significant sequence similarity between SMN and entries in protein or nucleic acid databases has limited the predictive potential of sequence analysis. The discovery of missense mutations might provide important information about the structure and function of this protein. Lefebvre et al. (10 ) found one patient in whom there was a missense mutation at codon 272 altering the sequence from tyrosine to cysteine. At a recent meeting the same group reported a second missense mutation which leads to a glycine to serine substitution at codon 275 (23 ). We now describe a novel missense mutation which, together with these other amino acid substitutions, has enabled us to propose that the known interaction of SMN with RNA binding proteins (24 ) occurs via its tyrosine- and glycine-rich C-terminal region. Using this region we have identified potential orthologues of SMN in the nematode Caenorhabditis elegans and in the yeast Schizosaccaromyces pombe.
Genomic DNA from a child with a clinical diagnosis of SMA and that of his parents was used to amplify exon 7 and exon 8 of the SMN gene by PCR and the products were resolved by single stranded conformation polymorphism (SSCP) analysis (Fig. 1 ). The father is deleted for the centromeric version of exon 7 and 8 but retains a normal copy of the telomeric gene which has not been passed on to his son. Homozygous deletion of the centromeric copy of SMN is seen in normal controls and does not appear to be associated with a phenotype (10 ,15 ). The mother has both a normal telomeric and centromeric gene but also a variant exon 7 which has been inherited by the affected child. This amplicon was cloned and sequenced, revealing the basis for the SSCP variant to be a G to T transition at the second nucleotide of exon 7 of the telomeric SMN gene leading to a glycine to valine substitution in the protein at amino acid position 279 (Fig. 2 ). Sequencing of numerous subclones revealed only the variant telomeric SMN or the normal centromeric exon 7 suggesting that on the other chromosome the patient is deleted for telomeric exon 7. Sequencing of maternal DNA confirmed that she also possessed the variant. In over 1500 SSCP reactions we have not observed this variant band in either patients or control individuals.
The clustering of SMN missense mutations to aC-terminal dodecapeptide region that is highly conserved in S.pombe and C.elegans homologues indicates a prominent role for this region in SMN protein function. This is consistent with the truncated protein described by Parsons et al. (21 ) which is predicted to terminate just before the dodecapeptide and indicates that the region N-terminal to this truncation is insufficient to provide a fully functional protein. Observation of an internal triplication of a YxxG motif within the dodecapeptide prompteda search of the SwissProt database for all occurrences of the sequence (YxxGYxxGYxxG). This revealed six such examples of which three are known RNA binding proteins: (a) human eukaryotic initiation factor 4B; (b) hnRNP A/B (p38) found in the shrimp Artemia; and (c) a Drosophila RNA binding protein, termed squid or hnRNP40.
Subsequently it was noted that other RNA binding proteins, particularly hnRNP A/B proteins, also contain sequences that have a high content of glycine and hydrophobic residues (tyrosine and phenylalanine) in regions that are C-terminal to RNA-binding arginine-glycine-glycine domains (`RGG boxes') (26 ).
The compositional similarity of the SMN dodecapeptide to C-terminal regions of hnRNPs appears to accord with similarities in their functions. The C-terminal Gly-, Tyr- and Phe- rich region of hnRNP A1 is known to bind the RGG box of hnRNP A/B proteins (27 ). SMN interacts with the RGG box of hnRNP U and the RGG box containing protein fibrillarin (24 ). It is proposed that these interactions are likely to be mediated by the SMN Gly- and Tyr-rich dodecapeptide region and that substitutions, such as the missense Tyr272 -> Cys mutation, reduce the affinity of SMN for RGG boxes in a similar manner to that observed upon substitution of Phe and Tyr residues in the hnRNP A1 C-terminal region (27 ). Further experiments are required to explore this hypothesis and to understand the potential role of SMN in regulating RNA metabolism.
It is striking that the tyrosine-glycine motif which contains all of the amino acid substitutions in patients with SMA is perfectly conserved down to C.elegans (with the exception of the conservative Gly-279 to Ala substitution mentioned above) and that all of the mutated amino acids are preserved in the S.pombe sequence. There are regions of homology at both the 5' and 3' ends of the SMN gene suggesting that both the N- and C-terminal regions of the protein have functional significance. It is striking, however, that the proline rich central domain of the human SMN is not conserved in the yeast and nematode. The analysis of the function of this region may therefore be important in understanding the role of SMN in mammalian cells. The detection of homologues of SMN in species as phylogenetically distant from human as yeast and nematode suggests that SMN has a housekeeping role which is fundamental to eukaryotic cells. The apparent absence of an SMN-like gene in S.cerevisiae may reflect the fact that this organism is thought to be phylogenetically more distant from Homo sapiens than is S.pombe. We are currently investigating the S.pombe orthologue.
The clustering of SMA point mutations in the SMN gene resulting in substitutions within a tyrosine-glycine rich dodecapeptide motif similar in sequence to proteins involved in mRNA metabolism suggests that it is the interaction of SMN with hnRNPs as detected by the yeast two-hybrid system (24 ) which is relevant to the disease. However, it is difficult to explain why mutations in SMN lead to such a specific neuromuscular phenotype. One possibility is that SMN regulates the post-transcriptional modification of a specific class of mRNAs that encode genes with a relatively high specificity for motoneurons. An alternative explanation draws on the observation that SMN-hRNP interactions are detected in a two-hybrid system using a non-neuronal (HeLa cell) library (24 ) and suggests that neuron-specific proteins that interact with SMN remain to be discovered. The consequence of the discovery of a C.elegans SMN homologue is that a well characterised model system is now available for mutational studies of SMN which are likely to be important in elucidating the role of SMN in mRNA metabolism and in further defining its cellular specificity.
The patient was referred for molecular diagnosis and was the first child of non-consanguineous parents. Fetal movements were noted to be decreased from ~22 weeks gestation. He was floppy at birth and had decreased limb movements. Electromyography was consistent with motor neurone degeneration.
Single stranded conformation analysis. DNA from the patient and his parents was analysed by PCR amplification of exons 7 and 8 of the SMN gene with the incorporation of 35S labelled d-ATP using primers as described by Lefebvre et al. Amplification of genomic DNA was performed in a 5 [mu]l volume using unlabelled primers (20 mM) and contained 200 mM dNTPs (deficient in d-ATP), 0.1 U of Taq Polymerase (Boehringer) and 0.2 [mu]l of 35S labelled d-ATP. Annealing was performed at 55oC for 30 s and polymerisation for 1 min at 72oC. Products were analysed on a non-denaturing gel (Hydrolink-MDE) at 6W for 13.5 and 14.5 h at room temperature for exon 7 and 8 respectively.
The exon 7 PCR product from the patient and his mother were subcloned into the pGEM-T cloning vector (Promega) according to the manufacturer's instructions. Individual clones were sequenced using the Sequenase 2 protocol (Boehringer).
A mouse adult brain library (courtesy of Dr D. Blake) was screened using the human SMN cDNA. Two hundred thousand colonies were screened and a 0.7 kb clone containing sequence corresponding to parts of exon 1 to 7 of SMN was isolated. Primers from this sequence were then used to screen a second mouse brain library (Stratagene) to isolate overlapping clones giving a 1.2 kb cDNA.
Blastp searches (Altschul et al., 1990) were performed at the National Centre for Biotechnology Information (USA) using the non-redundant database of translated sequences.
This work was supported by grants from the Medical Research Council (UK), The Wellcome Trust, The Muscular Dystrophy Association (USA) and the Muscular Dystrophy Group (UK). We are grateful to Payam Mohaghegh and Nick Owen for helpful discussions and technical suppport.
Human Molecular Genetics
Pages
Introduction
Results
Discussion
Materials And Methods
Patient
DNA procedures
Cloning and sequencing
Isolation of the murine homologue of SMN
Database analysis
Acknowledgements
References
REFERENCES
This page is maintained by OUP admin. Last updated Fri Feb 7 12:40:48 GMT 1997. Part of the OUP Journals World Wide Web service.Copyright Oxford University Press, 1996
