The childhood-onset spinal muscular atrophies are a clinically heterogeneous group of autosomal recessive disorders characterized by selective degeneration of the anterior horn cells with subsequent weakness and atrophy of limb muscles. The disease locus has been mapped to a region of chromosome 5q13 characterized by genetic instability and DNA duplication. Among the duplicated genes in this region, SMNT (telomeric copy; survival motor neuron) is thought to be the major disease determining gene since it is missing in the majority of SMA patients and since small, intragenic mutations in the gene have been associated with the disorder. Approximately half of the severely affected SMA I patients are also missing both homologues of a neighboring gene, the neuronal apoptosis inhibitory protein (NAIP). These data indicate that loss of NAIP may affect disease severity and further, that the molecular events underlying the childhood-onset SMAs are complex, possibly involving multiple genes. We report a third multicopy gene in the SMA region, encoding the p44 subunit of basal transcription factor II (BTF2p44). One copy of this transcription-repair gene is deleted in at least 15% of all SMA cases.
The childhood-onset spinal muscular atrophies (SMA) are autosomal recessive disorders characterized by degeneration of spinal cord anterior horn cells and proximal muscle wasting. Three forms of the disease are commonly recognized based on phenotypic severity and age of onset: Type I SMA (Werdnig-Hoffmann disease) is the most severe, with onset of symptoms prior to six months of age and death expected in the majority of cases by two years of age; onset ranges from six months to one year in Type II SMA (intermediate type) and from one year to 17 years in Type III SMA (Kugelberg-Welander disease). Type III SMA is clinically very heterogeneous.
All three forms of SMA have been localized by linkage analysis to chromosome 5q13 (1 -3 ). Meiotic breakpoint mapping and linkage disequilibrium studies have further refined the disease locus to a relatively small genomic region. The SMA genomic region is characterized by an abundance of low-copy repeat sequences, consisting of multicopy sequence tagged sites (STSs), microsatellite loci, genes and pseudogenes (4 -8 ). Two genes and a cDNA clone mapping to the SMA region have been shown to be preferentially deleted among SMA patients. Neuronal apoptosis inhibitory protein (NAIP) exists as a small cluster of gene(s) and pseudogenes mapping exclusively to the SMA region, as does the homologous cDNA clone XS2G3 (9 ,10 ). NAIP copy number appears to vary among individuals. Most individuals harbor multiple internally deleted and truncated copies together with at least one `intact' copy of NAIP identified by the presence of NAIP exon 5. The intact form of NAIP, as well as XS23G, demonstrates homozygous deletion in 45% of Type I SMA patients and 18% of Type II and Type III patients (9 ,10 ).
Survival motor neuron (SMN) also maps exclusively to the SMA region where it exists as two highly homologous, intact genes, the centromeric SMN (SMNC or BCD541) and the telomeric SMN (SMNT) (11 ). The telomeric copy of SMN is missing in ~95% of SMA I patients (11 ,12 ). The homozygous loss of SMNT homologues can apparently result from both deletion and gene conversion events (13 ), however, for brevity, we will use `deletion' to describe both mechanisms in the remainder of the manuscript. Less frequently, patients have been identified who harbor a small intragenic mutation in one SMNT gene; these individuals are presumably compound heterozygotes with one deleted and one mutated SMNT gene (11 ,14 ). The role of SMNC in SMA etiology is unclear. SMNC appears to undergo alternative splicing and the alternate transcript is more prevalent in SMA patients compared to control individuals (11 ,15 ). Because of the preferential deletion of SMNT in SMA patients and evidence for small intragenic deletion mutations within the gene (14 ), deletion of SMNT appears sufficient to cause disease, while the role of NAIP remains more speculative.
While the extent of SMNT deletion has been shown to be nearly equal in all three types of SMA, deletion of NAIP and several multicopy microsatellite markers like c212 and c272/Ag1CA appears to correlate with disease severity (9 ,10 ,16 -19 ). These data have led to the suggestion that loss of NAIP, and possibly other nearby genes, leads to a more severe disease phenotype. Additionally, some studies have indicated that a greater copy number of SMNC is associated with a milder phenotype (20 ).
The BTF2p44 protein is a subunit of the RNA polymerase II complex which is involved in transcription and transcription-mediated DNA repair (21 ). A single copy of the gene has been previously mapped to the SMA region (22 ). We report the presence of multiple copies of the gene in the SMA region, and the localization of the gene copies in close proximity to NAIP and SMN. Furthermore, we report a base pair polymorphism which distinguishes two gene copies, and document the SMA associated deletion of one version of the gene.
In a search for genes which map to the SMA interval, YACs spanning the region (4 ), as well as YAC derived phage clones, were used for exon-trapping (23 ) and direct cDNA selection (24 ). In this fashion, we identified several cDNA clones with nearly complete sequence identity to the basal transcription factor II p44 subunit (BTF2p44) gene (21 ). To elucidate the entire BTF2p44 coding sequence, we used the cDNA clones to identify homologous phage and cosmid genomic clones and sequenced the clones with oligonucleotide primers derived from the cDNA clones. We identified a total of 16 exons including both 5' and 3' untranslated regions (UTRs) as shown in Figure 1 . Intronic primer sequences for PCR of internal exons are described in Table 1 . Crude restriction mapping has indicated a size of 20-30 kb for the gene (data not shown).
Previously reported physical maps showing the relative positions and orientations of NAIP and SMN (9 ,11 ) reveal major discrepancies presumably reflecting individual chromosomal variations in gene organization throughout this region (25 ,26 ). The mapping of BTF2p44 relative to SMN and NAIP has been facilitated by the presence of the multicopy microsatellite marker CMS1 (4 ). This marker, reported to be present near the 5' end of NAIP (9 ), also maps to the 3' UTR of some BTF2p44 cDNA clones, based upon DNA sequence analysis (Fig. 1 ). We were able to position BTF2p44 relative to NAIP and SMN by PCR amplification of, and allele specific oligonucleotide hybridization to, YAC clones from the region which have been previously described (11 ). SMNC, [Psi]-NAIP, and a centromere-specific form of BTF2p44 (C-BTF2p44; see next section) map to YAC clone 759A3 whereas SMNT and the telomere-specific form of BTF2p44 (T-BTF2p44), along with a copy of NAIP, map to YAC 595c11 (Fig. 2 A). Both versions of SMN and BTF2p44 map to YAC clone 920C9. The multicopy microsatellite marker CMS1 (D5Z9) maps to YAC clones containing both the telomeric location (YAC595c11) and the centromeric location (YAC759A3). We know from previous studies that two CMS1 loci also map to a single cosmid clone, or within 40 kb of one another (7 ). Thus, it is certain that two or more copies of CMS1 can reside on a single chromosome although the extent of individual variation in regard to copy number has not been characterized. More recently, the complete sequence of PAC125D9 has been assembled from the sequencing of overlapping plasmid subclones of this PAC (Qianfa Chen, et al., unpublished data). Comparison of BTF2p44 DNA sequence with that of PAC125D9, which contains both SMNT and NAIP (9 ), places 3' exons of BTF2p44 on one end of the PAC proximal to NAIP, with the more 5' exons presumably extending off the PAC (Fig. 2 B).Considered together, these data indicate that the 3' UTR of BTF2p44 gene is adjacent to the 5' end of NAIP.
The preferential deletion of SMNT and full-length NAIP in SMA patients implicates these genes in the etiology or clinical manifestation of this disorder. Because BTF2p44 maps in close proximity to these two genes, we were interested in learning whether this gene was also preferentially deleted in patients with SMA. Similar to the multicopy SMN and NAIP genes, deletion analysis is complicated by the presence of at least two highly homologous gene copies, and by the possibility that gene copy number varies among individuals. We used direct DNA sequence comparison of multiple cDNA clones and SSCP analysis of individual YAC clones (data not shown) to distinguish two closely related homologues of BTF2p44 which we have designated C-BTF2p44 and T-BTF2p44 based upon their relative proximity to the centromere on the YAC contig reported by Lefebvre et al. (11 ). These versions differ at base pair 453 in exon 7 (see Fig. 1 ) where an A -> G transition results in an amino acid change from isoleucine in the telomeric copy to a methionine in the centromeric copy, and at base pair 706 in exon 10 where a G -> C transversion results in an amino acid change from valine in the telomeric copy to a leucine in the centromeric copy. The bp453 difference could be detected by SSCP analysis, it was more reliably documented using allele specific oligomer (ASO) hybridization (27 ). The bp706 polymorphism was screened using a restriction assay based on an NlaIII/DdeI restriction polymorphism created by the base difference.
A dot blot hybridization assay was developed for the bp453 polymorphism wherein a PCR product spanning BTF2p44exon 7 was immobilized to nylon membrane and hybridized to oligonucleotide probes specific for T-BTF2p44(5187) and C-BTF2p44 (5188) (Fig. 3 ). PCR amplification was performed on a sample of SMA Types I, II or III and control individuals chosen independently of SMN and NAIP deletion status. 14.3% (5/35) of unrelated SMA Type I, 14.8% (4/27) of unrelated Type II and 12.5% (3/24) of unrelated Type III patients were found to lack detectable hybridization with 5187, indicating a homozygous deletion of T-BTF2p44. One control individual (1/81 or 1.2%) was identified who showed no 5187 hybridization, indicating homozygous absence of the telomeric copy of this gene.
Figure 5 shows the pattern of expression ofBTF2p44 gene transcripts in multiple tissues. All tissues display transcripts of 4.0, 7.5 and >9.5 kb. Kidney and pancreas also show evidence of another transcript ~3.0 kb in size. The largest transcript appears to be most highly expressed in skeletal muscle. To test whether the two forms of BTF2p44 are specifically expressed in different tissues, we performed both ASO dot blot hybridization and NlaIII and DdeI restriction analysis on RT-PCR amplification products from adult liver, fetal liver, fetal muscle and fetal brain and found no tissue specific expression of either T-BTF2p44or C-BTF2p44; both forms are expressed in all tissues (data not shown). Similarly, both versions of BTF2p44 were expressed in SMA individuals who contain at least one copy of T-BTF2p44 (data not shown).RT-PCR experiments with muscle and lymphoblast RNA, together with analysis of cDNA clones, indicate that exon 14 is expressed in these tissues, as well (data not shown).
Figure
The p44 subunit of basal transcription factor II (BTF2p44) is part of a transcription-repair complex involving many protein components (21 ,28 ). The gene encoding this subunit was recently mapped to the SMA region on chromosome 5 (22 ). We report further characterization of this gene including identification of the full coding sequence consisting of at least 16 exons, and evidence for two and possibly more highly homologous gene copies mapping to the SMA region at chromosome 5q13. We report for the first time an exon (exon 14) which, if expressed in a full length transcript, would prematurely stop the protein coding sequence and eliminate the zinc binding motif encoded by exon 16. Two highly homologous copies of BTF2p44 are described which differ by two non-conservative amino acid changes (Ile -> Met) in exon 7 and (Leu -> Val) in exon 10. Both forms of BTF2p44 appear to be ubiquitously expressed and are present in fetal and adult tissues and in SMA and control lymphoblasts. The variable presence of exons 11 and 14 in isolated cDNAs suggest BTF2p44 transcripts are synthesized both with and without these exons, although we have not successfully quantitated the relative levels of each species. The functional significance of the two forms of BTF2p44 or the alternatively transcribed messages is not known. It is possible these alternate transcripts are expressed from pseudogene copies of BTF2p44, similar to B-cadherin in this same region (8 ); however, we have no evidence for the existence of incomplete BTF2p44 gene copies. All BTF2p44 DNA sequence variations detected in genomic DNA have likewise been identified in RNA preparations from lymphoblast cell lines by reverse transcriptase PCR amplification followed by DNA sequencing. These data indicate that non-expressed BTF2p44 pseudogenes do not exist, although the multiple copy nature of BTF2p44 makes this assertion difficult to prove.
Physical mapping in the SMA region is complicated and imprecise (25 ), presumably due to the great instability and variability arising from the duplication, deletion and gene conversion of highly homologous DNA in this region. Because physical maps are ill-defined, it is difficult to define the extent and content of SMA deletions. Indeed, it is not known whether deletions are continuous, interrupted or both across this region. Evidently, gene conversion accounts for the loss of some SMNT gene copies (20 ), and presumably for the loss of other highly homologous loci in the region; distinguishing gene conversion from a true deletion is difficult in the SMA region. It is interesting that no unambiguous evidence has emerged defining deletion breakpoints in SMA patients. In a region of extensive DNA duplication, certain molecular events would leave little trace of DNA deletion or rearrangement, i.e., gene conversion events, unequal crossing over, or deletion within direct repeat duplicated DNA segments. On the other hand, deletion involving inverted repeat gene segments, or deletion events spanning segments of unique sequence DNA, would be expected to leave detectable evidence, for example, in the form of altered restriction fragments. Unfortunately, the current state of physical mapping of the multicopy BTF2p44 gene does little to clarify these issues; deletion of T-BTF2p44 does not show any significant correlation to disease severity. The physical mapping and deletion data are consistent with either an interrupted pattern of small scale deletions, or a large scale, continuous deletion mechanism. The current data are not, however, consistent with small-scale continuous deletions.
The localization of BTF2p44 to non-overlapping YAC clones demonstrates its multicopy nature. The presence of the multicopy microsatellite marker CMS1 in the 3'-UTR of this gene is consistent with this interpretation. CMS1 has been reported to lie immediately proximal to the 5'-end of NAIP (9 ), suggesting that BTF2p44 and NAIP lie adjacent to one another at least in one orientation. This interpretation is supported by the mapping of BTF2p44 to PAC125D9 which contains the full length NAIP sequence.
Distinct copies of BTF2p44 were identified based on an A -> G transition in exon 7 and a G -> T transversion in exon 10. Dot blot ASO YAC screening of bp453 and restriction analysis at bp706 localized one copy to YAC 595c11, which contains full length NAIP and SMNT (9 ,11 ). We designated this the telomeric copy (T-BTF2p44). The centromeric copy (C-BTF2p44) was localized to the centromeric YAC clone 759A3 (11 ) along with another copy of NAIP and SMNC. Both homologues of T-BTF2p44 are missing in ~14% of all SMA individuals, with no apparent correlation to severity of phenotype. By contrast, NAIP appears to be most frequently deleted in more severely affected SMA patients (9 ). Assay of the bp706 polymorphism detects less frequent deletions in SMA individuals than the bp453 polymorphism, therefore it is probable that this base change does not absolutely delineate two different versions, but that bp706 is a polymorphic site within the gene, and one (or more) gene copies. For example, three copies of BTF2p44 might exist in the human genome, one of which is polymorphic at bp706. An assay of this polymorphism would detect a deletion when all non-deleted copies of BTF2p44 contained the alternate allele; no deletion would be detected if the remaining two (or more) gene copies were polymorphic at bp706. The `deletion frequency' would then be a combinatorial measure of both the rate of deletion of T-BTF2p44 and the frequency of the polymorphism. Again the fact that all non-SMA individuals examined exhibited both versions of the bp706 polymorphism indicates that in the normal human genome both bp706 versions of the gene are present.
Two of the 13 SMA patients who were homozygous null for T-BTF2p44 and SMNT contained at least one copy of NAIP exon5. Interpretation of this result is not straightforward. Based upon physical mapping of BTF2p44 and NAIP in the PAC125D9 clone (Fig. 2 b), as well as the mapping of CMS1 relative to both BTF2p44 and NAIP, we have represented NAIP residing between SMNT and T-BTF2p44 in Figure 2 a. If this is the correct gene order, the deletion data would be consistent with a non-continuous deletion event which excludes NAIP in these two cases.
Two Type III SMA siblings were shown to contain at least one copy of SMNT exons 7 and 8, as well as exon 5 of NAIP, yet were deleted for T-BTF2p44. These two patients do not show a loss at bp706 in exon 10. Whereas this deletion may suggest a role for BTF2p44 in SMA etiology, it may also reflect rarely occurring deletions that are unrelated to the disease. The presence of the BTF2p44 deletion in a small percentage of the normal population indicates that such a deletion could occur in a small percentage of SMA individuals independent of an SMA-causing deletion. We are currently looking for point mutations within SMNT in these SMA individuals.
The high frequency of SMNT deletions among SMA patients together with documentation of several small intragenic mutations within this gene (11 ,12 )strongly implicate SMNT as a causative agent in SMA. It remains to be determined, however, whether all cases of SMA must lack a functional copy of SMNT. The rare occurrence of unaffected, carrier parents who are homozygous null for exons 7 and 8 of SMNT (29 -31 ) and the rare occurrence of affected individuals who contain at least one copy of SMNT (although they may still harbor point mutations) leaves open the possibility that SMA can result from deletions or mutations of other genes in the region, either singly or perhaps in combination.
BTF2p44 is an interesting addition to the genes preferentially deleted in SMA individuals since it has a well-defined function in transcription and repair. A recent report shows that BTF2p44 and a specific protein kinase, MO15, interact together within a transcription-repair complex where the p44 subunit imparts to MO15 the ability to specifically phosphorylate RNA polymerase II and to perform nucleotide excision repair of mutated DNA (32 ). We have physically and genetically mapped MO15 to the same YAC contig containing SMN, NAIP and BTF2p44, although just outside of the critical disease gene region (33 ; unpublished observation). It is interesting, though highly speculative, to hypothesize that MO15 and BTF2p44 exist in a transcription-repair complex together with SMNT, NAIP and other proteins. Defects in the associated transcription-repair proteins are correlated with several diseases (Xeroderma Pigmentosum, Cockayne's Syndrome and trichothio-dystrophy), all of which demonstrate profound neurological dysfunction (34 ).
The exon 7 base pair polymorphism was PCR amplified in YACs, phage and cosmid subclones, and in total human genomic DNA. These reactions were performed in a total volume of 25 µl, with 100 ng total human genomic DNA, 0.2 mM of each dNTP, 2.5 µl of 10* Boehringer-Mannheim PCR buffer, 0.2 µM each of primers 4500 and 4501 and 0.25 U Taq polymerase. A touchdown program was applied in a Perkin-Elmer 9600 Thermal Cycler. Aliquots (8 µl) of each amplification reaction were denatured through the addition of 50 µl of a 500 mM NaOH, 2.0 M NaCl and 2.5 mM EDTA solution. The denatured products were then dot-blotted onto Hybond-N+ (Amersham, UK) membrane using a Bethseda Research Laboratory vacuum Hybri-Dot Manifold as described (35 ).
Alternate 15 bp oligomers were synthesized on an Applied Biosystem 392 DNA/RNA Synthesizer. End-labeling was performed on the oligomers at 37oC for 1 h after addition of 2.5 µl of NEB T4 polynucleotide kinase buffer, 17 µl of [[gamma]-32P]ATP and 1.2 µM oligomer in a final volume of 25 µl. The reaction mix was raised to 100 µl volume with distilled water and placed on NA45 paper; unincorporated radioactively labeled ATP was subsequently removed through a series of four room temperature washes in TE + 175 mM NaCl. The remaining labeled oligomers were stripped from the NA45 membrane with a 10 min, 65oC wash in the hybridization solution (5* SSPE, 0.5% SDS). The dot blots described above were then incubated at 47oC for 2 h in a rolling hybridization oven. The blot was then subjected to two 15 min washes in 3 M TMA at 47oC, after which 1 h of autoradiography was performed using DuPont Reflection film at room temperature.
Approximately 210 ng of genomic DNA was used in a PCR reaction with primers p44-706-F2B (5' CGT ACC ATG TTA TTT TAG ATG 3') and p44-706-R2C (5' TAC GAA TAA GTG AGC ATT CAG 3'). After initial denaturation the PCR conditions included annealing temperature of 57oC, extension time of 45 s at 72oC and denaturation for 40 s at 94oC, for 36 cycles. PCR products were purified using the Wizard PCR purification system (Promega, Madison, WI). One half of the purified product was digested with DdeI and the other half with NlaIII (New England Biolabs, Beverly, MA) in 30 µl reactions according to the manufacturer's conditions. The digestion products were separated on a 15% nondenaturing acrylamide gel and visualized with ethidium-bromide for gel photography.
A fragment was amplified using primers F1 and B14 from c19-2. This fragment was labeled by the random hexamer method and hybridized in 10 ml of 5* SSPE, 10* Denhardt's solution, 100 µg/ml salmon sperm DNA, 2% SDS and 50% formamide overnight at 42oC. Two washes were performed at 42oC for 20 min in 2* SSC and 0.5% SDS, followed by two more washes at 50oC for 20 min in 0.1* SSC and 0.1% SDS. Autoradiography was performed at -80oC for 3 days using DuPont Reflection film with an amplification screen.
This work was supported by the Families of SMA (Chicago, IL), Andrews' Buddies, Inc., the Muscular Dystrophy Association of America (LMK and TCG), and the National Institutes of Health Grants NS28877 (to TCG) and NS23740 (to LMK). CHW is a recipient of an NIH Clinical Investigator Award, NSO1576, and receives support from the Colleen Giblen Foundation for Pediatric Neurology Research. LMK is an associate investigator of the Howard Hughes Medical Institute.
Human Molecular Genetics Pages
©
Introduction
Results
Genomic structure
Physical mapping
Deletion of BTF2p44 in SMA patients
Expression of BTF2p44
Discussion
Materials And Methods
ASO hybridization
Bp706 RFLP analysis
Northern analysis
Acknowledgments
References
REFERENCES
This page is maintained by OUP admin. Last updated Fri Jan 31 11:45:11 GMT 1997. Part of the OUP Journals World Wide Web service.Copyright Oxford University Press, 1996


