A gene transcribed from the bidirectional ATM promoter coding for a serine rich protein: amino acid sequence, structure and expression studies
A gene transcribed from the bidirectional ATM promoter coding for a serine rich protein: amino acid sequence, structure and expression studiesPhilip J. Byrd*, Paul R. Cooper+, Tatjana Stankovic, Harjit S. Kullar, Giles D. J. Watts, Paula J. Robinson and A. Malcolm R. Taylor
CRC Institute for Cancer Studies, The University of Birmingham Medical School, Birmingham B15 2TJ, UK
Received June 21, 1996;Revised and Accepted August 1, 1996DDBJ/EMBL/GenBank accession no. X97186
In an earlier report we showed that the 5' end of the gene for ataxia telangiectasia ATM is within 700 bp of the 5' end of a novel gene E14, and suggested that the CpG island that separates these genes functions as a bidirectional promoter. We have now determined the complete amino acid sequence of the E14 protein, defined the exon/intron structure of the gene and estimate that the complete gene is more than 55 kb in length. The E14 gene appears to be a housekeeping gene that is expressed in all tissues, including all parts of the brain. The E14/ATM promoter organisation is conserved in man, monkey and mouse, although the mouse promoter is more compact and appears to lack two of the four putative Sp1 boxes found in the human promoter. Reporter gene constructs showed that the human and mouse E14/ATM promoters were indeed bidirectional, that the ATM side of the human promoter was three times stronger than the E14 side, and that the mouse promoter (in human cells) directed transcription with equal efficiency in both directions, but at a lower level than the human promoter. Analysis of a small number of A-T patients for mutations in the promoter region or the E14 coding sequence did not provide evidence to suggest that E14 contributes to the A-T phenotype.
In the course of completing the 5' half of the cDNA for the gene for ataxia telangiectasia ATM (1 -3 ) we identified a CpG island immediately upstream of the end of a 5' ATM RACE product. Within this island and ~700 bp upstream of the 5' end of the ATM RACE product we found the first exon of a gene that we had isolated from this region by exon trapping (2 ). This novel gene, which we called E14, appeared to be transcribed in the opposite direction to the ATM gene from an intergenic region that had features of a bidirectional promoter. Since the proteins encoded by some divergently transcribed genes have been found to interact or to be involved in the same biochemical pathway (4 -6 ), we decided to characterise the E14 protein as a protein that potentially interacts with the ATM protein or which is involved in the same pathway as the ATM protein.
At the present time there is very little known about the biochemical properties or function of the ATM protein, beyond a superficial knowledge of its role in the response to DNA damage and its apparent involvement in the maintenance of the stability of the genome (7 ). The ATM protein has a phosphatidylinositol 3-kinase (PI 3-kinase) like domain at its C-terminus (1 ), which is a feature common to a family of yeast, Drosophila and mammalian genes that are variously involved in cell cycle control, DNA repair, mitotic chromosome stability and meiotic recombination (8 -11 ). The yeast TOR2 and mammalian RAFT1 genes are, so far, the only members of this family which have been shown to have lipid kinase activity (12 ,13 ). It has been suggested that the ATM protein is functionally related to DNA-PK, a member of the PI 3-kinase family which has serine/threonine protein kinase activity, on the basis that both are involved in aspects of the repair of ionizing radiation damaged DNA and both have a role in the recombination of immune system genes (7 ,11 ). By analogy with DNA-PK, which functions as a heterotrimer of a 450 kDa catalytic subunit DNA-PKcs and the 70 kDa and 80 kDa Ku antigens (14 ), the ATM protein might be expected to function as a complex with several other proteins, one of which could possibly be the E14 protein.
In this report we present the complete amino acid sequence of the E14 protein, the exon/intron structure of the gene, and expression analysis that suggests that it is a housekeeping gene. We also compare the E14/ATM promoter sequences of man, monkey and mouse and evaluate the bidirectional activity of two of these promoters. Finally, we report our investigation to identify mutations in the promoter region and the E14 coding sequence in a small number of A-T patients.
The E14 gene was identified originally (2 ) as an exon which we isolated from cosmids that we cloned from an ATM region YAC. Initially a series of cDNA clones were assembled into a 1200 bp contig which was found to have an open reading frame (ORF) which started 35 bp in from the 5' end but encountered no in frame stop codons downstream. Subsequently, this contig was extended by a combination of RACE and the isolation of additional cDNA clones into a sequence of ~4.5 kb which contained a 4281 bp ORF. Further cDNA library screens for non-coding sequences at the 3' end identified two polyadenylated overlapping clones, each of which contained canonical polyadenylation sequences AATAAA 30-40 bp upstream of their poly-A tracts. The complete E14 cDNA sequence is 5867 bp (GenBank accession no. X97186) excluding the poly-A tail. The shorter sequence, produced by the internal polyadenylation signal, is 5672 bp. The ORF is predicted to encode a serine rich protein of 1427 amino acids (Fig. 1 ) with a molecular weight of ~157 kDa. The function of this gene is unknown and a protein database search failed to suggest a possible function because no significant matches were found to any motif or domain in any other protein for which a function is known.
Hybridisation of a zoo-blot with a probe representing the entire E14 coding sequence showed that the E14 sequence has been conserved through mammalian evolution with hybridizing bands being apparent in monkey, rat, mouse, dog, cow and rabbit (data not shown). A very weakly hybridising band was just discernible in chicken DNA after 15 days' exposure, suggesting that E14 related sequences might be present in birds. A more obvious discrete band was identified in yeast DNA after equally long exposure, raising the possibility that E14 sequences have been conserved in lower eukaryotes. However, further Southern blotting experiments of yeast DNA digested with five different restriction endonucleases showed that the hybridizing bands, which again could be seen only after a long exposure, were those that could also be seen on the ethidium bromide stained gel prior to blotting (data not shown). This suggested that the E14 probe was hybridising to sequences of a repetitive nature in the yeast genome. Additionally, a search of the complete yeast DNA database failed to identify potentially homologous sequences at either the DNA or protein level, other than those with a very high serine content.
Analysis of the expression of the E14 gene showed that it was expressed in spleen, thymus, prostate, testis, ovary, small intestine, colon and peripheral blood leukocytes with expression apparently being highest in testis, taking the loading of mRNA in each track into account (Fig. 2 ). Expression was also found in heart, brain, placenta, lung, liver, skeletal muscle, kidney and pancreas (data not shown), in fact in all tissues examined. Interestingly, expression was evident in all parts of the brain including the cerebellum (Fig. 2 ), which is of particular importance in terms of A-T. E14 therefore appears to be a housekeeping gene of unknown function. The E14 probe detected mRNA species of ~8.8 kb, ~6.25 kb and ~5.3 kb by northern blotting. The 6.25 kb species was the most abundant, followed by the 5.3 kb, whilst the 8.8 kb mRNA was present at the lowest levels. The 6.25 kb species approximates in size to the 5.9 kb polyadenylated cDNA whilst the 5.3 kb species is smaller than expected for the 5.7 kb polyadenylated cDNA; the origin of the least abundant 8.8 kb mRNA is unclear. The relative abundance of these three mRNA species was approximately the same in different tissues.
1 Savitsky, K., Bar-Shira, A., Gilad, S., et al. (1995). A single ataxia telangiectasia gene with a product similar to PI-3 kinase. Science, 268, 1749-1553.MEDLINE Abstract
2 Byrd, P.J., McConville, C.M., Cooper, P., et al. (1996). Mutations revealed by sequencing the 5' half of the gene for ataxia telangiectasia. Hum. Mol. Genet., 5, 145-149.MEDLINE Abstract
3 Savitsky, K., Sfez, S., Tagle, et al. (1995). The complete sequence of the coding region of the ATM gene reveals similarity to cell cycle regulators in different species. Hum. Mol. Genet., 4, 2025-2032.MEDLINE Abstract
4 Gavalas, A. and Zalkin, H. (1995). Analysis of the chicken GPAT/AIRC bidirectional promoter for de novo purine nucleotide synthesis. J. Biol. Chem., 270, 2403-2410.MEDLINE Abstract
5 Heikklila, P., Soininen, R. and Tryggvason, K. (1993). Directional regulatory activity of cis-acting elements in the bidirectional [alpha]1(IV) and [alpha]2(IV) collagen gene promoter. J. Biol. Chem., 268, 24677-24682.
6 Wright, K.L., White, L.C., Kelly, A., Beck, S., Trowsdale, J. and Ting, J.P.-Y. (1995). Coordinate regulation of the human TAP1 and LMP2 genes from a shared bidirectional promoter. J. Exp. Med., 181, 1459-1471.MEDLINE Abstract
7 Thacker, J. (1994). Cellular radiosensitivity in ataxia telangiectasia. Intl J. Radiat. Biol., 66, S87-S96.
8 Hari, K.l., Santerre, A., Sekelsky, J.J., McKim, K.S., Boyd, J.B. and Hawley, R.S. (1995). The mei-41 gene of D. melanogaster is a structural and functional homolog of the human ataxia telangiectasia gene. Cell, 82, 815-822.MEDLINE Abstract
9 Greenwell, P.W., Kronmal, S.L., Porter, S.E., Gassenhuber, J., Obermaier, B. and Petes, T.D. (1995). TEL1, a gene involved in controlling telomere length in S. cerevisiae, is homologous to the human ataxia telangiectasia gene. Cell, 82, 823-830.MEDLINE Abstract
10 Morrow, D.M., Tagle, D.A., Shiloh, Y., Collins, F.S. and Hieter, P. (1995). TEL1, an S. cerevisiae homolog of the human gene mutated in ataxia telangiectasia, is functionally related to the yeast checkpoint gene MEC1. Cell, 82, 831-840.MEDLINE Abstract
11 Hartley, K.O., Gell, D., Smith, G.C.M., et al. (1995). DNA-dependent protein kinase catalytic subunit: a relative of phosphatidylinositol 3-kinase and the ataxia telangiectasia gene product. Cell, 82, 849-856.MEDLINE Abstract
12 Cardenas, M.E. and Heitman, J. (1995). FKBP12-rapamycin target TOR2 is a vacuolar protein with an associated phosphatidylinositol-4 kinase activity. EMBO J., 14, 5892-5907.MEDLINE Abstract
13 Sabatini, D.M., Pierchala, B.A., Barrow, R.K., Schell, M.J. and Snyder, S.H. (1995). The rapamycin and FKBP12 target (RAFT) displays phosphatidylinositol 4-kinase activity. J. Biol. Chem., 270, 20875-20878.MEDLINE Abstract
14 Jackson, S.P. and Jeggo, P.A. (1995). DNA double-strand break repair and V(D)J recombination: involvement of DNA-PK. TIBS, 20, 412-415.
15 Shapiro, M.B. and Senapathy, P. (1987). RNA splice junctions of different classes of eukaryotes: sequence statistics and functional implications in gene expression. Nucleic Acids Res., 15, 7155-7174.MEDLINE Abstract
16 Patthy, L. (1987). Intron-dependent evolution: preferred types of exons and introns. FEBS Lett., 214, 1-7.MEDLINE Abstract
17 Long, M., Rosenberg, C. and Gilbert, W. (1995). Intron phase correlations and the evolution of the intron/exon structure of genes. Proc. Natl Acad. Sci. USA, 92, 12495-12499.MEDLINE Abstract
18 McConville, C.M., Stankovic, T., Byrd, P.J., et al. (1996). Mutations associated with variant phenotypes in ataxia telangiectasia. Am. J. Hum. Genet., 59, 320-330..MEDLINE Abstract
19 Miki, Y., Swenson, J., Shattuck-Eidens, D., et al. (1994). A strong candidate for the breast and ovarian cancer susceptibility gene BRCA1. Science, 266, 66-71.MEDLINE Abstract
20 Tavtigian, S.V., Simard, J., Rommens, J., et al. (1996). The complete BRCA2 gene and mutations in chromosome 13q-linked kindreds. Nature Genet., 12, 333-337. MEDLINE Abstract
21 Lambrechts, M.G., Pretorius, I.S., Marmur, J. and Sollitti, P. (1995). The S1, S2 and SGA 1 ancestral genes for the STA glucoamylase genes all map to chromosome-IX in Saccharomyces cerevisiae. Yeast, 11, 783-787.MEDLINE Abstract
22 Lefebvre, O., Ruth, J. and Sentenac, A. (1994). A mutation in the largest subunit of yeast TFIIIC affects transfer-RNA and 5-S RNA-synthesis-Identification of 2 classes of suppressors. J. Biol. Chem., 269, 23374-23381.MEDLINE Abstract
23 Imai, T., Yamauchi, M., Seki, N., et al. (1996). Identification and characterization of a new gene physically linked to the ATM gene. Genome Res., 6, 439-447.MEDLINE Abstract
24 Shinya, E. and Shimada, T. (1994). Identification of two initiator elements in the bidirectional promoter of the human dihydrofolate reductase and mismatch repair protein 1 genes. Nucleic Acids Res., 22, 2143-2149.MEDLINE Abstract
25 Lefebvre, S., Burglen, L., Reboullet, S., et al. (1995). Identification and characterization of a spinal muscular atrophy-determining gene. Cell, 80, 155-166.MEDLINE Abstract
26 Roy, N., Mahadevan, M.S., McLean, M., et al. (1995). The gene for neuronal apoptosis inhibitory protein is partially deleted in individuals with spinal muscular atrophy. Cell, 80, 167-178.MEDLINE Abstract
27 Oettinger, M.A., Schatz, D.G., Gorka, C. and Baltimore, D. (1990). RAG-1 and RAG-2, adjacent genes that synergistically activate V(D)J recombination. Science, 248, 1517-1523.MEDLINE Abstract
28 Uziel, T., Savitsky, K., Platzer, M., et al. (1996). Genomic organization of the ATM gene. Genomics, 33, 317-320.MEDLINE Abstract
29 Nasrin, N., Ercolani, L., Denaro, M., Kong, X.F., Kang, I. and Alexander, M. (1990). An insulin responsive element in the glyceraldehyde-3-phosphate dehydrogenase gene binds a nuclear protein induced by insulin in cultured cells and by nutritional manipulations in vivo. Proc. Natl Acad. Sci. USA, 87, 5273-5277.
30 Poteat, H.T., Kadison, P., McGuire, K., et al. (1989). Response of the human T-cell leukaemia virus type 1 long terminal repeat to cyclic AMP. J. Virol., 1604-1611.
31 Lee, K.A.W. and Green, M.R. (1987). A cellular transcription factor E4F1 interacts with an E1A-inducible enhancer and mediates constitutive enhancer function in vitro. EMBO J., 6, 1345-1353.
32 Watanabe, H., Wada, T. and Handa, H. (1990). Transcription factor E4TF1 contains two subunits with different functions. EMBO J., 9, 841-847.MEDLINE Abstract
33 Munroe, D.J., Loebbert, R., Bric, E., et al. (1995). Systematic screening of an arrayed cDNA library by PCR. Proc. Natl Acad. Sci. USA, 92, 2209-2213.MEDLINE Abstract
34 Huen, D.S., Henderson, S.A., Croom-Carter, D. and Rowe, M. (1995). The Epstein-Barr virus latent membrane protein-1 (LMP1) mediates activation of NF-[kappa]B and cell surface phenotype via two effector regions in its carboxy-terminal cytoplasmic domain. Oncogene, 10, 549-560.MEDLINE Abstract
35 Liu, Q. and Sommer, S.S. (1995). Restriction endonuclease fingerprinting (REF): a sensitive method for screening mutations in long, contiguous segments of DNA. BioTechniques, 18, 470-477. MEDLINE Abstract
*To whom correspondence should be addressed
+Present address: Department of Human Genetics, Roswell Park Cancer Institute, Elm and Carlton Streets, Buffalo, NY 14263, USA
This page is maintained by OUP admin. Last updated Thu Oct 31 15:28:51 GMT 1996. Part of the OUP Journals World Wide Web service.Copyright Oxford University Press, 1996