An RBM homologue maps to the mouse Y chromosome and is expressed in germ cells
An RBM homologue maps to the mouse Y chromosome and is expressed in germ cellsDavid J. Elliott, Kun Ma, Shona M. Kerr, Rekben Thakrar, Robert Speed, Ann C. Chandley and Howard Cooke*
MRC Human Genetics Unit, Western General Hospital, Crewe Road, Edinburgh EH4 2XU, UK
Received December 27, 1995;Revised and Accepted April 2, 1996
We have isolated a murine homologue of the human Y-linked RBM genes (previously termed YRRM), a gene family implicated in spermatogenesis and which encodes proteins containing an RNA recognition motif. A number of very similar copies of this gene (called Rbm) are present in the mouse. These mouse homologues are also Y-encoded, mapping on the short arm of the chromosome, proximal to Sry. Expression is confined to the testis, specifically the germ line on the basis of lack of expression in the germ-line negative testes of adult sex-reversed mice. The timing of Rbm transcription is regulated, with fetal message levels reaching a peak at 15 d.p.c. Transcripts are clearly detectable by 4 days after birth and reach their highest level at 14 d.p.p. which is the time at which the Y chromosome condenses during meiotic prophase. These results suggest that Rbm is functionally involved in germline RNA metabolism.
As a result of the absence of essential genes from the mammalian Y chromosome, many deletions can occur without fatal consequences for their carrier. The availability of such Y chromosome deletions in the human population has compensated for the absence of recombination-based methods of genetic mapping and resulted in the development of some of the first chromosome interval maps (1 ) and the most complete YAC contig to date for any chromosome (2 ). Although deletions of the Y chromosome are not lethal they are not necessarily without consequence. For example, those which involve the sex determining gene SRY can result in sex reversal and others which involve as yet unknown genes can lead to small stature and infertility (3 ,4 ).
Deletions of the human Y involving the distal euchromatic part of the long arm can result in oligo- and azoospermia (5 ). This is thought to be due to the loss of a locus AZF (azoospermia factor) which has been postulated to be present in this Yq11.23 region (5 ). A range of different male infertility phenotypes has been associated with deletions in this region (6 ) and by positional cloning a gene family (YRRM-now renamed RBM, for RNA binding motif, in the Genome Data Base) has been found (7 ) with members in or close to this region. Members of this family are transcribed in the testis giving rise to mRNAs which potentially encode an hnRNP protein similar to hnRNP G (26 ), a widely expressed member of the large family of heterogeneous ribonucleoproteins (hnRNPs). HnRNPs are in general highly abundant nuclear proteins (8 ), which are found associated with nuclear transcripts forming ribonucleoprotein particles. Some hnRNPs shuttle between nucleus and cytoplasm, although in these cases they are found at steady state to be more concentrated in the nucleus (9 ,10 ). A common feature of these and many other RNA binding proteins is the RNA recognition motif (RRM) which can be present in more than one copy in the protein.
Members of the hnRNP family may have more functions than RNA binding and particle formation (11 ) and may be involved in several aspects of RNA metabolism. HnRNP A1 has been implicated in the influencing of splice site choice by competing with the SR proteins (12 ) and immunodepletion of hnRNP proteins can reduce splicing efficiency in vitro (13 ). P element disruption of a Drosophila gene encoding an hnRNP leads to spermatogenic failure as its only apparent effect (14 ) and another Drosophila gene encoding a RRM-containing protein, snf, has been shown to be an essential gene involved in splice site recognition in the sex determination pathway. Some mutations in this gene can give rise to dominant lethal phenotypes (15 ). In the context of the testis both the RNA binding activities and the potential of these proteins to influence splicing are intriguing. Sequestration of some mRNAs is known to occur in testis and there are many other messages which have spliced forms which are apparently testis-specific (16 ). The role of the CREM protein in the testis switches in response to follicle-stimulating hormone as a result of a change in polyadenylation site, as a response to the hormone (17 ).
The analysis of the human RBM gene family and proof or disproof of its involvement in spermatogenesis is complicated by the presence of between 20 and 40 genes and pseudogenes (Prosser et al., submitted). Zoo blots with RBM cDNA probes suggested that there was a low copy number or perhaps a single copy Y-linked homologue in species other than primates (18 ). We wished to identify the mouse gene because of the relative ease with which developmental and biochemical material can be obtained and to determine if it corresponded to any Y-linked locus known to be involved in spermatogenesis. Several of these have already been mapped in the mouse. Mapping of homologous mouse sequences would also facilitate the construction of mice with this gene or genes either disrupted or deleted and so provide a rigorous test of the hypothesis that this gene (or genes) plays an essential role in spermatogenesis in the mouse. Such mice might also provide a useful model system for some forms of human male infertility and could give insight into the function of the RBM protein. The cloning of these genes in mice is also an important first step in studying the biochemistry of this protein. Here we report the cloning and characterization of a murine homologue of the RBM cDNA, the mapping of the mouse Rbm gene family on the Y chromosome and the timing of its expression during the development of the testis.
We isolated a mouse genomic lambda clone ([lambda]3.2) by screening a lambda 2001 library with the human RBM cDNA under low stringency conditions. Homology detectable by hybridization with the human cDNA was confined in this clone to a 750 bp fragment generated by XbaI digestion of the phage DNA. This fragment was subcloned into a plasmid vector and partially sequenced. Comparison of this sequence with the human RBM cDNA sequence revealed high homology to the RNP1 region of this gene.
Two approaches were taken to confirm the Y chromosomal origin of this clone. In the first the [lambda]3.2 DNA was used as an in situ hybridization probe to male mouse metaphase spreads. This gave a clear signal in the pericentric region of the Y chromosome (data not shown). To confirm this result and to provide a more precise localization primers were synthesized (g632 and g747-see Materials and Methods for sequences) based on the genomic sequence. DNA from male, female, XXSxra and XXSxrb mice (19 ) was tested by PCR amplification. All except the female DNA sample amplified, giving a product of the predicted size (data not shown). This STS maps one or more copies of the gene within the Sxra region of the short arm of the mouse Y chromosome, and excludes the possibility that all copies of the gene are within the region defined by the Sxrb deletion. One or more copies of this STS map within the Sxra region to the same deletion interval as Sry, the sex determining locus, but this does not exclude the possibility of copies elsewhere on the Y chromosome. Further analysis using the Yd chromosomes (21 ) places most if not all of these sequences proximal to Sry (22 ) in the region containing the Sx1 repeats (23 ). The location of the Sxra breakpoint, the Sxrb deletion, Sry, Rbm and the Yd deletions are illustrated schematically in Figure 1 .
Screening of mouse testis cDNA libraries with the human RBM cDNA identified many positively hybridizing clones, presumably due to the homology between different hnRNP genes. To focus on the closest homologues of the RBM genes we analysed 50 primary positives derived by hybridization of a directionally cloned adult testis cDNA library in [lambda]ZAP with the human RBM clone pMK5 (7 ). Amplification of the phage DNA was carried out with one vector primer (d15) and a primer g842 derived from the region of [lambda]3.2 homologous to the coding sequence of the human cDNA (see Materials and Methods). One cDNA clone was purified to homogeneity. The plasmid moiety was excised and the 1.6 kb insert of this plasmid (pmRbm1) sequenced.
An open reading frame (ORF) of 1251 bp initiating with a methionine residue was the longest present and the conceptual translation in this frame is shown in Figure 3 , aligned (using the GCG pileup program) with the predicted human hnRNP G and RBM polypeptide sequences. The N-terminal regions, including RNP1 and RNP2 of the RNA recognition motif, are highly conserved between these proteins. In addition, a second motif MNGXXLDG is also found in all three proteins just to the C-terminal side of RNP1. The region of the mouse Rbm which is C-terminal to this motif, between residues 90 and 470, is basic and contains a high content of glycine residues interspersed with basic and aromatic residues. A number of each of these residues is either conserved or changed conservatively, suggesting that they might be functionally important. The SRGY motif present in the human RBM genes is present once in the mouse Rbm, and is absent from the human hnRNP G protein. In human RBM, this tetrapeptide is part of a more complex repeat motif which includes the sequence SSRETREYAPP and is repeated three times (the third repeat has a conservative E-D substitution in position 7). This motif is not found in the mouse Rbm. A third motif GYGGX is found in the C-terminal region of all three proteins. The remainder of the C-terminus (residues 471-515) is the least conserved region.
Figure 3. Peptide sequence comparison of mouse RBM, human RBM and human hnRNP G predicted proteins. The RNP1 motif is boxed and cross hatched, the RNP2 motif is boxed and shaded, the SRGY repeat is in bold typeface and other highly conserved motifs are shaded.
As we had sequenced only one cDNA, it was possible that the sequences detected by Southern blot which are derived from regions of the Y chromosome more proximal than the Sxr breakpoint might encode other transcripts which might encode significantly different proteins. The h114/h118 PCR product which detects these other genomic fragments was therefore used as a probe on the same adult testis cDNA library (derived from a mouse strain with a Mus domesticus Y chromosome) and a further three cDNA clones isolated. Sequencing these shows a number of differences from pmRbm1 but these could be due to RNA processing-in one case to incomplete splicing resulting in a message retaining one intron or to differences in the site of polyadenylation of less than six bases-as well as transcription of different genes. In contrast, sequencing of RT-PCR products derived from a Mus musculus testis showed at least one clear difference between the two subspecies with a C-G change at position 784 resulting in a non conservative Q-E amino acid change (data not shown). These results are consistent either with transcripts originating predominantly from a single gene or with multiple transcribed genes which are identical within a strain but different between strains.
We wished to use RT-PCR to detect transcripts of this gene simply and reliably without the complication of signals from contaminating genomic DNA. Homology between the ubiquitously expressed hnRNP G and the mouse Rbm transcript might be another source of false signals. Dot blot comparison was used to find regions of Rbm with minimal homology to human hnRNP G and primers were selected within these regions. Primers h114 and h118 are suitable as they fail to amplify mouse genomic DNA.
A variety of adult mouse tissues was assayed by RT-PCR using the h114/h118 primer pair and 30 cycles of amplification. There are no detectable transcripts in kidney, placenta, lung, spleen, heart or brain (not shown). Testes from XXSxra (Fig. 4 , lane 12) and XXSxrb mice (not shown) were also negative under these PCR conditions but testis RNA from normal adult mice is strongly positive (Fig. 4 , lane 9). Nested PCR using primers h118 and h115 in the second set of 30 cycles produced a low level of product from the XXSxr samples but not from a control kidney RNA sample (not shown). This may reflect the presence of small regions of cells in these testes which have lost an X chromosome and can proceed partially through spermatogenesis and implies that the gene or genes present in the Sxr region is transcribed. This is supported by the detection of transcription using RT-PCR in the testes of XSxraO and to a lesser extent XSxrbO mice (Fig. 4 , lanes 11 and 13). The reduction in signal in XSxrbO mice compared with XSxraO could be due to the presence of a highly transcribed gene in the Sxrb deletion but is perhaps more likely to be due to the much earlier failure of spermatogenesis in these animals compared with XSxraO animals.
Figure 4.Reverse transcription-PCR analysis of Rbm mRNA expression. RNA was isolated from embryos 15.5 and 17.5 days post coitum (lanes 1 and 2), and from the testes of newborn, 2 day, 4 day, 6 day, 8 day, 10 day, and adult (lanes 3-9), and from the testes of XSxraO, XXSxra and XSxrbO (lanes 11-13). No signal was observed without reverse transcriptase (lane 10), or in the water control (lane 14).
We have also examined the timing of expression of this sequence during development using RT-PCR (Fig. 4 ). Transcripts are detectable in the testis during embryonic development with an apparent peak at 15 d.p.c. (lane 1). They are barely detectable at birth or 2 d.p.c. (lanes 2 and 3) but are found at 4 d.p.p. (although at an apparently reduced level compared with later stages) and continue to be detectable thereafter (lanes 4-9).
Northern blot analysis of total RNA from kidney, testes and brain is shown in Figure 5 . A transcript of 1.7 kb, consistent with the sequenced length of the cDNA clone, is detectable in the testis sample when it is probed with the PCR product produced by h114/h118 from either testis cDNA or the plasmid pmRbm1 as a template. As expected, no Rbm transcript was detected in RNA isolated from either brain or kidney.
Figure 5. Quantitation of Rbm mRNA expression during the first wave of spermatogenesis by Northern blotting. Total RNA isolated from the testes of adult mice and those of mice between 6 and 18 days after birth, from round spermatids isolated by Staput fractionation (25), and from male brain and kidney was electrophoresed through a formaldehyde/agarose gel. After blotting, the filter was hybridized with a random primed probe specific for Rbm (using the PCR product derived from primers h114 and h118 as a template). The level of Rbm transcript in each lane was quantified on a phosphor imager, and then the filter was re-probed using a random primed probe specific for ribosomal protein S16.
Northern blot quantitation of the level of Rbm mRNA in the testes of prepubescent mice shows a maximal amount of the RNA is present at 14 d.p.p. when corrected for loading by comparison with the level of a transcript encoding ribosomal protein S16 (24 ) as a loading control (Fig. 5 ). At this stage, spermatogonia and early spermatocytes are present in the testis. However, Rbm transcript was also detected in RNA from round spermatids isolated from adult testes by staput fractionation (25 ). The presence of Rbm transcript in round spermatids suggests that the mRNA may have a long half-life, and may be translated at a later stage than when it is transcribed. An additional higher molecular weight transcript was detected with the ribosomal protein S16 probe, which may correspond with the previously reported pre-mRNA from this gene (24 ).
A number of candidate genes and two genetically defined loci on the mouse Y chromosome have been implicated in spermatogenesis. One locus Spy (27 ) has been defined and mapped to the region of the Y chromosome short arm deleted by the Sxrb deletion on the basis of differences between XOSxra and XOSxrb testes. This locus is thought to be required for the early stages of spermatogonial development. Two genes, Smcy (Hya) and Ube1y-1 are known to be located in this region and therefore are strong candidates for Spy (28 ,29 ). A long arm locus involved in sperm head morphology has also been described and transcripts deriving from this region of the chromosome have been detected (30 ). Our mapping data exclude the mouse Rbm gene described here from these regions of the chromosome and therefore suggest that they do not correspond to any of these loci, in particular Spy.
Spermatogenesis is a complex developmental process which includes both germ cell division and differentiation, and requires the support of somatic cells in the testes. The chronology of these events in mouse spermatogenesis has been determined by timing the appearance of specific cell types during the first wave of spermatogenesis after birth (31 ,32 ). The absence of transcripts from XXSxra and XXSxrb testes suggests that transcription of Rbm is confined to the germ line. Transcripts from the Rbm gene are detectable during embryonic development with a peak at around 15.5 days post coitum-this is the time at which gonocytes and undifferentiated type A spermatogonia are at the end of their period of embryonic proliferation. Transcripts are again detectable four days after birth, which is about the time at which spermatogonia resume division. Expression is maximal prior to condensation of the Y chromosome (33 ). Hence Rbm may have multiple roles at different stages of germ cell development.
The protein encoded by the Rbm gene has substantial differences from the human RBM protein. Although the RNA recognition motifs are conserved, the SRGY motif is repeated in the human protein but is a single SRGY tetrapeptide in mouse. At the peptide level the human and mouse sequence are 46% identical and 66% similar (using the GCG program gap). This homology is present throughout the reading frame, in contrast to the Sry gene in which the HMG box is the only region conserved between species (34 ). Several specific sequence motifs are shared between hnRNP proteins, although the functional significance of these apart from the RNA recognition motif is not known. In general, the hnRNP proteins are thought to be modular, containing an auxillary domain attached to an RNA binding domain (containing one or more RRMs). The auxillary domain is thought to be responsible for the biological properties of the molecules, such as interacting with other proteins and targeting within the nucleus. However, a basic auxillary domain such as that found in RBM might also play a role in RNA binding. The extreme C-terminus is the least conserved region between these homologues, suggesting it might play a species/molecule-specific role. A detailed analysis of the evolution of this gene will require the sequence of the mouse hnRNP G gene in order to be able to compare the related autosomal gene with the Y-linked one in the two species.
The Rbm genes and related sequences appear to be confined to a small region of the chromosome and the family is less variable in sequence than the corresponding human family which is widely distributed on the Y chromosome and is not obviously part of a repeated DNA organization (Prosser et al., submitted). The Y chromosomes of both mouse and humans contain a number of gene and pseudogene families apart from the Rbm/RBM genes. In humans the TSPY gene family consists of about 200 copies (35 ) and in mouse the short arm of the Y chromosome has two copies of Zfy (36 ) and a number of pseudogenes related to Ube1Y-1 (37 ). A possible reason why these gene families might arise frequently on the Y chromosome could be the lack of recombination. Gene duplication in a recombining region of the genome might be corrected by the recombination process whereas on the Y chromosome such duplications are more likely to survive. If there was no requirement for multiple active genes (and our cDNA sequence data are consistent with this) then divergence might quickly result in the presence of closely linked pseudogenes. A further consequence of this is that gene conversion between gene and pseudogene could be a frequent event resulting in loss of function. An alternative explanation is that multiple copies of the genes are a response to a requirement for a large amount of mRNA. Because of the repeated nature of the Rbm genes in mouse, knockout experiments designed to test these hypotheses will not be straightforward and so a biochemical approach to the function of theRbm protein may be more appropriate.
A genomic [lambda] clone was initially isolated from a male C129 library in [lambda]2001 (a gift of M. Smith) by hybridization with a labelled human cDNA probe in 5* SSC at 60oC followed by washing at room temperature in 2* SSC. Subsequently, a further clone was isolated from this library using a mouse cDNA PCR product as a probe at high stringency. Genomic P1 clones were purchased from Genome Systems.
cDNA clones were isolated from an adult CD-1 testis library in the [lambda]ZAP vector and after purification were excised to give plasmids by infection with helper phage. The initial clone was derived by low stringency hybridization to a human RBM cDNA probe. Subsequent clones were isolated by high or low stringency hybridization with this clone or with RT-PCR products.
Southern and Northern blots and hybridizations were carried out by standard methods (38 ). Ribosomal protein S16 mRNA was detected using a random primed probe prepared using a PCR product made from primers 5'-AGGAGCGATTTGCTGGTGTGGA and 5'- TCCATCTCAAAGGCCTGGTAGC on the mouse cDNA library as a template.
All PCR primers were designed using the Primer program (E. Lander-provided by the MRC HGMP resource centre) to have annealing temperatures of 60oC and were used at a final concentration of 1 mM in TAPS buffer containing 0.2 mM dNTPs and 2 mM MgCl2 (39 ). PCR reactions were carried out on samples prepared either as genomic DNA, by reverse transcription of total RNA with an oligo dT12-18 primer, or by toothpicking samples of bacterial colonies or phage plaques directly into PCR reaction mixes. Primers used for STS mapping were g632 5'- AGA GGC TTT GCT TTC CTT AC and g747 5'- AGG AAT TTG CTC ATT TTT CAG C. Primers used for RT-PCR analysis were h114 5'-GAT GGT GCC TCA TGG AAT CT and h118 5'-AAA TAT GCC AAG AAGG AGA GCC. Intron spanning primers were g632 and h295 5'-TTC CCT TCA CAT GAA GGA CC (this oligonucleotide was synthesized with a 5' biotin modification). Phage DNA was amplified during cDNA screening with vector primers 5'-AGCGGATAACAATTTCACACAGGA (d15) and (c853 ATTAACCCTCACTAAAGGGA). Primer g842 used for library screening was GCAGCGCGTCGGAAAGTAAGG. All amplifications were carried out using a Hybaid omnigene thermal cycler with a 4 min initial denaturation at 95oC in the absence of polymerase which was added while reactions were held at 80oC. Subsequent cycles were 94oC for 45 s, 60oC for 45 s and 72oC for 45 s repeated for 30 or 35 cycles and followed by a final extension of 10 min at 72oC. Biotinylated products for sequencing were bound directly to streptavidin beads (39 ) and sequenced using dideoxy terminators (US Biochemicals, Cleveland, Ohio). The mouse Rbm cDNA sequence has been deposited in the Genbank database with the accession number GB:MMU36929.
We would like to thank Eva Eicher and Kenn Albrecht for the kind gift of RNA from XO sex-reversed animals, Paul Burgoyne and our colleagues in the MRC Human Genetics Unit for helpful discussions and criticisms, and to Bruce Cattanach for XXSxr mice.
This page is maintained by OUP admin. Last updated Thu Oct 31 15:25:15 GMT 1996. Part of the OUP Journals World Wide Web service.Copyright Oxford University Press, 1996