| Human Molecular Genetics | Pages |
Drosophila-related expressed sequences
Expressed Sequence Tags (ESTs):An Overview
ESTs And Human Diseases
Drosophila Melanogaster: An Ideal Model Organism
Drosophila-Related Expressed Sequences
Mapping Of Dres In Humans And Mice
Systematic Expression Studies Of Dres Genes
Dres Database
Acknowledgements
References
Drosophila-related expressed sequences
The study of model organisms has been instrumental towards the elucidation of the basic mechanisms of human biology. Drosophila melanogaster has been the target of extensive genetic analyses over the past 90 years and a notable amount of information is known about its gene structure, gene regulation and gene function. The vast gene resource generated by the expressed sequence tags (ESTs) efforts was exploited to identify, using a bioinformatic approach, novel human and murine gene transcripts homologous to Drosophila mutantgenes. A systematic characterization of these genes, named Drosophila-related expressed sequences (DRES), was performed including genomic mapping in human and mouse and detailed study of their expression pattern by RNA in situ hybridization experiments. Comparison between DRES genes and their putative partners in Drosophila contributes to the understanding of their function in mammals and to the discovery of their possible role in disease.
The identification of all human genes represents one of the most important aims of the Human Genome Project. It is anticipated that this goal will be achieved in its entirety by the year 2005 through the complete sequencing of the whole human genome. However, in the past 6-7 years, a `shortcut' approach was devised which allowed the scientific community to have access to partial sequence information for a significant number of human genes. This approach was represented by the random sequencing of human cDNA clones which generated the so-called expressed sequence tags (ESTs) ( It was evident that ESTs colinear with genomic DNAs could be easily converted into sequence-tagged sites (STSs) ( In October 1994, the Merck-funded EST project spurred a tremendous increase in the number of ESTs deposited in dbEST. This project was carried out by the Genome Sequencing Center at Washington University in St Louis, MO, in collaboration with the IMAGE consortium ( How is the availability of this enormous amount of information affecting the strategies aimed at the identification of genes involved in human diseases? First of all, it is interesting to note that >80% of positionally cloned genes mutated in human disease states are represented by exact matches with one or more ESTs in dbEST (as of June 1996) ( Table
The identification of each of these disease genes has relied on a common strategy, namely the so-called positional candidate gene approach ( Comparative genomics represents another promising strategy to assess the function of genes ( The complete sequencing of the genome of several organisms ( The fruitfly Drosophila melanogaster is one of the most valuable organisms in biological research. This is probably due to the fact that only Drosophila, more than other model organisms, is well-suited for the application of the tools of genetics, biochemistry, molecular biology, electrophysiology and other biological techniques. Over the past 90 years, most of the knowledge about genetic phenomena such as mutation, recombination and others has been obtained by using Drosophila as a model organism ( More than 11 000 genetic loci and 38 000 alleles have been identified to date; for a significant number of them, the genes responsible have been molecularly cloned, using techniques such as chromosome walking and transposon tagging. All this information has been deposited in Flybase, a comprehensive database containing information on the genetics and molecular biology of this organism ( The identification of additional Drosophila mutant genes and the dissection of their biological functions will be greatly enhanced by the ongoing effort of the Drosophila Genome Project which includes the mapping and sequencing of the entire Drosophila genome as well as the generation of several thousand ESTs ( The remarkable amount of information available makes Drosophila one of the most valuable model organisms to study the function of genes conserved during evolution. There is a considerable number of genes which are highly conserved between humans and Drosophila and play a similar biological role in the two species. More interestingly, the phenotypes caused by mutations in some of these genes can be very similar or affect related systems and organs in both man and fly (
Figure
Therefore, in theory, the systematic identification of human genes similar to genes involved in the generation of mutant phenotypes in Drosophila could provide us with a number of promising candidate genes for human diseases. Furthermore, comparing these novel human genes with genes that have been well characterized in the fly facilitates the process of deciphering their function in mammals. On the basis of this hypothesis we decided to undertake this strategy. While waiting for the complete recognition of all human genes via large scale sequencing approaches, the most obvious reservoir of human transcribed sequences is represented by the EST data. In October 1995, to evaluate the feasibility of this approach, we started to query dbEST using the text query interface available at the NCBI entering the names of some selected Drosophila mutant genes as keywords. As hoped, we were encouraged to observe that a significant percentage of our searches identified human ESTs showing a remarkable similarity to the Drosophila gene product used as query. This was possible since EST sequences in dbEST are periodically run through a BLAST homology search (
Figure
Obviously, we were aware that the list of DRES genes identified was not comprehensive, as our search was based exclusively on keywords. For this reason, we adopted a more systematic approach based on an automated dbEST searching procedure. We ran a series of TBLASTN searches using all of the Drosophila melanogaster protein entries as query sequences and dbEST (dynamically translated in all six reading frames) as the target database. We organized the collection of TBLASTN output results in a searchable database named DRES search engine ( One of the main advantages of this `TBLASTN-based' procedure is the possibility of performing a periodic analysis of dbEST, thanks to the automation of the entire process. Furthermore, it allows a more immediate identification of DRES members of putative gene families. One example is represented by the putative human GOLIATH gene family: the TBLASTN search versus dbEST performed using the Drosophila Goliath protein as query reveals the presence of at least four different human cDNAs (DRES58, 119, 120 and 121) showing a significant similarity to their Drosophila counterpart (Fig. The combination of the keyword-based and DRES search engine strategies allowed us to identify 121 DRES (as of May 1997). The degree of similarity between these human cDNAs and their Drosophila counterparts is in all cases highly significant (with P values ranging from 1.4e-70 to 1.1e-06), which suggests the possibility of a conserved function of these genes during evolution. DRES may thus be considered candidate genes for human disorders whose phenotype resembles that observed in Drosophila. In order to test this hypothesis, the next obvious step is the regional mapping of these transcripts. At the time we started the project (October 1995), very few data on EST mapping were available. For this reason, we decided to map the first set of DRES identified, using a combined approach including both fluorescence in situ hybridization (FISH) and radiation hybrid mapping. Subsequently, a large-scale EST mapping Consortium was started to systematically map by radiation hybrids the EST clusters generated by UniGene, providing the public databases with the regional mapping information for more than 17 000 putative human transcripts ( The mapping data are particularly important for the evaluation of DRES as candidate genes for human disorders. DRES may be considered positional candidate genes for human diseases mapped to the corresponding genomic region. This hypothesis is particularly intriguing when the phenotype of the Drosophila mutant resembles the phenotype of the human disease. There are several such examples among DRES genes: two of them are represented by DRES9 and DRES10, which are significantly similar to the Drosophila proteins retinal degeneration B (rdgB) ( The identification of DRES genes responsible for human disorders can be difficult for several reasons. Firstly, most of the standard procedures used for mutation detection require the determination of the genomic structure of the gene under examination, which is a time-consuming process. Secondly, in many cases, candidate diseases are clinically heterogeneous, which hampers the collection of an appropriate subset of patients to be tested. Thirdly, many genetic diseases either have not been mapped yet or have not been assigned with sufficient accuracy to a given chromosomal region. Nevertheless, the involvement of DRES genes in human disorders has already been demonstrated in a few instances. A novel homeobox-containing gene with significant similarity to the Drosophila Goosecoid (DRES112) has been found to be mutated in Rieger syndrome, an autosomal dominant disorder characterized by ocular anterior chamber anomalies, dental hypoplasia and cranofacial dysmorphism ( Murine homologs of DRES genes (Dres) could also represent candidates for mouse mutant phenotypes, both spontaneous or induced. Accordingly, we decided to determine the mapping assignment of Dres cDNAs in the mouse genome in order to anchor their location to existing genetic maps. To systematically identify murine Dres genes, besides using standard experimental approaches such as library screening and PCR-based techniques, we took advantage of the presence of over 190 000 mouse ESTs in dbEST, mainly generated by the WashU-HHMI Mouse EST Project. This expanding resource allows us to perform in many cases a bioinformatic identification of murine Dres, simply by running a BLASTN search against dbEST using human DRES nucleotide sequences as queries. These cDNAs are currently used for mapping experiments through the analysis of strain-specific polymorphisms on interspecific backcross DNA panels. The mapping assignment of Dres genes will provide a catalog of positional candidates for mouse mutant phenotypes. This catalog will be more and more valuable as the number of regionally mapped murine mutant phenotypes increases. The ongoing large scale systematic generations of mouse mutants represents promising strategies towards this goal (
Figure
In addition to sequence homology, a homologous pattern of gene expression is a common criterion used to diagnose the phylogenetic descent of two genes and their functional conservation. In fact, genes that are themselves homologous may be involved in disparate and non-homologous developmental processes and be expressed in non-homologous structures. One paradigmatic example is represented by EYA1, a human homolog of the Drosophila eyes absent, a gene required in the fly for survival and differentiation of eye progenitor cells. Mutations in the EYA1 gene have been found in patients with branchio-oto-renal (BOR) syndrome ( Taking into account these considerations, we decided to perform a detailed and systematic study of the expression of DRES genes in mammals, both during development and in adult tissues. Towards this goal, we decided to use RNA in situ hybridization, a technique which allows very accurate analysis of the spatial and temporal pattern of expression of gene transcripts. We are carrying out this analysis on mouse embryonic and adult tissue sections using as probes the murine homologs of DRES. An example of the usefulness of the expression studies of Dres genes is represented by the analysis of the expression of Dres9 and Dres10 (Fig. RNA in situ hybridization studies performed on mouse embryo tissue sections at various developmental stages revealed that Dres9, similar to its Drosophila rdgB counterpart, is expressed at very high levels in the neural retina ( These systematic expression studies will provide useful information on the putative function of DRES genes in vertebrates and their possible involvement in human inherited disorders. Moreover, the correlation with the expression pattern of the corresponding Drosophila genes will be helpful in assessing a conserved function of these genes during evolution. More specific insights into the biological role of these transcripts in mammals will be derived by the study of knockout mice carrying null mutations of Dres genes. In summary, the suggestion of the involvement of a given DRES in human and/or mouse inherited disorders arises from the global evaluation of sequence homology and mapping assignment data, integrated with the analysis of their expression patterns. We created the DRES database (DRES db) (Table For each DRES, the following types of data are available: (i) GenBank accession number: a direct link to dbEST at NCBI allows the retrieval of all relevant information on the cDNA clone from which the EST was originally generated. (ii) Drosophila gene and gene symbol: all available information for every Drosophila gene can be retrieved from FlyBase, including a detailed description of the Drosophila phenotype integrated with molecular biology data and bibliographic references. (iii) Drosophila sequence accession number: a hypertext link to the Drosophila sequence deposited in public databases. (iv) BlastX P value: a link to the BLASTX output obtained using each DRES as query sequence against a non-redundant protein database. This allows a visualization of the degree of homology between the human and the Drosophila sequences. (v) UniGene entry: this link makes it possible to determine whether a given DRES is present in a UniGene cluster. (vi) Additional sequence information: in this field, all additional sequence data generated on DRES can be retrieved, including human and/or murine full-length transcripts. (vii) Mapping data: this field contains all DRES mapping information generated as previously described, including FISH and radiation hybrids, as well as mapping data generated by other genome centers. In addition, a direct link to the gene map of the OMIM database allows the retrieval of all human diseases mapped to the corresponding genomic region. (viii) Expression data (to be implemented): a link to the Gene Expression Database (GXD) ( The DRES database constitutes a valuable tool which facilitates both the evaluation of DRESs as candidate genes for human diseases and the understanding of their biological role in mammals. We wish to thank Gyorgy Simon and Alessandro Guffanti for bioinformatic support, Massimo Zollo and the Tigem Sequencing Core and Melissa Smith for preparation of this manuscript. The financial support of the Italian Telethon Foundation (Grant n. B.37) is gratefully acknowledged.
EXPRESSED SEQUENCE TAGS (ESTs):AN OVERVIEW
ESTs AND HUMAN DISEASES
Resources
URLs
The Genome Database
http://gdbwww.gdb.org/
Human Genome Project Resources
http://gdbwww.gdb.org/gdb/hgpResources.html
dbEST
http://www.ncbi.nlm.nih.gov/dbEST/index.html
The I.M.A.G.E. Consortium
http://www-bio.llnl.gov/bbrp/image/image.html
WashU-Merck Human EST Project
http://genome.wustl.edu/est/esthmpg.html
WashU-HHMI Mouse EST Project
http://genome.wustl.edu/est/mouse_esthmpg.html
UniGene
http://www.ncbi.nlm.nih.gov/Schuler/UniGene
XREF db
http://www.ncbi.nlm.nih.gov/XREFdb/
Online Mendelian Inheritance in Man
http://www3.ncbi.nlm.nih.gov/omim/
The Human Gene Map
http://www.ncbi.nlm.nih.gov/SCIENCE96/
FlyBase
http://flybase.bio.indiana.edu:80/
The Drosophila related expressed sequences homepage
http://www.tigem.it/LOCAL/drosophila/dros.html
DRES search engine
=http://gcg.tigem.it/DRES/dresearch.html
DRES db
http://www.tigem.it/LOCAL/drosophila/html/drostable.html
DROSOPHILA MELANOGASTER: AN IDEAL MODEL ORGANISM
DROSOPHILA-RELATED EXPRESSED SEQUENCES
MAPPING OF DRES IN HUMANS AND MICE
SYSTEMATIC EXPRESSION STUDIES OF DRES GENES
DRES DATABASE
ACKNOWLEDGEMENTS
REFERENCES
This page is maintained by OUP admin. Last updated Fri Sep 12 18:09:23 BST 1997. Part of the OUP Journals World Wide Web service.
Copyright
Oxford University Press, 1997