Human Molecular Genetics, 2001, Vol. 10, No. 20 2209-2214
© 2001 Oxford University Press
Sequence variation and disease in the wake of the draft human genome
MRC Functional Genetics Unit, Department of Human Anatomy and Genetics, University of Oxford, South Parks Road, Oxford OX1 3QX, UK
Received July 1, 2001; Accepted July 17, 2001.
| ABSTRACT |
|---|
|
|
|---|
The sequencing phase of the human genome project will soon be over. In its wake, repertoires of sequence polymorphisms among the human population are being sampled and a battery of functional genomics projects, from gene and protein expression studies to whole proteome interaction experiments, are generating vast quantities of data. Now that the data, or the means to generate data, are available it is the application of this information in enhancing our understanding of biology that represents the next formidable challenge. Two prominent issues should be considered. First, existing data must be analysed using the best methods available. The prediction of enzymatic activity for bestrophin, whose gene is mutated in Best macular dystrophy, is described in this review. This is an example of the experimentally testable hypotheses that can result from such detailed and exhaustive analyses. Secondly, the torrents of data from high-throughput studies will need to be made more accessible to all using web-based resources that integrate and digest complementary data types. The internet sites that showcase the human genome sequence are blazing a new trail. Ultimately, the success of genome sequencing and functional genomics will be measured not by the quantity and accuracy of raw data generated, but how rapidly they can be harnessed to span the divide between genotype and phenotype.
Inherited variations in the human genome provide a basis for phenotypic differences. Most of these are neutral and have little effect on an individuals health. Single nucleotide polymorphisms (SNPs) are single base pair variants with allelic frequency values of at least 1%. These represent 90% of all polymorphisms in the human population. Of the more than 1.4 million SNPs identified in the draft human genome (1), only 60 000 are estimated to cause amino acid substitutions (1) (http://snp.cshl.org/). Of the estimated 1000 detrimental polymorphisms predicted in each individual (2), most will contribute to complex polygenic traits rather than being directly responsible for single gene disorders. In the human population, only approximately 1000 genes are known to be associated with Mendelian inheritable diseases (3) (so-called disease genes).
The recent availability of the human genome draft sequence (4,5) has already allowed a dramatic acceleration in disease gene discovery. Previously, this involved positional cloning after the use of genetic markers in linkage disequilibrium and association studies. The growing availability of high density cytogenic SNP linkage maps now allows the loci of rare Mendelian disorders and even some complex traits with a polygenic basis to be identified.
A further revolution is needed, however, if these breakthroughs are to usher in a new age of genetic medicine (6) in which (i) the genetic basis of all common heritable diseases, traits or predispositions can be identified; (ii) the genetic makeup of each individual can contribute to clinical diagnoses and prognoses; (iii) the heterogeneous origins of diseases in different patients can be unravelled even when they share similar symptoms; and (iv) the resulting treatments can be adjusted to match the pharmacogenetic profiles of patient and drugs. Some of these goals are within reach and may be achievable within the next 10 years; for example, by simply extending the use of association studies with SNP linkage disequilibrium profiles without actually identifying the allelic culprits (7).
Efforts to exploit genetic information in tackling diseases can be divided into two broad approaches: discovery genetics and discovery genomics (7) or, in approximate terms, diseases in search of genes and genes in search of a disease. The great advantage of the former, in proceeding from known disorders, is that, by definition, any disease genes identified will be of immediate relevance to disease diagnosis and often treatment. Already, a growing proportion of the more easily tackled common Mendelian disorders with distinct phenotypes and large families have been, or are being, addressed. Many of the remaining polygenic traits, however, are not easy targets for linkage analysis or high-throughput screening. More worrying still, there is a danger that discovery genetics may miss many potential therapeutic targets which are themselves not disease genes.
Paralogues of known human disease genes represent additions to the standard gene targets selected by discovery genetics. Paralogues are homologous genes that arose from intra-genome duplications. Some human paralogous genes have been found to be mutated in similar diseases. For example, discovery of polycystin-2 was greatly facilitated by prior identification of polycystin-1; both of these genes are mutated in autosomal dominant polycystic kidney disease (8). The availability of the human genome draft sequence allowed an initial investigation into whether novel disease gene paralogues can be identified (4). The vast majority of the 286 candidates, however, appear to represent pseudogenes or to have arisen due to the error-prone nature of the initial draft genome sequence (C.Ponting, unpublished data). For example, novel paralogues of
- and
-sarcoglycans, which are mutated in human limb-girdle muscular dystrophies (9), and a dystrophin-related protein were predicted on chromosomes 8p22 and 2q34, respectively (unpublished data). However, attempts to identify these gene products using human cDNA libraries have not been successful (S.Phelps and D.Powell, unpublished data) probably indicating that they both represent pseudogenes.
Discovery genomics has the potential to deliver a greater range of gene targets for both disease diagnosis and therapeutics and yet poses a far more exacting challenge. Not only must the cellular roles of genes be accurately predicted but variations in molecular function and dysfunction must somehow be correlated with pathologies of entire systems, linking phenomena at vastly different scales. Even when the process of gene identification is successful, there is little guarantee that the disease in question will be a matter of priority and real significance for public health (10). The paucity of studies that have identified likely disease genes ab initio is an illustration of the many difficulties that remain to be overcome. One rare example of disease genomics in action is the prediction of a link between high levels of iron found in dopaminergic neurons of patients suffering from neurodegenerative diseases and proteins believed to contain one or both of putative ferric reductase and catecholamine-binding domains (11). This hypothesis has yet to be tested empirically.
It is clear that the crucial task of predicting the molecular function and cellular role of genetic sequences can only be achieved by taking into account all available information from homology to gene expression patterns to analogies with previously described genotypephenotype relationships. This requires not only that high-throughput data representing gene expression, tissue expression, protein localization and binding partner information be widely available but also that all the various data and results of different analyses be comprehensively cross-linked and integrated. It is only through such a confluence of experimental information that biological knowledge can be teased out of raw sequence data.
| GENOME ANNOTATION RESOURCES AND SEQUENCE VARIATION |
|---|
|
|
|---|
Free and direct access to the human genome sequence and its annotation is provided by Ensembl (http://www.ensembl.org/; see also http://www.ensembl.org/genome/central/), by the National Center for Biotechnology Information (NCBI; http://www.ncbi.nlm.nih.gov/genome/guide/human/) and by the University of California at Santa Cruz (http://genome.cse.ucsc.edu/goldenPath/hgTracks.html). The greatest value of these web sites stems from their provision of different bioinformatics data cross-referenced and mapped directly onto the human genome sequence. Thus, specific regions of chromosomes may be viewed in the context of other vertebrate sequences, predicted genes and serial analysis of gene expression (SAGE) libraries. Annotated gene products are further labelled by predicted domains, known protein tertiary structures and exonic structures. The combination and juxtaposition of data from all these different sources immediately challenges scientists to associate disparate findings.
Two features of genome annotation with these resources are perhaps of particular interest to medical geneticists. The first is mapping of SNPs onto the genome. SNPs in disease genes can be used as potential candidates for causative phenotypic variations. The second is the highlighting of gene variants known to be associated with human disease genes. In both Ensembl and NCBIs LocusLink (http://www.ncbi.nlm.nih.gov/LocusLink/), links from sequence to disease are provided using the online version of the Mendelian inheritance in man (OMIM) database, curated by Victor McKusick and colleagues, and accessible via NCBIs Entrez system (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM).
OMIM provides succinct synopses of clinical diagnoses and results from genetics experiments. Allelic variants are provided, thereby allowing links between gene sequences and phenotypic information. The undoubted utility of this database, however, is diminished by the low correspondence between allelic variant and sequence information. Only 60% of OMIM entries contain allelic variants that can all be mapped faithfully onto sequences linked to OMIM via LocusLink. Some OMIM entries are not even internally consistent. For example, OMIM entry 120150 for
1 chain collagen contains allelic variants W
C and G
C that centre on a single amino acid. Many of these discrepancies are likely to have arisen from a combination of sequence errors, alternative transcripts and gene mispredictions, in addition to annotation errors. It is advisable, therefore, that analyses of variants for any particular disease listed in OMIM should proceed by first retrieving the original sequences from the associated literature sources.
Another repository of allelic variant information is SwissProt (http://ca.expasy.org/sprot/). This is a protein sequence database that provides a high degree of annotation and a low level of redundancy (12). SwissProt lists allelic variants that are associated with human disease and provides helpful links to OMIM entries. In contrast to OMIM, SwissProt variants all map faithfully onto their corresponding sequences. Since SwissProt entries are linked through to the human genomic sequence from Ensembl, this means that missense mutations can be located within the human genome and thus may be compared alongside SNPs in protein coding regions.
| AMINO ACID ALLELIC VARIANTS |
|---|
|
|
|---|
The latest version of SwissProt (June 18, 2001) contains 10 121 missense mutations from 734 protein sequences that have been linked to human disease by OMIM. An analysis of the frequency matrix for disease-associated missense mutations (Fig. 1) reveals, as expected, some differences from amino acid substitution rates seen in wild-type proteins. In particular, of the eight mutations that occur significantly more frequently (>10x) in disease-associated than in wild-type proteins (bars shown in red in Fig. 1), seven changes involve two amino acid types, cysteine (C
R, C
Y and C
G) and arginine (R
G, C
R, H
R, W
R and R
L). Other substitutions commonly found in wild-type proteins such as L
M, L
I, A
S and F
Y are relatively rare causes of disease, even though each of these mutations can arise from single base changes. Presumably, this is because similarities in physicochemical properties between these amino acids usually ensure conservation of protein function despite their substitutions.
|
Cysteine and arginine are also prominent when amino acids are ranked according to the differences between mutation frequencies and background amino acid frequencies (Fig. 2). These two amino acids are both more highly substituted and more highly substituting in disease-associated variations. For cysteine, this is likely to arise from its unusual chemical properties, in particular the gain or loss, in disease variants, of disulphide-bridges or free thiols. For example, a C
T variant, predicted to disrupt a disulphide bridge, occurs in the hemochromatosis gene product (HFE or HLA-H) that is associated with hereditary haemochromatosis in Northern Europeans (2).
|
The pre-eminence of arginine in disease-associated missense variants may be due, in part, to the degenerate arrangement of the triplet genetic code. Mutations in a single base can cause substitutions between arginine and the highest number, 12, of other amino acid types. This factor alone, however, does not account for the popularity of arginine, since leucine also more frequently replaces, or is replaced by, highly occurring amino acids but features infrequently in disease-causing mutations. This suggests that the physicochemical properties of arginine, in particular its participation in salt bridges and its prevalence in solvent-accessible peripheries rather than the hydrophobic interiors of proteins, are important for molecular stability and function.
| FINDING HOMOLOGUES |
|---|
|
|
|---|
Identification of a disease-associated gene with amino acid allelic variations provides the starting point for an investigative chain pointing from genotype to a prospective phenotype. The first link in this chain is the determination of the molecular basis of disease. Often, the first clues for a genes role result from an analysis of its sequence, particularly by making inferences from the common molecular function of its homologues. Determination of those homologues that are orthologues leads to a greater refinement in prediction power since orthologues possess the greatest similarities in function. Orthologues are genes that arose from speciation rather than intra-genomic duplication events and are often the most sequence-similar genes from different species. It is important here to note that only a minority of human, fruit fly and Caenorhabditis elegans (nematode worm) proteins have detectable orthologues in each of the three organisms (4). It is assumed that it is this minority of proteins that governs homologous biological processes such as morphogenesis and cellular metabolism. This hypothesis is consistent with the finding that fruit fly versions of human genes are found in comparatively greater numbers for cancer, malformation syndromes, metabolic diseases, renal diseases and neurological diseases than for other disorders (13).
Methods to predict homology by sequence similarity have advanced in recent years to the extent that many divergent homologues can now be identified. Many such analyses have been employed successfully to understand the molecular function of monogenic disease gene products (reviewed in 14). Guidance for best practice in such analyses has been provided in the literature (15). Yet the published analyses of disease gene sequences frequently fail to exploit fully their predictive potential. In order to assist such analyses we suggest the following five-point approach.
(I) Query protein sequence databases using BLAST (http://www.ncbi.nlm.nih.gov/BLAST/) and PSI-BLAST (15)
BLAST interrogates databases searching for sequences with significant similarity to a reference gene, or its gene product. In the absence of biases in amino acid composition, a pair of sequences that are aligned with a score x and have been assigned an expect (E)-value of <2 x 103 is highly likely to be homologous. (An E-value represents the number of different alignments with scores equivalent to, or better than, x that are expected from the database search simply by chance.) For protein sequence database searches, PSI-BLAST (http://www.ncbi.nlm.nih.gov/Education/BLASTinfo/psi1.html) is preferred to gapped BLAST since it detects significantly greater numbers of homologues (17). PSI-BLAST queries protein databases in an iterative manner using previously found homologues to detect increasingly more subtle, yet significant, sequence similarities. Detection of divergent homologues is important since their molecular functions may be more conserved than their sequences initially may suggest.
(II) Use all available sequence databases
These should include non-redundant (nr) nucleotide and protein sequence databases, expressed sequence tag (EST) databases and predicted proteins from incompletely sequenced genomes. The latter should include the human and mouse drafts (e.g. http://www.ensembl.org/perl/blastview and http://mouse.ensembl.org/perl/blastview) and unfinished microbial genome sequences (http://www.ncbi.nlm.nih.gov/Microb_blast/unfinishedgenome.html). A useful ability to search predicted prokaryotic protein sequences using PSI-BLAST is provided by the ViruloGenome site (http://www.vge.ac.uk/blast/psiblast.html).
(III) Investigate sequences with marginal similarities
Results from PSI-BLAST searches are not symmetrical: a search of a database with sequence B might detect a significantly sequence-similar homologue A, but a search with As sequence might not detect B with significance. Thus, in an investigation of a sequence (A), it is sometimes worthwhile performing additional reciprocal PSI-BLAST database searches using sequences (B) that just fail to be aligned with significant statistics (E > 2 x 103).
(IV) Search for protein repeats
Some proteins contain informative internal repetitions that may not be detectable using BLAST, but are found using Prospero (18) (http://www.well.ox.ac.uk/ariadne/). Prospero is implemented on the SMART web site (http://smart.embl-heidelberg.de/).
(V) Search for domains, repeats and motifs
Ninety-one percent of human disease gene products listed in SwissProt contain a domain that is recognized by either Pfam (http://www.sanger.ac.uk/Pfam/) or SMART (http://smart.embl-heidelberg.de/) resources. Consequently, it is suggested that protein sequences be annotated by domain, repeat or motif family using these resources prior to performing BLAST searches.
| BESTROPHIN: A WORKED EXAMPLE OF DISCOVERY GENETICS |
|---|
|
|
|---|
In order to illustrate how different genomes, sequence databases and analysis programs can be used to detect previously unforeseen evolutionary relationships, we shall describe the analysis of bestrophin, the product of the Best macular dystrophy or vitelliform macular dystrophy type-2 gene (VMD2). Patients with mutations in VMD2 are visually impaired with egg-like lesions in the macular area (19,20) (http://www.uni-wuerzburg.de/humangenetics/vmd2.html). BLAST-based analysis of the bestrophin sequence shows that it is a member of protein family with numerous representatives in C.elegans (21) (http://www.sanger.ac.uk/cgi-bin/Pfam/getacc?PF01062), but no other versions in mammalia.
The existence of a bestrophin paralogue (BEST2) in mammals, however, can be established with BLAST-based searches of protein databases, EST databases and the human draft sequence. The human BEST2 sequence can be predicted from a human mRNA sequence (FLJ20132), from human genomic sequence in chromosome 19p13.2 (accession no. AC018761) and by comparison with an EST coding for mouse BEST2 (GeneInfo code 11655934). Whether mutations of the BEST2 gene result in visual defects remains to be investigated.
The molecular functions of bestrophin are unknown. In order to determine whether the identification of additional bestrophin homologues might provide some insight into their functions, we performed additional database searches. PSI-BLAST searches of the nr database at the NCBI identified no further homologues. A similar conclusion was drawn from PSI-BLAST searches of the nr database at the ViruloGenome site (http://www.vge.ac.uk/blast/psiblast.html). In the latter search, an Escherichia coli sequence (ORF b1520; GeneInfo code 7466763) showed marginal sequence similarity to bestrophin (E = 5.3). However, in a reciprocal search of the ViruloGenome nr database using this sequence and default search parameters, PSI-BLAST identified bestrophin as an E.coli b1520 homologue in round 2 with significant statistics (E = 8 x 104). The multiple sequence alignment of bestrophin and E.coli b1520 homologues is presented in Figure 3.
|
This example highlights the benefits of using PSI-BLAST, reciprocal searches and comprehensive databases (steps I, II and III above). Unfortunately, since none of the newly identified bestrophin homologues have been subjects of empirical investigation, the molecular function of bestrophin cannot simply be inferred from that of its homologues. In such circumstances, it is sometimes worthwhile to consider several rules-of-thumb for predicting function by analogy (22). These include that: (i) active sites usually consist of conserved polar residues (C, D, E, H, K, N, Q, R, S and T); (ii) large aromatic residues (F, H, W and Y) are often found in proteinligand binding sites; (iii) C, D, E, H, N and Q can coordinate zinc ions in active sites or zinc fingers; and (iv) H, K, R, S and T sometimes are involved in binding phosphate or sulphate groups.
For the bestrophin homologues, five amino acids are absolutely conserved (Fig. 3): S, R, P, Y and D. With the exception of proline (P), which is likely to have a structural, rather than a functional role, conservation of these amino acids implies that the bestrophin family possess catalytic activity. The precise type or specificity of this enzymatic activity, however, cannot be predicted directly.
| FUTURE DIRECTIONS |
|---|
|
|
|---|
The increasing availability of vast quantities of data from high-throughput projects represents an almost unprecedented wealth for all of biology. Two efforts from the biological community are required to transform these newfound riches into medical and scientific breakthroughs. The first is to make more effective use of current data and analytic methods so as to exploit fully the new genetic and molecular information. The second requirement is for more comprehensive integration of contrasting bioinformatics data types. New sets of focused and empirically useful hypotheses will be inspired by the conjunction and cross-referencing of complementary information for genome sequences, gene and tissue expression and protein post-translational modification, localisation and interaction, as well as comparisons of homologues across species, whether paralogues or orthologues. The ultimate goal must be to span the divides between genetic and molecular phenomena and cellular and whole organism physiology.
| ACKNOWLEDGEMENTS |
|---|
We would like to thank Drs Pat Clissold and Richard Emes for their helpful suggestions and comments.
| FOOTNOTES |
|---|
+ To whom correspondence should be addressed. Tel/Fax: +44 1865 272175; Email: chris.ponting@anat.ox.ac.uk
| REFERENCES |
|---|
|
|
|---|
1 Sachidanandam, R., Weissman, D., Schmidt, S.C., Kakol, J.M., Stein, L.D., Marth, G., Sherry, S., Mullikin, J.C., Mortimore, B.J., Willey, D.L. et al. (2001) A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature, 409, 928933.[Medline]
2 Sunyaev, S., Ramensky, V., Koch, I., Lathe, W.,III, Kondrashov, A.S. and Bork, P. (2001) Prediction of deleterious human alleles. Hum. Mol. Genet., 10, 591597.
3 Antonarakis, S.E. and McKusick, V.A. (2000) OMIM passes the 1,000-disease-gene mark. Nat. Genet., 25, 11.[Web of Science][Medline]
4 Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W. et al. (2001) Initial sequencing and analysis of the human genome. Nature, 409, 860921.[Medline]
5 Venter, J.C., Adams, M.D., Myers, E.W., Li, P.W., Mural, R.J., Sutton, G.G., Smith, H.O., Yandell, M., Evans, C.A., Holt, R.A. et al. (2001) The sequence of the human genome. Science, 291, 13041351.
6 Collins, F.S. and McKusick, V.A. (2001) Implications of the human genome project for medical science. J. Am. Med. Assoc., 285, 540544.
7 Roses, A.D. (2000) Pharmacogenetics and the practice of medicine. Nature, 405, 857865.[Medline]
8 Schneider, M.C., Rodriguez, A.M., Nomura, H., Zhou, J., Morton, C.C., Reeders, S.T. and Weremowicz, S. (1996) A gene similar to PKD1 maps to chromosome 4q22: a candidate gene for PKD2. Genomics, 15, 14.
9 Bushby, K.M.D. (1999) The limb-girdle muscular dystrophies-multiple genes, multiple mechanisms. Hum. Mol. Genet., 8, 18751882.
10 Risch, N.J. (2000) Searching for genetic determinants in the new millennium. Nature, 405, 847856.[Medline]
11 Ponting, C.P. (2001) Domain homologues of dopamine ß-hydroxylase and ferric reductase: roles for iron metabolism in neurodegenerative disorders? Hum. Mol. Genet., 10, 18531858.
12 Bairoch, A. and Apweiler, R. (2000) The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res., 28, 4548.
13 Fortini, M.E., Skupski, M.P., Boguski, M.S. and Hariharan, I.K. (2000) A survey of human disease gene counterparts in the Drosophila genome. J. Cell Biol., 150, F23F29.
14 Sreekumar, K.R., Aravind, L. and Koonin, E.V. (2001) Computational analysis of human disease-associated genes and their protein products. Curr. Opin. Genet. Dev., 11, 247257.[Web of Science][Medline]
15 Bork, P. and Koonin, E.V. (1998) Predicting functions from protein sequences where are the bottlenecks? Nat. Genet., 18, 313318.[Web of Science][Medline]
16 Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25, 33893402.
17 Park, J., Karplus, K., Barrett, C., Hughey, R., Haussler, D., Hubbard, T. and Chothia, C. (1998) Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. J. Mol. Biol., 284, 12011210.[Web of Science][Medline]
18 Mott, R. (2000) Accurate formula for P-values of gapped local sequence and profile alignments. J. Mol. Biol., 300, 649659.[Web of Science][Medline]
19 Marquardt, A., Stöhr, H., Passmore, L.A., Krämer, F., Rivera, A. and Weber, B.H.F. (1998) Mutations in a novel gene, VMD2, encoding a protein of unknown properties cause juvenile-onset vitelliform macular dystrophy (Bests disease). Hum. Mol. Genet., 7, 15171525.
20 Petrukhin, K., Koisti, M.J., Bakall, B., Li, W., Xie, G., Marknell, T., Sandgren, O., Forsman, K., Holmgren, G., Andreasson, S. et al. (1998) Identification of the gene responsible for Best macular dystrophy. Nat. Genet., 19, 241247.[Web of Science][Medline]
21 Sonnhammer, E.L. and Durbin, R. (1997) Analysis of protein domain families in Caenorhabditis elegans. Genomics, 46, 200216.[Web of Science][Medline]
22 Ponting, C.P. (2001) Issues in predicting protein function from sequence. Brief. Bioinformatics, 2, 1929.
23 Tatusov, R.L., Altschul, S.F. and Koonin, E.V. (1994) Detection of conserved segments in proteins: iterative scanning of sequence databases with alignment blocks. Proc. Natl Acad. Sci. USA, 91, 1209112095.
24 Goodstadt, L. and Ponting, C.P. (2001) CHROMA: Consensus-based colouring of multiple alignments for publication. Bioinformatics, in press.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
M. Ryan, M. Diekhans, S. Lien, Y. Liu, and R. Karchin LS-SNP/PDB: annotated non-synonymous SNPs mapped to Protein Data Bank structures Bioinformatics, June 1, 2009; 25(11): 1431 - 1432. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. M. Garcia, M. O. Casanueva, M. C. Silva, M. D. Amaral, and R. I. Morimoto Neuronal signaling modulates protein homeostasis in Caenorhabditis elegans post-synaptic muscle cells Genes & Dev., November 15, 2007; 21(22): 3006 - 3016. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. M. Isaacs, P. L. Oliver, E. L. Jones, A. Jeans, A. Potter, B. H. Hovik, P. M. Nolan, L. Vizor, P. Glenister, A. K. Simon, et al. A Mutation in Af4 Is Predicted to Cause Cerebellar Ataxia and Cataracts in the Robotic Mouse J. Neurosci., March 1, 2003; 23(5): 1631 - 1637. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Letunic, L. Goodstadt, N. J. Dickens, T. Doerks, J. Schultz, R. Mott, F. Ciccarelli, R. R. Copley, C. P. Ponting, and P. Bork Recent improvements to the SMART domain-based sequence annotation resource Nucleic Acids Res., January 1, 2002; 30(1): 242 - 244. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||






