Skip Navigation


Human Molecular Genetics Advance Access originally published online on April 11, 2007
Human Molecular Genetics 2007 16(11):1381-1390; doi:10.1093/hmg/ddm089
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Data
Right arrow All Versions of this Article:
16/11/1381    most recent
ddm089v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Bao, L.
Right arrow Articles by Cui, Y.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Bao, L.
Right arrow Articles by Cui, Y.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

An integrative genomics strategy for systematic characterization of genetic loci modulating phenotypes

Lei Bao1,2, Jeremy L. Peirce2,3, Mi Zhou1,2, Hongqiang Li2,3, Dan Goldowitz2,3, Robert W. Williams2,3, Lu Lu2,3,4 and Yan Cui1,2,*

1 Department of Molecular Sciences, 2 Center of Genomics and Bioinformatics and 3 Department of Anatomy and Neurobiology, University of Tennessee Health Science Center, Memphis, TN 38163, USA and 4 Key Laboratory of Nerve Regeneration, Nantong University, Jiangsu Province, China

* To whom correspondence should be addressed at: Department of Molecular Sciences, University of Tennessee Health Science Center, 858 Madison Avenue, Memphis, TN 38163, USA. Tel: +1 9014483240; Fax: +1 9014487360; Email: ycui2{at}utmem.edu

Received December 11, 2006; Revised February 8, 2007; Accepted April 1, 2007


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 SUPPLEMENTARY MATERIAL
 REFERENCES
 
Naturally occurring genetic variations may affect certain phenotypes through influencing transcript levels of the genes that are causally related to those phenotypes. Genomic regions harboring common sequence variants that modulate gene expression can be mapped as quantitative trait loci (QTLs) using a newly developed genetical genomics approach. This enables a new strategy for systematically mapping novel genetic loci underlying various phenotypes. In this work, we started from a seed set of genes with variants that are known to affect behavioral and neurological phenotypes (as recorded in Mammalian Phenotype Ontology Database) and used microarrays to analyze their expression levels in brain samples of a panel of BXD recombinant inbred mouse strains. We then systematically mapped the QTLs controlling the expression of these genes. Candidate causal genes in the QTL intervals were evaluated for evidence of functional genetic polymorphisms. Using this method, we were able to predict novel genetic loci and causal genes for a number of behavioral and neurological phenotypes. Lines of independent evidence supporting some of our results were provided by transcription factor binding site analysis and by biomedical literature. This strategy integrates gene–phenotype relations from decades of experimental mutagenesis studies and new genomic resources to provide an approach to rapidly expand knowledge on genetic loci modulating phenotypes.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 SUPPLEMENTARY MATERIAL
 REFERENCES
 
Systematically characterizing the molecular basis of highly variable phenotypes is a central theme of genetics in the post-genomic era. Locating chromosomal intervals harboring coding and non-coding sequence variants that influence phenotypes is the first step in the forward genetics approach. Quantitative trait locus (QTL) mapping has long been used for this purpose. Recent advances in genetical genomics (133) have inspired us to propose a novel approach to expanding and integrating knowledge on the genetic loci influencing phenotypes. Genetical genomics approaches treat gene expression levels as intermediate traits between DNA sequence variations and phenotypes and use QTL mapping to identify genetic loci controlling gene expression. It is known that many genes underlying physiological/clinical phenotypes show significant expression level variations in relevant tissues across genetically segregating populations (34), and transcriptional regulation of these genes may play an important role in phenotype manifestation. Upstream regulators (transcription factors, signaling molecules, etc.) of these genes are likely to be the genetic drivers of the corresponding phenotypes as well. Therefore, by identifying a genomic region harboring such upstream regulators using QTL mapping, it is possible to discover novel genetic loci influencing the phenotypes. This approach starts with the genes whose mutations are known to affect phenotypes. Fortunately, a large number of such genes have been identified by mutagenesis coupled with high-throughput phenotype screening (3537). Mammalian Phenotype Ontology (MPO) (38), a newly developed collection of known allele–phenotype relations, enabled a systematic, large-scale application of our approach. The MPO uses controlled vocabularies to describe the phenotypes and hierarchically organizes them as a directed acyclic graph (DAG). Each node in MPO represents a category of phenotypes and is associated with gene variants (alleles) causing these phenotypes in genetically engineered or mutagenesis experiments. These phenotypically categorized genes provide an excellent starting point for our integrative genomics approach, a flow chart of which is shown in Figure 1. In this work, we retrieved genes affecting mouse behavioral and neurological phenotypes from MPO database, then used microarrays to assess their expression levels in the brain samples of a BXD recombinant inbred (RI) mouse panel and mapped the QTLs modulating their expression levels. The BXD RI strains were derived by crossing two inbred parental strains C57BL/6J and DBA/2J and then inbreeding progenies for many generations (39,40). The eQTLs may also influence these behavioral/neurological phenotypes, some of which are actually coincident with the previously identified QTLs that influence behavioral/neurological phenotypes. Finally, we used non-synonymous single nucleotide polymorphisms (nsSNPs) and expression microarray data to screen for causal genes in eQTL intervals. Lines of independent evidence from transcription factor binding site (TFBS) analysis and biomedical literature support our results, indicating the value of the method in providing high quality candidate QTLs.


Figure 1
View larger version (25K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 1. Flow chart of the integrative genomics strategy for systematic characterization of genetic loci influencing phenotypes.

 

    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 SUPPLEMENTARY MATERIAL
 REFERENCES
 
Systematic mapping of genetic loci influencing mouse phenotypes using mRNA expression data
The MPO hierarchically organizes a set of MP terms as a DAG, from the rather general to more specific. At the most general level (immediate child nodes of the MPO root), phenotypes are categorized on the absis of anatomical tissue types. In this study, we focused on the MP terms descended from one of these general phenotypes: the ‘behavior/neurological phenotypes’ (MP:0005386). For each MP term, we extracted from the Mammalian Phenotype Browser (38) a set of genes whose mutant or engineered alleles are associated with that term. We assessed the transcript levels of these genes in BXD brain samples using microarrays. A total of 630 genes on the Affymetrix M430 microarray were associated with the behavior/neurological phenotypes node. Likewise, genes associated with a child MP term consist of a subset of these genes. For each of the 630 genes, we attempted to identify a trans-acting locus regulating its transcript level. Using this approach, we mapped 53 significant QTLs for genes associated with behavior/neurological phenotypes (Table 1). Altogether, 40 MP terms were associated with at least one of these 53 trans-acting QTLs. Figure 2 shows a part of the MPO DAG and the QTLs associated with the MP terms. The entire DAG of mapped MP terms can be found in Supplementary Material (Fig. S1).


Figure 2
View larger version (31K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 2. QTL mapping of behavior/neurological phenotypes (MP:0005386). Shown are the mapping results for a representative part (rooted at MP:0002066) of the DAG structure. The entire DAG of mapped MP terms can be found in Supplementary Material (Fig. S1). MPO nodes are represented by squares. Only nodes having at least one associated eQTLs are shown. In each node, the MP term is shown. The horizontal bar with alternative black–gray coloring designates the chromosome boundaries. The vertical lines designate the peak locations of associated eQTLs and their lengths are proportional to the LRS score. The root node (MP:0002066) is represented in an enlarged view. The DAG structure implies that all the eQTLs associated with a child node is also associated with its parental node, hence the root node includes all the eQTLs.

 


View this table:
[in this window]
[in a new window]

 
Table 1. eQTLs and associated MP terms

 
Screening for causal genes
The ultimate goal of QTL mapping is to identify the causal genes that are responsible for the QTL effects, i.e. the quantitative trait genes (QTGs). C57BL/6J and DBA/2J, the two parental strains of the BXD panel, have been well sequenced, and dense sets of SNPs between these strains are available. For each eQTL, we employ SNP data and gene expression data in a two-pronged approach to prioritize the causal genes (Fig. 1). (The candidate causal gene list can be found in Supplementary Material, Table S1.)

Two types of DNA sequence polymorphisms may underlie QTL effects. The first type of polymorphisms is nsSNPs located in the coding region of the QTG and modifying the protein sequence. The gain or loss of function of a protein product may affect the transcript level of its target gene(s). Therefore, in the QTL interval, genes with nsSNPs between the two parental strains are excellent candidate QTGs. For example, as shown in Figure 3, Ntsr2 encodes a low-affinity neurotensin receptor and a null allele of Ntsr2 results in abnormal thermal nociception (41). Indeed, a previous study showed that the thermal nociception varied across BXD strains (42). A QTL modulating the transcript level of Ntsr2 was identified. Adcy2, located in the QTL interval, is a strong candidate QTG. Adcy2 encodes a brain adenylate cyclase and has two missense polymorphisms (V {leftrightarrow} A and R {leftrightarrow} Q) between the two parental strains. Activation of adenylate cyclases is known to downregulate the mRNA levels of neurotensin receptor (43), and adenylate cyclases are known to modulate the thermal nociception (44). Therefore, Adcy2 may regulate the expression of Ntsr2, which in turn regulate thermal nociception. Transcription factors in the QTL intervals and with nsSNPs are also high-priority candidate QTGs, especially when putative binding sites exist in the promoter regions of the target genes. For instance, four of the nsSNP-bearing candidate QTGs encode transcription factors (Ets2, Gtf2a1, Stat4 and E2f2) with known DNA-binding sites, as listed in the TransFac database (45). Table 2 shows the results of TFBS prediction in the promoter regions of the target genes. We found TFBSs in five out of seven putative regulatory relations. This immediately suggests the potential regulatory roles of these transcription factors in the manifestation of the corresponding phenotypes listed in Table 1. Interestingly, Kcna1, Zic2 and Dlgh4 have co-localized eQTLs and putative Stat4 binding sites in their promoter regions. In contrast, Pxmp3 and Rorb also have co-localized eQTLs, but neither has putative E2f2-binding sites in its promoter region, indicating that, in this case, E2f2 is not a good candidate QTG.


Figure 3
View larger version (34K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 3. Converged evidence designates Adcy2 as a candidate QTG. The eQTL modulating the Ntsr2 transcript level is identified (thick arrow). Null allele of Ntsr2 results in abnormal thermal nociception (41). Adcy2, located in the eQTL interval and encoding a brain adenylate cyclase, is a strong candidate gene underlying the eQTL and controlling the thermal nociception phenotype. Adenylate cyclases are known to regulate mRNA level of Ntsr2 (43) and to modulate the same phenotype as does Ntsr2 (44).

 


View this table:
[in this window]
[in a new window]

 
Table 2. Promoter sequence analysis supports predicted regulatory relations

 
The second type of polymorphisms are located within the regulatory regions of the QTG and may affect the QTG's mRNA abundance. Current understanding of regulatory polymorphisms is sparse; therefore, we cannot prioritize candidate causal genes by simply cataloging regulatory sequence polymorphisms. However, if the QTG has a regulatory polymorphism that changes its expression level, it will give rise to a cis-acting eQTL near its own location. Additionally, we computed Pearson's correlation coefficient (PCC) between the expression levels of the target gene and the candidate QTG. Genes residing in the QTL interval and having both significant cis-acting eQTL and PCC are considered to be likely QTGs. Table 3 shows two predicted QTGs of this type with supporting evidence from literature. In the first example, a genetic locus influencing abnormal involuntary movement (MP:0003492) was identified by mapping the eQTL of Myo6. Myo7a in the QTL interval was identified as the only candidate QTG with significant cis-acting eQTL and PCC (P < 0.01). Both Myo7a and Myo6 encode unconventional myosin isozymes expressed in the inner ear. Indeed, mutated alleles of Myo7 and Myo6 both cause common behavior/neurological phenotypes: abnormal head movements (MP:0000436) and circling (MP:0001394) as well as other shared phenotypes like inherited deafness (46,47). These data suggest that a regulatory polymorphism in the Myo7 gene alters its own gene expression level and consequently alters expression levels of relevant genes (e.g. Myo6) through cellular signaling pathways. In the second example, a genetic locus influencing abnormal visual placing response (MP:0001526) was identified by mapping the eQTL of Bbs4. Ttc8 in the QTL interval was identified as one candidate QTG with significant cis-acting eQTL and PCC (P < 10–5). Both Ttc8 and Bbs4 encode tetratricopeptide repeat domain proteins and were identified as molecular determinants of Bardet–Biedl syndrome by positional cloning (48). Interestingly, one major symptom of Bardet–Biedl syndrome is poor visual acuity and blindness, which may explain the abnormal visual placing response of mice. These data suggest that Ttc8 might play a similar role as Bbs4 in mouse abnormal visual placing behavior.


View this table:
[in this window]
[in a new window]

 
Table 3. Candidate QTGs identified by expression analysis

 
Co-localization of gene expression QTLs and physiological and behavioral QTLs
A large set of QTLs regulating physiological and behavioral phenotypes (pQTLs) have already been acquired using BXD lines. We assembled a data set containing 67 BXD behavioral and neurological phenotypes, each of which has a significant QTL (genome-wide adjusted P < 0.05). We found that 10 of the 53 loci associated with the MPO gene set were co-localized with these 67 pQTLs [i.e. the peaks of LOD (log of odds) are within 5 Mb) (Supplementary Material, Table S2). Direct links between BXD phenotypes and the specific MP terms are often unclear; however, plausible hypotheses about gene–phenotype relations can be generated by the co-localization analysis. For example, as shown in Figure 4, a pQTL influencing the open-field behavior of BXD mice in response to 5 mg/kg cocaine (49) co-localizes with the eQTL modulating expression of Ank2 which encodes a cytoskeletal adaptor protein. Notably, a major goal of open-field behavioral assays is to test locomotor activity differences. Some independent lines of evidence support the involvement of Ank2 in a molecular pathway underlying differences in cocaine-related locomotion. First, a null allele (knockout) of Ank2 results in abnormal locomotor activity in the absence of cocaine (50). Second, at the molecular level, cocaine is reported to modulate the functioning of Ank2. In addition, cocaine causes dissociation of Ank2 from inositol (1,4,5)-trisphosphate receptors and translocation to the plasma membrane and nucleus. This increases the intracellular calcium concentration and affects the activity of cytoskeletal proteins (51,52). Therefore, it is possible that Ank2 activity mediates at least part of cocaine's effect on mouse locomotor activity.


Figure 4
View larger version (27K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 4. An example of co-localization of pQTL and eQTL. A pQTL underlying the variations in locomotor activity in response to cocaine across BXD mice panel co-localizes with an eQTL modulating the Ank2 transcript level. Interestingly, cocaine is known to modulate the functioning of Ank2 protein (51,52), and null allele of Ank2 leads to abnormal locomotor activity in the absence of cocaine (50).

 

    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 SUPPLEMENTARY MATERIAL
 REFERENCES
 
Targeted mutation (knockout and knockin technologies), genome-wide ENU mutagenesis and studies of spontaneous mutations have led to an accelerated accumulation of data associating phenotypes with their causal genes. The crucial question is, based on these known but dispersed associations, can we systematically identify additional genes and pathways underlying a phenotype and fill in missing components linking genes to phenotypes? The intersection of genetical genomics and the known effects of specific single-gene mutations on physiological and clinical phenotypes provides an excellent tool for dissecting the phenotypes. We have integrated these two approaches to understanding mouse phenotypes—the confirmed genotype–phenotype relations from single-gene manipulations and transcript-eQTL pairs identified by genetical genomics approaches, to predict potential genetic loci and QTGs underlying a broad category of phenotypes.

Gene expression as an intermediate phenotype
The majority of MPs, from normal range physiological and behavioral variation to disease states, are genetically complex. Even in a defined genetic background, polygenic influences and complicated environmental effects serve to obscure genetic analysis. Cui et al. (53). estimated heritability of transcript levels using F1 hybrid mice. Although different transcript may have very different heritability, a substantial portion (25%) of transcripts are highly heritable (heritability > 0.5) (53). This suggests that transcript abundance may be a valuable intermediate phenotype between genomic DNA sequence variation and more complex system-level phenotypes. While gene expression phenotypes are also subject to complex regulation and environmental noise, they are considerably closer to the level of molecular influence than the phenotypes they modulate. Another advantage of using a molecular-level phenotype like transcript abundance is that it is possible to examine separately the influences of independent genetic pathways. This is an important consideration since even in animals models with a defined genetic background there may be multiple independent pathways that produce indistinguishable phenotypes.

Alternative and complementary methods
Although there are several alternative methods for the genetic dissection of complex traits, each has its advantages and drawbacks. A phenotype-driven approach uses genome-wide mutagenesis that results in phenotypic abnormality for which the causal gene needs to be identified with considerable effort. Another approach, gene-driven, is the international effort aimed at systematically knocking out every gene in the mouse genome (37,54), which is currently underway and for which assessment of a large set of physiological and disease phenotypes for these mutants is also planned. Again, this is a powerful screening approach to elucidating molecular mechanisms underlying specific phenotypes, but is laborious and expensive. Furthermore, approaches that mutate one gene at a time require additional steps to analyze pathways of effect. Both of these approaches, and indeed any method that involves identification, with reasonable certainty, of a specific gene involved in a phenotype is complementary to the approach we describe. The more extensive the collection of target genes with known effects, the more efficiently our method functions.

Co-localization of eQTL and pQTL
Phenotype-driven studies of complex traits have identified a large collection of pQTLs, yet further characterization of the molecular determinants has been proven to be difficult. For instance, identification of the 67 pQTLs already acquired for the BXD RI lines has not resulted in identification of the underlying genes in most (if not all) cases. However, the availability of co-localized eQTLs provides extra useful information in helping us to uncover genes that regulate these phenotypes. As Schadt and coworkers proposed (14), co-localization of an eQTL and a pQTL may be interpreted by one of the three different models: (i) causal model, where the common QTL acts on the gene expression trait and the gene regulates the phenotype trait; (ii) reactive model, where the common QTL acts on the phenotype trait and the gene expression trait is reactive to the phenotype and (iii) independent model, where the common QTL acts on the expression trait and phenotype trait independently. By mining the biological literature for the implied association of the gene and the phenotype trait, it is sometimes possible to pick the likely model for action of the pQTL, expanding our understanding of its action to the molecular level. The hypothetical model shown in Figure 4 is just an example of this approach.

Potential and limitations of the method
Our method is a powerful protocol for connecting the rapidly increasing universe of genetical genomics gene expression data with known gene–phenotype relations. Of course, as the tools and data sets available for genetical genomics investigations improve, so will our method. In this paper, for instance, we only applied our method to one MPO branch (MP:0005386), where gene expressions data in a highly relevant tissue have been assessed by us. Currently 34 high-level classes of phenotypes are included in the MPO database, many of which are related to specific tissue types. As gene expression is profiled in greater numbers of tissues, many more MPO phenotypes branches can be investigated in the same way. Actually, this method can make use of any known gene–phenotype relation and is not limited to MPO entries. Furthermore, if the researchers only have a handful of genes to start with, real-time RT–PCR instead of microarrays can be used to measure the gene expression levels in relevant tissues. The statistical power of the genetical genomics data sets available is also undergoing rapid improvement. The current expression data set consists of only 42 strains, and expression profiling of all members of the remaining BXD RI strains will improve our power to detect QTLs. Also, an ambitious project with the goal of generating 1000 RI lines by crossing eight common inbred mouse strains is underway (55). This project, the Collaborative Cross, increases the sample size and power by over 10-fold. By increasing the size of mapping panel and/or by including more variations between parental strains, many more significant QTLs and QTGs will be identified.

The final data-availability limitation is categorical: our method does not currently incorporate data relating mRNA abundance to protein abundance and function, because whole proteome data sets are not yet available. Since phenotype manifestation involves complex interactions at the levels of DNA polymorphism, mRNA abundance and protein abundance and modification, there are important layers of interaction we are not able to identify. Whole proteome data sets are rapidly approaching reality, however, and our method is readily extensible to include these data as they become available.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 SUPPLEMENTARY MATERIAL
 REFERENCES
 
Data collection and microarray experiments
MP annotations of mouse genes were retrieved from the Mammalian Phenotype Browser (38). We focused on MP terms descendent from the behavior/neurological phenotype branch (MP:0005386). Noting that annotations were usually assigned at the most specific levels, we recursively included annotations implied by the structure of the MPO, so that if an MP term was associated with a gene, then all the ancestors of the MP term were also associated with that gene. The steady-state transcripts levels of these genes in brains from 42 BXD strains were measured using Affymetrix M430 microarrays. Samples from brain tissues (forebrain minus olfactory bulb, plus the entire midbrain) were hybridized in small pools (n = 3) of closely matched age and same sex to a total of 83 Affymetrix M430A and B array pairs. A table that summarizes information on strain, sex and age of the mice is available at the Genenetwork website (http://www.genenetwork.org/dbdoc/IBR_M_0405_R.html). The original microarray data were processed using the Robust Multichip Average method (56). The genotypes of the BXD strains were characterized using 3795 markers that have been carefully error-checked (57,58).

QTL mapping and significance testing
Genome-wide QTL mapping was carried out for each transcript of interest following standard marker regression mapping protocols (59). A trans-acting eQTL was conservatively defined as eQTL that mapped to a different chromosome from where the target gene regulated by the eQTL is located. The highest trans-acting LOD of each transcript across the genome was determined and the corresponding empirical P-value (60) was estimated by 5000 independent permutations of the original transcript trait values. For each MP term (i.e. each node in the MPO DAG), we used Bonferroni correction or q-value (pointwise false discovery rate) (61) to correct for testing of multiple genes associated with it. An eQTL was considered significant and retained for further analysis only if the Bonferroni corrected P < 5% or q < 10%. The confidence intervals of the retained eQTLs were estimated using the 1.5 LOD rule (62,63).

Assessing candidate QTGs by expression analysis
If the QTG affects the downstream target gene expression by altering its own transcript level, the candidate QTG should itself have a cis-acting eQTL. We screened for candidate QTGs with cis-acting eQTLs that are overlapped with the trans-acting eQTL interval of the target gene. Because we need only test the trans-acting eQTL interval instead of the whole genome, we simply required LOD ≥ 4.3. Next, all the candidates with cis-acting eQTLs were assessed for pairwise PCCs between their transcript level and the target gene transcript level. The raw P-value of PCC was Bonferroni adjusted to correct for testing multiple transcripts in the QTL interval. Genes having cis-acting eQTL plus significant PCC (adjusted P < 0.01) were considered candidate QTGs.

nsSNPs analysis
Genes within eQTL intervals that harbored between-strain nsSNPs were automatically considered QTG candidates. SNPs annotated as missense or nonsense mutations between two mouse strains (C57BL/6J versus DBA/2J) were extracted from the Celera mouse RefSNP database (64). Entries that also have within-strain nsSNPs were filtered. The position of an nsSNP in the mouse genome (NCBI Build 33) was determined by aligning its flanking DNA sequences with the genome sequence using the BLAT program (65). The genomic locations for all RefSeq (66) mRNA transcripts were obtained from UCSC Genome Browser site (genome.ucsc.edu). We required an nsSNP to be located within the coding region of the gene and to produce an amino acid change in the same transcription orientation. Thus, 4464 nsSNPs were assigned to 2624 out of 16042 genes with genome location information, of which 146 are nonsense mutations and 4318 are missense mutations.

TFBS analysis
For each predicted regulatory pair where the regulatory gene encodes a transcription factor, we used the MATCH program (67) to screen putative TFBSs in the 1000 bp upstream of the target gene transcription start site, which was extracted from the mouse genome annotation database (68). Putative TFBS were identified when the computed matrix similarity scores were higher than MATCH's preset cutoffs, which were designed to minimize the sum of false-positive and false-negative error rates.


    SUPPLEMENTARY MATERIAL
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 SUPPLEMENTARY MATERIAL
 REFERENCES
 
Supplementary Material is available at HMG Online.


    ACKNOWLEDGEMENTS
 
This work was supported by a PhRMA Foundation grant (Y.C.), and NIH grants HD052472 (D.G.), AA014425 (L.L.) and DA021131 (R.W.W.).

Conflict of Interest statement. None declared.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 SUPPLEMENTARY MATERIAL
 REFERENCES
 

  1. Damerval C., Maurice A., Josse J.M., de-Vienne D. Quantitative trait loci underlying gene product variation: a novel perspective for analyzing regulation of genome expression. Genetics (1994) 137:289–301.[Abstract]

  2. Jansen R.C., Nap J.P. Genetical genomics: the added value from segregation. Trends Genet. (2001) 17:388–391.[CrossRef][Web of Science][Medline]

  3. Brem R.B., Yvert G., Clinton R., Kruglyak L. Genetic dissection of transcriptional regulation in budding yeast. Science (2002) 296:752–755.[Abstract/Free Full Text]

  4. Schadt E.E., Monks S.A., Drake T.A., Lusis A.J., Che N., Colinayo V., Ruff T.G., Milligan S.B., Lamb J.R., Cavet G., et al. Genetics of gene expression surveyed in maize, mouse and man. Nature (2003) 422:297–302.[CrossRef][Medline]

  5. Chesler E.J., Williams R.W. Brain gene expression: genomics and genetics. Int. Rev. Neurobiol. (2004) 60:59–95.[Web of Science][Medline]

  6. Morley M., Molony C.M., Weber T.M., Devlin J.L., Ewens K.G., Spielman R.S., Cheung V.G. Genetic analysis of genome-wide variation in human gene expression. Nature (2004) 430:743–747.[CrossRef][Medline]

  7. Chesler E.J., Lu L., Shou S., Qu Y., Gu J., Wang J., Hsu H.C., Mountz J.D., Baldwin N.E., Langston M.A., et al. Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function. Nat. Genet. (2005) 37:233–242.[CrossRef][Web of Science][Medline]

  8. Baldwin N.E., Chesler E.J., Kirov S., Langston M.A., Snoddy J.R., Williams R.W., Zhang B. Computational, integrative, and comparative methods for the elucidation of genetic coexpression networks. J. Biomed. Biotechnol. (2005) 2005:172–180.[CrossRef][Medline]

  9. Schadt E.E. Exploiting naturally occurring DNA variation and molecular profiling data to dissect disease and drug response traits. Curr. Opin. Biotechnol. (2005) 16:647–654.[CrossRef][Web of Science][Medline]

  10. de Koning D.J., Haley C.S. Genetical genomics in humans and model organisms. Trends Genet. (2005) 21:377–381.[CrossRef][Web of Science][Medline]

  11. Li J., Burmeister M. Genetical genomics: combining genetics with gene expression analysis. Hum. Mol. Genet. (2005) 14:R163–R169.[Abstract/Free Full Text]

  12. Li H., Lu L., Manly K.F., Chesler E.J., Bao L., Wang J., Zhou M., Williams R.W., Cui Y. Inferring gene transcriptional modulatory relations: a genetical genomics approach. Hum. Mol. Genet. (2005) 14:1119–1125.[Abstract/Free Full Text]

  13. Hubner N., Wallace C.A., Zimdahl H., Petretto E., Schulz H., Maciver F., Mueller M., Hummel O., Monti J., Zidek V., et al. Integrated transcriptional profiling and linkage analysis for identification of genes underlying disease. Nat. Genet. (2005) 37:243–253.[CrossRef][Web of Science][Medline]

  14. Schadt E.E., Lamb J., Yang X., Zhu J., Edwards S., Guhathakurta D., Sieberts S.K., Monks S., Reitman M., Zhang C., et al. An integrative genomics approach to infer causal associations between gene expression and disease. Nat. Genet. (2005) 37:710–717.[CrossRef][Web of Science][Medline]

  15. Brem R.B., Kruglyak L. The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proc. Natl Acad. Sci. USA (2005) 102:1572–1577.[Abstract/Free Full Text]

  16. Cheung V.G., Spielman R.S., Ewens K.G., Weber T.M., Morley M., Burdick J.T. Mapping determinants of human gene expression by regional and genome-wide association. Nature (2005) 437:1365–1369.[CrossRef][Medline]

  17. Bystrykh L., Weersing E., Dontje B., Sutton S., Pletcher M.T., Wiltshire T., Su A.I., Vellenga E., Wang J., Manly K.F., et al. Uncovering regulatory pathways that affect hematopoietic stem cell function using ‘genetical genomics’. Nat. Genet. (2005) 37:225–232.[CrossRef][Web of Science][Medline]

  18. Lan H., Chen M., Flowers J.B., Yandell B.S., Stapleton D.S., Mata C.M., Mui E.T., Flowers M.T., Schueler K.L., Manly K.F., et al. Combined expression trait correlations and expression quantitative trait locus mapping. PLoS Genet. (2006) 2:e6.[CrossRef][Medline]

  19. Bao L., Wei L., Peirce J., Homayouni R., Li H., Zhou M., Chen H., Lu L., Williams R., Pfeffer L., et al. Combining gene expression QTL mapping and phenotypic spectrum analysis to uncover gene regulatory relations. Mamm. Genome (2006) 17:575–583.[CrossRef][Web of Science][Medline]

  20. Cui Y. Elucidating gene regulatory networks underlying complex phenotypes: genetical genomics and Bayesian network. In: Microarrays and Transcription Networks—Shannon F., ed. (2006) Georgetown: Landes Bioscience. 114–126.

  21. Williams R.W. Expression genetics and the phenotype revolution. Mamm Genome (2006) 17:496–502.[CrossRef][Web of Science][Medline]

  22. Wang S., Yehya N., Schadt E.E., Wang H., Drake T.A., Lusis A.J. Genetic and Genomic Analysis of a Fat Mass Trait with Complex Inheritance Reveals Marked Sex Specificity. PLoS Genetics (2006) 2:e15.[CrossRef]

  23. Rockman M.V., Kruglyak L. Genetics of global gene expression. Nat. Rev. Genet. (2006) 7:862.[CrossRef][Web of Science][Medline]

  24. Drake T.A., Schadt E.E., Lusis A.J. Integrating genetic and gene expression data: application to cardiovascular and metabolic traits in mice. Mamm. Genome (2006) 17:466–479.[CrossRef][Web of Science][Medline]

  25. Ghazalpour A., Doss S., Zhang B., Wang S., Plaisier C., Castellanos R., Brozell A., Schadt E.E., Drake T.A., Lusis A.J., et al. Integrating genetic and network analysis to characterize genes related to mouse weight. PLoS Genet. (2006) 2:e130.[CrossRef][Medline]

  26. Tu Z., Wang L., Arbeitman M.N., Chen T., Sun F. An integrative approach for causal gene identification and gene regulatory pathway inference. Bioinformatics (2006) 22:e489–e496.[Abstract/Free Full Text]

  27. Li H., Chen H., Bao L., Manly K.F., Chesler E.J., Lu L., Wang J., Zhou M., Williams R.W., Cui Y. Integrative genetic analysis of transcription modules: towards filling the gap between genetic loci and inherited traits. Hum. Mol. Genet. (2006) 15:481–492.[Abstract/Free Full Text]

  28. Yang X., Schadt E.E., Wang S., Wang H., Arnold A.P., Ingram-Drake L., Drake T.A., Lusis A.J. Tissue-specific expression and regulation of sexually dimorphic genes in mice. Genome Res. (2006).

  29. Keurentjes J.J.B., Fu J., Terpstra I.R., Garcia J.M., van den Ackerveken G., Snoek L.B., Peeters A.J.M., Vreugdenhil D., Koornneef M., Jansen R.C. Regulatory network construction in Arabidopsis by using genome-wide gene expression quantitative trait loci. Proc. Natl Acad. Sci. (2007) 104:1708–1713.[Abstract/Free Full Text]

  30. Bao L., Zhou M., Wu L., Lu L., Goldowitz D., Williams R.W., Cui Y. PolymiRTS database: linking polymorphisms in microRNA target sites with complex traits. Nucleic Acids Res. (2007) 35:D51–D54.[Abstract/Free Full Text]

  31. Bing N., Hoeschele I. Genetical genomics analysis of a yeast segregant population for transcription network inference. Genetics (2005) 170:533–542.[Abstract/Free Full Text]

  32. Li Y., Alvarez O.A., Gutteling E.W., Tijsterman M., Fu J., Riksen J.A., Hazendonk E., Prins P., Plasterk R.H., Jansen R.C., et al. Mapping determinants of gene expression plasticity by genetical genomics in C. elegans. PLoS Genet. (2006) 2:e222.[CrossRef][Medline]

  33. Kulp D., Jagalur M. Causal inference of regulator-target pairs by gene mapping of expression phenotypes. BMC Genomics (2006) 7:125.[CrossRef][Medline]

  34. Abiola O., Angel J.M., Avner P., Bachmanov A.A., Belknap J.K., Bennett B., Blankenhorn E.P., Blizard D.A., Bolivar V., Brockmann G.A., et al. The nature and identification of quantitative trait loci: a community's view. Nat. Rev. Genet. (2003) 4:911–916.[Web of Science][Medline]

  35. Goldowitz D., Frankel W.N., Takahashi J.S., Holtz-Vitaterna M., Bult C., Kibbe W.A., Snoddy J., Li Y., Pretel S., Yates J., et al. Large-scale mutagenesis of the mouse to understand the genetic bases of nervous system structure and function. Mol. Brain Res. (2004) 132:105.[Medline]

  36. Austin C.P., Battey J.F., Bradley A., Bucan M., Capecchi M., Collins F.S., Dove W.F., Duyk G., Dymecki S., Eppig J.T., et al. The knockout mouse project. Nat. Genet. (2004) 36:921–924.[CrossRef][Web of Science][Medline]

  37. Auwerx J., Avner P., Baldock R., Ballabio A., Balling R., Barbacid M., Berns A., Bradley A., Brown S., Carmeliet P., et al. The European dimension for the mouse genome mutagenesis program. Nat. Genet. (2004) 36:925–927.[CrossRef][Web of Science][Medline]

  38. Smith C.L., Goldsmith C.A., Eppig J.T. The Mammalian Phenotype Ontology as a tool for annotating, analyzing and comparing phenotypic information. Genome Biol. (2005) 6:R7.[CrossRef][Medline]

  39. Taylor B.A. Recombinant inbred strains. In: Genetic Variation in the Laboratory Mouse (1989) 2nd edn. 773–796.

  40. Peirce J.L., Lu L., Gu J., Silver L.M., Williams R.W. A new set of BXD recombinant inbred lines from advanced intercross populations in mice. BMC Genet. (2004) 5:7.[CrossRef][Medline]

  41. Maeno H., Yamada K., Santo-Yamada Y., Aoki K., Sun Y.J., Sato E., Fukushima T., Ogura H., Araki T., Kamichi S., et al. Comparison of mice deficient in the high- or low-affinity neurotensin receptors, Ntsr1 or Ntsr2, reveals a novel function for Ntsr2 in thermal nociception. Brain Res. (2004) 998:122–129.[CrossRef][Web of Science][Medline]

  42. Mogil J.S., Richards S.P., O'Toole L.A., Helms M.L., Mitchell S.R., Belknap J.K. Genetic sensitivity to hot-plate nociception in DBA/2J and C57BL/6J inbred mouse strains: possible sex-specific mediation by delta2-opioid receptors. Pain (1997) 70:267–277.[CrossRef][Web of Science][Medline]

  43. Scarceriaux V., Souaze F., Bachelet C.M., Forgez P., Bourdel E., Martinez J., Rostene W., Pelaprat D. Neurotensin receptor down-regulation induced by dexamethasone and forskolin in rat hypothalamic cultures is mediated by endogenous neurotensin. J. Neuroendocrinol. (1996) 8:587–593.[CrossRef][Web of Science][Medline]

  44. Sluka K.A. Stimulation of deep somatic tissue with capsaicin produces long-lasting mechanical allodynia and heat hypoalgesia that depends on early activation of the cAMP pathway. J. Neurosci. (2002) 22:5687–5693.[Abstract/Free Full Text]

  45. Matys V., Fricke E., Geffers R., Gossling E., Haubrock M., Hehl R., Hornischer K., Karas D., Kel A.E., Kel-Margoulis O.V., et al. TRANSFAC: transcriptional regulation, from patterns to profiles. Nucleic Acids Res. (2003) 31:374–378.[Abstract/Free Full Text]

  46. Libby R.T., Steel K.P. The roles of unconventional myosins in hearing and deafness. Essays Biochem. (2000) 35:159–174.[Medline]

  47. Hasson T., Gillespie P.G., Garcia J.A., MacDonald R.B., Zhao Y., Yee A.G., Mooseker M.S., Corey D.P. Unconventional myosins in inner-ear sensory epithelia. J. Cell Biol. (1997) 137:1287–1307.[Abstract/Free Full Text]

  48. Sheffield V.C. Use of isolated populations in the study of a human obesity syndrome, the Bardet–Biedl syndrome. Pediatr. Res. (2004) 55:908–911.[CrossRef][Web of Science][Medline]

  49. Jones B.C., Tarantino L.M., Rodriguez L.A., Reed C.L., McClearn G.E., Plomin R., Erwin V.G. Quantitative-trait loci analysis of cocaine-related behaviours and neurochemistry. Pharmacogenetics (1999) 9:607–617.[Web of Science][Medline]

  50. Scotland P., Zhou D., Benveniste H., Bennett V. Nervous system defects of AnkyrinB (–/–) mice suggest functional overlap between the cell adhesion molecule L1 and 440-kD AnkyrinB in premyelinated axons. J. Cell Biol. (1998) 143:1305–1315.[Abstract/Free Full Text]

  51. Hayashi T., Su T.P. Regulating ankyrin dynamics: Roles of sigma-1 receptors. Proc. Natl Acad. Sci. USA (2001) 98:491–496.[Abstract/Free Full Text]

  52. Su T.P., Hayashi T. Cocaine affects the dynamics of cytoskeletal proteins via sigma(1) receptors. Trends Pharmacol. Sci (2001) 22:456–458.[CrossRef][Medline]

  53. Cui X., Affourtit J., Shockley K.R., Woo Y., Churchill G.A. Inheritance patterns of transcript levels in F1 hybrid mice. Genetics (2006) 174:627–637.[Abstract/Free Full Text]

  54. Nadeau J.H., Balling R., Barsh G., Beier D., Brown S.D., Bucan M., Camper S., Carlson G., Copeland N., Eppig J., et al. Sequence interpretation. Functional annotation of mouse genome sequences. Science (2001) 291:1251–1255.[Free Full Text]

  55. Vogel G. Genetics. Scientists dream of 1001 complex mice. Science (2003) 301:456–457.[Abstract/Free Full Text]

  56. Bolstad B.M., Irizarry R.A., Astrand M., Speed T.P. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics (2003) 19:185–193.[Abstract/Free Full Text]

  57. Williams R.W., Gu J., Qi S., Lu L. The genetic structure of recombinant inbred mice: high-resolution consensus maps for complex trait analysis. Genome Biol. (2001) 2. RESEARCH0046.

  58. Wiltshire T., Pletcher M.T., Batalov S., Barnes S.W., Tarantino L.M., Cooke M.P., Wu H., Smylie K., Santrosyan A., Copeland N.G., et al. Genome-wide single-nucleotide polymorphism analysis defines haplotype patterns in mouse. Proc. Natl Acad. Sci. USA (2003) 100:3380–3385.[Abstract/Free Full Text]

  59. Manly K.F., Cudmore R.H. Jr, Meer J.M. Map Manager QTX, cross-platform software for genetic mapping. Mamm. Genome (2001) 12:930–932.[CrossRef][Web of Science][Medline]

  60. Churchill G.A., Doerge R.W. Empirical threshold values for quantitative trait mapping. Genetics (1994) 138:963–971.[Abstract]

  61. Storey J.D., Tibshirani R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA (2003) 100:9440–9445.[Abstract/Free Full Text]

  62. Lander E.S., Botstein D. Mapping mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics (1989) 121:185–199.[Abstract/Free Full Text]

  63. Dupuis J., Siegmund D. Statistical methods for mapping quantitative trait loci from a dense set of markers. Genetics (1999) 151:373–386.[Abstract/Free Full Text]

  64. Kerlavage A., Bonazzi V., di Tommaso M., Lawrence C., Li P., Mayberry F., Mural R., Nodell M., Yandell M., Zhang J., et al. The celera discovery system. Nucleic Acids Res. (2002) 30:129–136.[Abstract/Free Full Text]

  65. Kent W.J. BLAT—the BLAST-like alignment tool. Genome Res. (2002) 12:656–664.[Abstract/Free Full Text]

  66. Pruitt K.D., Tatusova T., Maglott D.R. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. (2005) 33:D501–D504.[Abstract/Free Full Text]

  67. Kel A.E., Gossling E., Reuter I., Cheremushkin E., Kel-Margoulis O.V., Wingender E. MATCH: a tool for searching transcription factor binding sites in DNA sequences. Nucleic Acids Res. (2003) 31:3576–3579.[Abstract/Free Full Text]

  68. Kent W.J., Sugnet C.W., Furey T.S., Roskin K.M., Pringle T.H., Zahler A.M., Haussler D. The human genome browser at UCSC. Genome Res. (2002) 12:996–1006.[Abstract/Free Full Text]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Data
Right arrow All Versions of this Article:
16/11/1381    most recent
ddm089v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Bao, L.
Right arrow Articles by Cui, Y.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Bao, L.
Right arrow Articles by Cui, Y.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?