Skip Navigation


Human Molecular Genetics Advance Access originally published online on September 9, 2005
Human Molecular Genetics 2005 14(20):3057-3063; doi:10.1093/hmg/ddi338
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Material
Right arrow All Versions of this Article:
14/20/3057    most recent
ddi338v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (18)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Ahituv, N.
Right arrow Articles by Couronne, O.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Ahituv, N.
Right arrow Articles by Couronne, O.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Mapping cis-regulatory domains in the human genome using multi-species conservation of synteny

Nadav Ahituv1,2,{dagger}, Shyam Prabhakar1,2,{dagger}, Francis Poulin1,{ddagger}, Edward M. Rubin1,2 and Olivier Couronne1,2,*

1Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA and 2US DOE Joint Genome Institute, Walnut Creek, CA, USA

* To whom correspondence should be addressed at: Genomics Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, MS 84-171, Berkeley, CA 94720, USA. Tel: +1 5104865468; Fax: +1 5104864229; Email: ocouronne{at}mac.com

Received July 1, 2005; Revised August 31, 2005; Accepted September 6, 2005


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 REFERENCES
 
Our inability to associate distant regulatory elements with the genes they regulate has largely precluded their examination for sequence alterations contributing to human disease. One major obstacle is the large genomic space surrounding targeted genes in which such elements could potentially reside. In order to delineate gene regulatory boundaries, we used whole-genome human–mouse–chicken (HMC) and human–mouse–frog (HMF) multiple alignments to compile conserved blocks of synteny (CBSs), under the hypothesis that these blocks have been kept intact throughout evolution at least in part by the requirement of regulatory elements to stay linked to the genes they regulate. A total of 2116 and 1942 CBSs >200 kb were assembled for HMC and HMF, respectively, encompassing 1.53 and 0.86 Gb of human sequence. To support the existence of complex long-range regulatory domains within these CBSs, we analyzed the prevalence and distribution of chromosomal aberrations leading to position effects (disruption of a gene's regulatory environment), observing a clear bias not only for mapping onto CBS but also for longer CBS size. Our results provide an extensive data set characterizing the regulatory domains of genes and the conserved regulatory elements within them.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 REFERENCES
 
With the availability of the human and other vertebrate genomes, the annotation and analysis of distant regulatory elements using comparative genomics were greatly facilitated (1Go,2Go). Studies using this approach suggest that cis-regulatory elements may lie in very distant regions from the genes they regulate (3Go,4Go), pointing to one of the difficulties in associating alterations within them to human disease. Primary insights into the ability of these alterations to cause human disease were obtained through chromosomal rearrangements causing position effects. Position effects lead to a phenotype similar to that resulting from mutations within the gene, thought to be brought about by the removal of a gene's regulatory environment (5Go,6Go), and thus provide evidence that disruption of distant regulatory architecture can lead to human disease. However, the association of human disease with nucleotide changes among these distant regulatory elements is hindered by the unavailability of a regulatory code and by the inability to link them to particular genes.

A mechanism by which evolutionary constraints against chromosomal breakage are thought to be maintained is the need for distant cis-regulatory elements to remain in the vicinity of the genes they act on. On the basis of this assumption, synteny blocks (chromosomal segments in which all sequences are in the same order and orientation in the species analyzed) can be used to delimit borders for distant cis-regulatory elements regulating a given gene, a strategy that has been minimally explored (3Go,7Go,8Go). To identify syntenic blocks on a whole-genome scale, we generated multiple alignments of the HMC genomes as well as alignments of the HMF genomes. We reasoned that these genomes would be the most suitable to carry out this analysis allowing adequate evolutionary divergence. Characterization of these conserved blocks of synteny (CBSs) revealed a decrease in gene density and an increase in the density and evolutionary conservation of conserved non-coding sequences (CNSs) with block size. In order to validate the existence of distal regulatory networks within these blocks, we assessed the prevalence and distribution of position effects within them.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 REFERENCES
 
To identify syntenic blocks on a whole-genome scale, we generated multiple alignments of HMC genomes as well as alignments of HMF genomes. Alignments were carried out by locally aligning the genomes to one another and then applying a computational algorithm to cluster all these alignments into an n-dimensional segmental map (Materials and Methods). HMC and HMF were found to have 2116 and 1942 CBSs (Supplementary Material, Table S1), the largest being 5.68 and 2.93 Mb, respectively, with a cumulative length of 1.53 and 0.86 Gb in human (Table 1).


View this table:
[in this window]
[in a new window]
 
Table 1. Characteristics of the HMC and the HMF CBSs
 
Characterization of CNSs and genes within CBSs
We then characterized these CBSs in order to assess whether they have any unique sequence attributes. Analysis of their CNSs, evolutionarily conserved sequences that show no evidence of transcription (Materials and Methods), was carried out using Gumby (S. Prabhakar, manuscript in preparation), a tool for detecting statistically significant conserved regions in pairwise or multiple alignments of DNA sequences (Materials and Methods). Gumby identified a total of 37 191 CNSs for HMC and 9884 CNSs for HMF, with a conserved Gumby P-value of <0.01 [at this threshold, Gumby assigns 3% of the human genome to human–mouse conserved regions compared to the 5% quoted in Waterston et al. (9Go)].

We next analyzed several aspects of the sequence transcription behavior within our CBS finding, among others, no correlation between gene size and CBS size and also observing no enrichment for miRNA within our CBSs [only 35% (135/377) of miRNAs mapped to our HMC CBSs, which encompass half of the human genome]. However, analysis of the distribution of CNSs and genes within our CBSs shows that CBS size is inversely correlated with gene density (decreasing from 0.9 genes per 100 kb in the shortest segments of both multiple alignments to 0.4 and 0.3 genes per 100 kb in the longest HMC and HMF CBSs; Fig. 1A) and directly correlated with the median CNS density (8-fold difference between the shortest and longest CBSs; Fig. 1A). We also found CBS size to be directly correlated with the degree of CNS conservation, with median Gumby P-value dropping from 10–7 and 10–6 for the smallest segments to 10–14 and 10–11 for the largest segments in HMC and HMF, respectively (Fig. 1B). This indicates that longer CBSs harbor fewer genes and denser evolutionary CNSs. As an example of these trends, our analysis of chromosome 16 shows a long CBS covering 5.6 Mb with a gene and non-coding density per 100 kb of 0.5 and 7.5 when compared with a shorter CBS covering 894 kb with a gene and non-coding density per 100 kb of 1.5 and 1.0 (Fig. 2).



View larger version (27K):
[in this window]
[in a new window]
 
Figure 1. CBS size is directly correlated with CNS density and evolutionary conservation and inversely correlated with gene density in HMC and HNF multiple alignments. (A) Median CNS and gene density compared with CBS size. (B) Median Gumby P-value of the evolutionary conservation of CNS compared with CBS size.

 


View larger version (13K):
[in this window]
[in a new window]
 
Figure 2. Human chromosome 16 as an example of CBS trends. (A) HMC CBSs (colors of blocks indicate the different chicken and mouse chromosomes that the sequence is derived from). (B) Normalized density of HMC CNSs, densities are normalized so that the darkest shade in each track denotes 3.5 times the genomic average. (C) Conservation plot of two HMC syntenic segments. Conserved regions with a Gumby P-value <0.01 are depicted as blue (exonic) and magenta (non-exonic) bars (bar height is directly correlated with evolutionary conservation), with the gene structure shown below them. The longer CBS is 5.6 Mb in human, containing 428 HMC CNSs and 27 genes including SALL1, with a non-coding density and gene density of 7.5 and 0.5, respectively, per 100 kb. The shorter CBS is 894 kb long in human, contains nine HMC CNSs and 13 genes, giving a non-coding density and gene density of 1.0 and 1.5, respectively, per 100 kb. The red arrow depicts the approximate region of the chromosomal translocation leading to Townes–Brocks syndrome in one patient (18Go).

 
To examine whether there is a functional uniqueness to the genes that are located near CNSs within the CBSs, we analyzed their biological processes and molecular functions using the GO database (10Go,11Go). We identified the genes closest in distance to each CNS, and when many CNSs had the same closest gene, this gene was counted only once. Overall, we observed that the set of genes flanking these CNSs is enriched for genes involved in transcription regulation or development, with 40% (618 genes when compared with 441 expected, P-value=10–17; Z-score test) and 69% (278 genes when compared with 164 expected, P-value=10–19; Z-score test) more such genes than expected for HMC and HMF, respectively (see Materials and Methods). This indicates that these clusters of highly CNSs tend to reside in the vicinity of genes involved in transcription regulation or development and are likely to regulate them.

Mapping and distribution of position effects within CBSs
In order to validate that our CBSs harbor distant regulatory structure, we analyzed the prevalence and distribution of position effects within them. We searched the literature for position effects leading to human disease where the regulatory elements were removed from the postulated regulated gene; for the 17 that fit these criteria (Table 2), we used the target gene as an anchor for our alignments. As control groups, we chose the entire set of known genes in the human genome and the deleted subset of large-scale copy number polymorphisms (CNPs) (12Go), chromosomal deletions leading to no apparent phenotype, encompassing 44 alignable CNPs (see Materials and Methods). Because of the incomplete nature of the frog genome, several of these regions were absent in the HMF alignments; hence, this analysis was performed using HMC only. In terms of prevalence, we observed a skew where 88% (15/17) of position effects mapped to our CBSs versus ~50% of CNPs and all known genes (Table 3). The likelihood that at least 15 of 17 genes randomly chosen from the set of 20 399 known genes would map to HMC CBSs is ~0.0018 (one-sided P-value from Fisher's exact test). It is worth noting that the two genes that did not map to our >200 kb CBSs are alpha-globin and beta-globin, most likely because of the extensive genomic variations between species in these regions (13Go). Interestingly, these are the only non-developmental genes in our position effect list, and other than SOST, these are the only genes that are not transcription factors.


View this table:
[in this window]
[in a new window]
 
Table 2. Characterization of position effects within HMC CBSs
 

View this table:
[in this window]
[in a new window]
 
Table 3. Prevalence of known genes, CNP deletions and position effects in HMC CBSs
 
We next analyzed the size of the CBSs of only the ones that mapped within the HMC alignment. This examination revealed that though 80% (12/15) of the position effects fall within the top half of CBSs in terms of length, only 47.6% (10/21) of CNPs and 43.6% (4598/10 546) of all known genes are in the upper 50th percentile of CBSs (Fig. 3). Combined, these results confirm the existence of long-range regulatory elements within our CBSs and suggest that disruption of long syntenic blocks is more likely to lead to a more striking and observable phenotype.



View larger version (20K):
[in this window]
[in a new window]
 
Figure 3. Over-representation of position effect genes compared with deleted CNPs and known genes in HMC CBSs (blue, all genes; purple, deleted CNPs; red, position effects). The Y-axis represents the percentage of blocks proportional to CBS size, which is the X-axis.

 

    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 REFERENCES
 
Examining multiple alignments between several vertebrates allowed us to characterize CBSs on a whole-genome scale. CBS size showed an inverse correlation with gene density and a direct correlation with the density and evolutionary CNSs. These findings are in accordance with those from the chicken genome analysis (14Go) where it was observed that many of the human–chicken CNSs appear in clusters far away from genes. Combined, these observations suggest that longer regions of unbroken synteny are enriched for regulatory information.

Our analysis of the genes in the vicinity of these CNSs displayed enrichment for genes involved in transcription regulation or development. This is consistent with a previous characterization of human gene deserts (the 3% longest intergenic intervals in the human genome, with the shortest one covering 640 kb), where stable gene deserts, deserts whose sequence harbors substantially denser human–chicken CNSs, were shown to be enriched for transcription regulation and developmental genes (15Go). Our HMC CBSs overlap 161 of the 169 such stable deserts that were successfully converted to the May 2004 human genome assembly, with 82% of the stable deserts overlapping at the nucleotide level. Our findings are also consistent with a study that characterized non-coding ultra-conserved regions (UCRs) using in human–mouse–fugu sequence alignments and found them to be clustered near transcription factors and regulators of development (16Go). A comparison of these clusters with our data shows that HMC CBSs cover 94% (133/141) at the level of segment overlap and 88% at the nucleotide level. In addition, a recent study using the Drosophila melanogaster and Caenorhabditis elegans genomes (17Go) demonstrated that transcription factors and developmental genes are flanked by significantly more intergenic DNA than other genes with simpler functions. In our study, a similar theme was observed in vertebrates, in addition to the observation that these regions have more dense and evolutionary CNSs. This further supports the theory proposed in the metazoan study (17Go) that an expansion of intergenic regions occurred during evolution in order to accommodate numerous cis-regulatory elements. The drawback in this evolutionary scenario is the increased probability of a random chromosomal rearrangement event that would disrupt the cis-regulatory architecture or other unknown factors and that would lead to developmental defects. Thus, complex cis-architecture may be one of the major constraints on chromosomal rearrangement during evolution.

Position effects are an ideal subset of chromosomal aberrations to study the disruption of long-range regulatory domains. Using this subset, we were able to validate that our CBSs harbor distant regulatory architecture which when disrupted may lead to human disease. One such example is SALL1, a gene that maps to the longest CBS in our data set, 5.6 Mb in size (Fig. 2). Mutations in SALL1 lead to Townes–Brocks syndrome; also, a translocation in one patient ~180 kb telomeric to SALL1 leads to the same syndrome (18Go) (Table 2). Analysis of this region in our CBS list (Supplementary Material, Table S1) suggests that removal of this 3.1 Mb region encompassing 241 CNSs (Fig. 2) is the cause for Townes–Brocks syndrome in this patient. In addition, transgenic mouse data indicate that a significant percentage of non-coding sequences conserved between human and fugu in this segment behave as enhancer elements, with expression patterns similar to those of SALL1 (N. Ahituv, manuscript in preparation).

An important point regarding the disruption of regulatory architecture as a cause for human disease is that the phenotype caused by this disruption may only be a subset of the phenotype brought about by mutations in the coding region of the gene, thus making it difficult for clinicians to associate it to that disease/gene. One such example is the postulated SHH limb enhancer located 1 Mb away from the gene. Mutations in the coding region of SHH lead to a large spectrum of phenotypes among which the most prominent is holoprosencephaly (19Go), whereas both a chromosomal breakpoint and single nucleotide changes within the limb enhancer are suggested to cause preaxial polydactyly (20Go,21Go). Analysis of both the gene and the limb enhancer in our data set shows that they both map to the same CBS, totaling 1.94 Mb in size (Table 2); moreover, additional position effects leading to holoprosencephaly (22Go) also map within this block. We could thus speculate that numerous genetically unaddressed phenotypes, some even leading to embryonic lethality, could be caused by disruption of distant regulatory elements of genes, which would also probably map to our CBSs.

In general, using syntenic blocks to delineate boundaries of regulatory domains would seem an obvious approach when undertaking a comparative genomics endeavor, though it is not usually taken advantage of. The SALL1 and SHH examples show that this approach helps to understand better the boundaries of regulatory domains surrounding these genes and the CNSs within them. One could query a gene of interest for its CBS using our list (Supplementary Material, Table S1) and get a better sense of the domains and the CNSs within them. One major limitation to this method is that the gene of interest may lie within blocks smaller than our 200 kb cutoff size, as we observed with the alpha- and beta-globin genes. A way around that is to use the UCSC browser-chained BLASTZ alignments (11Go), with the limitations being that these are pairwise alignments that use less stringent filters and, consequently, tolerate very large insertions and deletions. Another limitation in using our CBS list is that HMC and HMF may not have enough evolutionary information to define a sufficiently precise cis-regulatory domain map, an obstacle which may be addressed in the future by the completion of several additional vertebrate genomes. In summary, our results provide a regulatory terrain for researchers around their gene of interest, highlighting evolutionary CNSs and delimiting their borders, in order to facilitate the search of regulatory mutations leading to human disease.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 REFERENCES
 
Data source
Human (hg17, May 2004), mouse and chicken genomes and associated data (genes annotations, spliced EST, mRNA, repeats, miRNAs) were downloaded from the UCSC Genome website, http://genome.ucsc.edu (11Go). The Xenopus Tropicalis v3.0 was downloaded from the JGI website, http://jgi.doe.gov. Gene ontology (10Go) data were obtained from http://www.geneontology.org and http://genome.ucsc.edu. The human gene set and their GO data used in this work are from the ‘known genes’ set developed at UCSC and available at http://genome.ucsc.edu.

Segmental map
The segmental homology map was computed using a clustering algorithm which takes BLASTZ (23Go) local alignment as an input. All pairwise alignments were merged into n-dimensional anchors and then clustered by PARAGON (24Go). The n-dimensional segmental map problem was resolved in a graph-theoretic framework. Conserved BLASTZ anchors comprised the vertices of the graph and these were connected by edges if the distance between the anchors was less than 150 kb in all the aligned genomes. Each connected subgraph represents a CBS. Human segments <200 kb were disregarded. We then realigned all the CBSs using the global aligner MLAGAN (25Go). We filtered out three-way syntenic segments with a human–mouse non-coding nucleotide mismatch rate higher than two standard deviations above the whole-genome average, as these are likely to result from alignment artifacts and paralogy, rather than true orthology.

Identification of conserved regions
MLAGAN alignments of synteny blocks were scanned for evolutionarily conserved regions using Gumby (v1.5; S. Prabhakar, manuscript in preparation). Gumby goes through the following three-step process to identify statistically significant conservation in the global alignment input. (i) Non-coding regions in the alignment are used to estimate the local neutral mismatch rates among all pairs of aligned sequences (26Go). The rates are used to derive a log-likelihood scoring scheme for slow versus neutral evolution, where the slow rate is set to two thirds the neutral rate. (ii) Each alignment position is then assigned a conservation score using a phylogenetically weighted sum-of-pairs scheme. (iii) Conserved regions of any length are identified as alignment blocks with a high cumulative conservation score and assigned P-values using Karlin–Altschul statistics (27Go). We used a threshold P-value of 0.01 in a baseline human sequence length of 100 kb. Transcribed sequences in the conserved set were filtered out using known genes, spliced ESTs and mRNA annotations obtained from the UCSC genome browser (intronic sequences were not filtered out). All CNSs that fall within the CBSs can be obtained from the Supplementary Material, Table S1. CNS sets from Gumby analysis of pairwise whole-genome alignments are available through the RankVISTA tracks on the VISTA browser, http://genome.lbl.gov/vista/index.shtml. CNS density was measured as the number of CNSs (regardless of size) per 100 kb of DNA.

Analysis of gene functionality
Each CNS was coupled to the closest gene. We then created a simulated GO term that encompasses everything that has either development or transcription regulator activity at GO level 2. The distribution of this GO term was then compared with the overall GO term distribution of all the genes in the human genome using a Z-score test.

Analysis of chromosomal aberrations
We searched the literature using PubMed (http://www.ncbi. nlm.nih.gov/entrez/query.fcgi?CMD=File&DB=pubmed) and OMIM (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search&DB=omim) for position effects leading to human disease where the regulatory elements were removed from the postulated regulated gene, and this gene was used as an anchor for our alignments. As control groups, we used all known genes from the UCSC genome website (11Go) and large-scale CNPs corresponding to deletions (loss) (12Go), under the assumption that deletions have a greater potential to disrupt regulation networks. We were able to map only 44 of these CNP deletions to the human May 2004 freeze using the coordinate conversion webtool at http://genome. ucsc.edu and used those as our data set.


    ACKNOWLEDGEMENTS
 
We thank members of the Rubin Laboratory for helpful comments and suggestions. F.P. was supported by a fellowship from the Canadian Institutes for Health Research. This work was performed under the auspices of the US Department of Energy's Office of Science, Biological and Environmental Research Program and by the University of California, Lawrence Livermore National Laboratory under contract no. W-7405-Eng-48, Lawrence Berkeley National Laboratory under contract no. DE-AC03-76SF00098 and Los Alamos National Laboratory under contract No. W-7405-ENG-36 and was supported by NIH-NHLBI U1HL66681B.

Conflict of Interest statement. None declared.


    FOOTNOTES
 
{dagger} The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors. Back

{ddagger} Present address: Department of Integrative Biology, University of California, Berkeley, CA, USA. Back


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 REFERENCES
 

  1. Boffelli, D., Nobrega, M.A. and Rubin, E.M. (2004) Comparative genomics at the vertebrate extremes. Nat. Rev. Genet., 5, 456–465.[CrossRef][Web of Science][Medline]

  2. Dermitzakis, E.T., Reymond, A. and Antonarakis, S.E. (2005) Conserved non-genic sequences—an unexpected feature of mammalian genomes. Nat. Rev. Genet., 6, 151–157.[CrossRef][Web of Science][Medline]

  3. Nobrega, M.A., Ovcharenko, I., Afzal, V. and Rubin, E.M. (2003) Scanning human gene deserts for long-range enhancers. Science, 302, 413.[Free Full Text]

  4. Sagai, T., Hosoya, M., Mizushina, Y., Tamura, M. and Shiroishi, T. (2005) Elimination of a long-range cis-regulatory module causes complete loss of limb-specific Shh expression and truncation of the mouse limb. Development, 132, 797–803.[Abstract/Free Full Text]

  5. Ahituv, N., Rubin, E.M. and Nobrega, M.A. (2004) Exploiting human–fish genome comparisons for deciphering gene regulation. Hum. Mol. Genet., 13, R261–R266.[Abstract/Free Full Text]

  6. Kleinjan, D.A. and van Heyningen, V. (2005) Long-range control of gene expression: emerging mechanisms and disruption in disease. Am. J. Hum. Genet., 76, 8–32.[CrossRef][Web of Science][Medline]

  7. Flint, J., Tufarelli, C., Peden, J., Clark, K., Daniels, R.J., Hardison, R., Miller, W., Philipsen, S., Tan-Un, K.C., McMorrow, T. et al. (2001) Comparative genome analysis delimits a chromosomal domain and identifies key regulatory elements in the alpha globin cluster. Hum. Mol. Genet., 10, 371–382.[Abstract/Free Full Text]

  8. Goode, D.K., Snell, P., Smith, S.F., Cooke, J.E. and Elgar, G. (2005) Highly conserved regulatory elements around the SHH gene may contribute to the maintenance of conserved synteny across human chromosome 7q36.3. Genomics, 3, 172–181.

  9. Waterston, R.H., Lindblad-Toh, K., Birney, E., Rogers, J., Abril, J.F., Agarwal, P., Agarwala, R., Ainscough, R., Alexandersson, M., An, P. et al. (2002) Initial sequencing and comparative analysis of the mouse genome. Nature, 420, 520–562.[CrossRef][Medline]

  10. Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T. et al. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet., 25, 25–29.[CrossRef][Web of Science][Medline]

  11. Karolchik, D., Baertsch, R., Diekhans, M., Furey, T.S., Hinrichs, A., Lu, Y.T., Roskin, K.M., Schwartz, M., Sugnet, C.W., Thomas, D.J. et al. (2003) The UCSC Genome Browser Database. Nucleic Acids Res., 31, 51–54.[Abstract/Free Full Text]

  12. Sebat, J., Lakshmi, B., Troge, J., Alexander, J., Young, J., Lundin, P., Maner, S., Massa, H., Walker, M., Chi, M. et al. (2004) Large-scale copy number polymorphism in the human genome. Science, 305, 525–528.[Abstract/Free Full Text]

  13. Hardison, R. (1998) Hemoglobins from bacteria to man: evolution of different patterns of gene expression. J. Exp. Biol., 201, 1099–1117.[Abstract]

  14. Hillier, L.W., Miller, W., Birney, E., Warren, W., Hardison, R.C., Ponting, C.P., Bork, P., Burt, D.W., Groenen, M.A., Delany, M.E. et al. (2004) Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature, 432, 695–716.[CrossRef][Medline]

  15. Ovcharenko, I., Loots, G.G., Nobrega, M.A., Hardison, R.C., Miller, W. and Stubbs, L. (2005) Evolution and functional classification of vertebrate gene deserts. Genome Res., 15, 137–145.[Abstract/Free Full Text]

  16. Sandelin, A., Bailey, P., Bruce, S., Engstrom, P.G., Klos, J.M., Wasserman, W.W., Ericson, J. and Lenhard, B. (2004) Arrays of ultraconserved non-coding regions span the loci of key developmental genes in vertebrate genomes. BMC Genomics, 5, 99.[CrossRef][Medline]

  17. Nelson, C.E., Hersh, B.M. and Carroll, S.B. (2004) The regulatory content of intergenic DNA shapes genome architecture. Genome Biol., 5, R25.[CrossRef][Medline]

  18. Marlin, S., Blanchard, S., Slim, R., Lacombe, D., Denoyelle, F., Alessandri, J.L., Calzolari, E., Drouin-Garraud, V., Ferraz, F.G., Fourmaintraux, A. et al. (1999) Townes–Brocks syndrome: detection of a SALL1 mutation hot spot and evidence for a position effect in one patient. Hum. Mutat., 14, 377–386.[CrossRef][Web of Science][Medline]

  19. Wallis, D. and Muenke, M. (2000) Mutations in holoprosencephaly. Hum. Mutat., 16, 99–108.[CrossRef][Web of Science][Medline]

  20. Lettice, L.A., Horikoshi, T., Heaney, S.J., van Baren, M.J., van der Linde, H.C., Breedveld, G.J., Joosse, M., Akarsu, N., Oostra, B.A., Endo, N. et al. (2002) Disruption of a long-range cis-acting regulator for Shh causes preaxial polydactyly. Proc. Natl Acad. Sci. USA, 99, 7548–7553.[Abstract/Free Full Text]

  21. Lettice, L.A., Heaney, S.J., Purdie, L.A., Li, L., de Beer, P., Oostra, B.A., Goode, D., Elgar, G., Hill, R.E. and de Graaff, E. (2003) A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly. Hum. Mol. Genet., 12, 1725–1735.[Abstract/Free Full Text]

  22. Roessler, E., Ward, D.E., Gaudenz, K., Belloni, E., Scherer, S.W., Donnai, D., Siegel-Bartelt, J., Tsui, L.C. and Muenke, M. (1997) Cytogenetic rearrangements involving the loss of the sonic hedgehog gene at 7q36 cause holoprosencephaly. Hum. Genet., 100, 172–181.[CrossRef][Web of Science][Medline]

  23. Schwartz, S., Kent, W.J., Smit, A., Zhang, Z., Baertsch, R., Hardison, R.C., Haussler, D. and Miller, W. (2003) Human–mouse alignments with BLASTZ. Genome Res., 13, 103–107.[Abstract/Free Full Text]

  24. Schmutz, J., Martin, J., Terry, A., Couronne, O., Grimwood, J., Lowry, S., Gordon, L.A., Scott, D., Xie, G., Huang, W. et al. (2004) The DNA sequence and comparative analysis of human chromosome 5. Nature, 431, 268–274.[CrossRef][Medline]

  25. Brudno, M., Do, C.B., Cooper, G.M., Kim, M.F., Davydov, E., Green, E.D., Sidow, A. and Batzoglou, S. (2003) LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA. Genome Res., 13, 721–731.[Abstract/Free Full Text]

  26. Cooper, G.M., Brudno, M., Stone, E.A., Dubchak, I., Batzoglou, S. and Sidow, A. (2004) Characterization of evolutionary rates and constraints in three mammalian genomes. Genome Res., 14, 539–548.[Abstract/Free Full Text]

  27. Karlin, S. and Altschul, S.F. (1990) Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc. Natl Acad. Sci. USA, 87, 2264–2268.[Abstract/Free Full Text]

  28. Davies, A.F., Mirza, G., Flinter, F. and Ragoussis, J. (1999) An interstitial deletion of 6p24-p25 proximal to the FKHL7 locus and including AP-2alpha that affects anterior eye chamber development. J. Med. Genet., 36, 708–710.[Abstract/Free Full Text]

  29. Fang, J., Dagenais, S.L., Erickson, R.P., Arlt, M.F., Glynn, M.W., Gorski, J.L., Seaver, L.H. and Glover, T.W. (2000) Mutations in FOXC2 (MFH-1), a forkhead family transcription factor, are responsible for the hereditary lymphedema-distichiasis syndrome. Am. J. Hum. Genet., 67, 1382–1388.[CrossRef][Web of Science][Medline]

  30. Crisponi, L., Uda, M., Deiana, M., Loi, A., Nagaraja, R., Chiappe, F., Schlessinger, D., Cao, A. and Pilia, G. (2004) FOXL2 inactivation by a translocation 171 kb away: analysis of 500 kb of chromosome 3 for candidate long-range regulatory sequences. Genomics, 83, 757–764.[CrossRef][Web of Science][Medline]

  31. Vortkamp, A., Gessler, M. and Grzeschik, K.H. (1991) GLI3 zinc-finger gene interrupted by translocations in Greig syndrome families. Nature, 352, 539–540.[CrossRef][Medline]

  32. Barbour, V.M., Tufarelli, C., Sharpe, J.A., Smith, Z.E., Ayyub, H., Heinlein, C.A., Sloane-Stanley, J., Indrak, K., Wood, W.G. and Higgs, D.R. (2000) alpha-thalassemia resulting from a negative chromosomal position effect. Blood, 96, 800–807.[Abstract/Free Full Text]

  33. Kioussis, D., Vanin, E., deLange, T., Flavell, R.A. and Grosveld, F.G. (1983) Beta-globin gene inactivation by DNA translocation in gamma beta-thalassaemia. Nature, 306, 662–666.[CrossRef][Medline]

  34. Spitz, F., Montavon, T., Monso-Hinard, C., Morris, M., Ventruto, M.L., Antonarakis, S., Ventruto, V. and Duboule, D. (2002) A t(2;8) balanced translocation with breakpoints near the human HOXD complex causes mesomelic dysplasia and vertebral defects. Genomics, 79, 493–498.[CrossRef][Web of Science][Medline]

  35. Jamieson, R.V., Perveen, R., Kerr, B., Carette, M., Yardley, J., Heon, E., Wirth, M.G., van Heyningen, V., Donnai, D., Munier, F. et al. (2002) Domain disruption and mutation of the bZIP transcription factor, MAF, associated with cataract, ocular anterior segment dysgenesis and coloboma. Hum. Mol. Genet., 11, 33–42.[Abstract/Free Full Text]

  36. Fantes, J., Redeker, B., Breen, M., Boyle, S., Brown, J., Fletcher, J., Jones, S., Bickmore, W., Fukushima, Y., Mannens, M. et al. (1995) Aniridia-associated cytogenetic rearrangements suggest that a position effect may cause the mutant phenotype. Hum. Mol. Genet., 4, 415–422.[Abstract/Free Full Text]

  37. Flomen, R.H., Vatcheva, R., Gorman, P.A., Baptista, P.R., Groet, J., Barisic, I., Ligutic, I. and Nizetic, D. (1998) Construction and analysis of a sequence-ready map in 4q25: Rieger syndrome can be caused by haploinsufficiency of RIEG, but also by chromosome breaks approximately 90 kb upstream of this gene. Genomics, 47, 409–413.[CrossRef][Web of Science][Medline]

  38. de Kok, Y.J., Vossenaar, E.R., Cremers, C.W., Dahl, N., Laporte, J., Hu, L.J., Lacombe, D., Fischel-Ghodsian, N., Friedman, R.A., Parnes, L.S. et al. (1996) Identification of a hot spot for microdeletions in patients with X-linked deafness type 3 (DFN3) 900 kb proximal to the DFN3 gene POU3F4. Hum. Mol. Genet., 5, 1229–1235.[Abstract/Free Full Text]

  39. Wallis, D.E., Roessler, E., Hehr, U., Nanni, L., Wiltshire, T., Richieri-Costa, A., Gillessen-Kaesbach, G., Zackai, E.H., Rommens, J. and Muenke, M. (1999) Mutations in the homeodomain of the human SIX3 gene cause holoprosencephaly. Nat. Genet., 22, 196–198.[CrossRef][Web of Science][Medline]

  40. Balemans, W., Patel, N., Ebeling, M., Van Hul, E., Wuyts, W., Lacza, C., Dioszegi, M., Dikkers, F.G., Hildering, P., Willems, P.J. et al. (2002) Identification of a 52 kb deletion downstream of the SOST gene in patients with van Buchem disease. J. Med. Genet., 39, 91–97.[Abstract/Free Full Text]

  41. Wirth, J., Wagner, T., Meyer, J., Pfeiffer, R.A., Tietze, H.U., Schempp, W. and Scherer, G. (1996) Translocation breakpoints in three patients with campomelic dysplasia and autosomal sex reversal map more than 130 kb from SOX9. Hum. Genet., 97, 186–193.[CrossRef][Web of Science][Medline]

  42. Rose, C.S., Patel, P., Reardon, W., Malcolm, S. and Winter, R.M. (1997) The TWIST gene, although not disrupted in Saethre–Chotzen patients with apparently balanced translocations of 7p21, is mutated in familial and sporadic cases. Hum. Mol. Genet., 6, 1369–1373.[Abstract/Free Full Text]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Hum Mol GenetHome page
L. D. Orozco, S. J. Cokus, A. Ghazalpour, L. Ingram-Drake, S. Wang, A. van Nas, N. Che, J. A. Araujo, M. Pellegrini, and A. J. Lusis
Copy number variation influences gene expression and metabolic traits in mice
Hum. Mol. Genet., November 1, 2009; 18(21): 4118 - 4129.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
A. L. Hufton, S. Mathia, H. Braun, U. Georgi, H. Lehrach, M. Vingron, A. J. Poustka, and G. Panopoulou
Deeply conserved chordate noncoding sequences preserve genome synteny but do not drive gene duplicate retention
Genome Res., November 1, 2009; 19(11): 2036 - 2051.
[Abstract] [Full Text] [PDF]


Home page
Brief Funct Genomic ProteomicHome page
P. Navratilova and T. S. Becker
Genomic regulatory blocks in vertebrates and implications in human disease
Brief Funct Genomic Proteomic, July 1, 2009; 8(4): 333 - 342.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
L. Xiong, H. Catoire, P. Dion, C. Gaspar, R. G. Lafreniere, S. L. Girard, A. Levchenko, J.-B. Riviere, L. Fiori, J. St-Onge, et al.
MEIS1 intronic risk haplotype associated with restless legs syndrome affects its mRNA and protein expression levels
Hum. Mol. Genet., March 15, 2009; 18(6): 1065 - 1074.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
L. A. Shakes, T. L. Malcolm, K. L. Allen, S. De, K. R. Harewood, and P. K. Chatterjee
Context dependent function of APPb enhancer identified using enhancer trap-containing BACs as transgenes in zebrafish
Nucleic Acids Res., November 1, 2008; 36(19): 6237 - 6248.
[Abstract] [Full Text] [PDF]


Home page
Poult. Sci.Home page
L. A. Cogburn, T. E. Porter, M. J. Duclos, J. Simon, S. C. Burgess, J. J. Zhu, H. H. Cheng, J. B. Dodgson, and J. Burnside
Functional Genomics of the Chicken A Model Organism
Poult. Sci., October 1, 2007; 86(10): 2059 - 2094.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Brudno, A. Poliakov, S. Minovitsky, I. Ratnere, and I. Dubchak
Multiple whole genome alignments and novel biomedical applications at the VISTA portal
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W669 - W674.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
H. Kikuta, M. Laplante, P. Navratilova, A. Z. Komisarczuk, P. G. Engstrom, D. Fredman, A. Akalin, M. Caccamo, I. Sealy, K. Howe, et al.
Genomic regulatory blocks encompass multiple neighboring genes and maintain conserved synteny in vertebrates
Genome Res., May 1, 2007; 17(5): 545 - 555.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
S. Prabhakar, F. Poulin, M. Shoukry, V. Afzal, E. M. Rubin, O. Couronne, and L. A. Pennacchio
Close sequence comparisons are sufficient to identify human cis-regulatory elements
Genome Res., July 1, 2006; 16(7): 855 - 863.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
L. Flaherty, B. Herron, and D. Symula
Genomics of the future: Identification of quantitative trait loci in the mouse
Genome Res., December 1, 2005; 15(12): 1741 - 1745.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Material
Right arrow All Versions of this Article:
14/20/3057    most recent
ddi338v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (18)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Ahituv, N.
Right arrow Articles by Couronne, O.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Ahituv, N.
Right arrow Articles by Couronne, O.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?