Evolution at the nucleotide level: the problem of multiple whole-genome alignment
1Department of Electrical Engineering and Computer Sciences and 2Department of Mathematics, University of California, Berkeley, CA 94720, USA
* To whom correspondence should be addressed at: Department of Electrical Engineering and Computer Sciences, 207 Cory Hall No. 1772, University of California, Berkeley, CA 94720-1772, USA. Email: cdewey{at}eecs.berkeley.edu
Received February 1, 2006; Revised March 9, 2006; Accepted March 9, 2006
With the genome sequences of numerous species at hand, we have the opportunity to discover how evolution has acted at each and every nucleotide in our genome. To this end, we must identify sets of nucleotides that have descended from a common ancestral nucleotide. The problem of identifying evolutionary-related nucleotides is that of sequence alignment. When the sequences under consideration are entire genomes, we have the problem of multiple whole-genome alignment. In this paper, we first state a series of definitions for homology and its subrelations between single nucleotides. Within this framework, we review the current methods available for the alignment of multiple large genomes. We then describe a subset of tools that make biological inferences from multiple whole-genome alignments.