Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (68)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Barry, A. E.
Right arrow Articles by Choo, K. H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Barry, A. E.
Right arrow Articles by Choo, K. H.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Human Molecular Genetics Pages 217-227  


Sequence analysis of an 80 kb human neocentromere
Introduction
Results
   Generation of the neocentromere (NC) sequence
   Sequence composition, A + T content and EST matches
   Pericentric DNA sequences and putative protein-binding motifs
   [alpha]-Satellite, [beta]-satellite, [gamma]-satellite, classical satellites I and III and other pericentric sequences
   CENPB, pJ[alpha], HMGI and topoisomerase II (topoII) protein-binding motifs
   Tandem repeats
   Human transposable elements
Discussion
Materials And Methods
   Generation of the NC DNA sequence
   Calculation of normal A + T content and expected abundance of motifs
   Computational analysis
Acknowledgements
References


Sequence analysis of an 80 kb human neocentromere

Sequence analysis of an 80 kb human neocentromere

Alyssa E. Barry, Emily V. Howman, Michael R. Cancilla, Richard Saffery and K. H. Andy Choo*

The Murdoch Institute, Royal Children’s Hospital, Flemington Road, Parkville 3052, Australia

Received July 7, 1998; Revised and Accepted November 3, 1998

DDBJ/EMBL/GenBank accession no. AF04284 (See Corrigenda).

We previously described the cloning of an 80 kb DNA corresponding to the core protein-binding domain of a human chromosome 10-derived neocentromere. Here we report the complete sequence of this DNA (designated NC DNA) and its detailed structural analysis. The sequence is devoid of human centromeric [alpha]-satellite DNA and the pericentric [beta]- and [gamma]-satellites, the ATRS and 48 bp repeat DNA. One copy of a sequence that is related to the CENPB box motif is present, and a number of copies of other pericentric sequences including pJ[alpha] and classical satellites I and III are present but both their relative sparsity and non-tandem organization suggest that each sequence, on its own, is unlikely to mimic any role the sequence may have in the normal centromere. The DNA-binding motifs of the architectural and regulatory proteins HMGI and topoII have a normal abundance and random distribution, implying that these sequences are not key functional elements. The total A + T content of the sequence is not notably different from that of the human genome, but an abundance of AT-rich islands and a biased distribution of these islands within the NC sequence are clearlydiscernible and may be functionally significant. Substantial amounts of transposable elements and low copy number tandem repeats, including several that are highly AT- and purine-rich, are also present and may act as functional elements. One of the AT-rich tandemrepeats (AT28) may form interesting structures and is described in detail. The defined features show only a loose resemblance to the structures of known centromeres, highlighting the possibility that, rather than a conserved primary sequence, it is the overallcomposition and distribution patterns of various unknown functional elements, or any ‘ordinary’ DNA under appropriate epigenetic influences, that determine centromere formation and function. This is the firstdetailed analysis of a neocentromere DNA and provides a basis for comparison against future sequences.

INTRODUCTION

The centromere is an essential component of all eukaryotic chromosomes, and appears as a primary constriction on all metaphase chromosomes. It functions as the site for kinetochore assembly and spindle fibre attachment, allowing the faithful pairing and segregation of sister chromatids during cell division. An undefined interplay between the centromeric DNA and centromere-binding proteins presumably underlies the formation of the kinetochore complex to allow correct centromere function. An increasing number of kinetochore proteins have now been identified and shown to be highly conserved through evolution (1-4); however, the precise composition of the functional centromere DNA has so far eluded researchers. Despite the fact that in all higher eukaryotes this DNA is made up of highly reiterated satellite sequences, with these sequences generally having a high A + T content, its primary sequence is highly variable between species, making it difficult to correlate sequence conservation with function (5).

Normal human centromeres vary in size between 1 and 4 Mb and are largely composed of highly repetitive [alpha]-satellite sequences. This DNA consists of a 171 bp tandem repeat unit that contains numerous binding sites for centromere protein B (CENPB) (6,7) and pJ[alpha] (8-10). Other types of satellite DNA found at the pericentric regions of human chromosomes include classical satellites I, II and III, and [beta]- and [gamma]-satellites (4,11). Repeats found at or near the human centromere also include an AT-rich sequence (ATRS) (12) and a novel 48 bp repeat (13-15). Some interspersed human transposable elements have also been detected pericentrically (16,17). To date, of all the known centromeric/pericentric satellite DNA sequences, only [alpha]-satellite has been shown experimentally to exhibit functional centromeric activity (18-22).

In recent years, numerous morphologically abnormal marker chromosomes have been described whose centromeres are devoid of detectable levels of [alpha]-satellite DNA but remain functional in mitosis (5,23). These so-called ‘analphoid’ ([alpha]-satellite-negative) marker chromosomes generally have lost their normal centromere through chromosomal rearrangements, leading to the formation of a new centromere or neocentromere at a previously non-centromeric region on the chromosome arm. These regions have been proposed to be the sites of latent centromeres that can become activated through unknown epigenetic mechanisms (5,24-27).

We previously described a chromosome 10-derived analphoid marker chromosome in a young boy with mild developmental impairment. This marker chromosome, designated mardel(10), has acquired neocentromere activity in a region corresponding to the q25.2 band on normal chromosome 10 (28). Using positional cloning, an 80 kb DNA spanning the core centromere protein-binding domain of the neocentromere was isolated from both a normal chromosome 10 (29) and the mardel(10) chromosome (30). Extensive restriction map comparisons between the normal chromosome 10 and the mardel(10) DNA have revealed an identical structural organization (29,30), supporting the hypothesis of neocentromere formation on mardel(10) via an epigenetic mechanism. In this communication, we present the complete nucleotide sequence of the 80 kb neocentromere DNA derived from the mardel(10) chromosome and discuss the results of our extensive computational analyses of this DNA.

RESULTS

Generation of the neocentromere (NC) sequence

In an earlier study, we have designated the neocentromere DNA cloned from the normal 10q25.2 region as the HC DNA (29). In order to distinguish this DNA from the sequence isolated directly from the mardel(10) chromosome (30) and analysed in detail in the present study, the sequence will be referred to as NC (neocentromere) DNA. The completed NC sequence consists of 80 155 bp and can be accessed from GenBank (accession no. AF04284). Nucleotides 1 and 80 155 correspond to the q[prime] (proximal to the normal chromosome 10 centromere) and p[prime] (distal to the normal chromosome 10 centromere) ends, respectively, of the NC DNA on the mardel(10) chromosome (29).

The NC DNA sequence represents the core centromere protein-binding domain of the mardel(10) neocentromere and is expected to contain DNA motifs or structures that enable the nucleation of key centromere proteins to elicit functional activity. A comprehensive multi-organism centromere DNA database was created for this study by compiling all the known centromeric and pericentric DNA sequences from the GenBank database (see Materials and Methods). Homology searches of the NC sequence against this database revealed no striking homologies to any of the sequences in the database. This result indicated that the NC sequence is unique with respect to previously characterized centromere DNA sequences from a range of organisms. Therefore, identification of putative functional elements required a more detailed computational analysis of the NC sequence.

Sequence composition, A + T content and EST matches

The NC sequence is composed of 28.79% A, 20.63% C, 20.87% G and 29.71% T nucleotides. This translates to an A + T nucleotide content of ~58%, compared with 58.7% for 10 Mb of random human genomic sequences (see Materials and Methods), both of which are within the normal range for the human genome (31). A more detailed regional analysis of the NC DNA involving scanning for nucleotide stretches over 30 bp with an A + T content >65% has revealed many small AT-rich islands (Fig. 1); this minimum level was chosen based on our calculation that the human centromeric [alpha]-satellite consensus sequence has an A + T content of >65% (32). As can be seen from Figure 1, these islands are not evenly distributed within the NC sequence, with the first 40 kb on the q[prime] half showing a relatively higher density than the remaining p[prime] half of the NC DNA. However, the overall A + T content for each of these two 40 kb regions remained similar. This analysis therefore shows that there is a higher number of AT-rich islands (or clusters of A + T nucleotides interspersed with clusters of G + C nucleotides) in the 0-40 kb region compared with the 40-80 kb region, where there is a more even distribution of the four nucleotides, A, C, G and T. Of particular interest is an AT-rich island of ~600 bp which contains >80% A or T nucleotides (Fig. 1; position ~15 kb, arrowhead). This sequence, designated AT28, is described later.


Figure 1. Distribution of AT-rich islands in the NC sequence. Islands of 30 bp with A + T contents >65% were plotted using the BASEPAIRPLOT program (see Materials and Methods). Window size was set at 30 bp and shift at 25 bp. The arrowhead points to the position of the AT28 sequence which forms an ~600 bp island of >80% A + T.

Table 1. Prevalence of human pericentric sequences and putative protein-binding motifs in NC DNA
Name (size) Present Expected no. (human genome) Observed no. (NC-DNA) Homology search (%)
[alpha]-Satellite (171 bp) no nd 0 >80
[beta]-Satellite (68 bp) no nd 0 >80
[gamma]-Satellite (220 bp) no nd 0 >80
Satellite IA (17 bp) yes 1.05 3.00 >88
Satellite IB (24 bp) no nd 0 >80
Satellite III (5 bp)a yes 264.87 213.70 100
48 bp repeat (48 bp) no nd 0 >80
ATRS (483 bp) no nd 0 >80
CENPB box motif (15 bp) no 2.07 1.00 >93
pJ[alpha] motif (9 bp) yes 1.14 3.00 100
HMGI motif (6 bp)b yes 1416.20 1339.00 100
TopoII motif (18 bp) yes 124.30 121.00 >94
nd, not done.
aThe values shown for satellite III are the averages of search results for the five possible motif combinations (TTGGA, TGGAT, GGATT, GATTG, ATTGG).
bHMGI recognizes strings of more than six A or T nucleotides (W6); therefore, we used the search string (W)6 S (where S = G or C) so that only one match, rather than a number of overlapping matches, was reported for more than six A or T nucleotides.

The NC DNA spans a normal genomic region (29,30) and, therefore, may contain genes. To determine the presence of expressed sequences and, therefore, potential genes in the NC DNA, we carried out a PowerBLAST search (see Materials and Methods) of the sequence against the GenBank expressed sequence tag (EST) databases. We found three regions in the NC DNA that contained significant matches to human EST sequences. More than 96% identity to the NC DNA was observed with GenBank EST clones AA774571 (NC DNA no. 11 627-13 562), AA584927 (NC DNA no. 41 168-41 536), and AA813722 and AA860318 (NC DNA no. 45 247-45 499). This indicates the presence of expressed sequences in the NC DNA; however, further analysis is required to determine whether or not these are genes.


Pericentric DNA sequences and putative protein-binding motifs

A number of different human pericentric sequences and protein-binding motifs have been described previously. The results of homology searches for these sequences in the NC DNA are summarized in Table 1 and discussed below.

In order to determine whether the abundance of a particular sequence motif within the NC DNA was different from a random stretch of human DNA sequence, it was necessary to derive an expected abundance value for such a motif. This was achieved by creating a mini-database containing sequences derived from >10 Mb of randomly selected human genomic sequences and using this mini-database to calculate the expected abundance for any 80 kb of human genomic sequence (see Materials and Methods).

[alpha]-Satellite, [beta]-satellite, [gamma]-satellite, classical satellites I and III and other pericentric sequences

Fluorescence in situ hybridization (FISH) studies have indicated that functional neocentromeres, including that of the mardel(10) chromosome (28,29), lack demonstrable levels of [alpha]-satellite DNA (5). The NC sequence shows no significant homologies to the consensus 171 bp [alpha]-satellite DNA (32). A direct search for sequences homologous to the pericentric 68 bp [beta]-satellite and 220 bp [gamma]-satellite also proved negative. No recognizable homologies were found with the pericentric 48 bp repeat and ATRS sequences. These results clearly indicate that these well-defined components of the centromere or pericentric regions of normal human chromosomes do not contribute to an essential part of the core protein-binding domain of the mardel(10) neocentromere.

Human satellite I DNA is present in the pericentric regions of chromsomes 3, 4, 13, 14, 15, 21 and 22 (33-35). The satellite I monomer is made up of a 42 bp sequence consisting of two parts: IA, a 17 bp conserved sequence of ACAWAAAATAWSAAAGT; and IB, a more variable 25 bp sequence of ACMYMARVYATRDATTHTATWCTGT (36). Within the NC DNA, satellite IB sequences were not detected, and the presence of three copies (compared with the expected 1.05 copies; Table 1) of the satellite IA monomer is also unlikely to be functionally significant.

Satellite III DNA has an underlying 5 bp repeated motif, TGGAA, that appears to be evolutionarily conserved and found at the pericentric regions of most, if not all, human chromosomes (37). As with the [alpha]-satellite DNA, low stringency FISH analysis failed to detect this DNA at the mardel(10) neocentromere (28). A search of the NC sequence using all five combinations of the satellite III monomer, TGGAA, GGAAT, GAATG, AATGG and ATGGA [since the ‘phasing’ of the sequence of this repeat monomer is not known (36,38)], showed a prevalence similar to that expected (Table 1). Furthermore, unlike the clustered and tandemly organized structure of normal pericentric satellite III arrays, the observed matches are randomly distributed and non-contiguous. Based on such a dissimilar organization, it becomes difficult to infer any functional significance for the observed satellite III matches, especially given the limited understanding of the functional role of even the authentic pericentric satellite III arrays.

CENPB, pJ[alpha], HMGI and topoisomerase II (topoII) protein-binding motifs

A homology search of the NC DNA for sequences related to the 15 bp degenerate CENPB box motif, TTCGNNNNANNCGGG (39), revealed just one match at the 14/15 nucleotide level, due presumably to a chance occurrence (Table 1). This low level of CENPB box motif is consistent with the absence of detectable CENPB protein binding on the mardel(10) chromosome (28) and shows that the CENPB protein is not necessary for neocentromere formation. Analysis of the NC sequence identified three perfect matches with the human pJ[alpha] motif at positions 10 719, 17 813 and 79 535. However, in view of the high prevalence of the pJ[alpha] motif within the [alpha]-satellite DNA of normal human centromeres (8,9), it is doubtful that these three sporadic copies of pJ[alpha] can exert a major impact on neocentromere formation or activity.


Figure 2. Distribution of putative HMGI and topoII-binding motifs in the NC DNA. (A) The number of sequences with a 100% match to the HMGI motif, WWWWWWS, was plotted using EWINDOWS/ESTATPLOT (see Materials and Methods) with window size set at 100 bp and shift at 95 bp. (B) The number of sequences with >94% match to the vertebrate topoII motif, RNYNNCNNGYNGKTNYNY, was determined using FINDPATTERNS (see Materials and Methods) allowing for a mismatch of one nucleotide. Matches to both forward (+) and reverse complement (-) strands are shown. Each bar represents one match to a motif in the NC DNA.

We also searched the NC sequence against the known DNA-binding motifs of two proteins that have been shown to interact with the centromere: the high-mobility-group protein I (HMGI) and DNA protein topoII. HMGI is an abundant protein that recognizes stretches of six or more A or T nucleotides (40). Thus, when searching for this motif, the array WWWWWWS was used to omit overlapping matches where there were more than six A or T nucleotides. When the NC sequence was searched for these HMGI-binding motifs, the observed prevalence was similar to that expected (Table 1). As shown in Figure 2A, the putative HMGI-binding sites appear to be uniformly distributed along the NC DNA, except for a noticeable clustering at the AT28 motif region at position ~15 kb.

Searches for topoII-binding sites were based on >94% identity to the published 18 bp vertebrate topoII motif, RNYNNCNNGYNGKTNYNY (41). The abundance of this motif was found to be similar to the expected value (Table 1). Matches to the topoII motifs are distributed throughout the NC DNA but appear to be more concentrated at the p[prime] end segment of ~25 kb (Fig. 2B). As for the other putative functional elements, the implications of these relatively random distributions of HMGI- and topoII-binding motifs, despite minor clustering, are unlikely to exert any major influence on centromere function.

Tandem repeats

An initial examination of the NC sequence indicated a lack of any large stretches (>1 kb) of tandemly repeated DNA, suggesting that only relatively small repeat arrays, if any, were present. Detailed computational analysis of the NC sequence has identified a number of such small arrays distributed throughout the sequence (Table 2 and Fig. 3). Except for the two sequence-tagged sites (STSs) AC24 and AC32 (microsatellite arrays with 24 and 39 copies of the AC dinucleotide, respectively) and the AT28 VNTR (discussed below), the arrays identified all contain <10 copies of tandem repeats.

The AT28 region was previously shown to be a variable number tandem repeat (VNTR) that varied between the NC DNA on the mardel(10) chromosome and the HC DNA on normal chromosome 10 (30). The region is ~600 bp in size and is the largest member of the tandem arrays within the NC DNA, residing between nucleotides 15 164 and 15 753 (Fig. 1). The AT28 DNA, which has a high A + T content of 82.1%, is made up of 12 perfectly conserved and 11 variant copies of the 28 bp monomer (Fig. 4A and B). Figure 4C shows that the consensus sequence of these repeats has a structure composed of a slightly imperfect 24 bp palindrome containing an internal non-palindromic 4 bp GTGT sequence, and within this a perfect 16 bp palindrome containing a central TGTG sequence, with the left-hand arm of the larger palindrome containing an additional 11 bp mirror repeat centred around a middle A nucleotide.

Table 2. Low copy number tandem repeats within the NC sequence
Name Position (length in bp) Sequence (size in bp) No. of copies Homology (%) Composition (%)
T1 2558-2581 (24) CAGGCACAGTGG (12) 2 100 43.4 A + T
T2 3021-3050 (30) TAACAAAGTG (10) 3 83.3 63.3 A + T
T3 4173-4211 (39) AAAAAAATAATTT (13) 3 82.1 92.3 A + T
AC24 6598-6645 (48) AC (2) 24 100 100 A + C
GA1 10 097-10 136 (40) GAAAGAAAGG (10) 4 82.5 97.5 G + A
AT4 11 996-12 028 (32) ATTT (4) 8 100 100 A + T
AT28 15 164-15 753 (589) ATGTATATATGTGTATATAGACATAAAT (28) 21 82.1 81.8 A + T
GA2 17 406-17 564 (159) AAGAAGGAAGGAAGAGAAGAAAGAAAAGAAAGA
AAAAAAAGGAAAGAAAATA (53)
3 81.8 98.7 G + A
CT1 21 375-21 639 (265) TTCCCTCCCCCCCCCCTTCCCTCCCTCCTCCCTT
CCTTCCTCCCTTCCTTCCT (53)
5 74 99.2 C + T
T4 22 560-22 604 (45) AATATTACAATAATT (15) 3 80 66.0 A + T
T5 23 098-23 130 (33) TTTTAAAAATA (11) 3 81.8 87.8 A + T
T6 26 597-26 626 (30) ATCAATTATT (10) 3 83.3 73.3 A + T
GA4 28 591-28 626 (36) AAGAAAGGAGGG (12) 3 88.9 100 G + A
T7 28 955-28 984 (30) TAAAAAAATT (10) 3 83.3 93.3 A + T
T8 31 566-31 598 (33) TATATTGTAAT (11) 3 81.8 90.9 A + T
T9 35 484-35 525 (42) ACTCAGCATAGTGG (14) 3 81 38.1 A + T
T10 38 855-38 887 (33) AAGGTGGAATA (11) 3 81.8 54.6 A + T
T11 40 124-40 153 (30) TTTATAAATT (10) 3 83.3 90.0 A + T
T12 44 786-44 821 (36) CTGTGGTTGTTG (12) 3 88.9 61.2 A + T
T13 46 193-46 208 (12) CATT (4) 3 88.9 68.7 A + T
AT5 46 208-46 230 (20) TATT (4) 5 100 100 A + T
AC32 51 968-52 045 (78) AC (2) 39 100 100 A + C
T14 53 919-53 990 (72) ACCAATCAGCACTCTGTAAAATGG (24) 3 88.9 59.7 A + T
T15 54 208-54 473 (266) AGGAAGAAACTCCAGACACACCATCTTTAAGAG
CTGTAACACTCACTGCAAGGGTCTGCGGCTTCA
TTCTTGAAGTCAGCAAGACCAAGAACCCACTGG
AAGGAAACAATTCCGGACACATTTTGGTGACCCA (133)
2 88.3 52.6 A + T
T16 55 342-55 511 (170) GTAAGGGTGCAGGTTTTCAAAAATGTGTTGGTA
AGGGCCACTAAATCTGACATTCCTTGGTCCTCC
TTGTGGTCTAGGAGGAAAA (85)
2 89.4 52.3 A + T
T17 55 515-55 674 (160) GTGTTTCTGCTGCTGCATTGGTGGGCTCAACTA
TTCCAATCAGCAGGGTCCAGTGACCTTTGCGGG
TTCTTGGGTCGGGG (80)
2 92.5 45.8 A + T
GA5 57 642-57 692 (51) GGAAAGAGAGAGAGAAA (17) 3 82.4 92.0 G + A
GA3 57 742-57 851 (110) GAGAGAGAGAGAGGGAAAGACA (22) 5 80.9 92.7 G + A
T18 58 130-59 211 (82) TGTGTCTAGCTAAAGGATTGTAAATGCACCAAT
CAGCACTC (41)
2 100 58.5 A + T
T19 59 465-59 730 (266) AACAAAVTCCAGACACACCATCTTTCAGAGCTGT
AACACTCACCGCAAGGGTCTGTGGCTTCATTCTT
GAAGTCAGCAAGACCAAGAACCCACCGGAAGGA
ACAAATTCCAGACACAGTAGGAAATCTGTATT (133)
2 86.5 51.2 A + T
T20 64 525-64 554 (30) ATAAAATAAG (10) 3 86.7 93.3 A + T
T21 68 662-68 703 (42) ATAAAAAAATTAAA (14) 3 85.7 95.2 A + T
T22 70 318-70 362 (45) ATATATATCTGTGTG (15) 3 82.2 82.2 A + T
T23 75 789-75 827 (39) TAAAAAAGAATAA (13) 3 87.2 94.8 A + T
Only repeats that have >80% homology (except for CT1 which has 74% homology) between monomers are shown.

In order to discount the possibility that the previously observed difference between the NC and HC DNA in the AT28 region may be directly responsible for the neocentromeric activation of the NC DNA, sequences from a number of normal chromosomes 10 were analysed. The results indicate that the AT28 sequence varies in copy number between 11 and 19 monomeric units within the conserved core region (Fig. 4A and B) on the various chromosomes 10, a range which encompasses the NC DNA copy number of 18 on the mardel(10) chromosome in the somatic cell hybrid BE2C1-18-5f (Table 3). In addition to copy number variation, minor differences in nucleotide sequences were also observed on the different chromosomes 10 (data not shown). These results indicate that the AT28 sequence conforms to the normal polymorphic variation expected of a VNTR (42,43). In addition, the copy number is the same as the sequenced allele (data not shown) from the father (CE) of the mardel(10) patient from whom we previously have shown the mardel(10) chromosome was derived (29). This lack of any observable change between the progenitor normal DNA and the activated neocentromere DNA suggests that the polymorphic differences detected in the AT28 alleles cannot be directly responsible for neocentromere activation.


Figure 3. Distribution of low copy number tandem repeats in the NC DNA. Refer to Table 2 for details.


Figure 4. Structure and sequence of AT28. (A) Arrangement of 28 bp tandemly repeating monomers (open arrows). PCR primers N17 and N18 (30) amplify an ~1.2 kb fragment containing the AT28 region. Two RsaI (R) sites are present immediately outside the highly conserved core of the AT28 region. (B) Nucleotide sequence of AT28 showing alignment of each 28 bp monomer and derivation of a consensus sequence. Dashed lines represent gaps introduced to optimize alignment. Asterisks denote perfectly conserved monomers. The RsaIsites are underlined. The nucleotide positions of the AT28 region in the NC DNA are shown. (C) Structural features of AT28. Closed arrows show a perfect palindrome around a central TGTG sequence (underlined). Hashed arrows represent a larger and slightly imperfect (at positions 5 and 20) palindrome around a central GTGT sequence (hashed underline). Open arrows indicate a mirror image around a central A nucleotide (double underlined).

Table 3. Polymorphism analysis of AT28 sequence on different human chromosomes 10
Template Copy number
C10-1.5RV 19
BE2C1-18-1f 15
BE2C1-18-5f 18
AE (mother) 15
CE (father) 18
GM10926D 11
MK (female) 17
C198b (male) 17
WM (female) 16
C10-1.5RV is a 1.5 kb EcoRV fragment subcloned from cosmid Y6C10 of the normal chromosome 10 HC DNA (29). BE2C1-18-1f and BE2C1-18-5f are somatic hybrid cell lines containing the normal chromosome 10 or the mardel(10) chromosome of the patient, BE, respectively (29). AE and CE are the parents of BE. GM10926D is a somatic hybrid cell line containing one normal chromosome 10. MK, C198b and WM are three normal unrelated individuals. PCR primers N17 and N18 (30) were used to amplify an ~1.2 kb fragment from the template DNA. The PCR products were digested with RsaI, and the fragment containing the conserved core AT28 region subcloned for sequence determination of copy number. Where two chromosomes 10 are present in total human genomic DNA templates, only one AT28 allele was subcloned and analysed.

Human transposable elements

Human transposable elements include all interspersed repetitive sequences and comprise 36% of the human genome. The most common of these are Alu elements which account for ~10% of human DNA (44). Using the web-based search tool ‘Repeat Masker’ (45), all human transposable elements in the NC sequence were identified and classified (Table 4). The observed proportion of NC DNA that is composed of transposable elements was compared with that observed when >7 Mb of random human genomic sequences was screened (44). Overall, with ~40% of the NC sequence being composed of these elements, the representation of human transposable elements in this sequence is not greatly different from that found in the normal human genome. When the individual elements are considered, a slight increase is seen in Alu and MIR, with a much more significant increase observed in the non-mariner DNA transposons and the HERV component of the LTR elements. These increases are counterbalanced by a significant reduction in the level of the LINE1 sequences, although such an alternate distribution of SINE and LINE elements is commonly seen within the human genome (46,47). As shown in Figure 5 and Table 4, although the transposable elements are distributed broadly across the entire NC sequence, significant clustering of these elements within certain segments is discernible, especially around positions 0-10 kb and 45-65 kb. Thus, whilst the abundance of these sequences within the NC DNA appears normal, the possibility that their specific distribution pattern may be important for neocentromere activity should be considered.


Figure 5. Distribution of human transposable elements in the NC DNA. Positions of these elements are shown by the vertical bars. Each bar represents the approximate proportion (not to scale) of regions composed of transposable elements.

DISCUSSION

In this study, we have elucidated the full nucleotide sequence of the 80 kb NC DNA derived from the analphoid neocentromere at 10q25.2 (28-30). The study represents the first detailed structural characterization of the core centromere antigen-binding domain of any neocentromere DNA. The analysis has established the absence of [alpha]-satellite, [beta]-satellite, [gamma]-satellite, 48 bp repeats and ATRS in the NC sequence. Furthermore, although sequence motifs for the putative [alpha]-satellite DNA-binding proteins, pJ[alpha] (8,10) and CENPB, and the classical satellites IA- and III-related pericentric sequences (36,38) were detected, their relatively low abundance and non-tandem nature within the NC DNA indicate that they are unlikely to mimic any possible role these sequences may have in the normal centromere (11,37).

In addition to CENPB and pJ[alpha], at least three other known DNA-binding proteins have been proposed to be associated with the eukaryotic centromere. These are centromere proteins A (CENPA), HMGI and topoII. CENPA is a member of a growing class of proteins referred to as histone H3-like proteins (48-50). The protein is found in association with histone H4 and the other core histones (51,52), and is postulated to act as a histone H3 homologue by replacing one or both copies of histone H3 in centromeric nucleosomes (48). The DNA recognition sequence(s) of CENPA have not been defined and could not be included in the present analysis. HMG proteins are abundant, heterogeneous and non-histone components of chromatin (53). These proteins have been shown to interact with the minor groove of the DNA helix, bind to irregular DNA structures and, via their capacity to bend DNA, are thought to facilitate the formation of higher-order nucleoprotein complexes (54). HMGI binds to sequences where there are stretches of six or more A/T nucleotides (40,55). This protein co-localizes with the G-band as well as the centromeric and telomeric regions of mouse and human metaphase chromosomes (56). More specifically, HMGI has been shown to bind to the 172 bp [alpha]-satellite DNA repeat of the African green monkey in vitro (57). TopoII, on the other hand, works by cleaving and opening one DNA helix transiently, passing a second intact DNA helix through the opening, and then resealing the break (58-62). Through this ability to alter DNA topology, topoII is thought to have a role in a host of cellular processes requiring the modulation of double-stranded DNA, including chromosome condensation and segregation during mitosis. The temporal and spatial distribution of topoII on the chromosome scaffold, including that of the centromere, suggest a role for this protein in the integrity of the centromere structure and/or its function (63,64). Scaffold attachment regions (SARs), which contain topoII motifs, have also been found within the [alpha]-satellite sequences of chromosome 1 (65). This, together with our own observation that the vertebrate topoII motif is present at >85% homology match in the [alpha]-satellite consensus sequence, and at 100% match to a number of other non-chromosome 1-specific [alpha]-satellite sequences, suggests that DNA binding of this protein to normal centromeres may be mediated by a typical vertebrate topoII consensus sequence in most [alpha]-satellite monomers. Our analysis has indicated that the NC sequence contains the expected abundance of putative binding motifs for both HMGI and topoII. Overall, these two motifs are distributed throughout the NC DNA relatively randomly and, therefore, are unlikely to exert any major influence on neocentro-mere formation. However, it is possible that some localized concentrations of these motifs (at position ~15 kb for HMGI and at the p[prime] terminal segment of the NC sequence for topoII) may have an architectural or regulatory role on the modelling of the NC region into a functional higher-order neocentromere structure.

Despite the observed lack of evolutionary conservation of the centromere at the nucleotide sequence level [the CEN-DNA paradox (5)], a repetitive and high A + T content appears to be a recurring theme in many centromeric sequences from wide-ranging organisms (for examples, see refs 4,66-72), suggesting that AT-richness could be important for centromere function. Although the NC sequence as a whole does not show a higher than expected A + T content, the q[prime] end of the sequence demonstrates a significantly higher level of AT-rich islands than the remaining regions. Of particular interest is the AT28 region which constitutes an ~600 bp stretch of a tandemly repeated sequence that is comprised of >80% of A + T nucleotides. This region contains a basic 28 bp repeating unit whose palindromic and mirror-image motifs may form interesting secondary structures. The functional significance of this structure and those of the other AT-rich islands is unclear, although it is possible that through some as yet unknown pattern of distribution of these AT-rich islands, a critical A + T constellation or threshold may exist that facilitates the neocentromeric activity of the NC DNA.

Our analysis has defined the positions and nucleotide sequences of various low copy number tandem repeats within the NC DNA. A number of these (e.g. T3, AT4, AT28, T5, T6, T7, T8, T11, AT5 and T20-T23; Table 2) are very high in A + T content and will clearly contribute to the AT-rich islands discussed earlier. On the other hand, several of the repeat sequences (e.g. GA1, GA2, CT1 reverse complement, GA4, GA5 and GA3; Table 2) are extremely rich in purine residues. It is interesting that a purine-based AAGAG satellite sequence has been identified within the defined functional centromeric region of the Drosophila Dp1187 minichromosome (72). Whether the various low copy number purine-rich tandem repeats seen in the NC sequence have any functional relevance akin to that of the Drosophila minichromosome is unclear. Furthermore, although these tandem repeats are much smaller than those seen in normal centromeres, they may nonetheless be important for the formation of essential secondary structures necessary for neocentromere function.

Table 4. Composition of human transposable elements in NC DNA
  SINEs LINEs DNA transposons Elements with LTRs Unclassified elements Total transposable elements
Alu MIR LINE1 LINE2 Mariner Others HERVs MaLRs Others
Human genome 10 1.7 14.6 2.1 0.1 1.5 1.3 2.6 0.7 0.8 35.5
NC DNA
(80.15 kb)
12.26 3.48 4.85 2.87 0.09 5.73 7.66 2.68 0.83 0 40.48
0-10 13.11 3.21 16 1.04 0 25.6 3.93 0 0 0 62.9
9-19 14.63 3.9 0 6.27 0 0 0 0 0 0 24.8
18-28 14.88 2.37 5.54 7.21 0 2.92 0 3.56 0 0 36.48
27-37 17.75 4.43 0 4.52 0 5.35 0 17.3 0 0 49.3
36-46 8.88 6.76 0 2.6 0 2.84 0 0 0 0 21.08
45-55 15.78 3.11 19.58 2.13 0 1.99 11.1 0 0 0 53.7
54-64 5.97 2.87 0 0.53 0 3.11 53.9 0 1.3 0 67.66
63-73 4.87 3.01 0 0 0.79 9.78 0 3.32 6.18 0 28.05
72-80.15 14.53 1.68 2.59 1.57 0 0 0 0 0 0 20.37
Repeat Masker (45) was used to search for human transposable elements. The overall composition (%) of the entire NC DNA (top section of the table) as well as those for 10 kb overlapping fragments (bottom section of table) are compared with those expected for a random DNA sequence in the human genome (44). SINEs include Alu and medium reiterated repeats (MIR). LINEs include all LINE1 and LINE2 elements. DNA transposons include the mariner elements and other non-mariner DNA transposons. Elements with LTRs refer to the retroviral LTRs (HERVs), mammalian long terminal repeats (MaLRs) and other elements not covered by these two categories. The last column shows the total composition of transposable elements in the NC DNA and each of the 10 kb subregions.

The Dp1187 minichromosome has also been shown to contain a substantial level of transposable elements (72). A detailed search for corresponding human elements has revealed their presence in ~40% of the NC DNA sequence. Whilst this level of transposable elements is not outstandingly different from that found in the human genome, some regional clustering is observed in the NC DNA. This itself is also not uncommon in the human genome and can result in a predisposition to instability of certain chromosomal segments (46). This instability may have implications for neocentromere activation. Furthermore, although a number of previous studies have described the presence of transposable elements in the normal human centromere regions (12,3,74), centromeric heterochromatin is thought to be poor in these elements (75). Whether the transposable elements play any role in neocentromere function is not known, but such elements clearly are not detrimental to this function.

In this study, we have characterized the core centromere protein-binding domain of our neocentromere. We were unable to detect any large blocks of tandemly repeated DNA, the single most prevalent feature of all eukaryotic centromeres (except that of Saccharomyces cerevisiae) studied to date. Whilst the sequence (or subregions of it) loosely demonstrate some centromeric features, such as enrichment for AT-rich islands, and the presence of transposable elements as well as interspersed stretches of highly AT- and purine-rich tandem repeats, no single commanding feature comparable with those of any known centromeric DNA is apparent. Importantly, this study indicates that a large tandem array of a particular sequence motif is not essential for centromere activity. This outcome adds to the mounting evidence and belief that there is no ubiquitous ‘magic’ centromere sequence for the eukaryotic centromeres. Indeed, based on existing information, two speculations can be put forward to account for the sequence nature of eukaryotic centromeres. In the first, rather than having a specific primary sequence requirement, it is the overall nucleotide composition or a particular combination and spatial distribution of different ordinary DNAs that influence centromere conformation and function (5,24,27,72). It may be the presence of lots of different motifs or repeats of broadly varied sequences within a specific distance that provides the signal. In the second perhaps more extreme speculation, there may not even be any compositional, combinational or spatial requirement for an ‘ordinary’ DNA sequence to be transformed into a centromere or neocentromere. In recent years, increasing attention has been turned to the possible role of epigenetic factors in influencing centromere and neocentromere activities (24-27,29), although what these factors are remains a subject for further work. The detailed study of centromeric structures and sequences is clearly important for understanding the molecular basis of centromere function. The results presented here provide one such study for a neocentromere with which others can be compared.

MATERIALS AND METHODS

Generation of the NC DNA sequence

Overlapping YAC and BAC clones spanning the neocentromere (30) and total genomic DNA isolated from a somatic hybrid cell line (BE2C1-18-5f) containing the mardel(10) chromosome in a CHOK1 rodent background (29) were used as templates for subcloning the NC DNA. Long (2-4 kb) PCR products were generated using the Long Template PCR kit (Boehringer Mannheim). These products were then purified using the High-Pure kit (Boehringer Mannheim) and either subcloned into pGEM-T (Promega) and end-sequenced with vector-specific primers or sequenced directly with internal primers designed from the generated sequences. Large cloned products were subcloned further into smaller fragments utilizing restriction endonuclease sites or, for larger clones, nested deletions were performed using the Erase-a-base nested deletion kit (Promega). Automated sequencing was carried out using ABI Prism cycle sequencing and electrophoresed on an ABI377 system according to the manufacturer’s instructions. Sequences were edited and contigs assembled using Sequencher software (Gene Codes). To determine the presence of EST matches to the NC DNA, a PowerBLAST search (76) of the human EST databases was conducted on a UNIX interface at the Australian National Genome Information Service, ANGIS (77).

Calculation of normal A + T content and expected abundance of motifs

A total of 10 098 857 bp (>10 Mb) of human genomic sequences randomly derived from 65 distinct loci were selected from GenBank. The accession numbers for these sequences are: AC000377, AC002127, AC002308, AC002366, AC002380, AC002996, AC003016, AC002427, AC004770, AC004501, AC000026, AC000100, AC002368, AC002381, AC002382, AC002383, AC002385, AC002462, AC002463, AC002487, AC002524, AC003015, AC003037, AC003099, AC003684, AC003986, AC004003, AC004083, AC004226, AC004552, AC004615, AC004673, AC004782, AC005220, AC005368, AC005392, AF017257, AF064858, HS164C20, HS175E3, HS390C10, HS445C9, HS57G9, HS941F9, HSAC002064, HSAC002087, HSAF001550, HSU91323, HSU95738, HUAC002041, AC003100, AC003666, AC003685, AC003973, AC003990, AC004038, AC004103, AC004254, AC004554, HUAC004158, HUAC004382, HUAC004531, HUAC004605, HUU91321 and HUU91324. The A + T content was determined for the listed sequences and used to calculate an average percentage of A + T nucleotides over the >10 Mb of random human genomic sequences. The abundance of CENPB, pJ[alpha], satellite IA, satellite III, topoII and HMGI-motifs was also determined for each of the listed sequences. From this, the average number of motifs per kb was determined and thus the expected number for 80 155 - (motif size - 1) bp, corresponding to the size of the NC DNA, was established. The formula for size = 80 155 - (motif size - 1) bp was used because 80 155 is the size of the NC DNA, and, for example, there would be 80 151 copies of a 5mer sequence in 80 155 bp. The expected abundances of the other sequences listed in Table 1 were not determined as they were too large for this analysis.

Computational analysis

The composition of the NC DNA was determined using COMPOSITION analysis (78). Searches for matches to small (<30 bp) sequence motifs were carried out using FIND-PATTERNS and EWINDOWS/ESTATPLOT of the WGCG Software Package (78) on a web-based interface at ANGIS (77). For the determination of homology to sequences >30 bp in size, we created our own comprehensive database for all known centromere-derived DNA sequences on the GenBank database using CreateDB (78). This was done by searching GenBank for all sequence entries with the word ‘centromere’ and selecting only those that were derived from the centromere DNA of an organism presented. BLAST and FASTA searches of the NC DNA sequence were then performed on this centromere database, using default parameters. AT-rich islands were determined with BASEPAIRPLOT (78) and tandem repeats with TANDEM (78). Palindromes were identified with STEMLOOP (78). Mirror repeats were identified by eye. The web-based search tool ‘Repeat Masker’ (45) was used in identification of human transposable elements and some simple repetitive regions that were not detected by TANDEM.

ACKNOWLEDGEMENTS

We thank Vivien Bonazzi, Bruno Gaeta and Kirsten Balding for advice on computer analyses, and NH&MRC and AMRAD Corp. Ltd for funding support. K.H.A.C. is a Principal Research Fellow of NH&MRC and a Senior Associate of the University of Melbourne.

REFERENCES

1. Brinkley, B.R., Ouspenski, I. and Zinkowski, R.P. (1992) Structure and molecular organization of the centromere/kinetochore complex. Trends Cell. Biol., 2, 14-21.

2. Earnshaw, W.C. and Mackay, A.M. (1994) Role of non-histone proteins in the chromosomal events of mitosis. FASEB J., 8, 947-956. MEDLINE Abstract

3. Pluta, A.F., Mackay, A.M., Ainsztein, A.M., Goldberg, I.G. and Earnshaw, W.C. (1995) The centromere: hub of chromosomal activities. Science, 270, 1591-1594. MEDLINE Abstract

4. Choo, K.H.A. (1997) The Centromere. Oxford University Press, Oxford.

5. Choo, K.H.A. (1997) Centromere DNA dynamics: latent centromeres and neocentromere formation. Am. J. Hum. Genet., 61, 1225-1233. MEDLINE Abstract

6. Muro, Y., Masumoto, H., Yoda, K., Nozaki, N., Ohashi, M. and Okazaki, T. (1992) Centromere protein B assembles human centromeric alpha-satellite DNA at the 17-bp sequence, CENPB-box. J. Cell Biol., 116, 585-596. MEDLINE Abstract

7. Pluta, A.F., Saitoh, N., Goldberg, I. and Earnshaw, W.C. (1992) Identification of a subdomain of CENPB that is necessary and sufficient for localization to the human centromere. J. Cell Biol., 116, 1081-1093. MEDLINE Abstract

8. Gaff, C., du Sart, D., Kalitsis, P., Iannello, R., Nagy, A. and Choo, K.H.A. (1994) A novel nuclear protein binds centromeric alpha-satellite DNA. Hum. Mol. Genet., 3, 711-716. MEDLINE Abstract

9. Romanova, L.Y., Deriagan, G.V., Mashkova, T.D., Tumeneva, I.G., Mushegian, A.R., Kisselev, L.L. and Alexandrov, I.A. (1996) Evidence for selection in evolution of alpha-satellite DNA: the central role of CENPB/pJ[alpha] binding region. J. Mol. Biol., 261, 334-340. MEDLINE Abstract

10. Hudson, D.F., Fowler, K.J., Earle, E., Saffery, R., Kalitsis, P., Trowell, H., Hill, J., Wreford, N.G., de Kretser, D.M., Cancilla, M.R., Howman, E., Hii, L., Cutts, S.M., Irvine, D.V. and Choo, K.H.A. (1998) Centromere protein B null mice are mitotically and meiotically normal but have lower body and testis weights. J. Cell Biol., 141, 309-319. MEDLINE Abstract

11. Lee, C., Wevrick, R., Fisher, R.B., Ferguson-Smith, M.A. and Lin, C.C. (1997) Human centromeric DNAs. Hum. Genet., 100, 291-304. MEDLINE Abstract

12. Wevrick, R., Willard, V.P. and Willard, H.F. (1992) Structure of DNA near long tandem arrays of alpha-satellite DNA at the centromeres of human chromosome 7. Genomics, 14, 912-923. MEDLINE Abstract

13. Metzdorf, R., Gottert, E. and Blin, N. (1988) A novel centromeric repetitive DNA from human chromosome 22. Chromosoma, 97, 154-158. MEDLINE Abstract

14. Mullenbach, R., Lutz, S., Holzmann, K., Dooley, S. and Blin, N. (1992) A non-alphoid repetitive DNA sequence from human chromosome 21.Hum. Genet., 89, 519-523. MEDLINE Abstract

15. Cooper, K.F., Fisher, R.B. and Tyler-Smith, C. (1992) Structure of the pericentric long arm region of the human Y chromosome. J. Mol. Biol., 228, 421-432. MEDLINE Abstract

16. Potter, S.S. (1984) Rearranged sequences of a human KpnI element. Proc. Natl Acad. Sci. USA, 81, 1012-1016. MEDLINE Abstract

17. Higgins, M.J., Wang, H.S., Shtromas, I., Haliotis, T., Roder, J.C., Holden, J.J. and White, B.N. (1985) Organisation of a repetitive 1.8 kb KpnI sequence localized in the heterochromatin of chromosome 15. Chromosoma, 93, 77-86. MEDLINE Abstract

18. Tyler-Smith, C., Oakey, R.J., Larin, Z., Fisher, R.B., Crocker, M., Affara, N.A. et al.) (1993) Localization of DNA sequences required for human centromere function through an analysis of rearranged Y chromosomes. Nature Genet., 5, 368-375. MEDLINE Abstract

19. Brown, K.E., Barnett, M.A., Burgtorf, C., Shaw, P., Buckle, V.J. and Brown, W.R. (1994) Dissecting the centromere of the human Y chromosome with cloned telomeric DNA. Hum. Mol. Genet., 3, 1227-1237. MEDLINE Abstract

20. Farr, C.J., Bayne, R.A.L., Kipling, D., Mills, W., Critcher, R. and Cooke, H.J. (1995) Generation of a human X-derived minichromosome using telomere-associated chromosome fragmentation. EMBO J., 14, 5444-5454. MEDLINE Abstract

21. Harrington, J.J., Van Bokkelen, G., Mays, R.W., Gustashaw, K. and Willard, H.F. (1997) Formation of de novo centromeres and construction of first-generation human artificial chromosomes. Nature Genet., 15, 345-355. MEDLINE Abstract

22. Ikeno, M., Grimes, B., Okazaki, T., Nakano, M., Saitoh, K., Hoshino, H., McGill, N.I., Cooke, H. and Masumoto, H. (1998) Construction of YAC-based mammalian artificial chromosomes. Nature Biotech., 16, 431-439.

23. Depinet, T.W., Zackowski, J.L., Earnshaw, W.C., Kaffe, S., Sekhon, G.S., Stallard, R., Sullivan, B.A., Vance, G.H., Van Dyke, D.L., Willard, H.F., Zinn, A.B. and Schwartz, S. (1997) Characterization of neocentromeres in marker chromosomes lacking detectable alpha-satellite DNA. Hum. Mol. Genet., 6, 1195-1204. MEDLINE Abstract

24. Brown, W. and Tyler-Smith, C. (1995) Centromere activation. Trends Genet., 11, 337-339. MEDLINE Abstract

25. Karpen, G.H. and Allshire, R.C. (1997) The case for epigenetic effects on centromere identity and function. Trends Genet., 13, 489-496. MEDLINE Abstract

26. Choo, K.H.A. (1998) Turning on the centromere. Nature Genet., 18, 3-4. MEDLINE Abstract

27. Williams, B.C., Murphy, T.D., Goldberg, M.L. and Karpen, G.H. (1998) Neocentromere activity of structurally acentric minichromosomes inDrosophila. Nature Genet., 18, 30-37. MEDLINE Abstract

28. Voullaire, L.E., Slater, H.R., Petrovic, V. and Choo, K.H.A. (1993) A functional marker centromere with no detectable alpha-satellite, satellite III or CENPB protein: activation of a latent centromere? Am. J. Hum. Genet., 52, 1153-1163. MEDLINE Abstract

29. du Sart, D., Cancilla, M.R., Earle, E., Mao, J., Saffery, R., Tainton, K.M., Kalitsis, P., Martyn, J., Barry, A.E. and Choo, K.H.A. (1997) A functional neocentromere formed through activation of a latent human centromere and consisting of non-alpha-satellite DNA. Nature Genet., 16, 144-153. MEDLINE Abstract

30. Cancilla, M.R., Tainton, K.M., Barry, A.E., Larionov, V., Kouprina, N., Resnick, M., duSart, D. and Choo, K.H.A. (1998) Direct cloning of human 10q25 neocentromere DNA using transformation associated recombination (TAR) in yeast. Genomics, 47, 399-404. MEDLINE Abstract

31. Bernardi, G. (1995) The human genome: organization and evolutionary history. Annu. Rev. Genet., 29, 445-476. MEDLINE Abstract

32. Choo, K.H., Vissel, B., Nagy, A., Earle, E. and Kalitsis, P. (1991) A survey of the genomic distribution of alpha-satellite DNA on all human chromosomes, and derivation of a new consensus sequence. Nucleic Acids Res., 19, 1179-1182. MEDLINE Abstract

33. Kalitsis, P., Earle, E., Vissel, B., Shaffer, L.G. and Choo, K.H.A. (1993) A chromosome 13-specific human satellite I DNA subfamily with minor presence on chromosome 21: further studies on Robertsonian translocations. Genomics, 16, 104-112. MEDLINE Abstract

34. Meyne, J., Goodwin H.E. and Moyzis, R.K. (1994) Chromosome localization and orientation of the simple sequence repeat of human satellite I DNA. Chromosoma, 103, 99-103. MEDLINE Abstract

35. Tagarro, I., Wiegant, J., Raap, A.K., Gonzalez-Aguilera, J.J. and Fernandez-Peralta, A.M. (1994) Assignment of human satellite I DNA as revealed by fluorescent in situ hybridization with oligonucleotides. Hum. Genet., 93, 125-128. MEDLINE Abstract

36. Prosser, J., Frommer, M., Paul, C. and Vincent, P.C. (1986) Sequence relationships of three human satellite DNAs. J. Mol. Biol., 187, 145-155. MEDLINE Abstract

37. Grady, D., Ratliff, R., Robinson, D., McCanlies, E., Meyne, J. and Moyzis, R.K. (1992) Highly conserved repetitive DNA sequences are present at human centromeres. Proc. Natl Acad. Sci. USA, 89, 1695-1699. MEDLINE Abstract

38. Frommer, M., Prosser, J., Tkachuk, D., Reisner, A.H. and Vincent, P.C. (1982) Simple repeated sequences in human satellite DNA. Nucleic Acids Res., 10, 547-563. MEDLINE Abstract

39. Kipling, D., Mitchell, A., Masumoto, H., Wilson, H., Nicol, L. and Cooke, H. (1995) CENPB binds a novel centromeric sequence in the Asian mouse Mus caroli. Mol. Cell. Biol., 15, 4009-4020. MEDLINE Abstract

40. Solomon, M., Strauss, F. and Varshavsky, A. (1986) A mammalian high mobility group protein recognizes any stretch of six A·T base pairs in duplex DNA. Proc. Natl Acad. Sci. USA, 83, 1276-1280. MEDLINE Abstract

41. Spitzner, J.R. and Muller, M.T. (1988) A consensus sequence for cleavage by vertebrate DNA topoisomerase II. Nucleic Acids Res., 16, 5533-5556. MEDLINE Abstract

42. Jeffreys, A.J., Wilson, V. and Thein, S.L. (1985) Hypervariable `minisatellite' regions in human DNA. Nature, 314, 67-73. MEDLINE Abstract

43. Nakamura, Y., Leppert, M., O' Connell, P., Wolff, R., Holm, T., Culver, M., Martin, C., Fujimoto, E., Hoff, M., Kumlin, E. and White, R. (1987) Variable number of tandem repeat (VNTR) markers for human gene mapping. Science, 235, 1616-1622. MEDLINE Abstract

44. Smit, A.F.A. (1996) The origin of interspersed repeats in the human genome. Curr. Opin. Genet. Dev., 6, 743-748. MEDLINE Abstract

45. Smit, A.F.A. and Green, P. Repeat Masker (http://ftp.genome.washington.edu/RM/RepeatMasker.html ).

46. Calabretta, B., Robberson, D.L., Barrera-Saldana, H.A., Lambrou, T.P. and Saunders, G.F. (1982) Genome instability in a region of human DNA enriched in Alu repeat sequences. Nature, 296, 219-225. MEDLINE Abstract

47. Korenberg, J.R. and Rykowski, M.C. (1988) Human genome organization: Alu, LINEs and the molecular structure of metaphase chromosome bands. Cell, 53, 391-400. MEDLINE Abstract

48. Sullivan, K.F., Hechenberger, M. and Masri, K. (1994) Human CENPA contains a histone H3 related histone fold domain that is required for targeting to the human centromere. J. Cell Biol., 127, 581-592. MEDLINE Abstract

49. Wilson, R., Anscough, R., Baynes, C., Berks, M., Bonfield, J., Burton, J., Connell, M., Copsey, T., Cooper, J. et al.) (1994) 2.2 Mb of contiguous nucleotide sequence from chromosome III of C.elegans. Nature, 368, 32-38. MEDLINE Abstract

50. Stoler, S., Keth, K.C., Curnick, K.E. and Fitzgerald-Hayes, M. (1995) A mutation in CSE4, an essential gene encoding a novel chromatin-associated protein in yeast, causes chromosome nondisjunction and cell-cycle arrest at mitosis. Genes Dev., 9, 573-586. MEDLINE Abstract

51. Palmer, D.K. and Margolis, R.L. (1985) Kinetochore components recognized by human autoantibodies are present on mononucleosomes. Mol. Cell. Biol., 5, 173-186. MEDLINE Abstract

52. Palmer, D.K., O'Day, K., Wener, M.H., Andrews, B.S. and Margolis, R.L. (1987) A 17-kD centromere protein (CENPA) copurifies with nucleosome core particles and with histones. J. Cell Biol., 104, 805-815. MEDLINE Abstract

53. Johns, E.W. (1982) The HMG Chromosomal Proteins. Academic Press,New York.

54. Grosschedl, R., Giese, K. and Pagel, J. (1994) HMG domain proteins: architectural elements in the assembly of nucleoprotein structures. Trends Genet., 10, 94-100. MEDLINE Abstract

55. Levinger, L. and Varshavsky, A. (1982) Protein D1 preferentially binds (A+T)-rich satellite DNA. Proc. Natl Acad. Sci. USA, 79, 7152-7156. MEDLINE Abstract

56. Disney, J., Johnson, K., Magnuson, N., Sylvester, S. and Reeves, R. (1989) High mobility group protein HMGI localizes to G/Q- and C-bands of human and mouse chromosomes. J. Cell Biol., 109, 1975-1982. MEDLINE Abstract

57. Straus, F. and Varshavsky, A. (1984) A protein binds to satellite DNA repeat at three specific sites that would be brought into mutual proximity by DNA folding in the nucleosome. Cell, 37, 889-901. MEDLINE Abstract

58. Earnshaw, W.C. and Heck, M. (1985) Localization of topoisomerase II in mitotic chromosomes. J. Cell Biol., 100, 1716-1725. MEDLINE Abstract

59. Earnshaw, W.C., Halligan, B., Cooke, C.A., Heck, M. and Liu, L. (1985) Topoisomerase II is a structural component of mitotic chromosome scaffolds. J. Cell Biol., 100, 1706-1715. MEDLINE Abstract

60. Gasser, S. and Laemmli, U. (1986) The organization of chromatin loops: characterization of a scaffold attachment site. EMBO J., 5, 511-518.

61. Roca, J. (1995) The mechanisms of DNA topoisomerases. Trends Biochem. Sci., 20, 156-160. MEDLINE Abstract

62. Berger, J., Gamblin, S., Harrison, S. and Wang, J. (1996) Structure and mechanism of DNA topoisomerase II. Nature, 379, 225-232. MEDLINE Abstract

63. Rattner, J.B., Hendzel, M.J., Sommer Furbee, C., Muller, M.T. and Bazett-Jones, D.P. (1996) Topoisomerase II[alpha] is associated with the mammalian centromere in a cell cycle- and species-specific manner and is required for proper centromere/kinetochore structure. J. Cell Biol., 134, 1097-1107. MEDLINE Abstract

64. Sumner, A. (1996) The distribution of topoisomerase II on mammalian chromosomes. Chromosome Res., 4, 4-5.

65. Strissel, P.L., Espinosa, R. III, Rowley, J.D. and Swift, H. (1996) Scaffold attatchment regions in centromere-associated DNA. Chromosoma, 105, 122-133. MEDLINE Abstract

66. Rattner, J.B. (1991) The structure of the mammalian centromere. BioEssays, 13, 51-56. MEDLINE Abstract

67. Schulman, I. and Bloom, K.S. (1991) Centromeres: an integrated protein:DNA complex required for chromosome movement. Annu. Rev. Cell Biol., 7, 311-336. MEDLINE Abstract

68. Alfenito, M.R. and Birchler, J.A. (1993) Molecular characterization of a maize B chromosome centric sequence. Genetics, 135, 589-597. MEDLINE Abstract

69. Clarke, L., Baum, M., Marschall, I.G., Ngan, V.K. and Steiner, N.C. (1993) Structure and function of Schizosaccharomyces pombe centromeres. Cold Spring Harbor Symp. Quant. Biol., 58, 687-695. MEDLINE Abstract

70. Tyler-Smith, C. and Willard, H.F. (1993) Mammalian chromosome structure. Curr. Opin. Genet. Dev., 3, 390-397. MEDLINE Abstract

71. Brown, K.E., Barnett, M.A., Burgtorf, C., Shaw, P., Buckle, V.J. and Brown, W.R. (1994) Dissecting the centromere of the human Y chromosome with cloned telomeric DNA. Hum. Mol. Genet., 3, 1227-1237. MEDLINE Abstract

72. Sun, X., Wahlstrom, J. and Karpen, G. (1997) Molecular structure of a functional Drosophila centromere. Cell, 91, 1007-1019. MEDLINE Abstract

73. Behrens, F., Claussen, U., Iyer, L.M., Green, E.D., Horsthemke, B., Williamson, R., Huxley, C. and Coutelle, D. (1997) Isolation of DNA from the centromere of human chromosome 7 by microdissection. Chromosome Res., 5, 215-220. MEDLINE Abstract

74. Greig, G.M. and Willard, H.F. (1992) Beta-satellite DNA: characterization and localization of two subfamilies from the distal and proximal short arms of the human acrocentric chromosomes. Genomics, 12, 573-580. MEDLINE Abstract

75. Moyzis, R.K., Torney, D.C., Meyne, J., Buckingham, J.M., Wu, J.-R., Burks, C., Sirotkin, K.M. and Goad, W.B. (1989) The distribution of interspersed repetitive DNA sequences in the human genome. Genomics, 4, 273-289. MEDLINE Abstract

76. Zhang, J. and Madden, T.L. (1997) PowerBLAST: a new network BLAST application for interactive and automated sequences analysis and annotation. Genome Res., 7, 649-656. MEDLINE Abstract

77. Australian National Genome Information Service (ANGIS) (http://mel1.angis.org.au/ ).

78. Genetics Computer Group (1994) Program Manual for the Wisconsin Package, Version 8.


*To whom correspondence should be addressed. Tel: +61 3 9345 5045; Fax: +61 3 9348 1391; Email: choo@cryptic.rch.unimelb.edu.au


This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: www-admin{at}oup.co.uk
Last modification: 4 Feb 1999
Copyright©Oxford University Press, 1999.

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Genome ResHome page
A. E. Hall, G. C. Kettler, and D. Preuss
Dynamic evolution at pericentromeres
Genome Res., March 1, 2006; 16(3): 355 - 364.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
S. E. Hall, S. Luo, A. E. Hall, and D. Preuss
Differential Rates of Local and Global Homogenization in Centromere Satellites From Arabidopsis Relatives
Genetics, August 1, 2005; 170(4): 1913 - 1927.
[Abstract] [Full Text] [PDF]


Home page
J HeredHome page
G. C. Ferreri, D. M. Liscinsky, J. A. Mack, M. D. B. Eldridge, and R. J. O'Neill
Retention of Latent Centromeres in the Mammalian Genome
J. Hered., May 1, 2005; 96(3): 217 - 224.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
H. Sumer, R. Saffery, N. Wong, J. M. Craig, and K. H. A. Choo
Effects of Scaffold/Matrix Alteration on Centromeric Function and Gene Expression
J. Biol. Chem., September 3, 2004; 279(36): 37631 - 37639.
[Abstract] [Full Text] [PDF]


Home page
J HeredHome page
R. J. O'Neill, M. D. B. Eldridge, and C. J. Metcalfe
Centromere Dynamics and Chromosome Evolution in Marsupials
J. Hered., September 1, 2004; 95(5): 375 - 381.
[Abstract] [Full Text] [PDF]


Home page
GENES CELLSHome page
C. Obuse, H. Yang, N. Nozaki, S. Goto, T. Okazaki, and K. Yoda
Proteomics analysis of the centromere complex from HeLa interphase cells: UV-damaged DNA binding protein 1 (DDB-1) is a component of the CEN-complex, while BMI-1 is transiently co-localized with the centromeric region in interphase
Genes Cells, February 1, 2004; 9(2): 105 - 120.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
D. Schindelhauer and T. Schwarz
Evidence for a Fast, Intrachromosomal Conversion Mechanism From Mapping of Nucleotide Variants Within a Homogeneous alpha -Satellite DNA Array
Genome Res., December 1, 2002; 12(12): 1815 - 1826.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Pathol.Home page
H. Tsuda, T. Takarabe, Y. Kanai, T. Fukutomi, and S. Hirohashi
Correlation of DNA Hypomethylation at Pericentromeric Heterochromatin Regions of Chromosomes 16 and 1 with Histological Features and Chromosomal Abnormalities of Human Breast Carcinomas
Am. J. Pathol., September 1, 2002; 161(3): 859 - 866.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
Y. Saito, Y. Kanai, M. Sakamoto, H. Saito, H. Ishii, and S. Hirohashi
Overexpression of a splice variant of DNA methyltransferase 3b, DNMT3b4, associated with DNA hypomethylation on pericentromeric satellite regions during human hepatocarcinogenesis
PNAS, July 23, 2002; 99(15): 10060 - 10065.
[Abstract] [Full Text] [PDF]


Home page
Mol. Cell. Biol.Home page
S. Ando, H. Yang, N. Nozaki, T. Okazaki, and K. Yoda
CENP-A, -B, and -C Chromatin Complex That Contains the I-Type {alpha}-Satellite Array Constitutes the Prekinetochore in HeLa Cells
Mol. Cell. Biol., April 1, 2002; 22(7): 2229 - 2241.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
A. S. Kondrashov and S. A. Shabalina
Classification of common conserved sequences in mammalian intergenic regions
Hum. Mol. Genet., March 1, 2002; 11(6): 669 - 674.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
K. A. Maggert and G. H. Karpen
The Activation of a Neocentromere in Drosophila Requires Proximity to an Endogenous Centromere
Genetics, August 1, 2001; 158(4): 1615 - 1628.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
H. F. Willard
Neocentromeres and human artificial chromosomes: An unnatural act
PNAS, May 8, 2001; 98(10): 5374 - 5376.
[Full Text] [PDF]


Home page
JCBHome page
R. D. Shelby, K. Monier, and K. F. Sullivan
Chromatin Assembly at Kinetochores Is Uncoupled from DNA Replication
J. Cell Biol., November 27, 2000; 151(5): 1113 - 1118.
[Abstract] [Full Text] [PDF]


Home page
J. Med. Genet.Home page
A. E COCKWELL, B. GIBBONS, I. E MOORE, and J. A CROLLA
An analphoid supernumerary marker chromosome derived from chromosome 3 ascertained in a fetus with multiple malformations
J. Med. Genet., October 1, 2000; 37(10): 807 - 810.
[Full Text]


Home page
Hum Mol GenetHome page
T. A. Ebersole, A. Ross, E. Clark, N. McGill, D. Schindelhauer, H. Cooke, and B. Grimes
Mammalian artificial chromosome formation from circular alphoid input DNA does not require telomere repeats
Hum. Mol. Genet., July 1, 2000; 9(11): 1623 - 1631.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
K. A. Maggert and G. H. Karpen
Acquisition and Metastability of Centromere Identity and Function: Sequence Analysis of a Human Neocentromere
Genome Res., June 1, 2000; 10(6): 725 - 728.
[Full Text]


Home page
Genome ResHome page
A. E. Barry, M. Bateman, E. V. Howman, M. R. Cancilla, K. M. Tainton, D. V. Irvine, R. Saffery, and K.H. A. Choo
The 10q25 Neocentromere and its Inactive Progenitor Have Identical Primary Nucleotide Sequence: Further Evidence for Epigenetic Modification
Genome Res., June 1, 2000; 10(6): 832 - 838.
[Abstract] [Full Text]


Home page
Proc. Natl. Acad. Sci. USAHome page
E. V. Howman, K. J. Fowler, A. J. Newson, S. Redward, A. C. MacDonald, P. Kalitsis, and K. H. A. Choo
Early disruption of centromeric chromatin organization in centromere protein A (Cenpa) null mice
PNAS, February 1, 2000; 97(3): 1148 - 1153.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
J. Koch
Neocentromeres and alpha satellite: a proposed structural code for functional human centromere DNA
Hum. Mol. Genet., January 22, 2000; 9(2): 149 - 154.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
R. Saffery, D. V. Irvine, B. Griffiths, P. Kalitsis, L. Wordeman, and K.H. A. Choo
Human centromeres and neocentromeres show identical distribution patterns of >20 functionally important kinetochore-associated proteins.
Hum. Mol. Genet., January 22, 2000; 9(2): 175 - 185.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
E. Earle, A. Saxena, A. MacDonald, D. F. Hudson, L. G. Shaffer, R. Saffery, M. R. Cancilla, S. M. Cutts, E. Howman, and K. H. A. Choo
Poly(ADP-ribose) polymerase at active centromeres and neocentromeres at metaphase
Hum. Mol. Genet., January 22, 2000; 9(2): 187 - 194.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
S. Henikoff, K. Ahmad, J. S. Platero, and B. van Steensel
From the Cover: Heterochromatic deposition of centromeric histone H3-like proteins
PNAS, January 18, 2000; 97(2): 716 - 721.
[Abstract] [Full Text] [PDF]


Home page
J. Cell Sci.Home page
E Csonka, I Cserpan, K Fodor, G Hollo, R Katona, J Kereso, T Praznovszky, B Szakal, A Telenius, G deJong, et al.
Novel generation of human satellite DNA-based artificial chromosomes in mammalian cells
J. Cell Sci., January 9, 2000; 113(18): 3207 - 3216.
[Abstract] [PDF]


Home page
Genome ResHome page
G. Montefalcone, S. Tempesta, M. Rocchi, and N. Archidiacono
Centromere Repositioning
Genome Res., December 1, 1999; 9(12): 1184 - 1188.
[Abstract] [Full Text]


Home page
Genome ResHome page
A. W.I. Lo, G. C.-C. Liao, M. Rocchi, and K.H. A. Choo
Extreme Reduction of Chromosome-Specific alpha -Satellite Array Is Unusually Common in Human Chromosome 21
Genome Res., October 1, 1999; 9(10): 895 - 908.
[Abstract] [Full Text]


Home page
Genome ResHome page
A. W.I. Lo, D. J. Magliano, M. C. Sibson, P. Kalitsis, J. M. Craig, and K.H. A. Choo
A Novel Chromatin Immunoprecipitation and Array (CIA) Analysis Identifies a 460-kb CENP-A-Binding Neocentromere DNA
Genome Res., March 1, 2001; 11(3): 448 - 457.
[Abstract] [Full Text]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (68)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Barry, A. E.
Right arrow Articles by Choo, K. H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Barry, A. E.
Right arrow Articles by Choo, K. H.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?