Facioscapulohumeral muscular dystrophy (FSHD) is an autosomal dominant neuromuscular disease that has been linked to deletions within a tandem array of 3.2 kb repeats adjacent to the telomere of 4q. These repeats are also present in other locations in the human genome, including the short arms of all the acrocentric chromosomes. Here, we examine two models for the role of this repeat in FSHD. First, because of the extensive similarity between the 3.2 kb repeats on 4q and those adjacent to rDNA on the acrocentric chromosomes, we investigated whether the FSHD region on 4q is involved in sub-nuclear localization, specifically to the nucleolus. The results likely exclude any involvement of nucleolar localization in the development of FSHD. Second, we investigated a model that suggests that a functional gene may be buried within the tandem array of 3.2 kb repeats. Toward this end, we evaluated the evolutionary conservation of the repeat and a double homeodomain sequence within the repeat in a variety of primate species. The genomic organization of the 3.2 kb repeat in humans, great apes and lower primates identified the FSHD-associated repeat on chromosome 4q as the likely ancestral copy. The sequence of the rhesus monkey double homeodomain reveals significant sequence identity with the human 4q sequence. These results strongly suggest a functional role for a component of the FSHD-associated repeat.
Facioscapulohumeral muscular dystrophy (FSHD) is an autosomal dominant neuromuscular disease which primarily affects the shoulder girdle and the periorbital and perioral muscles of the face (1 ). Retinal vasculopathy and sensorineural hearing loss are also often part of the clinical presentation (2 -4 ). Age of onset is generally in the second decade of life, although several infantile onset cases have been reported with the same range of clinical signs and symptoms (5 ). Penetrance is estimated at <5% for ages 0-4 years, and increases steadily to 95% by the late teens (6 ).
FSHD has been linked to the most distal genetic markers on the long arm of chromosome 4 (7 ). A probe designated p13E-11 detects an EcoRI fragment length polymorphism which segregates with the disease (8 ). Physical maps place this EcoRI fragment distal to all known markers on chromosome 4q (9 -11 ). The FSHD region is thus defined by the boundaries of this EcoRI fragment which ranges from ~40 to 200 kb in the normal population (8 ). Within this EcoRI fragment lies a tandem array of 3.2 kb repeats (8 ). A single unit of this repeat has been sequenced and found to be precisely 3254 bp (12 ). High resolution fluorescence in situ hybridization (FISH) places this 3.2 kb repeat array immediately adjacent to the telomere of 4q (13 ).
A decrease in the number of these tandem repeats accounts for the variation in size of the EcoRI fragment in both familial and sporadic cases of FSHD (14 ). The repeat contains several interesting sequence components, including a class of heterochromatic elements, LSau, and a double homeodomain (12 ,15 ). The homeodomains are separated by a 45 nucleotide spacer and the entire sequence is in an open reading frame (ORF) (12 ,15 ). In the human genome, the 3.2 kb repeat is represented on several chromosomes other than 4q, although the homeodomains are not in an ORF at these other loci (8 ,12 ,15 ,16 ). The short arms of all of the acrocentric chromosomes contain this repeat, in the heterochromatic regions adjacent to the rDNA gene clusters (8 ,15 ). The 3.2 kb tandem repeat is also present on human chromosomes 1q12 and 10qter (15 ). An overlap in fragment size exists between the 4qter and 10qter repeat sequences, which can complicate FSHD analysis (17 ).
The role of this repeat in the development of FSHD has yet to be determined. Three mechanisms have been proposed for the molecular etiology of FSHD. One possibility is that the heterochromatic elements within the repeat are essential for maintaining or establishing proper chromatin structure in this region of the genome. Deletions of the 3.2 kb subtelomeric repeat may alter local chromatin structure, affecting the expression of nearby genes (12 ,15 ). Such position effects have been demonstrated in mice, Drosophila and yeast (18 -23 ). However, a recently identified gene, FRG1, which maps 100 kb proximal to the repeat, does not demonstrate differences in transcriptional levels between control and FSHD individuals (24 ).
In this study, we examine two additional models for the function of the 3.2 kb repeat in the pathogenesis of FSHD. First, we hypothesize that the 3.2 kb repeat may be involved in nuclear positioning, as this sequence cross-hybridizes to the nucleolar organizing region (NOR) on all of the acrocentric chromosomes (12 ,15 ,25 ). Deletions of several copies of the 3.2 kb repeat sequence at the FSHD locus may disrupt nuclear localization signals required for expression of genes in the region.
A less complex model for the role of the 3.2 kb repeat, however, is that the repeat contains some portion of the FSHD gene itself. The rostro-caudal pattern of affected musculature in FSHD is consistent with involvement of a homeobox gene. A functional protein cannot be generated if deletions of the 3.2 kb repeat remove essential segments of the gene, such as the double homeodomain. The repetitive nature of this subtelomeric region and the representation of these sequences on multiple chromosomes in the human genome has, however, complicated the isolation of candidate genes for FSHD. For this reason, parallel studies were performed in a variety of non-human primates, which demonstrate a lesser degree of 3.2 kb dispersal than humans (12 ). Primate studies were necessary, as the 3.2 kb repeat homeodomain is not present in rodent genomes, such as mouse or rat (12 , S. T. Winokur et al., unpublished data). Interestingly, if a homeodomain is indeed involved, it may be one which is specific to primate development. The tissues involved in FSHD are those which have a distinct function in primates: the facial muscles allow for enhanced communication through facial expression, the shoulder blades and arm sockets are positioned and designed for a wide range of movement above their heads (26 ), and visual acuity (particularly of the retinal fovea) is a distinguishing characteristic of primate development (27 ).
The extensive homology between the 3.2 kb repeats on the acrocentric chromosomes and chromosome 4q, as well as their proximity to rDNA on the acrocentric chromosomes, has led to the postulate that the FSHD-associated repeat may be involved in positioning the 4q telomere near the nucleolus. To test this hypothesis, interphase lymphoblasts from FSHD-affected and unaffected individuals were examined for differences in the position of chromosome 4qter loci in relation to the nucleolus in deleted (FSHD-affected) and normal chromosome 4 homologs.
The nucleoli were identified by a biotin-labeled rDNA probe hybridized to lymphoblast interphase nuclei utilizing FISH (Fig. 1 ). Since unsynchronized lymphoblasts were used for this study, the nucleoli were observed in various stages of the cell cycle. Only nucleoli in which a fully formed nucleolus was demonstrated by a single fused rDNA signal were analyzed. Since the 3.2 kb repeat is found at several loci in addition to 4q, the single copy cosmid D4S139 (c88F8) was used to identify the location of 4qter relative to the nucleolus (15 ). D4S139 resides 180 kb proximal to the 3.2 kb repeat array, a distance in which two signals are barely resolvable in interphase nuclei (28 ).
Several hypotheses have now been proposed for the role of the deleted 3.2 kb repeat in the pathogenesis of FSHD: (i) an alteration in chromatin structure leading to a position effect on the expression of nearby genes; (ii) disruption of nuclear localization signals required for proper expression of the FSHD gene; and (iii) deletion of a portion of the FSHD gene within the tandem array of 3.2 kb repeats. Dominantly inherited disorders such as FSHD are not often the phenotypic manifestation of simple deletions. However, the early embryo often requires precise levels of transcription factors (such as homeobox proteins) for proper development (30 ,31 ). Thus, haploinsufficiency resulting from deletion of one chromosome 4 homolog could explain this mode of transmission. Alternatively, a dominant-negative mutation could be the result of a chimeric protein produced by deletion of sequences within the 3.2 kb repeat.
Figure
The first hypothesis for the 3.2 kb repeat involvement in FSHD suggests that alterations in chromatin structure result from deletions of this repeat. The 3.2 kb repeat lies immediately adjacent to the telomere of 4q and contains heterochromatic sequences. The structural organization of heterochromatic sequences on the short arms of the acrocentric chromosomes may give some insight as to a potential function for the 3.2 kb repeat. While [beta]-satellite and rDNA sequences do not overlap, this study demonstrates that the 3.2 kb repeat is interspersed both within the rDNA gene cluster and [beta]-satellite sequences in these heterochromatic regions of the genome. Thus, the 3.2 kb repeat could function as a spacer between actively transcribed genes such as rDNA and heterochromatin.
Deletions of these heterochromatic repeats may change the chromatin structure in such a way as to affect expression of nearby genes. This hypothesis cannot be examined fully until genes are isolated in the vicinity of the FSHD-associated repeat. The FRG1 gene, which recently has been isolated and maps 100 kb proximal to the 3.2 kb repeat array, does not demonstrate differences in transcription levels between FSHD individuals and controls (24 ). However, genes in closer proximity to the deleted EcoRI fragment may exhibit such a position effect. In addition, genes with relatively weak promoters may exhibit position effects over even greater genomic distances.
The second hypothesis proposes that sequences within the 3.2 kb repeat may be responsible for localizing the 4q telomeric region within the interphase nucleus. In the human genome, heterochromatic sequences within the 3.2 kb repeat are present on the short arms of the acrocentric chromosomes (the NOR) as well as the subtelomeric region of 4q (15 ,29 ). Consequently, we reasoned that the FSHD region might necessitate localization to the nucleolus for proper gene expression. However, when lymphoblasts from FSHD-affected and control individuals were examined for the nuclear sublocalization of 4qter loci, no differences were seen in deleted (FSHD) and control chromosome 4 homologs (Fig. 1 , Table 1 ). Nucleolar localization of the FSHD region can thus be excluded by the data presented in this study. Furthermore, this hypothesis is rendered less likely by the finding that only human and orangutan exhibit synteny between the 3.2 kb repeat and rDNA. If the relationship served some biological purpose it would be expected across all the species studied.
The data presented here most strongly support the third hypothesis, that a portion of the FSHD gene itself resides within the deleted repeat. The 3.2 kb repeat on chromosome 4 contains a 405 nucleotide double homeodomain within a continuous ORF (12 ,15 ). This ORF is seen only in the 3.2 kb repeat emanating from the ancestral copy in the FSHD region on chromosome 4, and not from the other regions of the genome in which this repeat is represented (12 ,16 and S. T. Winokur, unpublished data). Furthermore, analysis of the chromosomal location of this repeat in a wide range of primates reveals that the only consistent location of the 3.2 kb repeat across species is on the distal end of chromosome 4q, the FSHD syntenic region. Thus, the ancestral copy of this repeat contains an ORF and localizes to the syntenic 4q region. Sequence analysis of the double homeodomain from rhesus monkey, in which this region is represented as a predominantly single-copy sequence, reveals a remarkably high degree of conservation with the human 4q sequence throughout the entire 405 nucleotide stretch encoding the double homeodomain. These data support the likelihood of a bona fide homeobox gene residing within the FSHD region.
Figure
Figure
Figure
Figure
Indeed, the clinical pattern of FSHD is consistent with involvement of a homeobox gene. Developmental patterning of the vertebrate body plan is established in great part by the proper expression of homeobox genes (32 ). The muscles involved in FSHD are affected predominantly in a rostro-caudal fashion, with marked weakness of the facial, shoulder girdle and upper arm muscles, although peroneal and anterior tibial weakness in the lower limb can often precede more proximal lower limb weakness (33 ). Other anterior body structures, such as retinal vascular involvement and sensorineural hearing loss, are also often part of the FSHD phenotype (2 -4 ). All of these structures, the musculature of the face and shoulder girdle, the vasculature of the retina and the neuronal innervation of the ear, arise from the same brachial pouch and groove in the developing embryo (34 ). Asymmetry of muscle involvement and occasional loss of a complete muscle group (such as the pectoralis) are often seen as well (33 ). This would be the expected result of a suboptimal level of homeobox gene product (i.e. haploinsufficiency) at a critical time period when these structures are forming (30 ). Finally, although a 95% penetrance level is not seen until after the second decade of life, many infantile and juvenile cases have been documented (5 ). The explanation for this wide range of age of onset awaits full characterization of the gene responsible for FSHD.
The evolutionary distribution of the 3.2 kb repeat among primates lends support to the involvement of a homeobox gene in FSHD. Sequences related to the 4q double homeodomain are present in all primates examined in this study. These sequences are not significantly conserved below the level of New World monkey (12 , and S. T. Winokur, unpublished data). However, the anatomical structures involved in FSHD are those which have evolved along with primates. The facial muscles (particularly the periorbital and perioral muscles) allow for enhanced facial expression and communication, the upper body musculature has adapted to the increased use of the upper body for food gathering, carrying and tool use, and visual acuity has increased along with enhanced hand-eye coordination (35 ).
The mechanisms behind dispersal of the 3.2 kb repeat in the genomes of primate species remain a puzzle. Only the human and orangutan genomes demonstrate extensive radiation of this repeat, even though the great apes (gorilla and chimpanzee) diverged at a later date than the orangutan. These data would imply that amplification of the 3.2 kb repeat occurred on several independent occasions, one of which was following the divergence of the orangutan and another following the divergence of man. Some insight into the mechanism underlying this 3.2 kb repeat dispersal may be gained through the structural organization studies presented here. Since both the 3.2 kb repeat and [beta]-satellite contain heterochromatic sequences and are interspersed within the genome (Fig. 2 ), they may associate in the interphase nucleus. Amplification of these repeats may then occur through transposition or recombination.
More important for identifying the gene responsible for FSHD is the consistent presence of the 3.2 repeat at the terminus of 4q in all primates studied and the conservation of an ORF throughout the double homeodomain even in lower primates (Fig. 7 ). This strongly suggests that 4q represents the ancestral copy of this genomic segment. However, isolation of a full-length homeobox gene from the FSHD region presents a formidable technical challenge. First, as is now well documented, the FSHD region contains multiple repetitive sequences dispersed throughout the human genome (8 ,12 ,15 ). In addition, these sequences yield transcribed pseudogenes from loci other than 4q, complicating identification of FSHD region transcripts (12 ,16 ). Furthermore, if indeed a bona fide homeobox gene is involved in FSHD, it appears to be restricted to primates. As homeobox gene expression is often temporally and spatially limited in the embryo, the appropriate tissue from which to isolate this gene may be difficult to obtain.
The data presented in this study provide compelling evidence for homeobox gene involvement in FSHD and warrant the challenge of its isolation. Consequently, we have chosen to pursue the identification of the ancestral gene in lower primates. To this end, we have embarked on a strategy to isolate cDNAs from genomic clones spanning the 4qter syntenic region from the rhesus monkey. The high degree of sequence conservation between the FSHD region on 4q and that of the rhesus monkey (Fig. 7 ) suggests that this may be a worthwhile endeavor.
A Chinese hamster-human somatic cell hybrid HHW 686, containing chromosomes 5 and 13 as the only human components, was used for the study of the structural organization of the 3.2 kb repeat and rDNA-containing region on the short arm of the acrocentric chromosomes. A lymphoblast cell line HHW 1430 from an FSHD-affected individual was used for the study of nucleolar localization. One of the chromosome 4 homologs in this cell line has a large deletion of several copies of the 3.2 kb repeat in the EcoRI fragment associated with FSHD. A parallel study was performed on a control lymphoblast cell line HHW 1949. Several primate cell lines and lymphoblast cultures were used to examine the evolutionary relatedness and chromosomal origins of the 3.2 kb repeat. These include a human lymphoblast cell line HHW 1430 (described above), fibroblast cell lines from the chimpanzee (pygmy chimp, Pan paniscus), orangutan (Pongo pygmaeus) and marmoset (Callithrix jacchus)derived from samples provided by Oliver Ryder at the San Diego Zoo, and lymphoblastoid cell lines from gorilla (Gorilla gorilla rok) (ATCC CRL 1854) and stump-tail monkey (Coriell Cell Repository GM03443). FISH analysis on rhesus monkey (Macaca mulatta) chromosomes was performed on both lymphocytes from a fresh blood sample generously provided to us by Ron Walgenbach at the UC Davis Regional Primate Center and a cell line derived from fetal retinal epithelial cells (ATCC CRL 1783).
Genomic DNA was obtained from human (Homo sapiens) (Promega), a human-hamster hybrid HHW 416 (11 ) containing an intact chromosome 4 as the only human chromosome in a hamster background, great apes: chimpanzee (P.paniscus), gorilla (G.gorilla rok) and orangutan (P.pygmacus) (Bios Laboratories), Old World monkeys: African Green (Cercopithecus aethiops) (cos7 cell line kidney cells), rhesus (M.mulatta) (Clontech), crab-eating macaque (Macaca fascicularis) (Bios) and stump-tail (ATCC CRL 1854) and a New World monkey (marmoset, Callithrix jacchus) (Bios Laboratories). Ten [mu]g of genomic DNA was digested with either PstI or EcoRI and run on a 0.8% agarose gel in 1* TAE at 110 V for 5 h. DNA was transferred to Nytran nylon membrane and hybridized at 55oC overnight with a 32P random prime-labeled double 4q homeodomain probe (see below). Membranes were washed in 0.1* SSC/0.1%SDS for 1 h and put on X-Omat film overnight. The same blots were stripped in 0.1 M NaOH and rehybridized with the 3.2 kb repeat subclone.
Probes used for FISH analysis include a single unit of the 3.2 kb KpnI repeat, subcloned into Bluescript KSII+ from cosmid 119G6 (D4S809) (15 ). The chromosome 4 origin of the 3.2 kb repeat hybridization signal was verified through co-hybridization experiments with a chromosome 4-specific cosmid, c88F8, from the locus D4S139 (15 ,36 ). The rDNA plasmid clone was kindly provided by Mark Leonard. A biotin-labeled [beta]-satellite DNA probe specific for all the human acrocentric chromosomes was used (Oncor). An 18 kb rhesus genomic clone was isolated from a rhesus monkey genomic EMBL library (Clontech) from the FSHD syntenic region of chromosome 4q. Probes for Southern blot analysis include the 3.2 kb repeat and a PCR product encompassing the entire 405 bp double homeodomain using primers I4qhox5 (5'-GGACGGCGACGGAGACTCGTT-3') and 4qhox3b (5'-ACCCTGTCCCGGTGCCTGGCCCTTCG-3'). PCR was performed in 67 mM Tris-HCl, pH 8, 16.6 mM (NH4)2SO4, 6.7 mM MgCl2, 10 mM [beta]-mercaptoethanol, 10% dimethylsulfoxide and 1.25 mM dNTP at 94oC for 20 s, 58oC for 20 s, 72oC for 20 s (30 cycles) with an initial denaturation at 94oC for 3 min and a final extension at 72oC for 10 min.
The cell lines used for in situ hybridization to metaphase and interphase nuclei were harvested using standard cytogenetic procedures. The linearly extended chromatin from the human-hamster hybrids was prepared as previously described (37 ). A 2 [mu]l cell suspension (2000-4000 cells) in phosphate-buffered saline was placed at one end of a glass slide, air-dried and immediately lysed with 5 [mu]l of a solution of 0.5% SDS/50 mM EDTA/200 mM Tris (pH 7.4). After dissolving for 5 min, the slide was tilted to allow the drop of DNA to run down the slide. The DNA stream was air-dried and fixed to the slide with methanol/acetic acid (3:1) fixative.
Probes for FISH were prepared by nick translation with either biotin-11-dATP (BRL BioNick) or digoxigenin-11-dUTP (Boehringer Mannheim Biochemicals). Hybridization and washes were performed under identical conditions of stringency for all species examined. Slides were hybridized at 37oC overnight with 1-4 ng/[mu]l of each probe, 50% formamide, 10% dextran, 2* SSC and 50 ng/[mu]l Cot-1 DNA to suppress highly repetitive sequence hybridization. Slides were washed twice for 3 min each at 72oC in 0.5* SSC and then placed in 4* SSC. The detection of biotin and digoxigenin was performed with FITC-avidin DCS (Vector Laboratories) and rhodamine anti-digoxigenin Fab fragment (Boehringer Mannheim). Slides were viewed with a Zeiss axiophot epifluorescence microscope equipped with a double band filter (Zeiss 51006) and a triple band filter (Chroma Tech 61002). The images were captured and digitally enhanced using Oncor Imaging system and Adobe Photoshop. Primate chromosomes were identified and named according to their human homologs (38 ,39 ).
An adult rhesus liver genomic library (Clontech) was screened with the 4q double homeodomain as a probe. Then 106 plaque-forming units (p.f.u.) were plated in K803 cells on to NZY plates and lifted on to Dupont NEN nylon filters. Hybridization was carried out at 65oC overnight and filters were washed in 0.5* SSC/0.05% SDS for 1 h. X-Omat film was exposed for 16 h. Secondary p.f.u.s were treated in an identical manner. Large-scale lambda preps (Qiagen) were then performed on isolated clones. Inserts were ligated into pZero (Invitrogen) following digestion with BamHI or XhoI. A rhesus genomic subclone containing the double homeodomain was sequenced with 35S using the standard Sanger dideoxynucleotide sequencing protocol and the Amersham sequencing kit. Primers used for sequencing span the 4q double homeodomain. Sequence comparisons and translation of the 4q and rhesus double homeodomain sequences were performed using MacVector software.
We would also like to thank members of the FSH Consortium for sharing data and information in advance of publication. The authors wish to acknowledge the Muscular Dystrophy Association for their support of this work.
Similar studies of primates using FISH from the laboratory of Dr Jane Hewitt have recently been accepted for publication in the journal Chromosoma.
Human Molecular Genetics
Pages
Introduction
Results
Discussion
Materials And Methods
Primate cell lines and zoo blot
Probes
Fluorescence in situ hybridization
Rhesus genomic clone isolation and sequencing
Acknowledgements
References
Note Added In Proof
REFERENCES
This page is maintained by OUP admin. Last updated Thu Oct 31 15:27:49 GMT 1996. Part of the OUP Journals World Wide Web service.Copyright Oxford University Press, 1996
