Skip Navigation


Human Molecular Genetics Advance Access originally published online on February 19, 2004
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
13/7/683    most recent
ddh091v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (35)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Hao, K.
Right arrow Articles by Xu, X.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Hao, K.
Right arrow Articles by Xu, X.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Human Molecular Genetics, 2004, Vol. 13, No. 7 683-691
DOI: 10.1093/hmg/ddh091

A candidate gene association study on preterm delivery: application of high-throughput genotyping technology and advanced statistical methods

Ke Hao1, Xiaobin Wang2, Tianhua Niu1,3, Xin Xu1, Ang Li2, Weili Chang2, Lin Wang1, Guang Li2, Nan Laird4 and Xiping Xu1,*

1Program for Population Genetics, Harvard School of Public Health, Boston, MA, USA, 2Department of Pediatrics, Boston University Medical Center, Boston, MA and the Mary Ann and J. Milburn Smith Child Health Research Program, Children's Memorial Hospital, Chicago, IL, USA, 3Division of Preventive Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA and 4Department of Biostatistics, Harvard School of Public Health, Boston, MA USA

Received November 12, 2003; Accepted February 4, 2004


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 REFERENCES
 
Preterm delivery (PTD) is the leading cause of infant mortality and morbidity worldwide. The etiology of PTD is largely unknown but is believed to be complex, encompassing multiple genetic and environmental determinants. To date, reports of genetic studies on PTD are sparse. We conducted a large-scale case–control study exploring the associations of 426 single-nucleotide polymorphisms with PTD in 300 mothers with PTD and 458 mothers with term deliveries at the Boston Medical Center. Twenty-five candidate genes were included in the final haplotype analysis, and a significant association of F5 gene haplotype with PTD was revealed and remained significant after Bonferroni correction for multiple testing (P=0.025). We applied different statistical algorithms (both Gibbs sampling and expectation-maximization) in reconstructing haplotype phases and different tests (both likelihood ratio test and permutation test) in association analyses, and all yielded similar results. We also performed exploratory ethnicity-specific analyses, which confirmed the consistent findings of the F5 gene across the ethnic groups. Moreover, IL1R2 (P=0.002 in Blacks), NOS2A (P<0.001 in Whites) and OPRM1 (P=0.004 in Hispanics) gene haplotypes were associated with PTD in specific ethnic groups but not at global significance level. In summary, our results underscore the potentially important role of F5 gene variants in the pathogenesis of PTD, and demonstrate the utility of high-throughput genotyping and a haplotype-based approach in dissecting genetic basis of complex traits.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 REFERENCES
 
Preterm delivery (PTD, <37 weeks of gestation) is the leading cause of neonatal mortality and postnatal morbidity, with an incidence rate as high as 12% in the USA (1).The etiology of PTD remains largely unknown. Most studies have focused on identifying socio-environmental and medical variables for PTD. Although a number of such variables have been identified, most cases of PTD occurring in the general population cannot be explained by such known risk factors (2). Until recently, there has been limited data on whether genetic factors play a role in the pathogenesis of PTD (3). However, it has been increasingly recognized that PTD is a complex trait determined by multiple environmental and genetic factors. The literature provides considerable evidence for a familial or intergenerational influence on both low birthweight (LBW) and PTD (46). Human twin studies showed that the heritability was 17–27% for PTD in an Australian population (7) and 25–40% for gestational length in a Swedish population (8). A few genetic association studies of PTD have been reported. In a study of inner city African American women, the T2 allele of tumor necrosis factor alpha (TNF{alpha})–308 polymorphism (located in the promoter region and associated with an up-regulation of TNF{alpha} gene transcription) was associated with an increased risk of preterm premature rupture of the membranes (PPROM) (9). In another case–control study, this mutation was associated with spontaneous preterm birth in Black mothers (10). Also in an African American population, fetal genotype of a matrix metalloproteinease-1 (MMP-1) mutation was found in association with PPROM, indicating a significant genetic influence on the fetal membrane tensile strength (11). In 80 Turkish mothers, the Glu27Gln mutation in the ß2 adrenergic receptor (ADRB2) was associated with preterm labor (12). This mutation was known for its function of down-regulation of the receptor expression and induction of receptor internalization. Furthermore, there was also evidence of gene–environment interaction. A significant interaction was found between metabolic gene polymorphisms (CYP1A1 HincII RFLP and GSTT1 deletion) and low-level exposure to benzene on shortened gestation among healthy Chinese petrochemical female workers (13). These two gene variants were also shown to interact with maternal smoking during pregnancy in reducing infant birthweight and gestational age in an urban US population (14).

Available epidemiological, clinical and laboratory studies suggest that multiple environmental and genetic factors may affect the risk of PTD independently or interactively via five major pathogenic pathways: (1) intrauterine infection/inflammation; (2) maternal–fetal hypothalamic–pituitary adrenal axis activation; (3) uteroplacental vascular pathology; (4) pathologic uterine contraction; and (5) susceptibility to environmental toxins (15). Ultimately, these pathways converge on final clinical presentations characterized as preterm labor, PPROM, or medical induction due to maternal or fetal health threat; all of these conditions may lead to PTD.

Because of the heterogeneous etiology of PTD, it is necessary to study a large number of important candidate genes in order to systematically examine the genetic influences on PTD. To date, all the published genetic studies of PTD examined only one or a few genes in a given population. In this report, we describe a large-scale candidate gene study of PTD in a high-risk multi-ethnic US population.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 REFERENCES
 
Demographic characteristics
Our study included a total of 758 mothers (300 preterm cases and 458 term controls) who delivered a singleton live birth at the Boston Medical Center (BMC). The cases and controls were well separated (Fig. 1) in terms of gestational age (an average of 33.6 weeks among cases versus an average of 39.7 weeks among controls). As shown in Table 1, more than half of the mothers were Black (n=453), followed by Hispanics (n=194) and Whites (n=111). The Black cases on average had a shorter gestational age (33.3 weeks) compared with White (34.5 weeks) or Hispanic (34.1 weeks) cases. The Black preterm infants also had lower birthweight than their White and Hispanic counterparts. On the other hand, Black mothers had a higher pre-pregnant body weight as well as a greater body mass index than White and Hispanic mothers. White mothers were more likely to be married, and had the highest rates of illicit drug use, cigarette smoking as well as alcohol consumption during pregnancy (Table 1).



View larger version (13K):
[in this window]
[in a new window]
 
Figure 1. Distributions of gestational age in preterm cases and term controls.

 

View this table:
[in this window]
[in a new window]
 
Table 1. Clinical and socio-demographic characteristics of study subjects by ethnicity and case–control status
 
SNP genotyping
We identified 55 candidate genes implicated in five major pathogenic pathways of PTD based on both the biological plausibility and supportive literatures. A total of 232 single-nucleotide polymorphisms (SNPs) among the 426 selected potential SNPs were found to be polymorphic in the study subjects. Of these, 97 SNPs with minor allele frequency (MAF) <10% in Blacks, Whites or Hispanics were removed from subsequent statistical analysis. Moreover, 24 SNPs were excluded from the subsequent analysis due to inconsistent genotype calls among the 50 duplicate samples, or due to deviations from Hardy–Weinberg equilibrium at P<0.001 level. Consequently, a total of 111 SNPs located on 31 candidate genes were available for the genetic association tests.

Comparison of self-reported and predicted ethnicity
We assigned a total of 761 study subjects (455 Blacks, 195 Hispanics and 111 Whites) into K=3 populations with genotype data from the 31 loci using structure algorithm (16). Among the 455 self-reported Blacks, 449 (98.7%) subjects were classified as Black (probability of ‘Black’ population membership, 95.6±6.2%), whereas the remaining six (1.3%) subjects were admixed individuals with the largest population probability being Hispanic/White; among the 195 self-reported Hispanics, 190 (97.4%) subjects were classified as Hispanics (probability of ‘Hispanic’ population membership, 95.3±6.5%), whereas the remaining five (2.6%) subjects were admixed individuals with the largest population probability being Black/White; among the 111 self-reported Whites, 106 (95.5%) subjects were classified as Whites (probability of ‘White’ population membership, 96.3±5.8%), whereas the remaining five (4.5%) subjects were admixed individuals with the largest population probability being Black/Hispanics. Overall, among the total 761 subjects, the self-reported ethnicity information of 745 (97.9%) of the subjects matched with the results of the population classification using the genetic data for the 31 candidate genes. These findings were in agreement with those of Rosenberg et al. (17), who showed that there was a general agreement of genetic and predefined populations, suggesting that self-reported ancestry can facilitate assessments of epidemiological risks. Therefore, self-reported ethnicity groups have accounted for, on average, 95% of the underlying admixtures, and were justified in our analysis.

Pairwise linkage disequilibrium (LD) analysis
Consistent with the previous reports, the pairwise LDs measured in D' decay rapidly with an increasing physical distance. Using the exponential decay model, we found that the intercept (A), as expected, was very close to 1 across all three ethnic groups, but the decay rate (k) was twice as high in Black (0.038±0.005) as in White (0.018±0.002) and Hispanic mothers (0.018±0.003). This observation was in agreement with previous findings that LD in populations of African ancestry is markedly less extensive compared with other ethnic populations (18).

Haplotype-based association test
A haplotype block was defined as containing at least two SNPs. Among the 31 candidate genes, 25 (80.6%) formed one haplotype block (Table 2) and no gene contained more than one blocks. Hence, 25 genes were finally included in the haplotype analyses. Because common haplotypes were mostly shared across different ethnicities (19) (however the frequency of each specific haplotype may differ from ethnicity to ethnicity), as also observed in our study, the same sets of genetic markers were used for Blacks, Whites and Hispanics, rather than confining different sets of markers for different ethnicities. This also enabled us to conduct the global test, with all populations combined into one group in the analysis. The tests used the entire sample, and found factor 5 (F5) gene was associated with PTD with an unadjusted P-value of 0.001 (Table 2), which is significant even after Bonferroni correction for multiple testing (P=0.025). SNPs, which do not belong to any haplotype block, were examined in single marker analysis, however, no significant association was observed.


View this table:
[in this window]
[in a new window]
 
Table 2. Haplotype association of candidate genes with PTD
 
While the association direction of the F5 haplotypes was similar in the three ethnic populations, the strength of the gene effect was found to be more significant among Blacks and Hispanics than among Whites (Table 3). Because it is possible that some genetic variation plays a role only in specific ethnic groups, we also conducted population-stratified analysis on all the other genes. We found suggestive association of interleukin 1 receptor 2 (IL1R2), nitric oxide synthase 2A (NOS2A) and opioid receptor mu 1 (OPRM1) at 0.005 level in Black, White and Hispanic groups, respectively (Table 4).


View this table:
[in this window]
[in a new window]
 
Table 3. Combined (main) and ethnicity-specific (exploratory) haplotype analyses on association of F5 gene with PTD
 

View this table:
[in this window]
[in a new window]
 
Table 4. Ethnicity-specific (exploratory) analyses on association of IL1R2, NOS2A and OPRM1 genes with PTD
 

    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 REFERENCES
 
Little is known about the genetics of PTD. This study examined a large number of SNPs of important candidate genes identified along the five major pathogenic pathways of PTD based on biological plausibility and available literature. Our study, carried out in a multi-ethnic US population, is the largest genetic association study of PTD to date. Our result showed that the F5 gene was significantly associated with PTD. Our further exploratory ethnicity-specific analysis provided suggestive evidence of associations of IL1R2, NOS2A and OPRM1 gene haplotypes with PTD in the specific ethnic groups, but such results were not globally significant. Owing to the reduced sample size in the stratified analysis, we recognized the limited statistical power to detect association with global significance. However, this exploration may offer valuable suggestion for future research.

While further functional studies are needed to fully understand the underlying biological mechanisms of the observed genetic associations, our findings are biological plausible. Uteroplacental vasculopathy plays a pivotal role in PTD (20). All nutrients and oxygen required by the fetus for energy production and growth are transported by the placenta. Like other organs, placental vessels are endothelium-lined and their normal function depends on the balance of procoagulant and anticoagulant mechanisms for damage repair as well as the maintenance of blood fluidity. Pregnancy induces marked changes in the coagulation system. An imbalance in the coagulation system may result in placental thrombosis, infarction or hypoperfusion, which in turn leads to adverse pregnancy outcomes, including PTD (21). The risk is further increased among pregnant women who have acquired or genetic risk factors for thrombosis (22). F5 plays an essential role in the regulation of blood coagulation. Its activated form (Va) acts as a protein cofactor in the prothrombin-activating complex and accelerates Xa-catalyzed conversion of prothrombin into thrombin (2325). F5 polymorphism has been shown to be associated with an increased risk of severe pre-eclampsia and placental infarction (2628). In this present study, a missense mutation (rs6019) located at the 107th amino acid residual of F5 showed consistent association with PTD among all three ethnic populations. This mutation, resulting in a substitution of Asp with His, may modify the protein function and deserves further investigation. We had an insufficient statistical power to examine the effect of the well-documented F5 Leiden mutation (29), due to its rareness in our study sample (<1%).

In regard to IL1R2, NOS2A and OPRM1, their gene products could be biologically relevant to PTD. Intrauterine infection/inflammation is believed to play an important role in PTD (30). Inflammatory cytokines stimulate cascades of inflammatory responses (31). Among them, IL1{alpha} level was previously shown to be elevated in vaginal secretions among women with bacterial vaginosis (31). The amniotic fluids of patients with PTD and intra-amniotic fluid infections also had detectable IL1{alpha} and IL1ß (32). Thus, IL1R2 gene variants might play an important role to inflammatory reactions in PTD. In addition, NOS2A is a key regulator of vascular nitric oxide production and involved in cytokine release and lymphocyte activation, and its molecular variants could be associated with a higher risk of PTD (33). In this study, we found OPRM1 variant is associated with PTD. Environmental exposures such as smoking, benzene exposures and illicit drug use have been associated with PTD (13,14,34). Opioid receptor is involved in our experiences of pain as well as our response to environmental exposure such as drug use.

Our data support the previous notion that the effect of a SNP may differ significantly among different ethnic populations. We found some differences in haplotype/SNP associations with PTD among Blacks, Whites and Hispanics (shown in Tables 3 and 4). Such differences may be attributable to the observation that certain polymorphisms are present only in a certain ethnic population. Another plausible explanation is that the difference of LD patterns across different ethnic populations can lead to different association results. In other words, the disease-causing allele could be in strong LD with a SNP in one ethnic population, but not in others. Table 4 shows a few examples of different haplotype profiles and disease associations in different ethnic groups. If the SNP marker is truly the disease mutation, we would expect a consistent effect in all populations. As shown in Table 3, the F5 missense mutation (rs6019) may belong to this category. Moreover, the different types and levels of environmental exposures, such as smoking or illicit drug use, may confer differential risks in populations with different genetic susceptibilities.

This study attempted to apply advanced statistical methods to address two common challenges encountered in genetic dissection of complex traits. (1) Multiple testing is almost inevitable in large-scale gene mapping efforts. As mentioned in the Materials and Methods section, haplotype-based association tests offer advantages over conventional single-marker analysis. Compared with the haplotype-based tests, single-marker analyses involved many more tests and had less significance due to a low statistical power. For these reasons, we attempt to detect genetic associations by making use of markers that span each of the candidate genes in the haplotype analyses. Under such circumstances, the haplotype blocks become the basic units of genetic variants of interest in the association analyses. Even with haplotype analysis, multiple testing remains an issue. Correcting this problem in a valid and efficient manner has become a crucial part of these kinds of studies. Considering the 25 candidate genes (Table 2) are largely independent from each other, we applied the conservative Bonferroni adjustment, and found the F5 was significantly associated with PTD after correction. (2) Population admixture is a known confounding factor for population-based association analysis that may result in inflated type I error. In this study, we tried to reduce this effect by using a within-population permutation procedure in the association analysis. We examined the consistency between the self-reported ethnicity information (Blacks, Whites and Hispanics) in our study and the population classification based on genetic data using a Bayesian clustering method. We found that over 95% self-reported data match the classification based on genetic data, suggesting the self-reported ethnicity in general should be good enough for the association analysis. In this study we did not look at the possible subpopulations within each ethnicity due to limited sample size.

Of note, due to the vast number of SNPs deposited in publicly available databases, it has become much more efficient to confirm and use publicly available SNPs than performing de novo SNP discoveries in-house. However, most of these SNPs do not have known biological functions, including those SNPs evaluated in our study. Thus, these SNPs could simply serve as functionally neutral markers for the real disease-causing locus that could be located in the nearby genomic segments (i.e. haplotype blocks) that are inherited together.

In summary, this study underscores the potentially important role of F5 haplotypes in the etiology of PTD. It is imperative for us to study either retrospectively or prospectively other PTD cases and controls to see if the association found here was real, after correcting for both population stratification and multiple testing. Furthermore, we not only demonstrated the feasibility and power of using high-throughput genotyping, but also showed that haplotype-based strategies may offer significantly greater advantages than single-SNP analyses to identify genetic susceptible alleles underlying complex traits. This study is only the beginning of our effort to understand the genetic influences on PTD. How these genes and their expressions affect PTD and how they interact with other genes within the same pathway and in different pathways remain to be determined. Finally, it is notable that genetic factors of both the mother and the baby may be involved in PTD, and future studies may need to take into account both mother and baby genotypes in building the statistical and genetic model of PTD.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 REFERENCES
 
Study population and data collection
This study was part of an ongoing molecular epidemiological study of PTD conducted at the BMC, which serves a multi-ethnic urban population in Boston, USA. Details on the study site, the population and data collection procedures were described previously (14). Briefly, this is a case–control study, where PTD cases were defined as mothers who delivered singleton, live and preterm (<37 weeks of gestation regardless of birthweight) infants; controls were defined as mothers who delivered singleton, live and term (>=37 weeks of gestation) infants with birthweight >2500 g during the same time period. Cases and controls were matched for maternal age and ethnicity. Multiple-gestation pregnancies (e.g. twin births), PTD as a result of maternal trauma and newborns with major birth defects or chromosomal abnormalities were excluded. All eligible mothers were approached postpartum by the research staff for voluntary enrollment into the study. The participation rates were ~90 and 85% among approached cases and controls, respectively. There were no significant differences between study participants and non-participants in regard to gestational age, maternal ethnicity or other sociodemographic characteristics (data not shown). After written informed consent was obtained, a maternal questionnaire interview and a medical record review were conducted by trained research staff to collect relevant epidemiological and clinical information. Maternal venous blood samples were also collected for DNA extraction and subsequent genetic analysis. Institutional Review Boards of the Boston University Medical Center, the Children's Memorial Hospital in Chicago, and the Harvard School of Public Health have approved the study protocol.

Candidate SNP selection
We searched for all potential SNPs within each candidate gene from dbSNP (www.ncbi.nih.gov/dbSNP). For genes with a large number of SNPs, we selected six to 10 SNPs using the following criteria: biological function (non-synonymous SNP), allele frequency (if available), physical distance among SNPs, and assay success rate. Based on the above criteria, a total of 426 SNPs for 55 candidate genes (an average of ~8 SNPs per gene) were selected for the subsequent high-throughput genotyping.

High-throughput genotyping
Genotype data was obtained using the BeadArrayTM technology of Illumina Inc., which can achieve a very high throughput and accuracy with reasonably low costs (35,36). In brief, fiber-optic bundles were miniaturized in high-density arrays. Ninety-six of these arrays were held together in a matrix that is compatible with the 96-well microtiter plate format. In the multiplex assays, each SNP was denoted by a unique identifier. These identifiers were matched in real time with a laboratory information management system database to track, store and manage the flow of the genotype data generated out of the pipeline. Each genotype call was assigned with a GenCallTM score that correlated with the genotyping accuracy. Details on genotyping technology and assay procedures were described elsewhere (35). In addition to the routine quality assurance procedures implemented as a part of high-throughput genotyping by the Illumina Inc., SNP data was checked for reproducibility in 50 duplicate samples and for conformation to Hardy–Weinberg equilibrium within controls in each ethnic group.

Population classification of subjects based on genetic data
We applied a clustering method based on Bayesian model, implemented in the program structure (http://pritch.bsd.uchicago.edu/), to infer population structure and assign individuals to populations (16). The consistency between the predicted and the self-reported ethnicity was determined.

Statistical analysis
The central focus of our analysis is to test the genetic associations of candidate SNPs with PTD and to address significant statistical challenges arising from multiple testing and population admixture. To ensure adequate statistical power, we first determined allele frequencies for all SNPs by maternal ethnicity. Simulation showed that in population-based association studies, SNPs with a MAF greater than 10% are required for 80% power to detect an odds ratio (OR) of 1.5 for a sample size of 1000 (37). Thus, in the subsequent analysis, we removed those SNPs with MAF <10%.

Owing to the relatively large number of SNPs, the statistical power would be low if using conventional single-marker analysis because of the necessity to correct for an inflated type I error due to multiple testing. In addition, single-marker analysis cannot take into account LD information. Therefore, in our study, we first examined pairwise LD on SNPs within each gene to define potential LD blocks, and then conducted haplotype-based association analyses.

Pairwise LD analysis.
We studied pairwise LD between SNPs within each gene. LD was measured using the D', which was computed for alleles at pairs of SNP loci using EM algorithm (38). To estimate the relationship between LD and physical distance, we calculated all the D' values for pairs of SNPs with MAF >10% in the three ethnic groups as a function of their physical distances (Fig. 2). An exponential decay model was fitted for the data, expressing D' as a function of physical distance (d) (39): D'=A ek·d. The two parameters of the model, A and k, were estimated based on the observed data on D' and d for each ethnic group.



View larger version (11K):
[in this window]
[in a new window]
 
Figure 2. In pairwise LD analysis, only intragenic SNP pairs were examined, and the results were presented as black dots in the figure. The LD strength, D', was observed to be strong (D'>0.8) for most pairs and to decay with increasing physical distance. To measure the rate of decay, we fitted the exponential decay model: D'=Aek·d. We found the intercept, A, was very close to 1 for all the three ethnic populations, and the decay rate, k, was greater in the Blacks (k, 0.038; standard error, 0.005) than Hispanics (k, 0.018; standard error, 0.003) or Whites (k, 0.018; standard error, 0.002). The fitted curves were presented in the figure.

 
Haplotype-based analyses.
Haplotype-based analyses consisted of three steps: (1) haplotype block partitioning; (2) haplotype phase reconstruction; and (3) haplotype-based association tests. The details of each step were delineated as follows.

(1) Haplotype block-partitioning: owing to the block-like structure of LD, long-range haplotypes bearing multiple contiguously spaced SNPs can be divided into discrete blocks of limited haplotype diversity, and the use of haplotype blocks may improve power of association test over the use of single SNPs (19,4042). Haplotype blocks can be defined in various ways, and there is no standard definition to date (41,43). In the current study, we define a ‘haplotype block’ as a set of consecutive SNPs on the same chromosome if their pairwise D' values all exceed a threshold of 0.8. This definition appears to be both conceptually intuitive and statistically stringent, which assumes limited recurrent/backward mutations and/or recombination within each block, but allows for recombination between blocks (43).

(2) Haplotype phase reconstruction: haplotype-based study typically requires knowledge of phase information about the individuals studied. In our study, we employed the HAPLOTYPER program (44) to reconstruct haplotype phases based on measured genotype data. HAPLOTYPER is based on a stochastic sampling strategy that is shown to be robust to violation of HWE and the presence of missing data (44).

Computational haplotype inference as well as haplotype-based statistical tests using diploid genotype data is a novel and intriguing research field. Whether to phase cases and controls jointly or separately is a debatable issue. In our analysis, we first phased the haplotype jointly in cases and controls under H0 assuming that cases and controls are genetically homogeneous. When the genetic effect of a single-locus is weak, reconstruction of haplotype phases by combining cases and controls is conceivably tolerable. However, in the case of a strong association in regard to the SNPs under investigation, joint phasing could be problematic. On the other hand, phasing in cases and controls separately could result in very unstable test statistics, which deviate from {chi}2 distribution under H0. Using HAPLOTYPER, we found that phasing jointly or separately among cases or controls did not make significant differences, and we applied the results from separate phasing for the cases and the controls.

(3) Haplotype-based association tests: for each haplotype block, we used a Monte-Carlo approachto test for association with PTD (haplotype frequency differences in cases and controls) in the entire samples. First, for a haplotype with R distinct haplotypes, we calculated a {chi}2 statistics (45) from a simple Rx2 tablegenerated from all the cases and controls. The P-value of the {chi}2 test was denoted as pobserved, and was treated as a new test statistic. The null distribution of the pobserved was approximated using a permutation procedure. In each iteration of the permutation procedure, the cases and controls within each ethnic population were randomly permuted (i.e. fixing the total numbers of the cases and the controls in each population), and haplotypes were inferred separately for cases and for controls in each population. Then a {chi}2 value was calculated using the entire permuted samples, and its corresponding p-value was designated as ppermute A total of 1000 iterations were run in the current analysis. The significance of the haplotype test was determined by p= P (pobserved>ppermute)

Multiple testing adjustment.
The Bonferroni correction was used for multiple testing (the 25 candidate genes were treated as 25 independent statistical tests) by multiplying the nominal P-value of each test by 25 (i.e. the number of tests conducted).


    ACKNOWLEDGEMENTS
 
We thank the Boston University Medical Center preterm study advisory group, Drs Barry Zuckerman, Phillip Stubblefield, Paul Wise, Howard Bauchner, Jerome Klein, Milton Ketochuck, for their support and guidance throughout the study. We thank Dr Wing Hung Wong for his guidance throughout this work and his critical review of the manuscript. We thank Colleen Pearson and Katherine Ortiz for their effort in field data collection. We thank the nursing staff of Labor and Delivery at the Boston Medical Center for their assistance to our study. We thank Ann Ramsey for administrative support, and Lingling Fu and Qin Wang for data entry and management. This study was supported in part by grants 20-FY98-0701 and 20-FY02-56 from the March of Dimes Birth Defects Foundation, USA; R01 HD41702 from the National Institute of Child Health and Human Development; and R01ES11682, R21 ES11666, and ES-00002 from the National Institute of Environmental Health Sciences.


    FOOTNOTES
 
* To whom correspondence should be addressed at: Program for Population Genetics, Harvard School of Public Health, 665 Huntington Avenue FXB101, Boston, MA 02115, USA. Email: xu{at}hsph.harvard.edu


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 REFERENCES
 

  1. Guyer, B., MacDorman, M.F., Martin, J.A., Peters, K.D. and Strobino, D.M. (1998) Annual summary of vital statistics—1997. Pediatrics, 102, 1333–1349.[Abstract/Free Full Text]

  2. Kramer, M.S. (1987) Intrauterine growth and gestational duration determinants. Pediatrics, 80, 502–511.[Abstract/Free Full Text]

  3. Wildschut, H.I., Lumey, L.H. and Lunt, P.W. (1991) Is preterm delivery genetically determined? Paediatr. Perinat. Epidemiol., 5, 363–372.[Medline]

  4. Khoury, M.J. and Cohen, B.H. (1987) Genetic heterogeneity of prematurity and intrauterine growth retardation: clues from the Old Order Amish. Am. J. Obstet. Gynecol., 157, 400–410.[Web of Science][Medline]

  5. Johnstone, F. and Inglis, L. (1974) Familial trends in low birth weight. Br. Med. J., 3, 659–661.[Abstract/Free Full Text]

  6. Magnus, P., Bakketeig, L.S. and Skjaerven, R. (1993) Correlations of birth weight and gestational age across generations. Ann. Hum. Biol., 20, 231–238.[CrossRef][Web of Science][Medline]

  7. Treloar, S.A., Macones, G.A., Mitchell, L.E. and Martin, N.G. (2000) Genetic influences on premature parturition in an Australian twin sample. Twin. Res., 3, 80–82.[CrossRef][Medline]

  8. Clausson, B., Lichtenstein, P. and Cnattingius, S. (2000) Genetic influence on birthweight and gestational length determined by studies in offspring of twins. BJOG, 107, 375–381.[Medline]

  9. Dizon-Townson, D.S., Major, H., Varner, M. and Ward, K. (1997) A promoter mutation that increases transcription of the tumor necrosis factor-alpha gene is not associated with preterm delivery. Am. J. Obstet. Gynecol., 177, 810–813.[CrossRef][Web of Science][Medline]

  10. Roberts, A.K., Monzon-Bordonaba, F., Van Deerlin, P.G., Holder, J., Macones, G.A., Morgan, M.A., Strauss, J.F., 3rd and Parry, S. (1999) Association of polymorphism within the promoter of the tumor necrosis factor alpha gene with increased risk of preterm premature rupture of the fetal membranes. Am. J. Obstet. Gynecol., 180, 1297–1302.[CrossRef][Web of Science][Medline]

  11. Fujimoto, T., Parry, S., Urbanek, M., Sammel, M., Macones, G., Kuivaniemi, H., Romero, R. and Strauss, J.F., 3rd (2002) A single nucleotide polymorphism in the matrix metalloproteinase-1 (MMP-1) promoter influences amnion cell MMP-1 expression and risk for preterm premature rupture of the fetal membranes. J. Biol. Chem., 277, 6296–6302.[Abstract/Free Full Text]

  12. Ozkur, M., Dogulu, F., Ozkur, A., Gokmen, B., Inaloz, S.S. and Aynacioglu, A.S. (2002) Association of the Gln27Glu polymorphism of the beta-2-adrenergic receptor with preterm labor. Int. J. Gynaecol. Obstet., 77, 209–215.[CrossRef][Medline]

  13. Wang, X., Chen, D., Niu, T., Wang, Z., Wang, L., Ryan, L., Smith, T., Christiani, D.C., Zuckerman, B. and Xu, X. (2000) Genetic susceptibility to benzene and shortened gestation: evidence of gene-environment interaction. Am. J. Epidemiol., 152, 693–700.[Abstract/Free Full Text]

  14. Wang, X., Zuckerman, B., Pearson, C., Kaufman, G., Chen, C., Wang, G., Niu, T., Wise, P.H., Bauchner, H. and Xu, X. (2002) Maternal cigarette smoking, metabolic gene polymorphism, and infant birth weight. JAMA, 287, 195–202.[Abstract/Free Full Text]

  15. Wang, X., Zuckerman, B., Kaufman, G., Wise, P., Hill, M., Niu, T., Ryan, L., Wu, D. and Xu, X. (2001) Molecular epidemiology of preterm delivery: methodology and challenges. Paediatr. Perinat. Epidemiol., 15 (Suppl. 2), 63–77.

  16. Pritchard, J.K., Stephens, M. and Donnelly, P. (2000) Inference of population structure using multilocus genotype data. Genetics, 155, 945–959.[Abstract/Free Full Text]

  17. Rosenberg, N.A., Pritchard, J.K., Weber, J.L., Cann, H.M., Kidd, K.K., Zhivotovsky, L.A. and Feldman, M.W. (2002) Genetic structure of human populations. Science, 298, 2381–2385.[Abstract/Free Full Text]

  18. Reich, D.E., Cargill, M., Bolk, S., Ireland, J., Sabeti, P.C., Richter, D.J., Lavery, T., Kouyoumjian, R., Farhadian, S.F., Ward, R. and Lander, E.S. (2001) Linkage disequilibrium in the human genome. Nature, 411, 199–204.[CrossRef][Medline]

  19. Gabriel, S.B., Schaffner, S.F., Nguyen, H., Moore, J.M., Roy, J., Blumenstiel, B., Higgins, J., DeFelice, M., Lochner, A., Faggart, M. et al. (2002) The structure of haplotype blocks in the human genome. Science, 296, 2225–2229.[Abstract/Free Full Text]

  20. Goldenberg, R.L. and Rouse, D.J. (1998) Prevention of premature birth. New Engl. J. Med., 339, 313–320.[Free Full Text]

  21. Girling, J. and de Swiet, M. (1998) Inherited thrombophilia and pregnancy. Curr. Opin. Obstet. Gynecol., 10, 135–144.[CrossRef][Web of Science][Medline]

  22. Gerhardt, A., Scharf, R.E., Beckmann, M.W., Struve, S., Bender, H.G., Pillny, M., Sandmann, W. and Zotz, R.B. (2000) Prothrombin and factor V mutations in women with a history of thrombosis during pregnancy and the puerperium. N. Engl. J. Med., 342, 374–380.[Abstract/Free Full Text]

  23. Bertina, R.M., Koeleman, B.P., Koster, T., Rosendaal, F.R., Dirven, R.J., de Ronde, H., van der Velden, P.A. and Reitsma, P.H. (1994) Mutation in blood coagulation factor V associated with resistance to activated protein C. Nature, 369, 64–67.[CrossRef][Medline]

  24. Rosing, J. and Tans, G. (1997) Factor V. Int. J. Biochem. Cell Biol., 29, 1123–1126.[CrossRef][Web of Science][Medline]

  25. Rosing, J. and Tans, G. (1997) Coagulation factor V: an old star shines again. Thromb. Haemost., 78, 427–433.[Web of Science][Medline]

  26. Dizon-Townson, D.S., Meline, L., Nelson, L.M., Varner, M. and Ward, K. (1997) Fetal carriers of the factor V Leiden mutation are prone to miscarriage and placental infarction. Am. J. Obstet. Gynecol., 177, 402–405.[CrossRef][Web of Science][Medline]

  27. Kingdom, J.C. and Kaufmann, P. (1997) Oxygen and placental villous development: origins of fetal hypoxia. Placenta, 18, 613–621; discussion 623–626.[CrossRef]

  28. Luzi, G., Caserta, G., Iammarino, G., Clerici, G. and Di Renzo, G.C. (1999) Nitric oxide donors in pregnancy: fetomaternal hemodynamic effects induced in mild pre-eclampsia and threatened preterm labor. Ultrasound Obstet. Gynecol., 14, 101–109.[CrossRef][Web of Science][Medline]

  29. Kosmas, I.P., Tatsioni, A. and Ioannidis, J.P. (2003) Association of Leiden mutation in Factor V gene with hypertension in pregnancy and pre-eclampsia: a meta-analysis. J. Hypertens., 21, 1221–1228.[CrossRef][Web of Science][Medline]

  30. Goldenberg, R.L., Hauth, J.C. and Andrews, W.W. (2000) Intrauterine infection and preterm delivery. New Engl. J. Med., 342, 1500–1507.[Free Full Text]

  31. Imseis, H.M., Greig, P.C., Livengood, C.H., 3rd, Shunior, E., Durda, P. and Erikson, M. (1997) Characterization of the inflammatory cytokines in the vagina during pregnancy and labor and with bacterial vaginosis. J. Soc. Gynecol. Investig., 4, 90–94.[CrossRef][Web of Science][Medline]

  32. Romero, R., Mazor, M., Brandt, F., Sepulveda, W., Avila, C., Cotton, D.B. and Dinarello, C.A. (1992) Interleukin-1 alpha and interleukin-1 beta in preterm and term human parturition. Am. J. Reprod. Immunol., 27, 117–123.

  33. Suzuki, Y. (2002) Immunopathogenesis of cerebral toxoplasmosis. J. Infect. Dis., 186 (Suppl. 2), S234–S240.

  34. Vintzileos, A.M., Ananth, C.V., Smulian, J.C., Scorza, W.E. and Knuppel, R.A. (2002) The impact of prenatal care in the United States on preterm births in the presence and absence of antenatal high-risk conditions. Am. J. Obstet. Gynecol., 187, 1254–1257.[CrossRef][Web of Science][Medline]

  35. Oliphant, A., Barker, D.L., Stuelpnagel, J.R. and Chee, M.S. (2002) BeadArray technology: enabling an accurate, cost-effective approach to high-throughput genotyping. Biotechniques, 32 (suppl.), 56–58, 60–61.

  36. Walt, D.R. (2000) Techview: molecular biology. Bead-based fiber-optic arrays. Science, 287, 451–452.[Free Full Text]

  37. Johnson, G.C., Esposito, L., Barratt, B.J., Smith, A.N., Heward, J., Di Genova, G., Ueda, H., Cordell, H.J., Eaves, I.A., Dudbridge, F. et al. (2001) Haplotype tagging for the identification of common disease genes. Nat. Genet., 29, 233–237.[CrossRef][Web of Science][Medline]

  38. Excoffier, L. and Slatkin, M. (1995) Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Mol. Biol. Evol., 12, 921–927.[Abstract]

  39. Abecasis, G.R., Noguchi, E., Heinzmann, A., Traherne, J.A., Bhattacharyya, S., Leaves, N.I., Anderson, G.G., Zhang, Y., Lench, N.J., Carey, A. et al. (2001) Extent and distribution of linkage disequilibrium in three genomic regions. Am. J. Hum. Genet., 68, 191–197.[CrossRef][Web of Science][Medline]

  40. Patil, N., Berno, A.J., Hinds, D.A., Barrett, W.A., Doshi, J.M., Hacker, C.R., Kautzer, C.R., Lee, D.H., Marjoribanks, C., McDonough, D.P. et al. (2001) Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21. Science, 294, 1719–1723.[Abstract/Free Full Text]

  41. Zhang, K., Deng, M., Chen, T., Waterman, M.S. and Sun, F. (2002) A dynamic programming algorithm for haplotype block partitioning. Proc. Natl Acad. Sci. USA, 99, 7335–7339.[Abstract/Free Full Text]

  42. Daly, M.J., Rioux, J.D., Schaffner, S.F., Hudson, T.J. and Lander, E.S. (2001) High-resolution haplotype structure in the human genome. Nat. Genet., 29, 229–232.[CrossRef][Web of Science][Medline]

  43. Wang, N., Akey, J.M., Zhang, K., Chakraborty, R. and Jin, L. (2002) Distribution of recombination crossovers and the origin of haplotype blocks: the interplay of population history, recombination, and mutation. Am. J. Hum. Genet., 71, 1227–1234.[CrossRef][Web of Science][Medline]

  44. Niu, T., Qin, Z.S., Xu, X. and Liu, J.S. (2002) Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms. Am. J. Hum. Genet., 70, 157–169.[CrossRef][Web of Science][Medline]

  45. Humar, B., Graziano, F., Cascinu, S., Catalano, V., Ruzzo, A.M., Magnani, M., Toro, T., Burchill, T., Futschik, M.E., Merriman, T. and Guilford, P. (2002) Association of CDH1 haplotypes with susceptibility to sporadic diffuse gastric cancer. Oncogene, 21, 8192–8195.[CrossRef][Web of Science][Medline]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Journal of Renin-Angiotensin-Aldosterone SystemHome page
L. L Valdez-Velazquez, A. Quintero-Ramos, S. A Perez, F. Mendoza-Carrera, H. Montoya-Fuentes, F. Rivas Jr, N. Olivares, A. Celis, O. F Vazquez, and F. Rivas
Genetic polymorphisms of the renin-angiotensin system in preterm delivery and premature rupture of membranes
Journal of Renin-Angiotensin-Aldosterone System, December 1, 2007; 8(4): 160 - 168.
[Abstract] [PDF]


Home page
Cancer Res.Home page
R. L. Milne, G. Ribas, A. Gonzalez-Neira, R. Fagerholm, A. Salas, E. Gonzalez, J. Dopazo, H. Nevanlinna, M. Robledo, and J. Benitez
ERCC4 Associated with Breast Cancer Risk: A Two-Stage Case-Control Study Using High-throughput Genotyping
Cancer Res., October 1, 2006; 66(19): 9420 - 9427.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
M. Farrall and A. P. Morris
Gearing up for genome-wide gene-association studies
Hum. Mol. Genet., October 15, 2005; 14(suppl_2): R157 - R162.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
13/7/683    most recent
ddh091v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (35)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Hao, K.
Right arrow Articles by Xu, X.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Hao, K.
Right arrow Articles by Xu, X.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?