Statistical methods for detecting genotype-phenotype association in the presence of environmental covariates
MetadataVis full innførsel
This thesis shows how statistical methods based on logistic regression models canbe used to analyze and interpret biological data. In genome-wide association stud-ies, the aim is to detect association between genetic markers and a given phenotype.This thesis considers a situation where the phenotype is the absence or presence ofa common disease, the genetic marker is a biallelic single nucleotide polymorphism(SNP), and environmental covariates are available. The main goal is to study andcompare four statistical methods (Score test, Likelihood ratio test, Wald test andCochran-Armitage test for trend) which, by using different approaches, test thehypothesis about whether there is an association or not between the disease andthe genetic marker. The methods are applied to simulated datasets in order tomeasure their test size and statistical power, and to compare them. Interactionbetween the genetic marker and the environmental effect is also considered, andstrategies for simulating cohort and case-control data with genotype and environ-mental covariates are studied.The power simulations show that methods based on logistic regression models areappropriate for detecting genotype-phenotype association, but when the environ-mental effect is moderate, a simpler method (Cochran-Armitage test for trend)which does not require model fitting at all, is adequate. When an interaction effectis included in the model, the hypothesis testing becomes more complex. Severalpossible approaches to this problem are discussed.