Mixed-model and quasi-likelihood methods in genetic association studies /

Saved in:
Bibliographic Details
Author / Creator:Wang, Miaoyan, author.
Ann Arbor : ProQuest Dissertations & Theses, 2015
Description:1 electronic resource (121 pages)
Format: E-Resource Dissertations
Local Note:School code: 0330
URL for this record:http://pi.lib.uchicago.edu/1001/cat/bib/10773301
Hidden Bibliographic Details
Other authors / contributors:University of Chicago. degree granting institution.
Notes:Advisors: Mary Sara McPeek Committee members: Peter McCullagh; Dan Nicolae.
Dissertation Abstracts International, Volume: 77-02(E), Section: B.
Summary:In this dissertation, we develop statistical methods for two problems in genetic association analysis. At the heart of our methods are approaches to addressing relatedness and hidden sample structure. In problem 1, we make use of relatedness as a source of information to gain power. In problem 2, we account for relatedness and sample structure to avoid potential confounding.
Chapter 2 focuses on the selection of an optimal subset of individuals, chosen from a set of pedigrees, to genotype for a genetic association study. We consider samples that include arbitrarily related individuals, with the kinship matrix assumed known. Statistical dependence is an important feature of this type of family data --- dependence not only between genotypes and phenotypes, but also among genotypes and among phenotypes. The retrospective version of a linear mixed model provides a natural way to incorporate partial information by making use of this dependence when both phenotype and pedigree information are available for a larger set of individuals from the same pedigree. We propose G-STRATEGY, which uses simulated annealing to maximize the non-centrality parameter of the quasi-score test for association in the retrospective model. G-STRATEGY compares very favorably to existing methods and achieves robustness of power against a wide range of alternative models. It is shown to be computationally feasible and is applied to a large data-set from the Icelandic Heart Association (IHA).
Chapter 3 focuses on joint genetic association mapping studies involving two interactive organisms, e.g., a highly coupled host-pathogen pair. Existing approaches to association mapping typically consider a single organism type and aim to identify genes in the genome that are statistically associated with the phenotype of interest. However, in a host-pathogen interactive system, the response (e.g., infection) often depends on the specific pairing of host and pathogen. In such cases, it is desirable to systematically model the trait and perform association mapping jointly by taking advantage of the genome sequences available for both partners of the system. Our solution is to develop a two-way mixed-effect model with multiple variance components. We use empirical genetic relatedness matrices (GRMs) to account for hidden sample structure, such as population stratification, admixture and/or cryptic relatedness. In additional to the host GRM and the pathogen GRM, we choose to use a G x G interaction kernel matrix which we define to be a Hadamard (i.e., element- wise) product of these two GRMs, for the purpose of modeling inter-genome interaction. We propose various quasi-score tests, depending on the specific goal, for assessing genetic association effects of individual SNPs and/or SNP pairs. In the context of an Arabidopsis-Xanthomonas dataset from Joy Bergelson's lab, we consider both the Gaussian and the Binomial-like two-way mixed-effect models within a quasi-likelihood framework. Because many SNP sites are present in only a subset of the sampled Xanthomonas strains, we extend the calculation of the empirical GRM for Xanthomonas to allow for this feature.