A parallel computing-based gene-gene interaction detection method with covariate adjustment

HKUST Electronic Theses

A parallel computing-based gene-gene interaction detection method with covariate adjustment

by Meng Wang

THESIS 2017

M.Phil. Electronic and Computer Engineering

xi, 51 pages : illustrations ; 30 cm

Abstract

In genome-wide association studies (GWAS), detecting interactions among single nucleotide polymorphism (SNP) pairs and phenotypes is important to reveal the relationship between genotypes and genetic diseases. The most commonly used measurement for interactions is the departure from a linear model, which describes the statistical relationship between genotypes and phenotypes. Recently, a Boolean operation-based screening and testing (BOOST) method was proposed to detect interactions with log-linear models. As the interaction detection is parallel, a GPU-based implementation of the BOOST method, named GBOOST, was made available for acceleration. Neither BOOST nor GBOOST methods take covariates into consideration in their models, which may lead to inaccurate or even wrong interaction results under some circumstances.

In the thesis, two covariate-adjusted interaction detection tools, (BOOST 2.0 and GBOOST 2.0,) will be presented. BOOST 2.0 is a CPU multi-threaded version of the advanced method, and GBOOST 2.0 is a GPU-based implementation. We will introduce the log-linear models and the solutions to the maximum log-likelihood of the models used in the method. Then the CPU multi-threaded and GPU implementations will be illustrated. BOOST 2.0 and GBOOST 2.0 are both divided into four modules: data loading, screening, testing and results mapping. In the data loading step, genetic data is transformed into Boolean representation so that we can take advantage of the fast speed of bit operation. Two fast approximate models are used in the screening step to filter out SNP pairs with low interaction values. The screening step is the most computationally intensive part since it exhaustively calculates interaction values for all SNP pairs. Then we apply an iterative algorithm to calculate interaction values for the small portion of SNP pairs, which have passed the screening step. Last, we map the significantly interacted SNP pairs back to their positions on corresponding chromosomes.

The performance comparison of BOOST 2.0/GBOOST 2.0 with BOOST/GBOOST will be presented using simulated data. We will also demonstrate the discoveries on real data with BOOST 2.0 and GBOOST 2.0.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree M.Phil. Department Electronic and Computer Engineering Authors Wang, Meng Subjects Nucleotides Detection Mathematical models Gene expression Parallel algorithms Language English Call number Thesis ECED 2017 Wang DOI 10.14711/thesis-991012536165303412

Full record

A parallel computing-based gene-gene interaction detection method with covariate adjustment

by Meng Wang

Post a Comment Cancel reply