Abstract:
|
This talk is to showcase statistical developments of my group in analysis of big genomic data, to identify disease susceptibility genes and to predict treatment response. We start with data examples from genome wide association and pharmacogenomics studies, which contain a large number of variables, including a variety of disease phenotypes, clinical features, and whole genome genetic variants. I then introduce our developments on significance tests for screening the whole genome data, assessing confounding effects, and a dimension reduction approach. Specifically, our new methods of significance tests are based on joint effect of multiple variables, or a group of variables. We establish theories of asymptotic null distribution and power for one significance test. We show another genome screening method, which is based on groups of variables, possesses a sure screening property. Simulations and real genome data are used to demonstrate our developments and compare with other methods in the literature.
|