Abstract:
|
Biological research often generates "big" data - e.g. a single gene expression microarray can produce tens of thousands of rows of values - but the data are really just a bunch of numbers until statistics comes into play. Statisticians help biologists uncover underlying realities hidden in generated or collected data and make comparisons and predictions. Statisticians, however, don't provide all the answers; rather we help piece together the amazing stories that biologists strive to tell on their quest for answers, while ensuring numbers are utilized and interpreted correctly. In this talk I will introduce a study about finding circulating biomarkers for coronary artery disease and describe how integral statistics was throughout the project. Random forests, linear regression, and logistic regression were all used to analyze a clinical cohort of over 700 patients in the Netherlands. Each data type (including proteomics, gene expression arrays, flow cytometry, and microRNAs), and the analysis thereof, tells a different story about circulating cells related to unstable plaques, but should also tell a comprehensive one that could reduce major cardiac events like heart attack or stroke.
|