|
Activity Number:
|
80
|
|
Type:
|
Other
|
|
Date/Time:
|
Monday, July 30, 2007 : 8:30 AM to 10:20 AM
|
|
Sponsor:
|
ASA
|
| Abstract - #308191 |
|
Title:
|
Applications of Statistical Machine-Learning to Modern Biological Datasets
|
|
Author(s):
|
Jon D. McAuliffe*+
|
|
Companies:
|
University of Pennsylvania
|
|
Address:
|
400 Huntsman Hall, Philadelphia, PA, 19104,
|
|
Keywords:
|
|
|
Abstract:
|
High-throughput experimentation is now a routine part of research in biology. The corresponding need to process and analyze large, complex biological datasets has given rise to the specialized field of bioinformatics. I will give some examples of the statistical and computational issues that arise in bioinformatics analyses, and how machine-learning methods have been used with some success to address them. First I will describe the "functional annotation" problem: how to determine what biological role is played by different parts of an organism's genome. Comparing to the genomes of related organisms can help a lot; I will explain the graphical model formalism and show how it is relevant. Then I will describe the notion of "heterogeneous data integration" using support vector machines, with an application to discriminating different classes of proteins in yeast. No background in biology will be assumed.
|
- The address information is for the authors that have a + after their name.
- Authors who are presenting talks have a * after their name.
Back to the full JSM 2007 program |