Abstract #301887

This is the preliminary program for the 2003 Joint Statistical Meetings in San Francisco, California. Currently included in this program is the "technical" program, schedule of invited, topic contributed, regular contributed and poster sessions; Continuing Education courses (August 2-5, 2003); and Committee and Business Meetings. This on-line program will be updated frequently to reflect the most current revisions.

To View the Program:
You may choose to view all activities of the program or just parts of it at any one time. All activities are arranged by date and time.

The views expressed here are those of the individual authors
and not necessarily those of the ASA or its board, officers, or staff.


Back to main JSM 2003 Program page



JSM 2003 Abstract #301887
Activity Number: 283
Type: Topic Contributed
Date/Time: Tuesday, August 5, 2003 : 2:00 PM to 3:50 PM
Sponsor: Biopharmaceutical Section
Abstract - #301887
Title: Mining Gene Expression Data and DNA Sequence Database: A Case Study of in Silico Target Identification
Author(s): Nanxiang Ge*+ and Hong Liu
Companies: Aventis and
Address: 2 Tulip Dr., Newtown, PA, 18940-9265,
Keywords: data mining ; logistics regression ; motif ; ROC curve ; mixed-effect model ; genome scan
Abstract:

With the availability of genome sequence data and large amount of gene expression data, one important question in industry is how to transform this huge amount of data into something tangible. We present a case study on how statistical methods can be used to mine these available large databases. Two type of data are considered in this talk, gene expression data and transcription factor binding site pattern in the gene regulatory region. A mixed-effect mode was first used to analyze the gene expression data to identify genes that are either regulated or nonregulated. Score showing a particular motif appearing in the promoter region of these genes are also generated. A statistical model was then developed relating the regulation to the motif pattern in the promoter region. This statistical model was then applied to scan the whole genome with the goal to identify more genes that are regulated. In this case study, the genome scan helped identify 138 putative regulated genes. Down stream validation suggesting good predictive power of the model.


  • The address information is for the authors that have a + after their name.
  • Authors who are presenting talks have a * after their name.

Back to the full JSM 2003 program

JSM 2003 For information, contact meetings@amstat.org or phone (703) 684-1221. If you have questions about the Continuing Education program, please contact the Education Department.
Revised March 2003