Abstract:
|
This paper describes to minimize the jackknife estimate of the bias to evaluate gene expression data. The setting is a case-control study of short-term (cases) and long-term (controls) survivors with colon cancer. The objective is to determine the most accurate and smallest genes set needed to properly classify cases and controls. First, scan all n chips (n1 are case, n2 are control) and determine which genes have expressions > L (pre-specified level). Place the names of these k genes into the candidate "gene set" G(L). Second, restrict attention only to the candidate genes, count the number of candidate genes on each chip, x (0 < = x < = k), that have expression level >L. Define the proportion x/k as a decision measure, which is used to determine case and control. Third, find the best cut-point to classify case and control. Repeated step 1 to 3 and find all candidates' "gene set" having different L levels. Determine the most accurate of the candidates' "gene set," depending on case and control classification (more than one). Use jackknife to estimate the bias for each candidate "gene set" and find the candidate's "gene set" which has the smallest number genes and the smallest bias.
|