JSM 2011 Online Program

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

Abstract Details

Activity Number: 318
Type: Invited
Date/Time: Tuesday, August 2, 2011 : 10:30 AM to 12:20 PM
Sponsor: ENAR
Abstract - #300515
Title: Outcome and Probability Dependent Sampling Designs and Inference
Author(s): Haibo Zhou*+
Companies: The University of North Carolina
Address: , Chapel Hill, NC, 27514, United States
Keywords: Outcome dependent sampling ; probability dependent sampling

Biomedical studies are often designed to assess the relationship between some exposure X of interest and the corresponding outcome Y of individual adjusted by some confounding covariates Z. Restricted by the costs associated with exposure ascertainment, the full assessment of X on the whole study cohort is often not feasible. Two-stage stratified sampling design, introduced by Neyman (1938), is often used to enhance efficiency. At the first stage of a typical two-stage design, a relatively large random sample is drawn and measured for Y and Z, while, ascertainments of X are made at the second stage for a subsample drawn randomly, without replacement from the first stage data. Greater efficiency can be obtained through the two-stage sampling design (e.g. Breslow and Cain, 1988; Breslow et al., 2003 and Wang and Zhou, 2010). Another method for improving study efficiency is through biased sampling using the outcome-dependent-sampling (ODS) scheme. For example, the case-control study (e.g. Anderson, 1972; Prentice and Pyke, 1979) is the most well-known such design to deal with binary outcomes, and from it many subsequent designs have emerged. Among others, case-cohort studies were introduced by Prentice (1986) in order to reduce the cost by observing fewer subjects rather than following the whole cohort. Lu and Tsiatis (2006) propose a new way of estimating parameters in the linear transformation model component for the case-cohort study. Zheng et al (2010) describe likelihood-based approaches for the combining family-based and population-based case-control data. Schildcrout and Heagerty (2008) describe sampling based on the presence/absence of binary response series variation and propose conditional maximum-likelihood analyses. Biased sampling schemes can be a cost effective way to enhance study efficiency. In this paper, we propose a new two stage sampling design, the probability sampling scheme, in which, the second stage supplement samples are drawn based on a sampling probability calculated from the first stage data. The basic idea is to oversample those X that are on the two tails of its distribution. A semiparametric empirical likelihood inference procedure is proposed and the asymptotic normality properties of the proposed estimator is developed. Simulation results indicate that the sampling scheme and the proposed estimator is more efficient than the existing outcome dependent sampling design and the random sampling designs. We illustrate the proposed method with a data set from an environmental epidemiologic study, to assess the relationship between maternal polychlorinated biphenyl level and children's IQ test performance.

The address information is for the authors that have a + after their name.
Authors who are presenting talks have a * after their name.

Back to the full JSM 2011 program

2011 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Continuing Education program, please contact the Education Department.