Abstract:
|
Many survey data sets involve data collected from more than one sample from the same population. In many instances, there is no obvious best way to combine the data from the multiple samples. In NLSY97, a cross-sectional sample was designed to represent the various segments of the eligible population in their proper population proportions, and a supplemental sample was designed to produce, in the most statistically efficient way, the required oversamples of Hispanic and non-Hispanic black youths.
We consider two competing approaches. In one approach, the weights are determined within each sample and then adjusted to allow combination of the samples. In the alternative approach, the weights are determined across samples depending on the overall probability of the individual element, giving a single unified set of weights for the cumulated cases. A paper presented at JSM in 2000 (O'Muircheartaigh and Pedlow, 2000) gave an introduction and preliminary theoretical comparison of the two approaches. This paper follows up by using the data from the first four rounds of NLSY97 data to illustrate the principles involved and to test the effectiveness of the two approaches.
|