Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 210 - Contributed Poster Presentations: Survey Research Methods Section
Type: Contributed
Date/Time: Tuesday, August 4, 2020 : 10:00 AM to 2:00 PM
Sponsor: Survey Research Methods Section
Abstract #313574
Title: Regression-Based Oversampling for Mutual Fund Owners
Author(s): Abhi Jain* and Meimeizi Zhu
Companies: NORC @ the University of Chicago and NORC at University of Chicago
Keywords: sampling; logistic; regression; survey
Abstract:

In order to facilitate research into specific subgroups, survey statisticians use oversampling to disproportionally select individuals belonging to the subgroups of interest. Oversampling allows for an increased sample size of the subgroup and thus more precise estimation. However, it is often unknown whether an individual in a sample belongs to the subgroup of interest. In this case, we can build a probabilistic logistic regression model to estimate the probability of group membership, and use the estimated probabilities to oversample. In this paper, we outline a case study where we apply regression-based probabilistic sampling to oversample for individuals who own mutual funds. Using AmeriSpeak data, we develop a probabilistic model by regressing demographic characteristics that are predictive of an individual being a mutual fund owner. These characteristics include age, gender, income, education, race, and others. This paper documents the specifications and accuracy of the probabilistic model and the sampling results. After oversampling based on these estimated probabilities, we were able to successfully achieve the desired proportion of mutual fund owners in the sample.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program