606 – Sample Allocation
Constructing Strata of PSUs for the Residential Energy Consumption Survey
Rachel Harter
RTI International
Patrick Chen
RTI International
Joseph McMichael
RTI International
Edgardo Cureg
U.S. Energy Information Administration
Samson Adeshiyan
U.S. Energy Information Administration
Katherine Morton
RTI International
Textbooks provide guidance on the general principles and desirable properties of defining sampling strata. This paper reviews some basic exploratory methods for determining stratification variables, including principal component analysis, cluster analysis, correlations, regression analysis, and decision trees for reducing the set of potential variables. Although all stratification methods use auxiliary variables available for the entire frame, the decision tree, regression, and correlation approaches also use prior outcome data, which may be available for just a sample of units. The principal component method combined with cluster analysis, on the other hand, focuses on relationships among stratification variables. Using both principal components/cluster analysis and decision trees, we stratify primary sampling units for the Residential Energy Consumption Survey and compare the resulting strata.