Online Program Home
  My Program

Abstract Details

Activity Number: 182 - SPEED: Data Challenge
Type: Contributed
Date/Time: Monday, July 31, 2017 : 10:30 AM to 11:15 AM
Sponsor: Government Statistics Section
Abstract #325166
Title: Income and Expenditure in US Households: a Multivariate Analysis of Consumer Expenditure Fmli161 Dataset
Author(s): Mingzhao Hu*
Companies: University of Wisconsin - Madison
Keywords: Missing Values ; Priciple Component Analysis ; Correspondence Analysis ; Cluster of Variables ; Canonical Correlation Analysis

In recent decades, data-driven approaches have been developed to analyze demographic and economic surveys on a large scale. The goal of this report is to apply multivariate analysis techniques to gain insight on relationships between income and expenditure of American households using the fmli161 dataset based on the Consumer Expenditure survey conducted by Bureau of Labor Statistics. Initially, 35 variables are selected from three categories: demographics, income and expenditure. Missing values and categorical variables are the first to be handled in preliminary analysis. On the mathematical side, I propose to evaluate the data and the results for stability and reproducibility. Further interpretations beyond economics presents the potential of the dataset. In conclusion, sparse PCA suggests FINCBTXM, FSALARYM, TOTEXPCQ, FOODCQ and HOUSCQ as the five most important variables of the selected, while cluster analysis gives more options depending on the number of clusters needed. CCA revealed high correlation between income and expenditure for middle class Americans, while correspondence analysis does not fully support suggestions of rebalancing higher educational rights based on race.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2017 program

Copyright © American Statistical Association