Abstract:
|
For the first time in history, scientists of all industries are faced with an excess of data at their fingertips. The abundance of such data has led to great accessibility to work with data, such as analytics and modeling. Using powerful programming languages and packages, scientists can implement predictive models fairly easily; however, the surplus of data can drastically hurt the efficiency, or even worse, the accuracy, of models unless properly treated. As a result, dimensionality reduction has emerged as a means of optimizing the predictive power of a given set of data. This paper analyzes the theory of five commercial dimensionality reduction algorithms, then tests the techniques against a realistic dataset as evaluation for their efficacy in industrial modeling.
|