Abstract:
|
The evolution of the field of data mining has caused professionals external to the field of data science to need to extract insights from their data. We develop a framework to help those professionals make important decisions about their methodology to maximize return. This framework is designed to find the optimal classification method with standard tools including Decision Tree, Naïve Bayes, Logistic Regression, Random Forest, Boosting, Bagging, Neural Networks, and Support Vector Machine. For a given data set, we optimize the method based on features of the data set including format, size, and content. We present results of tests conducted with our framework using several real-world datasets.
|