Nonparametric Clustering Using Dirichlet Process Mixtures (306583)*Hend Aljobaily, University of Northern Colorado
Keywords: Bayesian, Nonparametric, Machine Learning, Big Data
Traditional parametric models using a fixed and finite number of parameters cannot be used with machine learning. It may result in over or under fitting of data due to the complexity of the model. Therefore, The Bayesian nonparametric (BNP) approach is an alternative for the traditional parametric approach. Probabilistic models are appropriate nonparametric models for machine learning since they are data-driven models. One example for Bayesian nonparametric models is the Dirichlet process. Dirichlet process (DP) is one of the most popular BNP models. The DP model is used as a prior on the space of probability measures. In this study, a real-world example will be used demonstrating the use of Dirichlet Process Mixture and nonparametric clustering to analysis big data set obtained from the U.S. government database.