Abstract:
|
Deep learning is a popular machine learning method that has gained a lot of interest in recent years, and it has benefited almost every aspect of modern big data applications. Most deep learning methods are regarded as black-box procedures, in the sense that their statistical properties still largely remain mysterious. In our recent paper (joint with Yingying Fan and Jinchi Lv), a simulation study was designed with latent subspace structure motivated by image recognition. We empirically demonstrated that the performance of deep neural network (DNN) is comparable to the ideal procedure knowing the true latent subspace information a priori. We showed that DNN does not really do efficient clustering in any of its layers. We also provided statistical theory and heuristic arguments to support our empirical discoveries and demonstrated the utility of our theoretical framework on the real data application. In this talk, I will introduce our statistical insights into DNN.
|