Analysis of big data demands computer aided or even automated model building. It becomes extremely difficult to analyze such data with traditional statistical models and model building methods. Deep learning has proved to be successful for a variety of challenging problems such as AlphaGo, driverless cars, and image classification. Understanding deep learning has however apparently been limited, which makes it difficult to be fully developed. In this talk, we study the capacity as well as generalization properties of deep neural networks (DNN) under different scenarios of weight normalization. If time permits, we also discuss how to use DNN for nonlinear variable selection.