Large-scale biobanks, such as UK-Biobank, have emerged as a powerful resource for complex disease studies and precision medicine. The genomic information coupled with clinical, behavior and environmental measurements enables to discover novel genetic associations and disease mechanism across the entire phenome. However, the scale and complex structure of biobank data have remained as substantial challenges. In this talk, I will first introduce a new statistical method that can analyze 500,000 samples for binary phenotypes with adjusting for family relatedness and case-control imbalance. This new method, called SAIGE, uses the saddle point approximation to adjust for case-control imbalance at the top of the recently developed Generalized Linear Mixed Model method. In addition, it uses optimization techniques to analyze large sample data. I will also introduce our more recent efforts including rare variant tests and gene-environmental interaction analysis for biobank scale data.