Online Program

Friday, February 19
CS07 Emerging Challenges and Methods for Large Databases Fri, Feb 19, 11:00 AM - 12:30 PM
Diamond I&II

An Introduction to High-Performance Statistical Modeling Procedures in SAS (303138)

*Robert N. Rodriguez, SAS 

Keywords: distributed computing, parallel computing, Hadoop, statistical modeling, large data

Do you use increasing volumes of data for predictive modeling? Would parallel computing benefit your work? Are you hearing more about Hadoop? These trends call for statisticians to learn about infrastructure and software for large-scale data analysis. SAS has developed a series of high-performance procedures for statistical modeling and model selection, which are available in SAS/STAT® software. On single machines, these procedures achieve scalability by exploiting all the cores on the machine. In distributed computing environments, these procedures exploit parallel access to the data, along with all the cores and the huge amounts of memory that are available. This presentation explains the architectural concepts, statistical capabilities, and practical benefits of these tools.