Online Program

Keynote Presentation | Concurrent Sessions | Poster Sessions
Short Courses (full day) | Short Courses (half day) | Tutorials | Practical Computing Demos | Closing General Session with Refreshments

Last Name:

Abstract Keyword:


Thursday, February 23
Registration Thu, Feb 23, 7:00 AM - 6:30 PM

SC1 Art and Practice of Classification and Regression Trees Thu, Feb 23, 8:00 AM - 5:30 PM
River Terrace 2
Instructor(s): Wei-Yin Loh, University of Wisconsin
It is more than 50 years since the first regression tree algorithm (AID, Morgan and Sonquist 1963) appeared. Rapidly increasing use of tree models among practitioners has stimulated many algorithmic advances over the last two decades. Modern tree models have higher prediction accuracy, increased computational speed, and negligible variable selection bias. They can fit linear models in the nodes using GLM, quantile, and other loss functions; response variables may be multivariate, longitudinal, or censored; and classification trees can employ linear splits and fit kernel and nearest-neighbor node models. The aims of the course are: (i) to briefly review the capabilities of the state-of-the-art methods and (ii) to show how to exploit free software to analyze data from initial data exploration to a final interpretable prediction model. Example applications include subgroup identification for precision medicine, missing value imputation, and propensity score estimation in sample surveys.

SC2 Becoming a Student of Leadership in Statistics Thu, Feb 23, 8:00 AM - 12:00 PM
City Terrace 7
Instructor(s): Matthew Gurka, University of Florida; Robert Rodriguez, SAS Institute Inc.; Gary R, Sullivan, Eli Lilly & Company
What is leadership? Much has been written and discussed within the statistics profession in the last few years on the topic and its importance in advancing our profession. This course will provide an introductory understanding of leadership as well as initial direction for statisticians who wish to develop as leaders. It will feature a leader in the statistics profession speaking on their personal journey as well as providing guidance on personal leadership development. You will also be introduced to some important leadership competencies - including influence, business acumen, and communication - and will begin to draft a plan for (1) developing your own leadership or (2) addressing a leadership challenge in your work. Finally, you will spend time reflecting on leadership learnings and networking with other statisticians and practitioners.

SC3 Peering into the Future: Introduction to Time Series Methods for Forecasting Thu, Feb 23, 8:00 AM - 12:00 PM
City Terrace 9
Instructor(s): Dave Dickey, North Carolina State University
This workshop will provide a practical guide to time series analysis and forecasting, focusing on examples and applications in modern software. Students will learn how to recognize autocorrelation when they see it and how to incorporate autocorrelation into their modeling. Models in the ARIMA class and their identification, fitting, and diagnostic testing will be emphasized and extended to models with deterministic trend functions (inputs) and ARMA errors. Diagnosing stationarity, a critical feature for proper analysis, will be demonstrated. After the course, students should be able to identify, fit, and forecast with this class of time series models and be aware of the consequences of having autocorrelated data. They should be able to recognize nonstationary cases in which the differences in the data, rather than the levels, should be analyzed. Underlying ideas and interpretation of output, rather than code, will be emphasized. No previous experience with any particular software is needed. Examples will be computed in SAS, but most modern statistical packages such as SPSS, R, STATA, etc. can be used for time series analysis.

SC4 Producing High-Quality Figures in SAS to Meet Publication Requirement, with Practical Examples Thu, Feb 23, 8:00 AM - 12:00 PM
City Terrace 12
Instructor(s): Charlie Chunhua Liu, Allergan PLC
The half day short course will cover publication requirements on high-quality figures, discus principles to produce high-quality figures in SAS, demonstrate using both SAS/GRAPH and ODS Graphics Procedures to produce some commonly used types of figures (line plots, scatter plots, bee swarm plots, box plots, and box plots overlaid with bee swarm plots etc.).

The instructor will also demonstrate to produce the above mentioned high-quality figures in listing (EMF, EPS, etc.) and document formats (RTF, PDF etc.).

SC5 Linear Mixed Models Through Health Sciences Applications Thu, Feb 23, 8:00 AM - 12:00 PM
River Terrace 3
Instructor(s): Constantine Daskalakis, Thomas Jefferson University
This course will focus on the heuristic understanding of linear mixed models and their implementation (including assessment of assumptions and model fit, and interpretation of results), rather than formal statistical theory. The following general topics will be covered: a. Specification and interpretation of the fixed effects (population-averaged/mean) model. b. Specification and interpretation of the random effects and their covariance structure (subject-specific effects). c. Considerations regarding the error structure. d. Statistical and graphical methods of assessment of (a), (b), and (c), and model selection strategies. e. Determination, estimation, and testing of linear combinations/contrasts of coefficients to address scientific objectives. f. Writing brief summaries of the results for non-statistical audiences.

These topics will be addressed through the analysis of data from two studies: (1) a school-based intervention program designed to impact students’ body mass index (BMI); and (2) an animal xenograft experiment designed to assess the effects of a drug and of radiotherapy on tumor growth.

SC6 Text Analytics and Its Applications Thu, Feb 23, 1:30 PM - 5:30 PM
City Terrace 7
Instructor(s): Edward Jones, Texas A&M University
Text analytics refers to the process of deriving actionable insights from text data. This half-day course explores the evolution and creative application of text analytics to solving business problems. Emphasis is placed on how text analytics is used for solving typical forecasting and classification problems by integrating structured and unstructured text data. Solutions are illustrated using SAS Text Miner, R and Python with real world applications in finance and social media.

SC7 Expressing Yourself with R Thu, Feb 23, 1:30 PM - 5:30 PM
City Terrace 9
Instructor(s): Hadley Wickham, RStudio
In this mini-workshop you'll learn how to better express yourself in R. To express yourself clearly in R you need to know how to write high quality functions and how to use a little functional programming (FP) to solve common programming challenges. You'll learn:

* The three key properties of a function. * A proven strategy for writing new functions. * How to use functions to reduce duplication in your code. * How `lapply()` works and why it's so important. * A handful of FP tools that increase the clarity of your code.

This workshop is suitable for beginning and intermediate R users. You need to know the basics of R (like importing your data and executing basic instructions). If you're an advanced R user, you probably won't learning anything completely new, but you will learn techniques that allow you to solve new challenges with greater ease.

The workshop will be hands-on and interactive, so please make sure to bring along your laptop with R installed!

SC8 Missing Data Analysis with R/SAS/Stata Thu, Feb 23, 1:30 PM - 5:30 PM
City Terrace 12
Instructor(s): Din Chen, The University of North Carolina at Chapel Hill; Frank Liu, Merck Research Labs
Missing data are near universal in applied research. Almost all applied researchers have faced the problems of missing data at some point. However, not all the researchers assessed missingness or used appropriate ways to deal with the missing data. Instead, researchers often drop the missing values (e.g., listwise deletion), which reduces the sample size, lowers statistical power, or use ad-hoc single imputation such as LOCF for simplicity. Both approaches introduce the possibility of biased parameter estimations. Such inefficient and potentially biased statistical inference would lead to erroneous research conclusions.

This short course aims to address the problems of missing data. The concept of different missing data mechanisms or typologies including missing completely at random (MCAR), missing at random (MAR) and missing not at random (MNAR) will be discussed with illustration from real clinical trial examples. Moreover, this short course will introduce how to conduct two commonly-used model-based methods for missing data analysis including the multiple imputation (Little & Rubin, 2002; Reiter & Raghunathan, 2007) and maximum likelihood (Allison, 2012) using R/SAS. Some sensitivity analysis approaches to handle missing data under MNAR will also be discussed briefly.

SC9 Bootstrap Methods and Permutation Tests Thu, Feb 23, 1:30 PM - 5:30 PM
River Terrace 3
Instructor(s): Tim Hesterberg, Google
We begin with a graphical approach to bootstrapping and permutation testing, illuminating basic statistical concepts of standard errors, confidence intervals, p-values and significance tests.

We consider a variety of statistics (mean, trimmed mean, regression, etc.), and a number of sampling situations (one-sample, two-sample, stratified, finite-population), stressing the common techniques that apply in these situations. We'll look at applications from a variety of fields, including telecommunications, finance, and biopharm.

These methods let us do confidence intervals and hypothesis tests when formulas are not available. This lets us do better statistics, e.g. use robust methods (we can use a median or trimmed mean instead of a mean, for example). They can help clients understand statistical variability. And some of the methods are more accurate than standard methods.

PS1 Poster Session 1 and Opening Mixer Thu, Feb 23, 5:30 PM - 7:00 PM
Conference Center AB
Chair(s): Nancy Wang, Celerion

Web Scraping Government Tax Revenue with Machine Learning
Brian Arthur Dumbacher, U.S. Census Bureau
A Hierarchical Clustering Analysis (HCA) in Automatic Driving Regarding Vehicle-to-Vehicle Pedestrian Position Identification
Jie Xue, Purdue University
Communicating Statistics to Nonstatisticians
Kim Love, K. R. Love Quantitative Consulting and Collaboration
Good Statistical Practices: An Example of Meta-Analysis of Odds Ratios
Bei-Hung Chang, University of Massachusetts Medical School
A New Method to Assess Measurement Agreement in Machine Readings
Dong-Yun Kim, NHLBI/NIH
Smoking Tendencies Among Junior High--School Students in Ghana: Applications of ROC Curve and AUC
Emmanuel Thompson, Southeast Missouri State University
An Application of Competing Risk Analysis in Large Cardiovascular Clinical Trials
Purva Jain, Beth Israel Deaconess Medical Center, Harvard Medical School
Profile Monitoring for Poisson Data with Fixed Effects Using Nonparametric Methods
Sepehr Piri, Virginia Commonwealth University
Interaction of Measurement Burden and Disease in the ENRICHD Clinical Trial
Probabilistic Record Linkage in R and Stata
Anders R Alexandersson, Florida Cancer Data System
Creating Reproducible Tables in R Markdown
Claire Palmer, University of Colorado at Denver School of Medicine
More Than Meets the Eye: Bayesian Inference in Nonparanormal Graphical Models
Jami Jackson Mulgrave, North Carolina State University
Communicating with Clinicians About Models That Predict Risk Using Interactive Web Graphics
Marshall Brown, Fred Hutchinson Cancer Research Center
R Resample Package
Tim Hesterberg, Google
StatTag: A Reproducible Research Tool for Generating Dynamic Documents Using Microsoft Word
Abigail S Baldridge, Northwestern University
Exhibits Open Thu, Feb 23, 5:30 PM - 7:00 PM
Conference Center AB

Friday, February 24
Registration Fri, Feb 24, 7:30 AM - 5:30 PM

Continental Breakfast Fri, Feb 24, 7:30 AM - 8:30 AM
Conference Center AB

Exhibits Open Fri, Feb 24, 7:30 AM - 6:30 PM
Conference Center AB

GS1 Keynote Address Fri, Feb 24, 8:00 AM - 9:00 AM
River Terrace 1
Chair(s): MoonJung Cho, Bureau of Labor Statistics

Snakes and Ladders: Challenges in Forging a Career in Statistics
David Lane Banks, Duke University
CS01 Presentation and Storytelling Fri, Feb 24, 9:15 AM - 10:45 AM
River Terrace 2
Chair(s): Cynthia R. Long, Palmer College of Chiropractic

9:20 AM The Statistician’s Role in Data Storytelling Projects: Case Studies and Best Practices
Haviland Wright, Boston University
10:05 AM Statistical Presentation Power: How to Reveal Your 'X Factor'!!!
Jennifer H Van Mullekom, Laboratory for Interdisciplinary Statistical Analysis (LISA), Virginia Tech
CS02 Beyond the Basics: Advanced Modeling Methods Fri, Feb 24, 9:15 AM - 10:45 AM
River Terrace 3
Chair(s): Shankang Qu, PepsiCo

9:20 AM Don't Be Silly; Do It Bayesian
Perceval Sondag, Arlenda
10:05 AM Improve Regression and Communicate Results Using Stochastic Gradient Boosting and LASSO
Charles William Harrison, Salford Systems
CS03 Data Wrangling and Visualization Fri, Feb 24, 9:15 AM - 10:45 AM
City Terrace 7
Chair(s): Robert P. Yerex, University of Virginia Medical Center

9:20 AM Discover and Visualize the Golden Paths, Unique Sequences, and Marvelous Associations Out of Your Big Data Using Link Analysis in SAS Enterprise Miner
Delali Agbenyegah, Alliance Data Card Services
10:05 AM Data Scraping, Parsing, Wrangling, and Cleaning
Mark Daniel Ward, Purdue University
CS04 Keep It Simple with R Fri, Feb 24, 9:15 AM - 10:45 AM
City Terrace 9

9:20 AM Reproducibility in Action
Richard Thomas Schwinn, U.S. Small Business Administration
10:05 AM Managing Many Models
Hadley Wickham, RStudio
CS05 Statistical Collaboration Fri, Feb 24, 11:00 AM - 12:30 PM
River Terrace 2
Chair(s): Michael Latta, YTMBA Research & Consulting and Coastal Carolina University

11:05 AM Practical Examples and Challenges of Statistical Consulting in Health Settings
Laura H Gunn, Stetson University
11:50 AM Panel Discussion on Statistical Volunteers
David J Corliss, Peace-Work
CS06 Business Intelligence Practices Fri, Feb 24, 11:00 AM - 12:30 PM
River Terrace 3
Chair(s): Madhuri Mulekar, University of South Alabama

11:05 AM On the Street: Conducting Business Research
Joyce Nilsson Orsini, Fordham University GBA
11:50 AM Bridging the Gap on Multi-Channel Attribution
John Lin, Epsilon Data Management
CS07 Surveys and Sentiment Analysis Fri, Feb 24, 11:00 AM - 12:30 PM
City Terrace 7
Chair(s): Susan Simmons, NC State Institute for Advanced Analytics

11:05 AM The Nexus Between Data Science, Survey Design, and Statistical Practice
Steven B Cohen, RTI International
11:50 AM Sentiment Analysis of Brand Social Mentions: The Polarity Classification and Beyond
Jin Su, Johnson & Johnson Vision Care, Inc.
CS08 Interactivity with R Shiny Fri, Feb 24, 11:00 AM - 12:30 PM
City Terrace 9
Chair(s): Edward Mulrow, NORC at the University of Chicago

11:05 AM Working with Shiny Things
Harlen Hays, Cerner Corporation
11:50 AM Rapid Data Visualization and Dissemination Using R and Shiny
Bogdan Alexandru Rau, UCLA Center for Health Policy Research
Lunch (on own) Fri, Feb 24, 12:30 PM - 2:00 PM

CS09 Organizational Impact Fri, Feb 24, 2:00 PM - 3:30 PM
River Terrace 2
Chair(s): Kathy Hanford, University of Nebraska-Lincoln

2:05 PM Developing a Data Science Center of Excellence (DS CoE)
Celeste R Fralick, Unaffiliated
2:50 PM My Marathon Journey for Analytics Change
Terri Henderson, Johnson & Johnson Vision Care, Inc.
CS10 Probability Distributions Fri, Feb 24, 2:00 PM - 3:30 PM
River Terrace 3
Chair(s): Kathleen Jablonski, The George Washington University

2:05 PM Modeling Proportions and Probabilities: The Beta Distribution Is Your Friend
Paul Teetor, William Blair & Co.
2:50 PM Probability Density for Repeated Events
Bruce Stephen Lund, Magnify Analytic Solutions
CS11 Text Analytics Fri, Feb 24, 2:00 PM - 3:30 PM
City Terrace 7
Chair(s): Laura H Gunn, Stetson University

2:05 PM Predicting Regulatory Risk from Unstructured Text Data
Danielle Leigh Boree, Johnson & Johnson Vision Care, Inc.
2:50 PM Using Text Analytics and Signal Detection to Predict Medical Device Recalls
Lisa Ensign, Significant Statistics
CS12 Going Big on Bayesian Fri, Feb 24, 2:00 PM - 3:30 PM
City Terrace 9
Chair(s): Alok Kumar Dwivedi, Texas Tech University Health Sciences Center

2:05 PM Introduction to Bayesian Analysis Using Stata
Chuck Huber, StataCorp
2:50 PM Bayesian Structural Equation Modeling
M'hamed Hamy Temkit, Mayo Clinic
CS13 Career and Personal Development Fri, Feb 24, 3:45 PM - 5:15 PM
River Terrace 2
Chair(s): Rich Newman, Johnson & Johnson Vision Care, Inc.

3:50 PM Soft Skills for Succeeding Outside of Academia
Diahanna L Post, Nielsen
4:35 PM Career Development for Statisticians in a Collaborative Environment: Importance of Effective Mentoring and Development of Soft Skills
Jay N Mandrekar, Mayo Clinic
CS14 Addressing Statistical Problems and Issues Fri, Feb 24, 3:45 PM - 5:15 PM
River Terrace 3
Chair(s): Bonita Singal, United States Department of Energy

3:50 PM Data Preparation: The Key for Meaningful Insights
Huiyu Qian, AutoAnything Inc.
4:35 PM Matched Case-Control Data Analysis
Yinghui Duan, Connecticut Institute for Clinical and Translational Science
CS15 Machine Learning Fri, Feb 24, 3:45 PM - 5:15 PM
City Terrace 7
Chair(s): John Stevens, Utah State University

3:50 PM Tree-Based Techniques for High-Dimensional Data
Wei-Yin Loh, University of Wisconsin
4:35 PM Intro to Deep Learning with TensorFlow
Denisa A.O. Roberts, ASAPP Inc.
CS16 Generalized Linear Mixed Models with R Fri, Feb 24, 3:45 PM - 5:15 PM
City Terrace 9
Chair(s): Doug Lehmann, University of Miami

3:50 PM Constructing and Analyzing Generalized Linear Mixed Models
Christina P Knudson, Macalester College
4:35 PM Simulation and Power Analysis of Generalized Linear Mixed Models
Brandon LeBeau, University of Iowa
PS2 Poster Session 2 and Refreshments Fri, Feb 24, 5:15 PM - 6:30 PM
Conference Center AB
Chair(s): Huanjun Zhang, Texas A&M University

Use of Longitudinal Models to Identify Subject-Specific Implausible Body Mass Index Measures: A Comparison with Screening for Population-Level Outliers
Carrie Tillotson, OCHIN, Inc.
Enhancing Monthly Retail Holiday Effect Methodology Through Daily Data
Rebecca Jean Hutchinson, U.S. Census Bureau
Statistics for Public Policy: Reflecting on Change in the Last 50 Years
Karen Moran Jackson, The University of Texas at Austin
Data Science, Statistics, Analytics, Data Engineering: What Does It All Mean?
Michael Latta, Coastal Carolina University
Iterative Semiparametric Generalized Linear Models
Busayasachee Puang-Ngern, Macquarie University
Expanding the Appeal of Model Selection Using Mixture Priors to Incorporate Expert Opinion: A Behavioral Economic Case Study
Christopher T Franck, Virginia Tech
Performance of Data Mining Methods in an Example with Ordinal and Imbalanced Data
Elena Rantou, FDA
Additive P-Value Combinations and an Application in Consumer Product Research
Georgette Asherman, Direct Effects, LLC
Pseudo-Maximum Likelihood Estimation with Sampling Weight for Modeling Count Data from a Complex Survey
Lin Dai, Medical University of South Carolina
Two-Step Logistic Regression Model for Predicting Phone Campaign Response
Sharon (Renting) Xu, AARP, Inc.
Detecting Interaction in Two-Way Unreplicated Experiments via Bayesian Model Selection
Thomas Anthony Metzger, Virginia Tech
Analyzing Shot Data with MANOVA
Victoria Cox, Dstl
What Effort Is Needed in Explaining Statistical Results to Pediatric Researchers? A Survey to Better Understand the Confidence and Knowledge of Pediatric Researchers
Curtis Dean Travers, Emory University School of Medicine
Rapid Data Visualization and Dissemination Using R and Shiny
Bogdan Alexandru Rau, UCLA Center for Health Policy Research
ShinySurvival: An Interactive Tool for Visualizing and Analyzing Survival Data in R
Felicia Powell Hardnett, CDC
Side-by-Side Bar Charts for More Than One Variable on Different Scales Using SAS SGPLOT
John Stephen Taylor, Johnson & Johnson Vision Care, Inc.
Good Old Excel: Using an Old Favorite to Explore, Visualize, and Share Data
Nola du Toit, NORC at the University of Chicago
Saturday, February 25
Registration Sat, Feb 25, 7:30 AM - 2:30 PM

Exhibits Open Sat, Feb 25, 7:30 AM - 1:00 PM
Conference Center AB

PS3 Poster Session 3 and Continental Breakfast Sat, Feb 25, 8:00 AM - 9:15 AM
Conference Center AB
Chair(s): Michael Devin Floyd, Saint Software

Variations in Statistical Practice Between North-American Stat Labs
Eric Vance, University of Colorado at Boulder
Not Just a Statistician: Experience on How to Communicate with Your Client
Kate Wan-Chu Chang, University of Michigan
Listwise Deletion or Multiple Imputation When Complex Sample Data Are MCAR or MAR: A Guide to Selecting an Appropriate Missing Data Treatment Method
Anh P. Kellermann, University of South Florida
Analysis of Bird Arrival Dates in Cayuga County
Caitlin Mary Cunningham, Le Moyne College
A Guide to Modeling Strategies for Immunological Count Data
Claire Palmer, University of Colorado at Denver School of Medicine
Blending Big Data Visualization Tools with Statistical Analysis: Improving Automotive Lubricants
Jim McAllister, Afton Chemical Corporation
Partial Least Squares Regression Analysis Identifies Interleukin-1 Receptor as a Predictor of Airway Neutrophils in Asthma
Michael David Evans, University of Wisconsin-Madison
Data Fusion Techniques for Estimating the Relative Abundance of Rare Species
Purna Gamage, Texas Tech University
Statistical Comparison of Particle Size Distributions
Scott J Richter, The University of North Carolina at Greensboro
Optimal Experimental Designs for Mixed Categorical and Continuous Responses
Soohyun Kim, Arizona State University
Balanced Salary Structure Modeling
Thor Dane Osborn, Sandia National Laboratories
Sequential Pattern Mining in Real-Time Marketing with Backward Match Algorithm
Yi Cao, Alliance Data Card Services
Using Shiny to Efficiently Process Survey Data
Carl Ganz, UCLA Center for Health Policy Research
Integrating R Programming Platforms into Community Collective Impact Efforts to Solve Social Problems
Frank M Ridzi, Central New York Community Foundation and Le Moyne College
Generating Tables and Statistical Summary Using PROC REPORT
Lei Zhang, University of Minnesota
A Nonlinear Regression Plugin for Rcmdr
Thomas Edward Burk, University of Minnesota
Predict Warriors' 73-Win on April 13, 2016
Jason Li, Morrill Learning Center
CS17 Ethical Guidelines Sat, Feb 25, 9:15 AM - 10:45 AM
River Terrace 2
Chair(s): Constantine Daskalakis, Thomas Jefferson University

9:20 AM How to Deal with Ethical Issues of Human Subjects, Difficult Colleagues, and Networking
Michael Latta, YTMBA Research & Consulting and Coastal Carolina University
10:05 AM How the New ASA Guidelines Help Practicing Statisticians
Alan C. Elliott, Southern Methodist University
CS18 Going Mainstream: Emerging Modeling Methods Sat, Feb 25, 9:15 AM - 10:45 AM
River Terrace 3
Chair(s): Viswanathan Ramakrishnan, Medical University of South Carolina

9:20 AM Amazon Product Co-Purchasing Network Estimation Through ERGM Model Using Reference Prior
Sayan Chakraborty, Michigan State University
10:05 AM Integrating Text Analytics with Traditional Structured Analytics
Edward Jones, Texas A&M University
CS19 Guided and Automatic Model Selection Sat, Feb 25, 9:15 AM - 10:45 AM
City Terrace 7
Chair(s): Inyoung Kim, Virginia Tech

9:20 AM Best Practices in Model Selection and Profiling
Scott Lee Wise, SAS Institute, Inc.
10:05 AM Designing Automated Workflows for Model Selection and Optimization
Christian Kendall, Salford Systems
CS20 A Graph Is Worth a Thousand Words Sat, Feb 25, 9:15 AM - 10:45 AM
City Terrace 9
Chair(s): Andrew D. Althouse, University of Pittsburgh Medical Center

9:20 AM Geospatial Analysis with R
Michael Jadoo, Unaffiliated
10:05 AM How to Avoid Some Common Graphical Mistakes
Naomi B. Robbins, NBR
CS21 Communicating to Motivate and Influence Sat, Feb 25, 11:00 AM - 12:30 PM
River Terrace 2
Chair(s): Yuanyuan Tang, Saint Luke's Health System

11:05 AM The Psychology of Influence
Colleen Mangeot, Cincinnati Children's Hospital Medical Center
11:50 AM Strategic Marketing and Communication for Statistical Consultants and Collaborators
Renita Canady, Association of American Medical Colleges
CS22 In THIS Corner: X1! When Model Variables Compete Sat, Feb 25, 11:00 AM - 12:30 PM
River Terrace 3
Chair(s): Michael Reiger, West Virginia University

11:05 AM Estimating with Weights: Common Sense vs. Unbiasedness
Tim Hesterberg, Google
11:50 AM Competing Risk Data and Semi-Competing Risk Data Analysis and Visualization in SAS and R
Ran Liao, Indiana University
CS23 Latent Variable and Mixed Effects Models Sat, Feb 25, 11:00 AM - 12:30 PM
City Terrace 7
Chair(s): Xianggui (Harvey) Qu, Oakland University

11:05 AM Nonparametric Mixed-Effects Regression for Large Samples
Nathaniel Erik Helwig, University of Minnesota
11:50 AM Optimization of Processes and Products from Historical (Un-Designed) Data
John F. MacGregor, ProSensus, Inc.
CS24 It's a Package Deal Sat, Feb 25, 11:00 AM - 12:30 PM
City Terrace 9
Chair(s): John Castelloe, SAS Institute

11:05 AM Introduction to JMP Software
Terrie Vasilopoulos, University of Florida
11:50 AM Logistic Regression Cross-Package Comparison
Lillian Ma, Capital One Bank
Lunch (on own) Sat, Feb 25, 12:30 PM - 2:00 PM

PCD1 Power and Sample Size Analysis Using Stata Sat, Feb 25, 2:00 PM - 4:00 PM
City Terrace 6
Instructor(s): Chuck Huber, StataCorp
Power and sample size analysis is a fundamental step in the planning of any research project. This talk will demonstrate how to use Stata's power command to calculate power, sample size and minimum detectable effect size. We will show how to create customized tables and graphs for many study designs with both continuous and categorical outcomes. We will also demonstrate how to add your own methods to the power command and how to calculate power for multilevel/longitudinal studies using simulation.

PCD2 Xymp: A Web Application Supporting Best Practices in Bioassay Sat, Feb 25, 2:00 PM - 4:00 PM
City Terrace 8
Instructor(s): David Lansky, Precision Bioassay, Inc.
The software system consists of three components (each on a different virtual server): a web application (written in PHP), a database, and a collection of R programs, packages, and reports (sweave and knitr). The system helps users perform randomized instances of routine bioassays, performs mixed model analyses (using linear or non-linear models), produces reports (including summaries). The statistical portion works well with simple or complex designs (from CRD to a strip-unit). The system contains a lot of features to meet regulatory requirements (users with different levels of authorization, automatic tracking and reporting of re-analyses of data, etc.). The system is designed to be very easy for routine use in the lab, while providing a rich collection of modern statistical capabilities. The system is designed to facilitate good collaboration between bioassay scientists and statisticians. Each assay has a protocol, each analysis has a protocol, the protocols capture all the statistical details; the lab users select protocols by name. The statisticians build the protocols.

PCD3 Marketing Mix Modeling and Optimization Using Bayesian Networks and BayesiaLab Sat, Feb 25, 2:00 PM - 4:00 PM
City Terrace 12
Instructor(s): Stefan Conrady, Bayesia USA
“Half the money I spend on advertising is wasted; the trouble is I don’t know which half.” Over the last century, various versions of this quote have been attributed to John Wanamaker, Henry Ford, and Henry Procter, among others. Yet, 100 years after these marketing pioneers, in this day and age of big data and advanced analytics, the quote still rings true among marketing executives. The ideal composition of advertising and marketing efforts remains the industry's Holy Grail. The current practice remains “more art than science.” The lack of a well-established marketing mix methodology has little to do with the domain itself. Rather, it reflects the fact that marketing is yet another domain that typically has to rely on non-experimental data for decision support.

The single most important thing we need to recognize about marketing mix modeling is that it is a causal question. This means, we are not looking for a prediction of an outcome variable based on the observation of marketing variables. Rather, we are looking to manipulate marketing variables to optimize an outcome variable. Thus, we are performing an intervention, which requires us to perform causal inference. This leads us to the Holy Grail of statistics, i.e. causal inference from observational data.

In this workshop, we introduce the basic concepts of graphical models and how they can help us perform causal identification, e.g. using causal assumptions and the well-known Adjustment Criterion. While this is straightforward in theory, the complexity of the marketing domain prevents the practical application of this criterion. Thus, we introduce a new criterion (Shpitser and VanderWeele, 2011) that reduces the number of assumptions that we require for confounder selection and causal identification.

Implementation with BayesiaLab With the confounders identified, we can now build a high-dimensional statistical model that represents the joint probability distribution of all marketing variables. We do that using the machine-learning algorithms of the BayesiaLab software platform. We obtain a Bayesian network that represents a multitude of relationships between all marketing variables and the outcome variable. Using BayesiaLab’s visualization functions, we can compare the machine-learned graph to our understanding of the domain. Furthermore, we can examine the (mostly nonlinear) response curves of the outcome variable as a function of the marketing variables. Most importantly, we use BayesiaLab to perform Likelihood Matching on all confounders to establish the causal response of the outcome variable.

With all causal response curves computed, we introduce cost functions for the marketing variables via BayesiaLab’s Function Node. On that basis, we proceed to BayesiaLab’s Target Optimization function, which, by means of a genetic algorithm, searches for an optimal combination of all marketing variables, while being subject to constraints of individual variables and an overall marketing budget constraint. The optimization report shows feasible solutions along with the degree of achievement.

PCD4 Dig Deeper and Uncover the Unexpected with JMP 13 and JMP Pro 13 Sat, Feb 25, 2:00 PM - 4:00 PM
City Terrace 10
Instructor(s): Mia Stephens, SAS Institute, Inc.; Scott Lee Wise, SAS Institute, Inc.
This session will cover how we can meet the challenge to explore, model and experiment on complex data analytic needs by: • Increasing Ease and Efficiency of Preparing and Accessing Data • Handling and Exploring all Types of Data, including Text • Providing Next Generation Analytical Tools in Quality, DOE & Reliability • Unleashing Advanced Analytics in Predictive Modeling • Improving the Ways to Share and Report Out Analytics and Graphs We will feature new ground-breaking methodology on relevant demos to maximize participant learning.

T1 Understanding and Working with Different (and Sometimes Difficult) People Sat, Feb 25, 2:00 PM - 4:00 PM
City Terrace 7
Instructor(s): Colleen Mangeot, Cincinnati Children's Hospital Medical Center
Do you have coworkers, researchers, or clients that are difficult to work with? Do you feel frustrated and/or confused about how to work with them? Do you wonder sometimes why they just don’t get it? This session will introduce the DISC model for understanding and working with different and sometimes difficult people. It will involve case studies and examples. The result? Improved relationships, increased effectiveness, greater influence, and ability to motivate others.

T2 Penalized Regression Methods for Generalized Linear Models in SAS/STAT Sat, Feb 25, 2:00 PM - 4:00 PM
City Terrace 9
Instructor(s): G Gordon Brown, SAS Institute, Inc.
Regression problems that have large numbers of candidate predictor variables occur in a wide variety of scientific fields and in business. These problems require you to perform statistical model selection to find an optimum model that is simple and has good predictive performance. For linear and generalized linear models you will see how to use the forward, backward, stepwise, and LASSO methods of variable selection. This tutorial presents modern variable selection methods for linear models using the adaptive LASSO, group LASSO, and elastic net penalized regression techniques, plus various screening methods. Penalized regression techniques yield a sequence of models and require at least one tuning method to choose the optimum model that has the minimum estimated prediction error. You will learn how to use fit criteria (such as AIC, SBC, and the Cp statistic), average square error on the validation data, and cross validation as tuning methods for penalized regression. Various examples will be provided using the GLMSELECT and HPGENSELECT procedures of SAS/STAT, which offer extensive customization options and powerful graphs for performing statistical model selection.

T3 Introduction to Spatial Analysis Through Statistics Sat, Feb 25, 2:00 PM - 4:00 PM
River Terrace 3
Instructor(s): Michael Devin Floyd, Saint Software; Phillip Stedman Floyd, Segal Consulting
The word spatial means related to space or geography. Thus, spatial analysis is an analysis that takes into consideration the location of the observation. This course is about using spatial elements to derive conclusions or eliminate dependence based on location. This course assumes no prior knowledge of spatial analysis. It starts from the beginning by defining spatial data and reasoning why spatial analysis is relevant. Different mapping techniques are explored to visualize spatial information to get a better understanding. Test for spatial dependence in datasets are discussed. Then, it is shown that spatial dependence can influence results. Spatial regression techniques are discussed to mitigate the spatial dependence. For the conclusion, I talk about how accounting for the spatial dependence influenced the research I did at the Louisiana Public Health Institute. Basic knowledge of regression and linear modeling is assumed.

T4 How to Find (the Right) Clients for Your Independent Statistical Consulting Business Sat, Feb 25, 2:00 PM - 4:00 PM
River Terrace 2
Instructor(s): Karen Grace-Martin, The Analysis Factor
If you are starting an independent statistical consulting business, you will need to learn many business skills. The most important, yet intimidating, of these is finding and attracting clients.

Clients will hire (and re-hire) you only if they know, like, and trust you. This will only happen when you build a solid marketing process that conveys your strengths and what you can offer to the right clients. In this tutorial, you will learn about how to approach and get started creating a simple, yet solid, marketing plan that allows the right clients to know, like, and trust you.

The instructor will share her personal experiences and case studies of colleagues who built a consulting business and guide you through small group exercises.

GS2 Closing General Session Sat, Feb 25, 4:15 PM - 5:30 PM
River Terrace 2
The Closing Session is your opportunity to interact with the CSP Steering Committee in an open discussion about how the conference went. CSPSC vice chair, Jean Adams, will lead a panel of committee members as they summarize their conference experience. The audience will then be invited to ask questions and provide feedback. The committee highly values suggestions for improvements gathered during this time. The best student poster will also be awarded during the Closing Session, and each attendee will have an opportunity to win a door prize.