Name: 2018 ASA Conference on Statistical Practice
Start: 2018-02-15T07:00:00+00:00
End: 2018-02-17
Location: Marriott Portland Downtown Waterfront

Last Name:	Abstract Keyword:	Title:

Thursday, February 15
Registration		Thu, Feb 15, 7:00 AM - 6:30 PM






SC1 Introduction to Big Data Analysis		Thu, Feb 15, 8:00 AM - 5:30 PM Salon A
Instructor(s): Fulya Gokalp Yavuz, Yildiz Technical University; Mark Daniel Ward, Purdue University


This one-day introductory workshop is geared toward CSP participants who want to revitalize or improve their data analysis skills, especially with an emphasis on big data. Ward and Gokalp Yavuz will present tools and techniques for these most fundamental, low-level aspects of data analysis. We are well-versed at teaching such techniques to students who have no background in data analysis or programming. This workshop will bring people up to speed with powerful techniques for data analysis. This one-day course has no prerequisites. This workshop will be hands-on and driven by examples, using large data sets. The intended participants for the course are people who work in a data-driven environment and have an increasing need to perform aspects of large data analysis. Before data is gathered and organized, a great deal of data manipulation is necessary, especially for working with big data sets. Sometimes the data need to be scraped from remote sources, and then parsed into more natural forms. This process often involves munging and cleaning the data. The need to be able to reproduce and reliably verify all of the methods used for the data wrangling is more important than ever.


SC2 An Introduction to D3.Js: From Scattered to Scatterplot		Thu, Feb 15, 8:00 AM - 5:30 PM Salon B
Instructor(s): Scott Murray, O’Reilly Media


Interested in coding data visualizations on the web, but don't know where to start? This workshop will have you transforming data into visual images in no time at all, starting from scratch and building an interactive scatterplot by the end of the session. We'll use d3.js, the web's most powerful library for data visualization, to load data and translate values into SVG elements — drawing lines, points, and scaled axes to label our data. We’ll learn how to use motion and visual transitions, and introduce simple interactivity to make our charts more explorable. All methods and examples will be up-to-date for the current version of D3 (4.x as of this writing).


SC3 Collaboration Essentials for Practicing Statisticians and Data Scientists		Thu, Feb 15, 8:00 AM - 12:00 PM Salon C
Instructor(s): Heather Smith, Cal Poly; Eric Vance, LISA--University of Colorado Boulder


Statisticians and data scientists positively impact many people, organizations, and governments through the careful collection, analysis, and interpretation of data to solve problems and make decisions. To maximize their impact, statisticians and data scientists must effectively collaborate with a variety of domain experts who originate the data or the problems to be solved. In this short course, participants will learn and practice essential skills to improve their professional communication and collaboration to increase their effectiveness on the job. Specifically, participants will learn how to establish foundational collaborative relationships with domain experts; structure effective meetings; and effectively communicate with non-statisticians. Participants will also practice their newly acquired skills and learn how to improve their proficiency in these essential collaboration skills by using role-plays and video coaching and feedback reviews outside of this short course. In sum, participants will learn and practice how to leverage their technical skills to more effectively collaborate for maximal impact inside and outside of their organizations.


SC4 A Variety of Mixed Models: Linear, Generalized Linear, and Nonlinear		Thu, Feb 15, 8:00 AM - 12:00 PM Salon E
Instructor(s): David A. Dickey, NC State University


The MIXED procedure in SAS, for example, correctly handles linear models that have multiple sources of random effects such as random town to town, store to store, and aisle to aisle variation in sales. Associated fixed effects might be product price, color of packaging and amount spent on advertising. The talk begins with a checklist for deciding when to treat effects as random versus fixed and follows with a series of examples. When the response variable is not normal, for example with a binary or Poisson response, additional complexities arise. Models with such non normal responses are often analyzed by assuming that some transformation, or link function, of the expected value of Y results in a linear model with fixed and random effects. We are then in the generalized linear mixed model setting. It may be that a model cannot be linearized by a transformation, thus making it a nonnlinear model. If random effects are involved the model is referred to as a nonlinear mixed model. With a minimal amount of theory and an emphasis on examples, these types of models will be explained and illustrated. SAS will be used but the ideas and interpretation are software independent.


SC5 Cleaning Up the Data Cleaning Process: Challenges and Solutions in R		Thu, Feb 15, 8:00 AM - 12:00 PM Salon D
Instructor(s): Claus Thorn Ekstrøm, Biostatistics, University of Copenhagen; Anne Helby Petersen, Biostatistics, University of Copenhagen


Data cleaning and validation are the first steps in any data analysis, as the validity of the conclusions from the analysis hinges on the quality of the input data. Mistakes in the data can arise for any number of reasons, including erroneous codings, malfunctioning measurement equipment, and inconsistent data generation manuals. We present a systematic, analytical approach to data cleaning that will ensure the data cleaning process to be just as structured and well-documented as the rest of the data analysis. The primary software tool is the dataMaid R package, which implements an extensive and customisable suite of quality assessment tools that can be used to identify potential problems in a dataset. The results are summarised in an auto-generated, non-technical, stand-alone document readable by statisticians and non-statisticians alike. Thus, the course teaches practical skills that aid the dialogue between data analysts and field experts, while also providing easy documentation of reproducible data cleaning steps and data quality control.


SC6 Effective Presentation for Statisticians and Data Scientists: Success=(PD)^2		Thu, Feb 15, 1:30 PM - 5:30 PM Salon C
Instructor(s): Jennifer H. Van Mullekom, Virginia Tech


Statisticians must be able to effectively convey their ideas to clients, collaborators, and decision-makers. Presenting in the modern world is even more daunting when speakers have the opportunity to employ slideware, videos, and live demos. Unfortunately, university coursework and professional development programs are often not targeted towards sharpening these skills. This short course, developed and taught by statisticians, will provide an opportunity to learn how to employ different methods and tools in the phases of the framework taught. The material covered in the course is geared toward data-based presentations and is based on the works of Garr Reynolds and Michael Alley, among others. The course will emphasize the importance of stepping away from the computer to Prepare an effective message aimed at your core point guided with a series of questions and tips. The Design phase emphasizes the importance of structure, streamlining, and good graphic design accompanied by a series of checklists. Of course, “Practice makes perfect” so we cannot skip this step. Finally, engaging the audience and effectively using the room and equipment is covered in the Deliver phase.


SC7 Statistical Learning Methods in R		Thu, Feb 15, 1:30 PM - 5:30 PM Salon E
Instructor(s): Kelly Sue McConville, Swarthmore College


Applied statisticians are often confronted with difficult modeling problems where standard regression approaches are not appropriate. For example, it may be that the number of possible predictors is large relative to the sample size or that the relationship between the variables is non-linear. This course will cover several statistical learning techniques which are designed to handle these difficult modeling problems. In particular, we will study penalized regression techniques (lasso, ridge, elasticnet), non-parametric regression (regression and smoothing splines), and classification methods (support vector machines, trees). Using data from the Bureau of Labor Statistics, participants will learn how to fit these models in R. R Markdown files with the relevant code will be provided so that participants can actively follow along with the demonstrations.


SC8 NISS Shortcourse: A Survey of Modern Data Science		Thu, Feb 15, 1:30 PM - 5:30 PM Salon D
Instructor(s): David Banks, Dept. of Statistical Science, Duke University


Modern data science is driven by applications, and these often entail Big Data and machine learning perspectives. This short course reviews key ideas and methods in nonparametric regression (starting with cross-validation and light bootstrap asymptotics, then moving on to the additive model, the generalized additive model, and neural networks. It also covers variable selection, with the Lasso and the Median Model, and describes the p >> n problem in the context of contributions by Candes and Tao, Donoho and Tanner, and Wainwright. The course next treats classification, with emphasis upon Random Forests, boosting, and ensemble strategies such as bagging, stacking and boosting.


PS1 Poster Session 1 and Opening Mixer		Thu, Feb 15, 5:30 PM - 7:00 PM Salons F-I


Chair(s): Alok Dwivedi, Texas Tech University Health Sciences Center El Paso (TTUHSC EP)

	Some Dimension Reduction Strategies for the Analysis of Survey Data Jiaying Weng, University of Kentucky
	Perl-Compatible Regular Expressions as a Tool to Abstract Semi-Structured Electronic Health Records Samantha Emily Montag, Northwestern University
	Collaborative Process to Efficiently Produce Publications in Multicenter Research Cody S. Olsen, University of Utah, Department of Pediatrics
	Developing a Comprehensive Personal Plan for Teleworking (Working Remotely) Julia Lull, Janssen Research & Development, LLC
	Thank You, Come Again: Modeling Repeat Purchase Behavior for Business Travelers Diag D. Davenport, Georgetown University
	Wavelet-Based Methods for Data-Driven Monitoring Achraf Cohen, University of West Florida
	A Simulation Study of Violations of the Local Independence Assumption in Latent Class Analyses Michael P. Chen, U.S. Centers for Disease Control and Prevention
	Impact of Linear Regression Predictor Omission on Estimation and Inference Julia L. Sharp, Colorado State University
	A Comparison of Standard Logistic Regression, Multilevel Modeling, Robust Error Estimation, and Exposure Simulation for Data Containing Quasi-Berkson Error Angelique Liddell Zeringue, Mercy Healthcare
	Statistical Modeling for Repeated Measures in Rubber Research Wenzhao Yang, The Dow Chemical Company
	Combining Historical Data and Propensity Score Methods in Observational Studies to Improve Internal Validity Miguel Marino, Oregon Health & Science University
	Marketing Communication Channel Preference Optimization Using a Two-Stage Statistical Modeling Hongying Yang, Statistical consultant
	Limitations of Propensity Score Methods: Demonstration Using a Real-World Example Gregory B. Tallman, Oregon State University/Oregon Health & Science University
	Effect Size Measures for Nonlinear Count Regression Models Stefany Coxe, Florida International University
	Appropriate Dimension Reduction for Sparse, High-Dimensional Data Using Intensity Plots and Other Visualizations Eugenie Jackson, West Virginia University
	Navigating Large-Scale Forest Plots Using R and Shiny Steele Valenzuela, Oregon Health & Science University
	Ranked-Choice Voting R Package Jay Lee, Reed College


Exhibits Open		Thu, Feb 15, 5:30 PM - 7:00 PM Salons F-I






Friday, February 16
Registration		Fri, Feb 16, 7:30 AM - 5:30 PM






Continental Breakfast		Fri, Feb 16, 7:30 AM - 8:30 AM Salons F-I






Exhibits Open		Fri, Feb 16, 7:30 AM - 6:30 PM Salons F-I






GS1 Keynote Address		Fri, Feb 16, 8:00 AM - 9:00 AM Salon E


Chair(s): Kim Love, K. R. Love Quantitative Consulting and Collaboration

8:05 AM	Reflections on Career Opportunities and Leadership in Statistics Lisa LaVange, The University of North Carolina


CS01 #LeadWithStatistics		Fri, Feb 16, 9:15 AM - 10:45 AM Salon A


Chair(s): Sejong Bae, Comprehensive Cancer Center, University of Alabama

9:20 AM	Q&A with Lisa LaVange Lisa LaVange, The University of North Carolina
10:05 AM	Developing and Delegating: Two Key Strategies to Master as a Technical Leader Diahanna L. Post, Nielsen, Columbia University


CS02 Practical Considerations for Modeling		Fri, Feb 16, 9:15 AM - 10:45 AM Salons BC


Chair(s): Trijya Singh, Le Moyne College

9:20 AM	Evaluating Model Fit for Predictive Validity Katherine M. Wright, Northwestern University
10:05 AM	Flexible Modeling and Experimental Design Strategies Timothy E. O'Brien, Loyola University Chicago


CS03 Text Analytics Applications		Fri, Feb 16, 9:15 AM - 10:45 AM Salon D


Chair(s): Steven Cohen, RTI International

9:20 AM	Approachable, Interpretable Tools for Mining and Summarizing Large Text Corpora in R Luke W. Miratrix, Harvard University
10:05 AM	Latent Dirichlet Allocation Topic Models Applied to the Center for Disease Control and Prevention’s Grant Matthew Keith Eblen, Centers for Disease Control and Prevention


CS04 Working with Messy Data		Fri, Feb 16, 9:15 AM - 10:45 AM Salon E


Chair(s): Karol Krotki, RTI

9:20 AM	Practical Time-Series Clustering for Messy Data in R Jonathan Robert Page, University of Hawaii Economic Research Organization (UHERO)
10:05 AM	Doing Data Linkage: A Behind-the-Scenes Look Clinton J. Thompson, National Center for Health Statistics, CDC


CS05 Collaboration Essentials		Fri, Feb 16, 11:00 AM - 12:30 PM Salon A


Chair(s): Terrie Vasilopoulos, University of Florida, College of Medicine

11:05 AM	Asking Great Questions Eric Vance, LISA--University of Colorado Boulder
11:50 AM	Listening, Paraphrasing, and Summarizing Heather Smith, Cal Poly


CS06 Bayesian Applications		Fri, Feb 16, 11:00 AM - 12:30 PM Salons BC


Chair(s): Mariangela Guidolin, Department of Statistical Sciences, University of Padua

11:05 AM	Bayesian Inference for Stochastic Processes Lyle David Broemeling, University of Texas MD Anderson Cancer Center
11:50 AM	Forecasting Periodic Accumulating Processes with Semiparametric Distributional Regression Models and Bayesian Updates Harlan D. Harris, WayUp


CS07 Exploring Big Data		Fri, Feb 16, 11:00 AM - 12:30 PM Salon D


Chair(s): Christina Phan Knudson, University of St. Thomas

11:05 AM	Exploratory Data Structure Comparisons by Use of Principal Component Analysis Anne Helby Petersen, Biostatistics, University of Copenhagen
11:50 AM	Tools for Exploratory Data Analysis Wendy L. Martinez, U.S. Bureau of Labor Statistics


CS08 Streamlining Your Work Using Apps		Fri, Feb 16, 11:00 AM - 12:30 PM Salon E


Chair(s): Blake Langlais, Mayo Clinic

11:05 AM	Mechanizing Clinical Review Processes with R Shiny for Efficiency and Standardization Jimmy Wong, Food and Drug Administration
11:50 AM	Building Shiny Apps: With Great Power Comes Great Responsibility Jessica Minnier, Oregon Health & Science University


Lunch (On Own)		Fri, Feb 16, 12:30 PM - 2:00 PM






CS09 Presenting and Storytelling		Fri, Feb 16, 2:00 PM - 3:30 PM Salon A


Chair(s): Shasha Bai, University of Arkansas for Medical Sciences

2:05 PM	How to Give a Really Awful Presentation Paul Teetor, William Blair & Co
2:50 PM	Telling the Story of Your Stats Jennifer H. Van Mullekom, Virginia Tech


CS10 Propensity Scores and Resampling Methods		Fri, Feb 16, 2:00 PM - 3:30 PM Salons BC


Chair(s): Christine Wells, UCLA Statistical Consulting Group

2:05 PM	CANCELED: A Streamlined Process for Conducting a Propensity Score-Based Analysis John A. Craycroft, University of Louisville
2:50 PM	Resampling Methods for Statistical Inference on Multi-Rater Kappas Chia-Ling Kuo, University of Connecticut Health


CS11 Data Mining Algorithms		Fri, Feb 16, 2:00 PM - 3:30 PM Salon D


Chair(s): Abbass Sharif, University of Southern California

2:05 PM	Stochastic Gradient Boosting on Distributed Data Roxy Cramer, Rogue Wave Software
2:50 PM	Deep Neural Networks for Scalable Prediction Lynd Bacon, Loma Buena Assoc./Notre Dame Univ./Northwestern Univ.


CS12 Education to Practice and Data Visualization		Fri, Feb 16, 2:00 PM - 3:30 PM Salon E


Chair(s): Chester Ismay, DataCamp

2:05 PM	What Is Happening at the School Level and Why It Is Important to Statistical Practice Jane Watson, University of Tasmania
2:50 PM	The Life-Cycle of a Project: Visualizing Data from Start to Finish Nola du Toit, NORC at the University of Chicago


CS13 Managing Up		Fri, Feb 16, 3:45 PM - 5:15 PM Salon A


Chair(s): Ronald Gangnon, University of Wisconsin School of Medicine and Public Health

3:50 PM	What Does It Take for an Organization to Make Difficult Information-Based Decisions? Using the Oregon Department of Forestry’s RipStream Project as a Case Study Jeremy Groom, Groom Analytics
4:35 PM	Statistics for Management of an Organization Joyce Nilsson Orsini, Fordham University Graduate School of Business


CS14 Working with Health Care Data		Fri, Feb 16, 3:45 PM - 5:15 PM Salons BC


Chair(s): Melanie Edwards, Exponent

3:50 PM	Application of Support Vector Machine Modeling and Graph Theory Metrics for Disease Classification Jessica Michelle Rudd, Kennesaw State University
4:35 PM	Assessing Correspondence Between Two Data Sources Across Categorical Covariates with Missing Data: Application to Electronic Health Records Emile Latour, Oregon Health & Science University


CS15 Statisticians Teaching		Fri, Feb 16, 3:45 PM - 5:15 PM Salon D


Chair(s): Georgette Asherman, Direct Effects, LLC

3:50 PM	Should I Bring a Basket of Fish or Some Fishing Poles? Kathy Hall, Hewlett Packard
4:35 PM	Engaging Undergraduates in Statistical Consulting Christina Phan Knudson, University of St. Thomas


CS16 Novel Applications of Data Visualization		Fri, Feb 16, 3:45 PM - 5:15 PM Salon E


Chair(s): Mary Grace Crissey, CSRA

3:50 PM	Warranty/Performance Text Exploration for Modern Reliability Scott Lee Wise, SAS Institute, Inc.
4:35 PM	Improving the Data Customer’s Ability to Visualize Historical Agricultural Data at the National Agricultural Statistics Service Irwin Anolik, USDA-NASS


PS2 Poster Session 2 and Refreshments		Fri, Feb 16, 5:15 PM - 6:30 PM Salons F-I


Chair(s): S. Keith Anderson, Mayo Clinic

	Data, Data Everywhere …, but Mind the Disclaimers: Benefits and Costs of Matching Large Cohorts to Individual US Mortality Case Data in the NDI, SSA Death Master File (DMF/SSDI), and More Sigurd Wilson Hermansen, Westat
	Curating and Visualizing Big Data from Wearable Activity Trackers Meike Niederhausen, OHSU-PSU School of Public Health
	Consensus Strategy for Variable Selection in Clinical Prediction Rule Development Miriam R. Elman, OHSU/OSU College of Pharamcy
	Reproducible Research Implemented Through Version Control Systems Lillian S. Lin, Montana State University
	The Boeing Applied Statistics ToolKit: Best Practices and Tools for Collaboration and Reproducibility in High-Throughput Consulting Robert Michael Lawton, Boeing Research & Technology
	Empirical Comparisons of Differential Expression Analysis Pipelines for RNA-Sequencing Data Lina Gao, Biostatistics Shared Resource (OHSU BSR); Biostatistics and Bioinformatics Unit (ONPRC BBU)
	A Practical Guide for Modeling Length of Stay with Focus on Right Skewness and Zero Inflation Lizhou Nie, Stony Brook University
	Nonparametric Estimation of Time-Variant Quantiles and Statistical Models Jessica Michelle Rudd, Kennesaw State University
	Estimating the Relative Excess Risk Due to Interaction in Clustered Data Settings Katharine Fischer Berry Correia, Harvard T.H. Chan School of Public Health
	Spatial Analysis of Fukushima Thyroid Ultrasound Examination Survey Data Emerson H. Webb, Reed College
	A Growth Reference for Mid-Upper-Arm Circumference for Age Among School-Age Children and Adolescents, with Validation for Mortality in Two Cohorts Lazarus K. Mramba, University of Florida
	Machine Learning Methods for Predicting Zygosity Ally Rochelle Avery, Washington State University
	Simulating Real-World Data with Time-Varying Variables Maria Emilia de Oliveira Montez-Rath, Stanford University
	Evaluating the Effectiveness of the Flipped Classroom Model Using Structural Equation Modeling Shan Wang, Assistant Professor
	Software for Covariate Specification in Linear, Logistic, and Survival Regression Sai Liu, Stanford University
	Exploratory Analyses from Different Forms of Interactive Visualizations Lata Kodali, Virginia Tech
	Using SAS Programming to Create Complex Paneled Graphs from Electronic Health Records Carrie Tillotson, OCHIN, Inc.
	An Algorithm to Identify Family Linkages Using Electronic Health Record Data Megan Hoopes, OCHIN, Inc.


Saturday, February 17
Registration		Sat, Feb 17, 7:30 AM - 2:30 PM






Exhibits Open		Sat, Feb 17, 7:30 AM - 1:00 PM Salons F-I






PS3 Poster Session 3 and Continental Breakfast		Sat, Feb 17, 8:00 AM - 9:15 AM Salons F-I


Chair(s): Edward Mulrow, NORC at the University of Chicago

	Thematic Feature Selection for Research Support Thealexa Becker, Federal Reserve Bank of Kansas City
	Systematizing Your Statistical Consulting Practice Terrie Vasilopoulos, University of Florida, College of Medicine
	Sixteen Personalities at Work Katherine Eleanor Tranbarger Freier, Intel Corporation
	Re-Examining Sick Quitter Hypothesis on Association of Alcohol Consumption with Coronary Heart Disease Amy Z. Fan, National Institute of Health
	Comparisons of Propensity Score Analysis for Analyzing Rare Binary Outcome Jihye Park, Stony Brook University
	Understanding Graduate School Speed-Dating with Generalized Linear Mixed Models Christina Phan Knudson, University of St. Thomas
	Data Modeling to Mitigate the Impact of Missing Data in a Longitudinal Study of Injecting Drug Users Tania Amanda Patrao, University of Queensland, Australia
	Multivariate Statistical Analysis in Plastic Foam Research Wenyu Su, The Dow Chemical Company
	Win Ratio Application for a Composite Outcome in a Randomized Cardiovascular Trial Rose A. Hamershock, TIMI Study Group
	Statistical Analysis of Network Change Teresa Danielle Schmidt, Portland State University
	Exploring Data Quality and Time Series Event Detection in 2016 US Presidential Election Polls Kaelyn M. Rosenberg, Reed College
	Understanding and Using Ordinal Factor Analysis Nivedita Bhaktha, The Ohio State University
	An Easy-to-Use SAS® Macro for a Descriptive Statistics Table with P-Values Yuanchao Zheng, Stanford University
	Animated Data Visualization with Plotly: Useful Tool for Health Care Quality Improvement Eric A. Tesdahl, SpecialtyCare, Inc.
	Using Accessible Patient Data to Individualize Sample Timing for Pharmacokinetic Studies Matthew Stephen Shotwell, Vanderbilt University Medical Center


CS17 Passion for Statistics		Sat, Feb 17, 9:15 AM - 10:45 AM Salon A


Chair(s): Kathleen A. Jablonski, The George Washington University

9:20 AM	Am I Supposed to Enjoy My Job? Career Observations from a Biostatistician Daniel Thomas Cotton, Boehringer Ingelheim Pharmaceuticals
10:05 AM	Statistics in the Wild: Practicing Statistics in Nontraditional Places, from a Tiny Island in the Pacific to the Federal Cabinet Heather Krause, Datassist


CS18 Survival Analysis v. 'Survival' Analysis		Sat, Feb 17, 9:15 AM - 10:45 AM Salons BC


Chair(s): Yulia Marchenko, StataCorp LLC

9:20 AM	'How Long Would You Wait?' Using Time-to-Event (Survival) Analysis to Explore Waiting Times Ruth Hummel, SAS Institute
10:05 AM	Statistical Methods for National Security Risk Quantification and Optimal Resource Allocation Robert Brigantic, Pacific Northwest National Laboratory


CS19 Business Intelligence Applications		Sat, Feb 17, 9:15 AM - 10:45 AM Salon D


Chair(s): Chris Holloman, ICC

9:20 AM	Business Intelligence (BI) Reporting Solution: From Source to Nuts Andrew Piskorowski, Survey Research Center, University of Michigan
10:05 AM	Location Analytics: An Application of GIS Moxie Zhang, Esri (China)


CS20 Understanding Populations		Sat, Feb 17, 9:15 AM - 10:45 AM Salon E


Chair(s): Shelley DeVost, Los Angeles LGBT Center

9:20 AM	Quantifying Populations in Proximity to Oil and Gas Development: A National Spatial Analysis and Review Tanja Srebotnjak, Harvey Mudd College
10:05 AM	Approaches and Techniques for Estimating the Total Number of Species in a Population, with Emphasis on Application to Mineral Species Grethe Hystad, Purdue University Northwest


CS21 Developing Communication Skills		Sat, Feb 17, 11:00 AM - 12:30 PM Salon A


Chair(s): Cynthia R. Long, Palmer Center for Chiropractic Research, Palmer College of Chiropractic

11:05 AM	How to Communicate Statistics, and How Statisticians Should Communicate Achim Guettner, Novartis Pharma
11:50 AM	PANEL: Communication Skills: What's Next Lillian S. Lin, Montana State University; Kim Love, K. R. Love Quantitative Consulting and Collaboration; Alicia Toledano, Biostatistics Consulting, LLC; Eric Vance, LISA--University of Colorado Boulder


CS22 Small Sample Sizes and Non-Probability Sampling		Sat, Feb 17, 11:00 AM - 12:30 PM Salons BC


Chair(s): Amy Laird, Oregon Clinical & Translational Research Institute (OCTRI)

11:05 AM	Quantifying and Incorporating Sources of Variability and Uncertainty in Statistical Analyses with Very Small Sample Sizes Annette M. Bachand, Ramboll Environ
11:50 AM	Non-Probability Sampling: Wave of the Future in Survey Research? Karol Krotki, RTI


CS23 Data Science Applications		Sat, Feb 17, 11:00 AM - 12:30 PM Salon D


Chair(s): Sarah Burgoyne, Claritas

11:05 AM	Recent Advances in the Analysis and Detection of Communities in a Network Frederick Kin Hing Phoa, Institute of Statistical Science, Academia Sinica
11:50 AM	Firehose Data Science: Real-Time Analytics of Twitter Feeds David Corliss, Ford Motor Company


CS24 Causal Inference		Sat, Feb 17, 11:00 AM - 12:30 PM Salon E


Chair(s): Larisa G. Tereshchenko, Oregon Health and Science University

11:05 AM	Causal Inference with Multilevel Data Structures Luke Keele, Georgetown
11:50 AM	A Decision Tool for Causal Inference and Observational Data Analysis Methods in Comparative Effectiveness Research (DECODE CER) Douglas Landsittel, University of Pittsburgh


Lunch (On Own)		Sat, Feb 17, 12:30 PM - 2:00 PM






PCD1 Deploying Quantitative Models as 'Visuals' in Popular Data Visualization Platforms		Sat, Feb 17, 2:00 PM - 4:00 PM Salon E
Instructor(s): Daniel Fylstra, Frontline Systems Inc.


Data visualization and business intelligence tools such as Tableau and Power BI have become extremely popular in recent years. Tableau reports that over 90% of Fortune 500 companies are now customers, while Microsoft reports that over 200,000 organizations of all sizes are using Power BI. These tools currently offer easy-to-use access to many data sources, powerful facilities for "slicing and dicing" data, and rich, flexible data visualization, but only limited built-in analytics methods. A new avenue has emerged in the past year for extending analytics methods in both Tableau and Power BI- and this provides a new way for an analyst to develop quantitative models outside these platforms, then deploy them as 'visuals' inside Tableau and Power BI, in 'dashboards' which are often published for use by thousands of users in an organization. Though originally conceived as a way to extend the range of visualization styles, these components can perform arbitrary computations on data before it is rendered in visual form. In this session, Excel Solver developer Frontline Systems, one of the first to explore this new avenue, will demonstrate use of its tools to automatically convert existing quantitative models into 'visuals' for both Tableau and Power BI. Among other options, this enables an analyst to convert predictive (data mining, machine learning) or prescriptive (optimization, simulation) model from Microsoft Excel into an easily-deployed 'visual', just two mouse clicks. No programming is required, but the ability to extend models using high-level RASON modeling language code or programming language code is available. These 'visuals' are full-fledged models that easily connect to any Tableau or Power BI data source, and re-solve the underlying problem whenever the data sources are refreshed.


PCD2 Handling Missing Data Using Multiple Imputation		Sat, Feb 17, 2:00 PM - 4:00 PM Salons BC
Instructor(s): Yulia Marchenko, StataCorp LLC


This workshop will cover the use of Stata to perform multiple-imputation analysis. Multiple imputation (MI) is a simulation-based technique for handling missing data. The course will provide a brief introduction to multiple imputation and will demonstrate how to perform multiple imputation in Stata. The three stages of MI (imputation, completed-data analysis, and pooling) will be discussed with accompanying Stata examples. Imputation using multivariate normal (MVN) and using chained equations (MICE, FCS) will be discussed. A number of examples demonstrating hot to efficiently manage multiply imputed data within Stata will also be provided. Linear and logistic regression analysis of multiply imputed data as well as several postestimation features will be presented. No prior knowledge of Stata is required, but basic familiarity with multiple imputation will prove useful.


T1 Engage the Room: Mastering Your Personal Presentation Style		Sat, Feb 17, 2:00 PM - 4:00 PM Salon A
Instructor(s): Duncan Burl Gilles, Art of Problem Solving


As confident as we may be in the quality of our work, presentation can make or break the impact it has. Engaging the room and communicating clearly can make the difference between an unimpressed, bored audience and a thrilled audience eager to learn more. This course will focus on presentation techniques that help you communicate your ideas effectively and in an engaging manner. You’ll be trained on ways to draw your audience into your talk, engage them in active listening and thinking, and use your voice and the space of the room to command attention and convey your message. These are skills applicable in many areas – whether presenting your work to clients, teaching in the classroom, one-on-one interviews or discussions, and even CSP talks! After the talk, participants will have the chance to send a short video of a talk to the presenter for review and feedback.


T2 Applying Propensity Score Methods to Observational Studies Using R and SAS		Sat, Feb 17, 2:00 PM - 4:00 PM Eugene
Instructor(s): Wei Pan, Duke University


Observational studies are common in applied settings but pose threats to the validity of causal inference due to selection bias in the data. Propensity score methods have been increasingly used as a means of reducing selection bias to enhance the causal claims. A training course on the application of propensity score methods to observational studies using commonly used statistical software would be beneficial for applied statisticians and researchers to improve the quality of their observational studies. With this objective, the proposed course will introduce basic concepts and practical issues of propensity score methods, including matching, stratification, and weighting; the instructors will facilitate hands-on activities of applying propensity score methods to observational studies with real-world examples using R and SAS. No prior knowledge of propensity score methods or computer programming is required. Participants are encouraged to bring their own laptop computers for hands-on activities.


T3 A Workshop on Validation of Discrete Response Statistical Models		Sat, Feb 17, 2:00 PM - 4:00 PM Portland
Instructor(s): Raul Eduardo Avelar Moran, Texas A&M Transportation Institute


Count models are widely used to analyze discrete data in various fields. When the intent of the analysis is prediction, model validation is an important step before the model can be offered with confidence to final users. This tutorial will discuss when and why to validate, and will demonstrate model validation techniques specific to discrete response models, such as Poisson and Negative Binomial Generalized Linear Regression Models.


T4 Tools for Connecting R, SAs, and Stata to Word: A Practical Approach to Reproducibility		Sat, Feb 17, 2:00 PM - 4:00 PM Salon D
Instructor(s): Abigail S. Baldridge, Northwestern University; Leah J. Welty, Northwestern University


Reproducibility, wherein data analysis and documentation is sufficient so that results can be recomputed or verified, is an increasingly important component of statistical practice. “Weaving” tools such as R Markdown facilitate reproducibility by combining narrative text and analysis code in one plain-text document, but are of limited use when manuscripts or reports must be generated in MS Word (e.g. due to journal requirements or client preference). This course will: (1) summarize how weaving tools create Word documents, and the ensuing limitations; and (2) introduce an alternate approach using recently released StatTag software. StatTag is a free, open-source program that embeds results (values, tables, figures, or verbatim output) from R, SAS, or Stata directly in Word such that they can be automatically updated if code or data changes. This course is intended for a broad audience; prerequisites are experience preparing documents in Word and conducting analysis in any one of R, SAS, or Stata. The workshop will provide practical, hands-on examples drawn from R, SAS, and Stata, and will include an overview of weaving approaches as well as an introduction to StatTag.


GS2 Closing General Session		Sat, Feb 17, 4:15 PM - 5:30 PM Salon E


Chair(s): Eric Vance, LISA--University of Colorado Boulder
The Closing Session is an opportunity for you to interact with the CSP Steering Committee in an open discussion about how the conference went and how it could be improved in future years. CSPSC vice chair, Eric Vance, will lead a panel of committee members as they summarize their conference experience. The audience will then be invited to ask questions and provide feedback. The committee highly values suggestions for improvements gathered during this time. The best student poster will also be awarded during the Closing Session, and each attendee will have an opportunity to win a door prize.

Online Program

American Statistical Association

Share