Keynote Address | Concurrent Sessions | Poster Sessions
Short Courses (full day) | Short Courses (half day) | Tutorials | Practical Computing Demonstrations | Closing General Session with Refreshments
Thursday, February 15 | ||
Registration
|
Thu, Feb 15, 7:00 AM - 6:30 PM
|
|
|
||
SC1
Introduction to Big Data Analysis
|
Thu, Feb 15, 8:00 AM - 5:30 PM
Salon A |
|
Instructor(s): Fulya Gokalp Yavuz, Yildiz Technical University; Mark Daniel Ward, Purdue University | ||
This one-day introductory workshop is geared toward CSP participants who want to revitalize or improve their data analysis skills, especially with an emphasis on big data. Ward and Gokalp Yavuz will present tools and techniques for these most fundamental, low-level aspects of data analysis. We are well-versed at teaching such techniques to students who have no background in data analysis or programming. This workshop will bring people up to speed with powerful techniques for data analysis. This one-day course has no prerequisites. This workshop will be hands-on and driven by examples, using large data sets. The intended participants for the course are people who work in a data-driven environment and have an increasing need to perform aspects of large data analysis. Before data is gathered and organized, a great deal of data manipulation is necessary, especially for working with big data sets. Sometimes the data need to be scraped from remote sources, and then parsed into more natural forms. This process often involves munging and cleaning the data. The need to be able to reproduce and reliably verify all of the methods used for the data wrangling is more important than ever.
|
||
|
||
SC2
An Introduction to D3.Js: From Scattered to Scatterplot
|
Thu, Feb 15, 8:00 AM - 5:30 PM
Salon B |
|
Instructor(s): Scott Murray, O’Reilly Media
Download Handouts |
||
Interested in coding data visualizations on the web, but don't know where to start? This workshop will have you transforming data into visual images in no time at all, starting from scratch and building an interactive scatterplot by the end of the session. We'll use d3.js, the web's most powerful library for data visualization, to load data and translate values into SVG elements — drawing lines, points, and scaled axes to label our data. We’ll learn how to use motion and visual transitions, and introduce simple interactivity to make our charts more explorable. All methods and examples will be up-to-date for the current version of D3 (4.x as of this writing).
|
||
|
||
SC3
Collaboration Essentials for Practicing Statisticians and Data Scientists
|
Thu, Feb 15, 8:00 AM - 12:00 PM
Salon C |
|
Instructor(s): Heather Smith, Cal Poly; Eric Vance, LISA--University of Colorado Boulder
Download Handouts |
||
Statisticians and data scientists positively impact many people, organizations, and governments through the careful collection, analysis, and interpretation of data to solve problems and make decisions. To maximize their impact, statisticians and data scientists must effectively collaborate with a variety of domain experts who originate the data or the problems to be solved. In this short course, participants will learn and practice essential skills to improve their professional communication and collaboration to increase their effectiveness on the job. Specifically, participants will learn how to establish foundational collaborative relationships with domain experts; structure effective meetings; and effectively communicate with non-statisticians. Participants will also practice their newly acquired skills and learn how to improve their proficiency in these essential collaboration skills by using role-plays and video coaching and feedback reviews outside of this short course. In sum, participants will learn and practice how to leverage their technical skills to more effectively collaborate for maximal impact inside and outside of their organizations.
|
||
|
||
SC4
A Variety of Mixed Models: Linear, Generalized Linear, and Nonlinear
|
Thu, Feb 15, 8:00 AM - 12:00 PM
Salon E |
|
Instructor(s): David A. Dickey, NC State University
Download Handouts |
||
The MIXED procedure in SAS, for example, correctly handles linear models that have multiple sources of random effects such as random town to town, store to store, and aisle to aisle variation in sales. Associated fixed effects might be product price, color of packaging and amount spent on advertising. The talk begins with a checklist for deciding when to treat effects as random versus fixed and follows with a series of examples. When the response variable is not normal, for example with a binary or Poisson response, additional complexities arise. Models with such non normal responses are often analyzed by assuming that some transformation, or link function, of the expected value of Y results in a linear model with fixed and random effects. We are then in the generalized linear mixed model setting. It may be that a model cannot be linearized by a transformation, thus making it a nonnlinear model. If random effects are involved the model is referred to as a nonlinear mixed model. With a minimal amount of theory and an emphasis on examples, these types of models will be explained and illustrated. SAS will be used but the ideas and interpretation are software independent.
|
||
|
||
SC5
Cleaning Up the Data Cleaning Process: Challenges and Solutions in R
|
Thu, Feb 15, 8:00 AM - 12:00 PM
Salon D |
|
Instructor(s): Claus Thorn Ekstrøm, Biostatistics, University of Copenhagen; Anne Helby Petersen, Biostatistics, University of Copenhagen | ||
Data cleaning and validation are the first steps in any data analysis, as the validity of the conclusions from the analysis hinges on the quality of the input data. Mistakes in the data can arise for any number of reasons, including erroneous codings, malfunctioning measurement equipment, and inconsistent data generation manuals. We present a systematic, analytical approach to data cleaning that will ensure the data cleaning process to be just as structured and well-documented as the rest of the data analysis. The primary software tool is the dataMaid R package, which implements an extensive and customisable suite of quality assessment tools that can be used to identify potential problems in a dataset. The results are summarised in an auto-generated, non-technical, stand-alone document readable by statisticians and non-statisticians alike. Thus, the course teaches practical skills that aid the dialogue between data analysts and field experts, while also providing easy documentation of reproducible data cleaning steps and data quality control.
|
||
|
||
SC6
Effective Presentation for Statisticians and Data Scientists: Success=(PD)^2
|
Thu, Feb 15, 1:30 PM - 5:30 PM
Salon C |
|
Instructor(s): Jennifer H. Van Mullekom, Virginia Tech
Download Handouts |
||
Statisticians must be able to effectively convey their ideas to clients, collaborators, and decision-makers. Presenting in the modern world is even more daunting when speakers have the opportunity to employ slideware, videos, and live demos. Unfortunately, university coursework and professional development programs are often not targeted towards sharpening these skills. This short course, developed and taught by statisticians, will provide an opportunity to learn how to employ different methods and tools in the phases of the framework taught. The material covered in the course is geared toward data-based presentations and is based on the works of Garr Reynolds and Michael Alley, among others. The course will emphasize the importance of stepping away from the computer to Prepare an effective message aimed at your core point guided with a series of questions and tips. The Design phase emphasizes the importance of structure, streamlining, and good graphic design accompanied by a series of checklists. Of course, “Practice makes perfect” so we cannot skip this step. Finally, engaging the audience and effectively using the room and equipment is covered in the Deliver phase.
|
||
|
||
SC7
Statistical Learning Methods in R
|
Thu, Feb 15, 1:30 PM - 5:30 PM
Salon E |
|
Instructor(s): Kelly Sue McConville, Swarthmore College
Download Handouts |
||
Applied statisticians are often confronted with difficult modeling problems where standard regression approaches are not appropriate. For example, it may be that the number of possible predictors is large relative to the sample size or that the relationship between the variables is non-linear. This course will cover several statistical learning techniques which are designed to handle these difficult modeling problems. In particular, we will study penalized regression techniques (lasso, ridge, elasticnet), non-parametric regression (regression and smoothing splines), and classification methods (support vector machines, trees). Using data from the Bureau of Labor Statistics, participants will learn how to fit these models in R. R Markdown files with the relevant code will be provided so that participants can actively follow along with the demonstrations.
|
||
|
||
SC8
NISS Shortcourse: A Survey of Modern Data Science
|
Thu, Feb 15, 1:30 PM - 5:30 PM
Salon D |
|
Instructor(s): David Banks, Dept. of Statistical Science, Duke University
Download Handouts |
||
Modern data science is driven by applications, and these often entail Big Data and machine learning perspectives. This short course reviews key ideas and methods in nonparametric regression (starting with cross-validation and light bootstrap asymptotics, then moving on to the additive model, the generalized additive model, and neural networks. It also covers variable selection, with the Lasso and the Median Model, and describes the p >> n problem in the context of contributions by Candes and Tao, Donoho and Tanner, and Wainwright. The course next treats classification, with emphasis upon Random Forests, boosting, and ensemble strategies such as bagging, stacking and boosting.
|
||
|
||
PS1
Poster Session 1 and Opening Mixer
|
Thu, Feb 15, 5:30 PM - 7:00 PM
Salons F-I |
|
Chair(s): Alok Dwivedi, Texas Tech University Health Sciences Center El Paso (TTUHSC EP) | ||
|
||
1 Some Dimension Reduction Strategies for the Analysis of Survey Data
View Presentation Jiaying Weng, University of Kentucky |
||
2 Perl-Compatible Regular Expressions as a Tool to Abstract Semi-Structured Electronic Health Records
View Presentation Samantha Emily Montag, Northwestern University |
||
3 Collaborative Process to Efficiently Produce Publications in Multicenter Research
View Presentation Cody S. Olsen, University of Utah, Department of Pediatrics |
||
4 Developing a Comprehensive Personal Plan for Teleworking (Working Remotely)
View Presentation Julia Lull, Janssen Research & Development, LLC |
||
5 Thank You, Come Again: Modeling Repeat Purchase Behavior for Business Travelers
View Presentation Diag D. Davenport, Georgetown University |
||
6 Wavelet-Based Methods for Data-Driven Monitoring
View Presentation Achraf Cohen, University of West Florida |
||
7 A Simulation Study of Violations of the Local Independence Assumption in Latent Class Analyses
View Presentation Michael P. Chen, U.S. Centers for Disease Control and Prevention |
||
8 Impact of Linear Regression Predictor Omission on Estimation and Inference
View Presentation Julia L. Sharp, Colorado State University |
||
10 A Comparison of Standard Logistic Regression, Multilevel Modeling, Robust Error Estimation, and Exposure Simulation for Data Containing Quasi-Berkson Error
View Presentation Angelique Liddell Zeringue, Mercy Healthcare |
||
11 Statistical Modeling for Repeated Measures in Rubber Research
View Presentation Wenzhao Yang, The Dow Chemical Company |
||
12 Combining Historical Data and Propensity Score Methods in Observational Studies to Improve Internal Validity
Miguel Marino, Oregon Health & Science University |
||
13 Marketing Communication Channel Preference Optimization Using a Two-Stage Statistical Modeling
View Presentation Hongying Yang, Statistical consultant |
||
14 Limitations of Propensity Score Methods: Demonstration Using a Real-World Example
View Presentation Gregory B. Tallman, Oregon State University/Oregon Health & Science University |
||
15 Effect Size Measures for Nonlinear Count Regression Models
View Presentation Stefany Coxe, Florida International University |
||
16 Appropriate Dimension Reduction for Sparse, High-Dimensional Data Using Intensity Plots and Other Visualizations
View Presentation Eugenie Jackson, West Virginia University |
||
17 Navigating Large-Scale Forest Plots Using R and Shiny
View Presentation Steele Valenzuela, Oregon Health & Science University |
||
18 Ranked-Choice Voting R Package
View Presentation Jay Lee, Reed College |
||
Exhibits Open
|
Thu, Feb 15, 5:30 PM - 7:00 PM
Salons F-I |
|
|
||
Friday, February 16 | ||
Registration
|
Fri, Feb 16, 7:30 AM - 5:30 PM
|
|
|
||
Continental Breakfast
|
Fri, Feb 16, 7:30 AM - 8:30 AM
Salons F-I |
|
|
||
Exhibits Open
|
Fri, Feb 16, 7:30 AM - 6:30 PM
Salons F-I |
|
|
||
GS1
Keynote Address
|
Fri, Feb 16, 8:00 AM - 9:00 AM
Salon E |
|
Chair(s): Kim Love, K. R. Love Quantitative Consulting and Collaboration | ||
|
||
8:05 AM |
Reflections on Career Opportunities and Leadership in Statistics
Lisa LaVange, The University of North Carolina |
|
CS01
#LeadWithStatistics
|
Fri, Feb 16, 9:15 AM - 10:45 AM
Salon A |
|
Chair(s): Sejong Bae, Comprehensive Cancer Center, University of Alabama | ||
|
||
9:20 AM |
Q&A with Lisa LaVange
Lisa LaVange, The University of North Carolina |
|
10:05 AM |
Developing and Delegating: Two Key Strategies to Master as a Technical Leader
View Presentation Diahanna L. Post, Nielsen, Columbia University |
|
CS02
Practical Considerations for Modeling
|
Fri, Feb 16, 9:15 AM - 10:45 AM
Salons BC |
|
Chair(s): Trijya Singh, Le Moyne College | ||
|
||
9:20 AM |
Evaluating Model Fit for Predictive Validity
Katherine M. Wright, Northwestern University |
|
10:05 AM |
Flexible Modeling and Experimental Design Strategies
Timothy E. O'Brien, Loyola University Chicago |
|
CS03
Text Analytics Applications
|
Fri, Feb 16, 9:15 AM - 10:45 AM
Salon D |
|
Chair(s): Steven Cohen, RTI International | ||
|
||
9:20 AM |
Approachable, Interpretable Tools for Mining and Summarizing Large Text Corpora in R
View Presentation Luke W. Miratrix, Harvard University |
|
10:05 AM |
Latent Dirichlet Allocation Topic Models Applied to the Center for Disease Control and Prevention’s Grant
View Presentation Matthew Keith Eblen, Centers for Disease Control and Prevention |
|
CS04
Working with Messy Data
|
Fri, Feb 16, 9:15 AM - 10:45 AM
Salon E |
|
Chair(s): Karol Krotki, RTI | ||
|
||
9:20 AM |
Practical Time-Series Clustering for Messy Data in R
View Presentation Jonathan Robert Page, University of Hawaii Economic Research Organization (UHERO) |
|
10:05 AM |
Doing Data Linkage: A Behind-the-Scenes Look
View Presentation Clinton J. Thompson, National Center for Health Statistics, CDC |
|
CS05
Collaboration Essentials
|
Fri, Feb 16, 11:00 AM - 12:30 PM
Salon A |
|
Chair(s): Terrie Vasilopoulos, University of Florida, College of Medicine | ||
|
||
11:05 AM |
Asking Great Questions
View Presentation Eric Vance, LISA--University of Colorado Boulder |
|
11:50 AM |
Listening, Paraphrasing, and Summarizing
View Presentation Heather Smith, Cal Poly |
|
CS06
Bayesian Applications
|
Fri, Feb 16, 11:00 AM - 12:30 PM
Salons BC |
|
Chair(s): Mariangela Guidolin, Department of Statistical Sciences, University of Padua | ||
|
||
11:05 AM |
Bayesian Inference for Stochastic Processes
View Presentation Lyle David Broemeling, University of Texas MD Anderson Cancer Center |
|
11:50 AM |
Forecasting Periodic Accumulating Processes with Semiparametric Distributional Regression Models and Bayesian Updates
Harlan D. Harris, WayUp |
|
CS07
Exploring Big Data
|
Fri, Feb 16, 11:00 AM - 12:30 PM
Salon D |
|
Chair(s): Christina Phan Knudson, University of St. Thomas | ||
|
||
11:05 AM |
Exploratory Data Structure Comparisons by Use of Principal Component Analysis
View Presentation Anne Helby Petersen, Biostatistics, University of Copenhagen |
|
11:50 AM |
Tools for Exploratory Data Analysis
View Presentation Wendy L. Martinez, U.S. Bureau of Labor Statistics |
|
CS08
Streamlining Your Work Using Apps
|
Fri, Feb 16, 11:00 AM - 12:30 PM
Salon E |
|
Chair(s): Blake Langlais, Mayo Clinic | ||
|
||
11:05 AM |
Mechanizing Clinical Review Processes with R Shiny for Efficiency and Standardization
View Presentation Jimmy Wong, Food and Drug Administration |
|
11:50 AM |
Building Shiny Apps: With Great Power Comes Great Responsibility
View Presentation Jessica Minnier, Oregon Health & Science University |
|
Lunch (On Own)
|
Fri, Feb 16, 12:30 PM - 2:00 PM
|
|
|
||
CS09
Presenting and Storytelling
|
Fri, Feb 16, 2:00 PM - 3:30 PM
Salon A |
|
Chair(s): Shasha Bai, University of Arkansas for Medical Sciences | ||
|
||
2:05 PM |
How to Give a Really Awful Presentation
View Presentation Paul Teetor, William Blair & Co |
|
2:50 PM |
Telling the Story of Your Stats
View Presentation Jennifer H. Van Mullekom, Virginia Tech |
|
CS10
Propensity Scores and Resampling Methods
|
Fri, Feb 16, 2:00 PM - 3:30 PM
Salons BC |
|
Chair(s): Christine Wells, UCLA Statistical Consulting Group | ||
|
||
2:05 PM |
CANCELED: A Streamlined Process for Conducting a Propensity Score-Based Analysis
John A. Craycroft, University of Louisville |
|
2:50 PM |
Resampling Methods for Statistical Inference on Multi-Rater Kappas
Chia-Ling Kuo, University of Connecticut Health |
|
CS11
Data Mining Algorithms
|
Fri, Feb 16, 2:00 PM - 3:30 PM
Salon D |
|
Chair(s): Abbass Sharif, University of Southern California | ||
|
||
2:05 PM |
Stochastic Gradient Boosting on Distributed Data
View Presentation Roxy Cramer, Rogue Wave Software |
|
2:50 PM |
Deep Neural Networks for Scalable Prediction
View Presentation Lynd Bacon, Loma Buena Assoc./Notre Dame Univ./Northwestern Univ. |
|
CS12
Education to Practice and Data Visualization
|
Fri, Feb 16, 2:00 PM - 3:30 PM
Salon E |
|
Chair(s): Chester Ismay, DataCamp | ||
|
||
2:05 PM |
What Is Happening at the School Level and Why It Is Important to Statistical Practice
View Presentation Jane Watson, University of Tasmania |
|
2:50 PM |
The Life-Cycle of a Project: Visualizing Data from Start to Finish
View Presentation Nola du Toit, NORC at the University of Chicago |
|
CS13
Managing Up
|
Fri, Feb 16, 3:45 PM - 5:15 PM
Salon A |
|
Chair(s): Ronald Gangnon, University of Wisconsin School of Medicine and Public Health | ||
|
||
3:50 PM |
What Does It Take for an Organization to Make Difficult Information-Based Decisions? Using the Oregon Department of Forestry’s RipStream Project as a Case Study
View Presentation Jeremy Groom, Groom Analytics |
|
4:35 PM |
Statistics for Management of an Organization
Joyce Nilsson Orsini, Fordham University Graduate School of Business |
|
CS14
Working with Health Care Data
|
Fri, Feb 16, 3:45 PM - 5:15 PM
Salons BC |
|
Chair(s): Melanie Edwards, Exponent | ||
|
||
3:50 PM |
Application of Support Vector Machine Modeling and Graph Theory Metrics for Disease Classification
View Presentation Jessica Michelle Rudd, Kennesaw State University |
|
4:35 PM |
Assessing Correspondence Between Two Data Sources Across Categorical Covariates with Missing Data: Application to Electronic Health Records
View Presentation Emile Latour, Oregon Health & Science University |
|
CS15
Statisticians Teaching
|
Fri, Feb 16, 3:45 PM - 5:15 PM
Salon D |
|
Chair(s): Georgette Asherman, Direct Effects, LLC | ||
|
||
3:50 PM |
Should I Bring a Basket of Fish or Some Fishing Poles?
View Presentation Kathy Hall, Hewlett Packard |
|
4:35 PM |
Engaging Undergraduates in Statistical Consulting
View Presentation Christina Phan Knudson, University of St. Thomas |
|
CS16
Novel Applications of Data Visualization
|
Fri, Feb 16, 3:45 PM - 5:15 PM
Salon E |
|
Chair(s): Mary Grace Crissey, CSRA | ||
|
||
3:50 PM |
Warranty/Performance Text Exploration for Modern Reliability
View Presentation Scott Lee Wise, SAS Institute, Inc. |
|
4:35 PM |
Improving the Data Customer’s Ability to Visualize Historical Agricultural Data at the National Agricultural Statistics Service
Irwin Anolik, USDA-NASS |
|
PS2
Poster Session 2 and Refreshments
|
Fri, Feb 16, 5:15 PM - 6:30 PM
Salons F-I |
|
Chair(s): S. Keith Anderson, Mayo Clinic | ||
|
||
1 Data, Data Everywhere …, but Mind the Disclaimers: Benefits and Costs of Matching Large Cohorts to Individual US Mortality Case Data in the NDI, SSA Death Master File (DMF/SSDI), and More
View Presentation Sigurd Wilson Hermansen, Westat |
||
2 Curating and Visualizing Big Data from Wearable Activity Trackers
View Presentation Meike Niederhausen, OHSU-PSU School of Public Health |
||
3 Consensus Strategy for Variable Selection in Clinical Prediction Rule Development
View Presentation Miriam R. Elman, OHSU/OSU College of Pharamcy |
||
4 Reproducible Research Implemented Through Version Control Systems
View Presentation Lillian S. Lin, Montana State University |
||
5 The Boeing Applied Statistics ToolKit: Best Practices and Tools for Collaboration and Reproducibility in High-Throughput Consulting
View Presentation Robert Michael Lawton, Boeing Research & Technology |
||
6 Empirical Comparisons of Differential Expression Analysis Pipelines for RNA-Sequencing Data
View Presentation Lina Gao, Biostatistics Shared Resource (OHSU BSR); Biostatistics and Bioinformatics Unit (ONPRC BBU) |
||
7 A Practical Guide for Modeling Length of Stay with Focus on Right Skewness and Zero Inflation
View Presentation Lizhou Nie, Stony Brook University |
||
8 Nonparametric Estimation of Time-Variant Quantiles and Statistical Models
View Presentation Jessica Michelle Rudd, Kennesaw State University |
||
9 Estimating the Relative Excess Risk Due to Interaction in Clustered Data Settings
View Presentation Katharine Fischer Berry Correia, Harvard T.H. Chan School of Public Health |
||
10 Spatial Analysis of Fukushima Thyroid Ultrasound Examination Survey Data
View Presentation Emerson H. Webb, Reed College |
||
11 A Growth Reference for Mid-Upper-Arm Circumference for Age Among School-Age Children and Adolescents, with Validation for Mortality in Two Cohorts
View Presentation Lazarus K. Mramba, University of Florida |
||
12 Machine Learning Methods for Predicting Zygosity
View Presentation Ally Rochelle Avery, Washington State University |
||
13 Simulating Real-World Data with Time-Varying Variables
View Presentation Maria Emilia de Oliveira Montez-Rath, Stanford University |
||
14 Evaluating the Effectiveness of the Flipped Classroom Model Using Structural Equation Modeling
View Presentation Shan Wang, Assistant Professor |
||
15 Software for Covariate Specification in Linear, Logistic, and Survival Regression
View Presentation Sai Liu, Stanford University |
||
16 Exploratory Analyses from Different Forms of Interactive Visualizations
Lata Kodali, Virginia Tech |
||
17 Using SAS Programming to Create Complex Paneled Graphs from Electronic Health Records
View Presentation Carrie Tillotson, OCHIN, Inc. |
||
18 An Algorithm to Identify Family Linkages Using Electronic Health Record Data
View Presentation Megan Hoopes, OCHIN, Inc. |
||
Saturday, February 17 | ||
Registration
|
Sat, Feb 17, 7:30 AM - 2:30 PM
|
|
|
||
Exhibits Open
|
Sat, Feb 17, 7:30 AM - 1:00 PM
Salons F-I |
|
|
||
PS3
Poster Session 3 and Continental Breakfast
|
Sat, Feb 17, 8:00 AM - 9:15 AM
Salons F-I |
|
Chair(s): Edward Mulrow, NORC at the University of Chicago | ||
|
||
1 Thematic Feature Selection for Research Support
View Presentation Thealexa Becker, Federal Reserve Bank of Kansas City |
||
2 Systematizing Your Statistical Consulting Practice
View Presentation Terrie Vasilopoulos, University of Florida, College of Medicine |
||
3 Sixteen Personalities at Work
View Presentation Katherine Eleanor Tranbarger Freier, Intel Corporation |
||
4 Re-Examining Sick Quitter Hypothesis on Association of Alcohol Consumption with Coronary Heart Disease
View Presentation Amy Z. Fan, National Institute of Health |
||
5 Comparisons of Propensity Score Analysis for Analyzing Rare Binary Outcome
View Presentation Jihye Park, Stony Brook University |
||
6 Understanding Graduate School Speed-Dating with Generalized Linear Mixed Models
View Presentation Christina Phan Knudson, University of St. Thomas |
||
7 Data Modeling to Mitigate the Impact of Missing Data in a Longitudinal Study of Injecting Drug Users
View Presentation Tania Amanda Patrao, University of Queensland, Australia |
||
8 Multivariate Statistical Analysis in Plastic Foam Research
View Presentation Wenyu Su, The Dow Chemical Company |
||
9 Win Ratio Application for a Composite Outcome in a Randomized Cardiovascular Trial
View Presentation Rose A. Hamershock, TIMI Study Group |
||
10 Statistical Analysis of Network Change
View Presentation Teresa Danielle Schmidt, Portland State University |
||
11 Exploring Data Quality and Time Series Event Detection in 2016 US Presidential Election Polls
View Presentation Kaelyn M. Rosenberg, Reed College |
||
12 Understanding and Using Ordinal Factor Analysis
Nivedita Bhaktha, The Ohio State University |
||
13 An Easy-to-Use SAS® Macro for a Descriptive Statistics Table with P-Values
View Presentation Yuanchao Zheng, Stanford University |
||
14 Animated Data Visualization with Plotly: Useful Tool for Health Care Quality Improvement
View Presentation Eric A. Tesdahl, SpecialtyCare, Inc. |
||
15 Using Accessible Patient Data to Individualize Sample Timing for Pharmacokinetic Studies
View Presentation Matthew Stephen Shotwell, Vanderbilt University Medical Center |
||
CS17
Passion for Statistics
|
Sat, Feb 17, 9:15 AM - 10:45 AM
Salon A |
|
Chair(s): Kathleen A. Jablonski, The George Washington University | ||
|
||
9:20 AM |
Am I Supposed to Enjoy My Job? Career Observations from a Biostatistician
View Presentation Daniel Thomas Cotton, Boehringer Ingelheim Pharmaceuticals |
|
10:05 AM |
Statistics in the Wild: Practicing Statistics in Nontraditional Places, from a Tiny Island in the Pacific to the Federal Cabinet
Heather Krause, Datassist |
|
CS18
Survival Analysis v. 'Survival' Analysis
|
Sat, Feb 17, 9:15 AM - 10:45 AM
Salons BC |
|
Chair(s): Yulia Marchenko, StataCorp LLC | ||
|
||
9:20 AM |
'How Long Would You Wait?' Using Time-to-Event (Survival) Analysis to Explore Waiting Times
View Presentation Ruth Hummel, SAS Institute |
|
10:05 AM |
Statistical Methods for National Security Risk Quantification and Optimal Resource Allocation
View Presentation Robert Brigantic, Pacific Northwest National Laboratory |
|
CS19
Business Intelligence Applications
|
Sat, Feb 17, 9:15 AM - 10:45 AM
Salon D |
|
Chair(s): Chris Holloman, ICC | ||
|
||
9:20 AM |
Business Intelligence (BI) Reporting Solution: From Source to Nuts
View Presentation Andrew Piskorowski, Survey Research Center, University of Michigan |
|
10:05 AM |
Location Analytics: An Application of GIS
Moxie Zhang, Esri (China) |
|
CS20
Understanding Populations
|
Sat, Feb 17, 9:15 AM - 10:45 AM
Salon E |
|
Chair(s): Shelley DeVost, Los Angeles LGBT Center | ||
|
||
9:20 AM |
Quantifying Populations in Proximity to Oil and Gas Development: A National Spatial Analysis and Review
View Presentation Tanja Srebotnjak, Harvey Mudd College |
|
10:05 AM |
Approaches and Techniques for Estimating the Total Number of Species in a Population, with Emphasis on Application to Mineral Species
View Presentation Grethe Hystad, Purdue University Northwest |
|
CS21
Developing Communication Skills
|
Sat, Feb 17, 11:00 AM - 12:30 PM
Salon A |
|
Chair(s): Cynthia R. Long, Palmer Center for Chiropractic Research, Palmer College of Chiropractic | ||
|
||
11:05 AM |
How to Communicate Statistics, and How Statisticians Should Communicate
View Presentation Achim Guettner, Novartis Pharma |
|
11:50 AM |
PANEL: Communication Skills: What's Next
Lillian S. Lin, Montana State University; Kim Love, K. R. Love Quantitative Consulting and Collaboration; Alicia Toledano, Biostatistics Consulting, LLC; Eric Vance, LISA--University of Colorado Boulder |
|
CS22
Small Sample Sizes and Non-Probability Sampling
|
Sat, Feb 17, 11:00 AM - 12:30 PM
Salons BC |
|
Chair(s): Amy Laird, Oregon Clinical & Translational Research Institute (OCTRI) | ||
|
||
11:05 AM |
Quantifying and Incorporating Sources of Variability and Uncertainty in Statistical Analyses with Very Small Sample Sizes
View Presentation Annette M. Bachand, Ramboll Environ |
|
11:50 AM |
Non-Probability Sampling: Wave of the Future in Survey Research?
View Presentation Karol Krotki, RTI |
|
CS23
Data Science Applications
|
Sat, Feb 17, 11:00 AM - 12:30 PM
Salon D |
|
Chair(s): Sarah Burgoyne, Claritas | ||
|
||
11:05 AM |
Recent Advances in the Analysis and Detection of Communities in a Network
Frederick Kin Hing Phoa, Institute of Statistical Science, Academia Sinica |
|
11:50 AM |
Firehose Data Science: Real-Time Analytics of Twitter Feeds
View Presentation David Corliss, Ford Motor Company |
|
CS24
Causal Inference
|
Sat, Feb 17, 11:00 AM - 12:30 PM
Salon E |
|
Chair(s): Larisa G. Tereshchenko, Oregon Health and Science University | ||
|
||
11:05 AM |
Causal Inference with Multilevel Data Structures
Luke Keele, Georgetown |
|
11:50 AM |
A Decision Tool for Causal Inference and Observational Data Analysis Methods in Comparative Effectiveness Research (DECODE CER)
View Presentation Douglas Landsittel, University of Pittsburgh |
|
Lunch (On Own)
|
Sat, Feb 17, 12:30 PM - 2:00 PM
|
|
|
||
PCD1
Deploying Quantitative Models as 'Visuals' in Popular Data Visualization Platforms
|
Sat, Feb 17, 2:00 PM - 4:00 PM
Salon E |
|
Instructor(s): Daniel Fylstra, Frontline Systems Inc. | ||
Data visualization and business intelligence tools such as Tableau and Power BI have become extremely popular in recent years. Tableau reports that over 90% of Fortune 500 companies are now customers, while Microsoft reports that over 200,000 organizations of all sizes are using Power BI. These tools currently offer easy-to-use access to many data sources, powerful facilities for "slicing and dicing" data, and rich, flexible data visualization, but only limited built-in analytics methods. A new avenue has emerged in the past year for extending analytics methods in both Tableau and Power BI- and this provides a new way for an analyst to develop quantitative models outside these platforms, then deploy them as 'visuals' inside Tableau and Power BI, in 'dashboards' which are often published for use by thousands of users in an organization. Though originally conceived as a way to extend the range of visualization styles, these components can perform arbitrary computations on data before it is rendered in visual form. In this session, Excel Solver developer Frontline Systems, one of the first to explore this new avenue, will demonstrate use of its tools to automatically convert existing quantitative models into 'visuals' for both Tableau and Power BI. Among other options, this enables an analyst to convert predictive (data mining, machine learning) or prescriptive (optimization, simulation) model from Microsoft Excel into an easily-deployed 'visual', just two mouse clicks. No programming is required, but the ability to extend models using high-level RASON modeling language code or programming language code is available. These 'visuals' are full-fledged models that easily connect to any Tableau or Power BI data source, and re-solve the underlying problem whenever the data sources are refreshed.
|
||
PCD2
Handling Missing Data Using Multiple Imputation
|
Sat, Feb 17, 2:00 PM - 4:00 PM
Salons BC |
|
Instructor(s): Yulia Marchenko, StataCorp LLC | ||
This workshop will cover the use of Stata to perform multiple-imputation analysis. Multiple imputation (MI) is a simulation-based technique for handling missing data. The course will provide a brief introduction to multiple imputation and will demonstrate how to perform multiple imputation in Stata. The three stages of MI (imputation, completed-data analysis, and pooling) will be discussed with accompanying Stata examples. Imputation using multivariate normal (MVN) and using chained equations (MICE, FCS) will be discussed. A number of examples demonstrating hot to efficiently manage multiply imputed data within Stata will also be provided. Linear and logistic regression analysis of multiply imputed data as well as several postestimation features will be presented. No prior knowledge of Stata is required, but basic familiarity with multiple imputation will prove useful.
|
||
T1
Engage the Room: Mastering Your Personal Presentation Style
|
Sat, Feb 17, 2:00 PM - 4:00 PM
Salon A |
|
Instructor(s): Duncan Burl Gilles, Art of Problem Solving
Download Handouts |
||
As confident as we may be in the quality of our work, presentation can make or break the impact it has. Engaging the room and communicating clearly can make the difference between an unimpressed, bored audience and a thrilled audience eager to learn more. This course will focus on presentation techniques that help you communicate your ideas effectively and in an engaging manner. You’ll be trained on ways to draw your audience into your talk, engage them in active listening and thinking, and use your voice and the space of the room to command attention and convey your message. These are skills applicable in many areas – whether presenting your work to clients, teaching in the classroom, one-on-one interviews or discussions, and even CSP talks! After the talk, participants will have the chance to send a short video of a talk to the presenter for review and feedback.
|
||
|
||
T2
Applying Propensity Score Methods to Observational Studies Using R and SAS
|
Sat, Feb 17, 2:00 PM - 4:00 PM
Eugene |
|
Instructor(s): Wei Pan, Duke University
Download Handouts |
||
Observational studies are common in applied settings but pose threats to the validity of causal inference due to selection bias in the data. Propensity score methods have been increasingly used as a means of reducing selection bias to enhance the causal claims. A training course on the application of propensity score methods to observational studies using commonly used statistical software would be beneficial for applied statisticians and researchers to improve the quality of their observational studies. With this objective, the proposed course will introduce basic concepts and practical issues of propensity score methods, including matching, stratification, and weighting; the instructors will facilitate hands-on activities of applying propensity score methods to observational studies with real-world examples using R and SAS. No prior knowledge of propensity score methods or computer programming is required. Participants are encouraged to bring their own laptop computers for hands-on activities.
|
||
|
||
T3
A Workshop on Validation of Discrete Response Statistical Models
|
Sat, Feb 17, 2:00 PM - 4:00 PM
Portland |
|
Instructor(s): Raul Eduardo Avelar Moran, Texas A&M Transportation Institute | ||
Count models are widely used to analyze discrete data in various fields. When the intent of the analysis is prediction, model validation is an important step before the model can be offered with confidence to final users. This tutorial will discuss when and why to validate, and will demonstrate model validation techniques specific to discrete response models, such as Poisson and Negative Binomial Generalized Linear Regression Models.
|
||
|
||
T4
Tools for Connecting R, SAs, and Stata to Word: A Practical Approach to Reproducibility
|
Sat, Feb 17, 2:00 PM - 4:00 PM
Salon D |
|
Instructor(s): Abigail S. Baldridge, Northwestern University; Leah J. Welty, Northwestern University
Download Handouts |
||
Reproducibility, wherein data analysis and documentation is sufficient so that results can be recomputed or verified, is an increasingly important component of statistical practice. “Weaving” tools such as R Markdown facilitate reproducibility by combining narrative text and analysis code in one plain-text document, but are of limited use when manuscripts or reports must be generated in MS Word (e.g. due to journal requirements or client preference). This course will: (1) summarize how weaving tools create Word documents, and the ensuing limitations; and (2) introduce an alternate approach using recently released StatTag software. StatTag is a free, open-source program that embeds results (values, tables, figures, or verbatim output) from R, SAS, or Stata directly in Word such that they can be automatically updated if code or data changes. This course is intended for a broad audience; prerequisites are experience preparing documents in Word and conducting analysis in any one of R, SAS, or Stata. The workshop will provide practical, hands-on examples drawn from R, SAS, and Stata, and will include an overview of weaving approaches as well as an introduction to StatTag.
|
||
|
||
GS2
Closing General Session
|
Sat, Feb 17, 4:15 PM - 5:30 PM
Salon E |
|
Chair(s): Eric Vance, LISA--University of Colorado Boulder | ||
The Closing Session is an opportunity for you to interact with the CSP Steering Committee in an open discussion about how the conference went and how it could be improved in future years. CSPSC vice chair, Eric Vance, will lead a panel of committee members as they summarize their conference experience. The audience will then be invited to ask questions and provide feedback. The committee highly values suggestions for improvements gathered during this time. The best student poster will also be awarded during the Closing Session, and each attendee will have an opportunity to win a door prize.
|
||