Online Program
Return to main conference page
Back to search menu
Key:
Computational Statistics
Data Science Technologies
Data Visualization
Education
Machine Learning
Practice and Applications
Software
Friday, May 31
Exhibits Open
SDSS Hours
Fri, May 31, 7:30 AM - 3:45 PM
Grand Ballroom Foyer
Registration
SDSS Hours
Fri, May 31, 7:30 AM - 5:30 PM
Grand Ballroom Foyer
GS03 -
Friday Keynote Address
General Session
Fri, May 31, 8:30 AM - 9:45 AM
Grand Ballroom E
Organizer(s): Kelly McConville, Reed College
Chair(s): Jo Hardin, Pomona College
8:35 AM
Data Science: How the Union of Inferential Thinking and Computation Are Transforming Research and Education at Berkeley
Presentation
Fernando Perez, UC Berkeley
9:35 AM
Sponsor Spotlight - SAS
9:40 AM
Floor Discussion
PS04 -
Machine Learning E-Posters, I
E-Poster
Fri, May 31, 9:45 AM - 10:45 AM
Grand Ballroom Foyer
2
Artificial Intelligence Mammography Model and Healthcare Savings Opportunity
Olajide Israel Ajayi, Blue Cross NC
3
The Geometry of feature embeddings in kernel discriminant analysis-deterministic or randomized
Jiae Kim, The Ohio State University
4
HARNESSING the POWER of MACHINE LEARNING METHODS in HIV VIROLOGIC FAILURE RISK PREDICTION
Presentation
Allan Kimaina, brown university
5
Practical Considerations of Deep Learning in Digital Pathology
Shubing Wang, Merck
6
Identifying Shifts in Forest Communities Using Machine Learning Techniques
Trenton W Ford, University of Notre Dame
7
Rapid deployment of a Machine Learning-based derived biomarker using publicly available data sources for covariate adjusted descriptive modeling.
Presentation
Albert Taylor, Origent Data Sciences
8
Adaptively Stacked Ensembles for Influenza Forecasting with Incomplete Data
Presentation
Thomas Charles McAndrew, University of Massachusetts Amherst
9
Overcoming Big Data: Linking the 2014 National Hospital Care Survey to the 2014/2015 Medicare CMS Master Beneficiary Summary File
Scott Robert Campbell, National Opinion Research Center at University of Chicago
10
Comparing Performance of Lasso, Group Lasso, and Linear Regression with Categorical Predictors
Presentation
Yihuan Huang, UCLA
12
ML-assisted ongoing monitoring for fighting fraud and abuse
Jose Ferreira, Google
13
Time-aggregated forecasting for ultra high dimensional regression and time-series error
Sayar Karmakar, University of Florida
14
Empirical priors for prediction in sparse high-dimensional linear regression
Yiqi Tang, NC State University
CS20 -
Data Science Platforms: Spark
Invited
Fri, May 31, 10:30 AM - 12:05 PM
Grand Ballroom E
Organizer(s): Kevin Kuo, RStudio
Chair(s): Kevin Kuo, RStudio
10:35 AM
An R Interface to Hail
Presentation
Michael Lawrence, Genentech Research
11:05 AM
Scaling Sparklyr with Streams and Arrow
Javier Luraschi, RStudio
11:35 AM
Interpretable Machine Learning Using rsparkling
Navdeep Gill, H2O.ai
CS21 -
A Field Guide to Education Tools in Data Science
Invited
Fri, May 31, 10:30 AM - 12:05 PM
Grand Ballroom I
Organizer(s): Alison Hill, RStudio
Chair(s): Alison Hill, RStudio
10:35 AM
Necessity Is the Mother of Invention: Evolution of a Data Science Team
Adrienne Zell, Oregon Health and Science University
11:05 AM
Using Unit Testing to Teach Data Science
Presentation
Kyle Gorman, CUNY
11:35 AM
Data Presentation For Everyone: Simple Ways to Educate without Teaching
Presentation
Allison Sliter, Digimarc Inc
CS22 -
Building and Growing Data Science Teams
Invited
Fri, May 31, 10:30 AM - 12:05 PM
Grand Ballroom J
Organizer(s): Jacqueline Nolis, Nolis, LLC
Chair(s): Jacqueline Nolis, Nolis, LLC
10:35 AM
From Zero to A^X: Scaling Data Science Teams
Amanda Casari, Google Cloud
11:05 AM
Together at Last: Heterogeneous Teams and the Key to Success
Heather Nolis, T-Mobile
11:35 AM
Creating Effective Data Science Teams
Presentation
Mehar Singh, ProCogia
CS23 -
Advances in Analysis and Computing in Complex Data
Invited
Fri, May 31, 10:30 AM - 12:05 PM
Grand Ballroom K
Organizer(s): George Michailidis, University of Florida
Chair(s): Regina Liu, Rutgers University
10:35 AM
Graph-Based Change-Point Detection
Lynna Chu, UC Davis
11:05 AM
A Double Core Tensor Factorization and Its Applications to Heterogeneous Data
George Michailidis, University of Florida
11:35 AM
Individualized Fusion Learning (IFusion) with Applications to Personalized Inference
Minge Xie, Rutgers University
CS24 -
Recent Developments on Machine Learning
Invited
Fri, May 31, 10:30 AM - 12:05 PM
Regency Ballroom AB
Organizer(s): Xiaotong Shen, University of Minnesota
Chair(s): Xiaotong Shen, University of Minnesota
10:35 AM
Shrinking Characteristics of Precision Matrix Estimators
Adam J. Rothman, University of Minnesota
11:05 AM
P-Splines with an L1 Penalty for Repeated Measures
Hui Jiang, University of Michigan
11:35 AM
Community Detection with Dependent Connectivity
Annie Qu, University Illinois at Urbana-Champaign
CS25 -
Software Packages for Data Science
Contributed
Fri, May 31, 10:30 AM - 12:05 PM
Regency Ballroom C
Chair(s): Amrina Ferdous, Boise State University
10:35 AM
An R Package for Linear Mediation Analysis with Complex Survey Data
Presentation
Yujiao Mai, St. Jude Children's Research Hospital
10:50 AM
GREIN: An Interactive Web Platform for Re-Analyzing GEO RNA-Seq Data
Presentation
Naim Al Mahi, University of Cincinnati
11:05 AM
Bioc2mlr: R Package to Bridge Between Bioconductor’s S4 Complex Genomic Data Container, to Mlr, a Meta Machine Learning Aggregator Package.
Dror Berel, Fred Hutch
CS26 -
Data Visualization in Applications
Contributed
Fri, May 31, 10:30 AM - 12:05 PM
Regency Ballroom EF
Chair(s): Oyeleke Olaoye, .
10:35 AM
Topological Data Analysis for Understanding Phenotypic Presentation in Aortic Stenosis
Sirish Shrestha, West Virginia University
10:50 AM
Assessing and Visualizing the Impact of Medical Coding Systems for Predicting Inpatient Mortality
Brian Hochrein, IBM Watson Health
11:05 AM
Methods for Visualizing Dimension Reduction in R
Tiffany Jiang, UC Davis
11:20 AM
Floor Discussion
CS27 -
Data Science Platforms: Deep Learning
Invited
Fri, May 31, 1:30 PM - 3:05 PM
Grand Ballroom E
Organizer(s): Javier Luraschi, RStudio
Chair(s): Javier Luraschi, RStudio
1:35 PM
Deep Learning and Probabilistic Programming with Applications to Intelligent Reality
Soren Harner, Permaling
2:05 PM
R Interfaces to TensorFlow and Keras
Kevin Kuo, RStudio
2:35 PM
Deep Learning Models at Scale with Apache Spark
Presentation
Joseph Kurata Bradley, Databricks, Inc.
CS28 -
Data Science Ethics Meet Reality
Invited
Fri, May 31, 1:30 PM - 3:05 PM
Grand Ballroom J
Organizer(s): Os Keyes, University of Washington
Chair(s): Brandeis Hill Marshall, Spelman College
1:35 PM
The Politics of Data
Presentation
Meg Drouhard, University of Washington
2:05 PM
The Political Consequences of Repurposing Data
Meg Young, University of Washington
2:35 PM
Beyond Methodological Rigor: Widening the Scope of Ethics in Data Science
Anissa Tanweer, University of Washington
CS29 -
The Cutting Edge in Statistical Machine Learning
Invited
Fri, May 31, 1:30 PM - 3:05 PM
Regency Ballroom AB
Organizer(s): Daniela Witten, University of Washington
Chair(s): Boxiang Wang, University of Iowa
1:35 PM
A Continuous-Time View of Early Stopping in Least Squares Regression
Ryan Tibshirani, Carnegie Mellon University
2:05 PM
Fused Lasso on Graphs: Applications to Nonparametric Statistical Problems
Oscar Hernan Madrid Padilla, UC Berkeley
2:35 PM
Two-Stage Computational Framework for Sparse Generalized Eigenvalue Problem
Kean Ming Tan, University of Minnesota
CS30 -
Data Visualization Education
Invited
Fri, May 31, 1:30 PM - 3:05 PM
Regency Ballroom EF
Organizer(s): Silas Bergen, Winona State University; Amelia McNamara, University of St. Thomas
Chair(s): Silas Bergen, Winona State University
1:35 PM
Teaching Data Visualization: Integrating Theory and Practice
Presentation
Michael Freeman, University of Washington
2:05 PM
A Three-Part Data Visualization Curriculum
Presentation
Jerzy Wieczorek, Colby College
2:35 PM
Help Me Understand: Guiding Visualization Users with Annotations
Robert Kosara, Tableau Software
CS31 -
Instructional Applications & Insights
Contributed
Fri, May 31, 1:30 PM - 3:05 PM
Grand Ballroom I
Chair(s): Emily Rose Flanagan, University of Washington
1:35 PM
Apply “STEAMS” Methodology on Managing Europe Travel
Charles Chen, Applied Materials
1:50 PM
A Robust and Dynamic Formulation for Predicting Student Offer Acceptance
Michael Liut, McMaster University
2:05 PM
P-Values: A Closer Look
Jeanne Li, Santa Barbara Cottage Hospital
2:20 PM
Floor Discussion
CS32 -
Statistical Methods for Analyzing Large Scale or Massive Data
Contributed
Fri, May 31, 1:30 PM - 3:05 PM
Grand Ballroom K
Chair(s): Alona Kryshchenko, California State University Cannel Islands
1:35 PM
High-Dimensional Association Detection in Large Scale Genomic Studies
Hillary Koch, Pennsylvania State University
1:50 PM
Threshold Knot Selection for Large-Scale Spatial Models with Applications to the Deepwater Horizon Disaster
Casey Jelsema, West Virginia University
2:05 PM
Goodness-of-Fit Tests for Large Data Sets
Taras Lazariv, TU Dresden
2:20 PM
Big Data and Portfolio Optimization
QIYU WANG, Zhejiang Univ of Finance and Econ
2:35 PM
An Application of Linear Programming to Computational Statistics
Presentation
John M. Ennis, Aigora
2:50 PM
Accelerate Pseudo-Proximal Map Algorithm and Its Application to Network Analysis
Dao Nguyen, University of Mississippi
Hackathon Update
Special Session
Fri, May 31, 1:30 PM - 3:05 PM
Regency Ballroom C
Join the Hackathon participants as they present their findings.
PS05 -
Machine Learning E-Posters, II
E-Poster
Fri, May 31, 3:00 PM - 4:00 PM
Grand Ballroom Foyer
1
Clustering Chocolate Types: Dark, White, Milk and Fruit
Kaitlyn Zhang, Stanford OHS
2
Statistical Approaches for Identifying Untargeted Metabolites Prognostic for Kidney Disease Progression in Type 2 Diabetic Patients: Application to the Chronic Renal Insufficiency Cohort Study
Jing Zhang, UCSD Moores Cancer Center
3
Genomic Determination Index
Cheng Cheng, St. Jude Children's Research Hospital
4
On Combining Data from Distinct Nonlinear Predictive Models
Presentation
Amrina Ferdous, Boise State University
5
Predicting Unknown Links for Interconnected Networks
Yubai Yuan, UIUC
6
A Bayesian Structural Time Series-Based Approach for Understanding and Predicting Temperatures in the Red Sea
Nabila Bounceur, King Abdullah University of Science and Technology
7
Is robustness trade-off really inevitable?
Jungeum Kim, Purdue Department of Statistics
8
HARNESSING THE POWER OF MACHINE LEARNING METHODS IN PROSPECTIVE HIV CARE AND TREATMENT
Presentation
Allan Kimaina, brown university
9
Machine Learning meets Survival Analysis for the personalized medicine
Jongyun Jung, University of Nevada, Las Vegas
10
Predicting Claims Litigation using Text Mining
Xiyue Liao, Universiry of California, Santa Barbara
11
A Multicategory Kernel Distance Weighted Discrimination Method for Multiclass Classification
Boxiang Wang, University of Iowa
13
Comparison of Automated Liver Image Quality Evaluation Using Handcrafted Features and Convolutional Neural Networks
Wenyi Lin, University of California, San Diego
14
Statistical Learning on Next-Generation Sequencing of T cell Repertoire Data
Li Zhang, UCSF
CS33 -
Backend Data Science
Invited
Fri, May 31, 3:40 PM - 5:15 PM
Grand Ballroom E
Organizer(s): Edgar Ruiz, RStudio
Chair(s): Soren Harner, Permaling
3:45 PM
Data Science with Databases and R
James Blair, RStudio
4:15 PM
STOIC Next-Generation Spreadsheet: Bringing Data Science to the Masses
Ismael Ghalimi, STOIC
4:45 PM
Working with Images and Text in R Through Embeddings
Michael Lucy, Basilica
CS34 -
Computational Statistics for Large-Scale Biological Data
Invited
Fri, May 31, 3:40 PM - 5:15 PM
Grand Ballroom K
Organizer(s): Jacob Bien, University of Southern California
Chair(s): Kean Ming Tan, University of Minnesota
3:45 PM
Computationally Efficient High-Dimensional Interaction Modeling
Guo Yu, University of Washington
4:15 PM
Inference for Diversity Under Networked Models
Bryan Martin, University of Washington
4:45 PM
Variance Component Testing and Selection for a Longitudinal Microbiome Study
Jin Zhou, University of Arizona
CS35 -
Modern Multivariate Analysis
Invited
Fri, May 31, 3:40 PM - 5:15 PM
Regency Ballroom AB
Organizer(s): Adam J. Rothman, University of Minnesota
Chair(s): Adam J. Rothman, University of Minnesota
3:45 PM
The Multivariate Square Root Lasso: Computational and Theoretical Insights
Aaron Molstad, Fred Hutchinson Cancer Research Center
4:15 PM
Estimating Multiple Precision Matrices Using Cluster Fusion Regularization
Brad Price, West Virginia University
4:45 PM
$L_2$-Regularization and Some Path-Following Algorithms
Yunzhang Zhu, The Ohio State University
CS36 -
Democratizing Data Science with Workflows
Invited
Fri, May 31, 3:40 PM - 5:15 PM
Regency Ballroom C
Organizer(s): Michael I. Love, UNC-Chapel Hill
Chair(s): Stas Kolenikov, Abt Associates
3:45 PM
Publishing Literate Programming Workflows in Scientific Journals
Michael I. Love, UNC-Chapel Hill
4:15 PM
When Should You Add Github, Make and Docker to Your Data Science Workflow?
Tiffany Timbers, University of British Columbia
4:45 PM
Useful Tools for Teaching and Outreach in Data Science: Workflows, Case Studies, Github Classroom, and Slack
Stephanie Hicks, Johns Hopkins Bloomberg School of Public Health
CS37 -
Data Visualizations at the Institute for Health Metrics and Evaluation
Invited
Fri, May 31, 3:40 PM - 5:15 PM
Regency Ballroom EF
Organizer(s): Brian Dart, IHME
Chair(s): Disha Patel, University of Washington
3:45 PM
Building Interactive Data Visualization for a Global (Health) Audience
Ryan Shackleton, University of Washington
4:15 PM
The Story of a Chart: Data Visualization Principles to Simplify Complexity
Evan Laurie, University of Washington
4:45 PM
Behind the Scenes: Building Tools to Visualize Intermediate Results in Complex Data Science Pipelines
Marlena Bannick, University of Washington
CS38 -
Engaging Students in Statistics & Data Science
Contributed
Fri, May 31, 3:40 PM - 5:15 PM
Grand Ballroom I
Chair(s): Ted Laderas, Oregon Health & Science University
3:45 PM
STEAMS Approach on Playing Video Games
Mason Chen, Stanford OHS
4:00 PM
Competition Based Teaching of Machine Learning
Presentation
Mikael Vejdemo-Johansson, CUNY College of Staten Island
4:15 PM
USING R and SPSS for TEACHING STATISTICS
Lucy Xiaojing Kerns, Youngstown State University
4:30 PM
Tools for R in Introductory Statistics Courses
Kelly Nicole Bodwin, Cal Poly - San Luis Obispo
4:45 PM
Teaching Data Science Students to Write Clean Code
Presentation
Todd Iverson, Winona State University
5:00 PM
Hack Weeks as a Model for Data Science Education and Collaboration
Daniela Huppenkothen, University of Washington
CS39 -
Data and Society
Contributed
Fri, May 31, 3:40 PM - 5:15 PM
Grand Ballroom J
Chair(s): Heather Nolis, T-Mobile
3:45 PM
Using Convolutional Neural Networks to Automatically Classify Logos on Shopping Receipts
Presentation
Émilie Mayer, Statistics Canada
4:00 PM
Using Topological Data Analysis to Assess Gerrymandering in Voting Districts
Courtney Thatcher, University of Puget Sound
4:15 PM
Predicting the Success of an Crowdfunding Campaign: Spatial Location-Based Trajectory Modeling
Han Yu, University of Northern Colorado
4:30 PM
Nurturing select customers using a state-space model (Investment Recommender / Resource allocation)
Eunice Kim, Microsoft
4:45 PM
Floor Discussion
CS40 -
SAS Open-Source Platforms for Analytics
Invited
Fri, May 31, 5:20 PM - 6:25 PM
Grand Ballroom E
Organizer(s): Jim Harner, West Virginia University
Chair(s): Wendy Martinez, Bureau of Labor Statistics
5:25 PM
SAS Viya: A Modern Scalable and Open Platform for Artificial Intelligence
Presentation
Wayne Thompson, SAS
5:55 PM
Making Predictive Modeling Approachable with JMP Pro
Jordan Hiller, JMP
CS41 -
Incorporating Ethics and Inclusion in Undergraduate Statistics Curriculum
Invited
Fri, May 31, 5:20 PM - 6:25 PM
Grand Ballroom I
Organizer(s): Brianna Heggeseth, Macalester College
Chair(s): Jingchen Hu, Vassar College
5:25 PM
Ethics in an Advanced Undergraduate Seminar: Statistical Analysis of Social Network Data
Miles Q. Ott, Smith College
5:55 PM
Intertwining Data Ethics into Intro Stats
Presentation
Brianna Heggeseth, Macalester College
CS42 -
Interoperability: Your R Package Can Depend on Its Friends
Invited
Fri, May 31, 5:20 PM - 6:25 PM
Regency Ballroom C
Organizer(s): Matthew N. McCall, University of Rochester
Chair(s): Xiaowei Yue, Virginia Polytechnic Institute and State University
5:25 PM
Case Studies in Interoperability: From Generic Classes to Specific Functions
Presentation
Matthew N. McCall, University of Rochester
5:55 PM
How Core Data Structures Drive Interoperability in the Bioconductor Project
Marcel Ramos, CUNY SPH
CS43 -
Grammar of Graphics: The Twentieth Anniversary
Invited
Fri, May 31, 5:20 PM - 6:25 PM
Regency Ballroom EF
Organizer(s): Jim Harner, West Virginia University
Chair(s): Claus Wilke, University of Texas at Austin
5:25 PM
Past, Present, and Future of Grammar of Graphics Systems
Lee Wilkinson, H2O.ai
5:55 PM
Discussant
Anushka Anand, Tableau
6:05 PM
Discussant
Jeffrey Heer, University of Washington
6:15 PM
Discussant
Bryan Van de Ven, Microsoft
CS44 -
Science and the Environment
Contributed
Fri, May 31, 5:20 PM - 6:25 PM
Grand Ballroom J
Chair(s): Melanie Edwards, Exponent, Inc.
5:25 PM
Trend Assessment for Daily Snow Depths with Changepoints Considerations
Jaechoul Lee, Boise State University
5:40 PM
Yield Forecasting Based on Short Time Series with High Spatial Resolution Data
Yuzhen Zhou, University of Nebraska Lincoln
5:55 PM
Are Forest Communities Impacted by Climate Change?
Jonathan Andrew Knott, Purdue University
6:10 PM
Extracting Signal from the Noisy Environment of an Ecosystem
Presentation
Pranita Pramod Patil, Harrisburg University of Science & Technology
CS45 -
Change Point Detection
Contributed
Fri, May 31, 5:20 PM - 6:25 PM
Grand Ballroom K
Chair(s): Dao Nguyen, University of Mississippi
5:25 PM
Detection of Structural Changes in Correctly Specified and Misspecified Conditional Quantile Polynomial Distributed Lag (QPDL) Model Using Change-Point Analysis
Presentation
KWADWO AGYEI NYANTAKYI, GHANA INSTITUTE OF MANAGEMENT AND PUBLIC ADMINISTRATION
5:40 PM
Robust Graph Change-Point Detection for Brain Evolvement Study
Honglang Wang, Indiana University-Purdue University Indianapolis
5:55 PM
Graph Theoretic Statistics for Change Detection and Localization in Multivariate Data
Presentation
Matthew A. Hawks, US Naval Academy
6:10 PM
Floor Discussion
CS46 -
Recent Advancements in Deep Learning
Contributed
Fri, May 31, 5:20 PM - 6:25 PM
Regency Ballroom AB
Chair(s): Yunzhang Zhu, The Ohio State University
5:25 PM
Statistical Evaluation of Long Memory in Recurrent Neural Networks
Presentation
Alexander Greaves-Tunnell, University of Washington
5:40 PM
On Interpretable Machine Learning
Serge Berger, Microsoft
5:55 PM
Machine Learning Methods for Modeling Animal Movement
Dhanushi Wijeyakulasuriya, Pennsylvania State University
6:10 PM
Optimal Transport Classifier: Defending Against Adversarial Attacks by Regularized Deep Embedding
Yao Li, University of California, Davis
↑