Online Program
Return to main conference page
Viewing Track 'Practice and Applications' only
Back to search menu
Key:
Computational Statistics
Data Science Technologies
Data Visualization
Education
Machine Learning
Practice and Applications
Software
Thursday, May 30
CS03 -
Open Source and Community
Invited
Thu, May 30, 10:30 AM - 12:05 PM
Grand Ballroom J
Organizer(s): Gabriela de Queiroz, IBM
Chair(s): David Smith, Microsoft
10:35 AM
Getting Involved in Scientific Open Source: Lessons from 7 Years of Growing the ROpenSci Community
Karthik Ram, UC Berkeley
11:05 AM
Sustainers of the Tidyverse
Presentation
Mara Averick, RStudio
11:35 AM
Building a Community: The R-Ladies Story
Presentation
Gabriela de Queiroz, IBM
CS12 -
Enterprise Applications of Data Science
Contributed
Thu, May 30, 1:30 PM - 3:05 PM
Grand Ballroom J
Chair(s): Gabriela de Queiroz, IBM
1:35 PM
Estimating Causal Effects in Large Scale Online Experiments and Designing Automated A/B Testing Platforms for Machine Learning
Presentation
Zuzanna Klyszejko, MongoDB
1:50 PM
Data Storytelling: Improve Insight-To-Action Conversion for a Greater Real World Impact
Yu Zhou, Mastercard
2:05 PM
Detecting Innovative Companies via Their Website
Piet Daas, Statistics Netherlands
2:20 PM
Metrics and Modeling in Large-Scale Digital Experimentation
W. Duncan Wadsworth, Microsoft
2:35 PM
Forecasting at Scale to Champion Customer Trust
Ana Bertran, Salesforce
2:50 PM
Floor Discussion
PS02 -
Data Science Applications E-Posters, I
E-Poster
Thu, May 30, 3:00 PM - 4:00 PM
Grand Ballroom Foyer
1
Automated Survey Text Analysis -- Supervised Latent Dirichlet Allocation (SLDA)
Presentation
Christine P. Chai, Microsoft
2
Comparing various string similarity algorithms in the task of name-matching
Presentation
Aleksandra Zaba, University of Utah
3
Hypothesis Testing in Nonlinear Function on Scalar Regression with Application to Child Growth Study
Mityl Biswas, NC State University
4
Comparing Object Correlation Metrics for Effective Space Traffic Management
Julie Zhang, University of Washington
5
Batch effect adjustment via ensemble learning in the validation of genomic classifiers
Yuqing Zhang, Boston University
6
Tensor Mixed Effects Model with Application to Nanomanufacturing Inspection
Presentation
Xiaowei Yue, Virginia Polytechnic Institute and State University
7
Burst Detection in Call Trains for Identifying Fraud in Telecommunications
Presentation
Miguel Raul Pebes Trujillo, Indiana University Bloomington, Department of Statistics
8
Active Labeling using Model-based Classification
Min Fang, San Jose State University
9
Analyzing Influence of Social Media Through Twitter
Presentation
Dhrubajyoti Ghosh, North Carolina State University
10
Diversity of forest structure across the United States
Jessica Lynn Gilbert, Purdue University
11
ClusterJob, an Experiment Management System For Ambitious Data Science
Bekk Blando, Clemson University
12
A Maximum Likelihood Method for Correlated Discrete and Continuous Outcomes with Selection, Lagged Effects and Variance
Rhoda Nandai Muse, University of Arizona, Mathematics Department
13
Gender Distribution in Movie Roles
Presentation
Vijay Ravuri, CalPoly SLO
14
Evaluating and forecasting the CD4 cell count evolution in HIV+ patients from a Bayesian stochastic model related to the logistic curve with multiple inflection points.
Victor Cruz-Torres, University of Puerto Rico
CS15 -
Linguistic Diversity in NLP
Invited
Thu, May 30, 4:00 PM - 5:35 PM
Grand Ballroom J
Organizer(s): Rachael Tatman, Kaggle
Chair(s): Julia Silge, Stack Overflow
4:05 PM
An Introduction to Computational Sociolinguistics
Rachael Tatman, Kaggle
4:35 PM
English Isn't Generic for Language, Despite What NLP Papers Might Lead You to Believe
Presentation
Emily M. Bender, University of Washington
5:05 PM
Learning the Language of BlackTwitter
Brandeis Hill Marshall, Spelman College
PS03 -
Data Science Applications E-Posters, II
E-Poster
Thu, May 30, 5:30 PM - 6:30 PM
Grand Ballroom Foyer
1
Automated Analytics of the Solar Corona with Scalable Cloud Based Platforms
Lars K. S. Daldorff, JHU/APL
2
Modeling and Forecasting the Percent Changes in the National Park Visitation Counts Using Social Media Data
Russell Goebel, Western Washington University
3
Estimating Plant Growth Curves and Derivatives by Modeling Crowdsourced Imaged-Based Data
Haozhe Zhang, Iowa State University
4
Using Bayesian Networks to Perform Reject Inference
Billie Anderson, Harrisburg University
5
Usability evaluation of data presentation for official statistics
Presentation
Lin Wang, U.S. Census Bureau
6
Do Unregistered Voters Want to Vote? Automatic Registration and Oregon Elections Turnout.
Matthew Stephan Yancheff, Reed College
7
Relationship between physical activity and depression in elderly Costa Ricans
Presentation
Shu Li, Kent State University
8
Building an Interpretable Incident Prediction model for Site Reliability
Jiaping Zhang, Salesforce
9
For-estimation: Post-stratification to increase efficiency of forest attribute estimates
Miranda Rintoul, Reed College
10
Forecasting NBA Fan Support using Time Series Analysis
Victor Wilson, Cal Poly San Luis Obispo
11
Handling Missing Data in Cardiovascular Disease Prediction Using Neural Networks
Presentation
Megan Shand, Broad Institute
12
Leverage Machine Learning to Advance Risk Prediction with Electronic Health Record
Presentation
Yirui Hu, Geisinger
13
Multiple uses for chronic condition data mart
John Massman, Virginia Mason
14
Team Item Response Models
Deborshee Sen, Duke University
Friday, May 31
CS22 -
Building and Growing Data Science Teams
Invited
Fri, May 31, 10:30 AM - 12:05 PM
Grand Ballroom J
Organizer(s): Jacqueline Nolis, Nolis, LLC
Chair(s): Jacqueline Nolis, Nolis, LLC
10:35 AM
From Zero to A^X: Scaling Data Science Teams
Amanda Casari, Google Cloud
11:05 AM
Together at Last: Heterogeneous Teams and the Key to Success
Heather Nolis, T-Mobile
11:35 AM
Creating Effective Data Science Teams
Presentation
Mehar Singh, ProCogia
CS28 -
Data Science Ethics Meet Reality
Invited
Fri, May 31, 1:30 PM - 3:05 PM
Grand Ballroom J
Organizer(s): Os Keyes, University of Washington
Chair(s): Brandeis Hill Marshall, Spelman College
1:35 PM
The Politics of Data
Presentation
Meg Drouhard, University of Washington
2:05 PM
The Political Consequences of Repurposing Data
Meg Young, University of Washington
2:35 PM
Beyond Methodological Rigor: Widening the Scope of Ethics in Data Science
Anissa Tanweer, University of Washington
CS39 -
Data and Society
Contributed
Fri, May 31, 3:40 PM - 5:15 PM
Grand Ballroom J
Chair(s): Heather Nolis, T-Mobile
3:45 PM
Using Convolutional Neural Networks to Automatically Classify Logos on Shopping Receipts
Presentation
Émilie Mayer, Statistics Canada
4:00 PM
Using Topological Data Analysis to Assess Gerrymandering in Voting Districts
Courtney Thatcher, University of Puget Sound
4:15 PM
Predicting the Success of an Crowdfunding Campaign: Spatial Location-Based Trajectory Modeling
Han Yu, University of Northern Colorado
4:30 PM
Nurturing select customers using a state-space model (Investment Recommender / Resource allocation)
Eunice Kim, Microsoft
4:45 PM
Floor Discussion
CS44 -
Science and the Environment
Contributed
Fri, May 31, 5:20 PM - 6:25 PM
Grand Ballroom J
Chair(s): Melanie Edwards, Exponent, Inc.
5:25 PM
Trend Assessment for Daily Snow Depths with Changepoints Considerations
Jaechoul Lee, Boise State University
5:40 PM
Yield Forecasting Based on Short Time Series with High Spatial Resolution Data
Yuzhen Zhou, University of Nebraska Lincoln
5:55 PM
Are Forest Communities Impacted by Climate Change?
Jonathan Andrew Knott, Purdue University
6:10 PM
Extracting Signal from the Noisy Environment of an Ecosystem
Presentation
Pranita Pramod Patil, Harrisburg University of Science & Technology
Saturday, June 1
CS47 -
Data Science for Fun
Invited
Sat, Jun 1, 10:00 AM - 11:35 AM
Grand Ballroom E
Organizer(s): David Smith, Microsoft
Chair(s): Ana Bertran, Salesforce
10:05 AM
Minecraft, R, and Containers
Presentation
David Smith, Microsoft
10:35 AM
Using Deep Learning in R to Generate Offensive License Plates
Presentation
Jacqueline Nolis, Nolis, LLC
CS56 -
Data for Human Health
Contributed
Sat, Jun 1, 1:00 PM - 2:35 PM
Grand Ballroom E
Chair(s): Xiyue Liao, Department of Statistics and Applied Probability, University of California, Santa Barbara
1:05 PM
Multiple-target Robust Design of a Coronary Stent with Multiple Functional Outputs
Presentation
Fan JIANG, City University of Hong Kong
1:20 PM
Multiple Hypotheses Testing for Discrete Data - "MHTdicsrete" R package
Yalin Zhu, Merck & Co., Inc.
1:35 PM
What Are the Comorbidities That Go with Asthma? Basket Analysis Approach
Tianyuan Guan, University of Cincinnati
1:50 PM
An Optimal Kernel-Based U-Statistic Method for Quantitative Gene-Set Association Analysis
Tao He, San Francisco State University
2:05 PM
A Nonlinear Hierarchical Modeling Approach to Estimating the BAT Curve Using Markov Chain Monte Carlo
Colin O'Rourke, Benaroya Research Institute
2:20 PM
Floor Discussion
CS58 -
When Biomedical Data Gets Big: Challenges and Solutions in Biomedical Data Science
Invited
Sat, Jun 1, 2:45 PM - 3:50 PM
Grand Ballroom E
Organizer(s): James Eddy, Sage Bionetworks
Chair(s): Yalin Zhu, Merck & Co., Inc.
2:50 PM
Analysis of Whole Genome Sequence Analysis in >100k Individuals: Experience in the TOPMed Program
Ken Rice, Universiry of Washington
3:20 PM
Biomedical Informatics and Precision Medicine Are Laying the Framework for the Next Generation of Data-Driven Clinical Research
Sean Mooney, University of Washington
↑