All Times ET

Key:

Computational Statistics

Data Visualization

Education

Machine Learning

Practice and Applications

Software & Data Science Technologies

Thursday, June 3

SDSS Virtual Expo
SDSS Hours

Thu, Jun 3, 9:30 AM - 6:30 PM

CS06 - Shaping Human Health with Data
Refereed

Thu, Jun 3, 10:00 AM - 11:35 AM

Chair(s): Julia Buffinton, LMI

10:05 AM

Prescription Opioid Epidemic: Current Trends, Analysis, and Interpretation
Presentation Thomas Bryan, BGD

10:35 AM

Predicting Adverse Drug Reactions (ADR) Using Physiological Time Series Data Obtained in the Intensive Care Unit (ICU): A Case Study
Jason Lee, Johns Hopkins University Applied Physics Laboratory

11:05 AM

SVM-Based Models for Pill Shape Classification
Presentation William Franz Lamberti, George Mason University

CS07 - Estimation Techniques
Refereed

Thu, Jun 3, 10:00 AM - 11:35 AM

Chair(s): Jean Opsomer, Westat

10:05 AM

A New Sufficient Dimension Reduction Predictive Model Using Maximum Entropy Covariance Estimator with Information Complexity
Presentation Kabir Opeyemi Olorede, Kwara State University

10:35 AM

Quantile Shrinkage Covariance Estimation
qiyu wang, Hong Kong Polytechnic Univ

11:05 AM

Nonparametric Application of Functional Analysis of Generalized Linear Models Under Nonlinear Constraints
Kali Prasun Chowdhury, University of California, Irvine

CS08 - Network Analysis
Refereed

Thu, Jun 3, 10:00 AM - 11:35 AM

Chair(s): Katie Anne Bakewell, NLP Logix

10:05 AM

Estimation of the Mean Function of Functional Data via Deep Neural Networks
Presentation SHUOYANG WANG, Auburn University

10:35 AM

Interpretation of Radiological Imaging Features Using Generative Adversarial Networks
Kyle Andrew Hasenstab, San Diego State University

11:05 AM

Multinomial Tensor Regression with Application to Whole-Brain Structural MRI Analysis
Presentation Fang Yang, University of Cincinnati

CS09 - Online/Hybrid Teaching in Statistics and Data Science
Special Session

Thu, Jun 3, 10:00 AM - 11:35 AM

Chair(s): Laura Le, University of Minnesota

10:05 AM

Lessons for the Future from a Year of COVID Teaching
Steven Foti, University of Florida; Adam Loy, Carleton; Douglas Whitaker, Mount Saint Vincent University; Laura Ziegler, Iowa State University

CS10 - Classification and Simulation: Methods, Analyses, and Applications
Lightning

Thu, Jun 3, 10:00 AM - 11:35 AM

Chair(s): Stanislav Kolenikov, Abt Associates

10:05 AM

Auto-classification of occupational data
Presentation Ning Chong, Ministry of Manpower

10:10 AM

Identifying different types of companies via their website text
Piet J. Daas, Eindhoven University of Technology

10:15 AM

Identifying the Tone of FOMC Statements Using a Natural Language Processing Tool
Taeyoung Doh, Federal Reserve Bank of Kansas City

10:20 AM

Using Data Visualization to Tell the Complex Story of Language Use Trends in United States
Heather Hisako Kitada Smalley, Willamette University

10:25 AM

How SciLine Solved its Multi-Label Classification Problem
Joshua Logan Colburn, SciLine, AAAS

10:30 AM

An Extension of DEIM for Class Identification
Emily Hendryx, University of Central Oklahoma

10:35 AM

Character-Level Representation Model for Domain-Specific Short Text
Xinming Li, Global Tech, Walmart

10:40 AM

An Algorithm for Discrete Logistic Classification for Sparse Tables
Yves Thibaudeau, U.S. Census Bureau

10:45 AM

Classification of Longitudinal Data with Irregularly Spaced Intervals: Mixture-Based Mixed Effects Models Versus Post-Hoc Mixture Models of the Best Linear Unbiased Predictors (BLUP) from Linear Mixed
Md Jobayer Hossain, Nemours Children Hospital System, A. I. Dupont Hospital for Children

10:50 AM

Robust Meta-Analysis for Large-Scale Genomic Experiments based on an Empirical Approach
Sinjini Sikdar, Old Dominion University

10:55 AM

Disease Associated Network Detection in Multi-Omic Single-Cell Experiments
Lorin Towle-Miller, University at Buffalo

11:00 AM

An empirical Bayes approach to estimating dynamic models of co-regulated gene expression
Sara Venkatraman, Cornell University

11:05 AM

Pathway and Gene Selection with Guided Regularized Random Forests
Tyler Cook, University of Central Oklahoma

11:10 AM

WITHDRAWN Prediction Interval of Air Pollutants Concentration by Nonparametric Regression Analysis

11:15 AM

Bayesian wavelet-packet historical functional linear models
Mark J Meyer, Georgetown University

11:20 AM

Using Simulation-Based Inference to mitigate instrumental biases in X-ray telescopes
Daniela Huppenkothen, SRON Netherlands Institute for Space Research

11:25 AM

Regional and Sectoral Structures and Their Dynamics of the Chinese Economy: A Network Perspective from Multi-Regional Input-Output Tables
Jun Yan, University of Connecticut

11:30 AM

Connecting the Popularity Adjusted Block Model to the Generalized Random Dot Product Graph
Presentation John Koo, Indiana University

CS11 - Administrative Data Analysis Shaping Decisions
Refereed

Thu, Jun 3, 1:10 PM - 2:45 PM

Chair(s): Mian Shams Adnan, Bowling Green State University

1:15 PM

Learning About Homelessness Using Linked Survey and Administrative Data
Angela Jean Wyse, University of Chicago

1:45 PM

Equitable Prediction of Suicide from Administrative Patient Records
Majerle Reeves, University of California, Merced

2:15 PM

Disambiguating Patent Inventors, Assignees, and Their Locations in PatentsView
Christina Jones, American Institutes for Research

CS12 - Software and Technology Shaping Data Science
Refereed

Thu, Jun 3, 1:10 PM - 2:45 PM

Chair(s): Tommy Jones, Data Community DC

1:15 PM

Synthetic Data Generation with Tidysynthesis
Aaron Robert Williams, Urban Institute

1:45 PM

Estimation and Clustering in the Sparse Popularity Adjusted Blockmodel
Ramchandra Rimal, Middle Tennessee State University

2:15 PM

The Importance of Good Coding Practices for Data Scientists
Randall Pruim, Calvin University

CS13 - Visual Analytics
Refereed

Thu, Jun 3, 1:10 PM - 2:45 PM

Chair(s): Brennan Bean, Utah State University

1:15 PM

VitalVis: Visual Analysis of Multivariate Time Series Data for Healthcare
Chad A. Steed, Oak Ridge National Laboratory

1:45 PM

Validating Visual Inference Methods by Use of Deep Learning
Anne Helby Helby Petersen, University of Copenhagen

2:15 PM

Using Geographic Information Systems (GIS) and Spatial Statistics to Support the Education to Workforce Pipeline and Address Inequities
Caitlin Deal, American Institutes for Research

CS14 - Data-Driven Healthcare
Lightning

Thu, Jun 3, 1:10 PM - 2:45 PM

Chair(s): Wendy Martinez, Bureau of Labor Statistics

1:15 PM

Long-Term and High-Exposure Effects of NB-UVB Phototherapy in Relation to Increased Risks of Skin Cancers: A 30-Year Cancer Registry Linkage Study
Khaled Bedair, Faculty of Commerce, Tanta University

1:20 PM

CNVScope: Visually Exploring Copy Number Aberrations in Cancer Genomes
James L T Dalgleish, National Cancer Institute, Center for Cancer Research, Genetics Branch

1:25 PM

Temporal Prediction of Future Disease States using High-dimensional Covariates
Sandipan Dutta, Old Dominion University

1:30 PM

Creation of Breast Cancer Subtypes Through a Consensus-Based Network Approach
Christina Horr, University of Notre Dame

1:35 PM

Did Increasing Continuity of Care Protect Patients with Chronic Disease from Emergency and Hospitalization Readmission? A Cohort Spatial-Temporal Study in Mississippi
Phi Le, University of Mississippi Medical Center

1:40 PM

NCHS Data Linkage Program: Leveraging the Nation’s Health Data for Evidence-Based Decision-Making
Lisa B. Mirel, CDC/NCHS

1:45 PM

Real-Time Client Attrition Prediction in the Nurse Family Partnership Home Visiting Program
Kaushik Mohan, Two Sigma

1:50 PM

Empirical Calibration of a Simulation Model of Opioid Use Disorder
Anusha Madushani Rajapaksha Wasala Mudiyanselage, Boston Medical Center

1:55 PM

Random Survival Forests for Dynamic Prediction of a Time-to-event outcome using a Longitudinal Biomarker
Krithika Suresh, University of Colorado

2:00 PM

County-level Low Birth Weight Rates and Associated Contextual Factors in the United States, 2011-2016
Pallavi Dwivedi, University of Maryland College Park

2:05 PM

A Population-Based Study of Associations Between Attainment of Incentivized Primary Care Indicators and Emergency Hospital Admissions Among Those with Type 2 Diabetes in England
Laura H Gunn, University of North Carolina at Charlotte & Imperial College London

2:10 PM

Important factors to predict Anemia during the treatment of Malaria in HIV-infected population
Yein Jeon, Georgetown University

2:15 PM

Identification of latent relationships between disability rates and socio-geographic variables in veterans utilizing Machine Learning methods
Gina McKernan, University of Pittsburgh

2:20 PM

Survival Analysis Based on Statistical Modeling Versus Cox Proportional Hazard Model of Multiple Myeloma Cancer Patients
Lohuwa Mamudu, University of South Florida

2:25 PM

Sequential Pattern Mining of Electronic Health Record for Early Diagnosis of Amyotrophic Lateral Sclerosis
Lily Sun, Stanford OHS

2:30 PM

Trinary (+/0/-) Categorization for Tracing Step-Based Shifts over Time and Identifying Hot Spots in Big Data
Turkan Kumbaraci Gardenier, Teka Trends, Inc.

2:35 PM

Bayesian Estimation of Program-Specific Impacts in the HPOG Program
Stanislav Kolenikov, Abt Associates

2:40 PM

Image clustering of brain tumor patients using 3D convolutional auto-encoder
Seyed Mohammad Hadi Hosseini, St. Jude Children's Research Hospital

CS15 - Addressing Big Data Challenges: Topics in Deep Learning and Model Monitoring
Lightning

Thu, Jun 3, 1:10 PM - 2:45 PM

Chair(s): SHUOYANG WANG, Auburn University

1:15 PM

WITHDRAWN Visual Similarity in Ranking E-Commerce Listings

1:20 PM

Automated Active Monitoring of Production Machine Learning Models
Katie Anne Bakewell, NLP Logix

1:25 PM

WITHDRAWN A Novel Machine Learning Approach for Humanitarian Relief Assistance After Natural Disasters

1:30 PM

A Brief Review of Quantum Computation, Quantum Algorithms, and Impact on Practice of Data Science
David Han, The University of Texas at San Antonio

1:35 PM

Score-Based Change Detection for Gradient-Based Learning Machines
Presentation Lang Liu, University of Washington

1:40 PM

Tolstoy Targets: An efficient niche graph.
Dennis Sweitzer, YPrime

1:45 PM

Monitoring of sunspot number observations based on neural networks
Sophie Mathieu, UCLouvain

1:50 PM

Utilizing stability criteria in choosing feature selection methods yields reproducible results in microbiome data
Lingjing Jiang, Johnson & Johnson

1:55 PM

Self-Supervised Learning for Robust Image Classification
Presentation Ladyna Wittscher, Friedrich-Schiller-Universität Jena

2:00 PM

Bayesian forward modeling of high-resolution radio interferometric gravitational lens observations
Presentation Devon Powell, Max Planck Institute for Astrophysics

2:05 PM

Modeling Implicit Feedback in Visual Recommendations for E-Commerce
Julia Zhou, Etsy, Inc

2:10 PM

Contextual Matching via Graph Representation Learning with Side Information
Chris Xu, Etsy

2:15 PM

Loss convergence in a causal Bayesian neural network of retail firm performance
F. Trevor Rogers, University of Hawaii, Manoa

2:20 PM

WITHDRAWN Statistically Structured Computations vs. Perceptron Computations: New Opportunities

SC1 - Data Visualization with R, Part 2
Short Course

Thu, Jun 3, 3:00 PM - 6:30 PM

Instructor(s): Aaron Robert Williams, Urban Institute

Data visualization plays a crucial role in the data science and statistics workflows. It is fundamental to everything from exploratory data analysis to communicating results. Data scientists and statisticians can better understand data and more effectively communicate their work by understanding how to better visualize their data. Too often, however, visualization is an afterthought.

In this course, attendees will learn the core principles of data visualization how we perceive visual information; the layered grammar of graphics; and best practices for creating effective visualizations. To put these principles to work, attendees will learn practical skills for R programming that improve the quality of their work and teach them to program away the mundane. The course will focus on the popular R package ggplot2 and the reproducible research framework R Markdown. All R instruction will begin with a clear motivation, followed by an explanation of the approach and code and ending with hands-on examples.

SC2 - Deep Learning in Statistics, Part 2
Short Course

Thu, Jun 3, 3:00 PM - 6:30 PM

Instructor(s): Edgar Dobriban, University of Pennsylvania; Annie Qu, University of California, Irvine; Xiao Wang, Purdue University

This short course is for those who are new to data science and interested in understanding the cutting-edge machine learning and deep learning models. It is for those who want to become familiar with the core concepts behind these learning algorithms and their successful applications and who want to start thinking about how machine learning and deep learning might be useful in their research, business, or career development. The course will provide a comprehensive overview of statistical machine learning and deep learning methods. Topics include classical methods and modern techniques, including basic machine learning tools, supervised and unsupervised learning, deep neural network, computational algorithms and software of deep learning, and various applications in deep learning.

SC4 - Data Quality for Data Science and Statistics: A Survey and Practical Application
Short Course

Thu, Jun 3, 3:00 PM - 6:30 PM

Instructor(s): Henry Li, Bigeye

Data science and statistics become more important in society every year—as a prime example, consider the sudden influx of public interest in COVID-19 tracking projects such as the tracker from 1Point3Acres. From published research that guides policy to the online predictive systems that set prices and control what we read, high-quality and reliable input data is a necessary (but not sufficient!) condition for quality outcomes.

This half-day course will cover the impact of data quality issues on data science and statistics work, taxonomies of data quality issues that can occur, a survey of current techniques and tools for issue identification, and how to start including data quality techniques in one’s data science work process.

Online Program

Key:

American Statistical Association