Short Courses

SDSS 2026 will offer one full-day and four half-day courses on Tuesday. Short courses require an additional fee.

Register

Full-Day Course
8:30 a.m. – 5:30 p.m.

Modern Machine Learning with Bayesian Additive Regression Trees (BART)
Robert E. McCulloch, Arizona State University, and Rodney Sparapani, Medical College of Wisconsin

Relatively cheap, off-the-shelf computing power has led to breakthroughs in our ability to learn high-dimensional, complex relationships from large, curated data sets. The two key modeling approaches in this area are deep learning with neural networks and ensemble learning with trees. Deep learning is an ideal choice for prediction when all the covariates are of the same type (e.g., pixels, words, audio waves). Conversely, ensemble learning is superior with respect to out-of-sample predictive performance for tabular data when all the covariates are of different types (e.g., age, sex, weight). A collection of machines (trees in this case) is fit simultaneously and forms the basis of an ensemble’s aggregate predictive performance far beyond that of any single machine’s fit.

The focus of this workshop is machine learning with an ensemble of trees with Bayesian Additive Regression Trees. Although its machine learning foundation might seem daunting, BART is easy to use. We will demonstrate with user-friendly R packages and worked examples. The Bayesian approach allows for a Markov chain Monte Carlo stochastic exploration of the model space, uncertainty quantification, and posterior inference. BART is a modern nonparametric approach that exploits the elegance and convenience of the Bayesian paradigm. Handling outcomes of different types—continuous, dichotomous/categorical, and time-to-event—is one of BART’s strengths.

Furthermore, we will demonstrate BART’s effectiveness in a wide range of regression applications, including marginal effects, variable selection, monotonicity, outlier detection, and time-to-event extensions such as competing risks or recurrent events. This is a lot of material to cover in one day, so we have developed digital materials for attendees to self-explore afterward.

Morning Half-Day Courses
8:30 a.m. – 12:30 p.m.

Getting Started with Positron: A Next-Generation IDE for Data Science
Mine Çetinkaya-Rundel, Duke University

Positron is a next-generation data science IDE built by Posit PBC that combines the best features of RStudio and Visual Studio Code. This tutorial will introduce Python and R users to Positron’s core capabilities, with an emphasis on helping RStudio and VS Code users while highlighting its seamless integration with the respective language ecosystems. For R programmers coming from RStudio, Positron delivers a familiar yet enhanced environment for data analysis and package development, while offering a path to Python when needed. For Python programmers coming from VS Code, Positron offers data science–specific enhanced capabilities for exploring and analyzing data, authoring, and publishing. Unlike traditional software development–oriented IDEs, Positron provides first-class support for data science–specific workflows through its native support for R and Python, along with a designated, integrated UI for exploring for variables (environment), connections, plots, help, and more.

Everyday Reproducibility
Gregory Hunt, William & Mary, and Johann Gagnon-Bartsch, University of Michigan

Reproducibility is essential to statistics in both academia and industry. Indeed, reproducibility is fundamental to science, itself, and the public confidence in it. Beyond reproducibility, analyses must also be easy to share, access, and explore. Yet building analyses that meet these standards is rarely simple. The “reproducibility crisis” has drawn attention from both popular science and professional societies. This short course introduces participants to the core ideas behind reproducibility in statistics and data science, along with practical tools and workflows that make analyses more reproducible, shareable, and accessible. Our goal is to highlight approaches statisticians and data scientists across diverse fields and computing environments can readily adopt.

Afternoon Half-Day Courses
1:30 p.m. – 5:30 p.m.

Expanding the Statistician’s Toolkit: Building and Sharing Data Science Tools in R
Mehdi Maadooliat, Jaihee Choi, and Daniel Cirkovic, Marquette University

R is a primary platform for statistical computing. Many statisticians, however, do not fully engage with the broader data science community. This short course helps statisticians take the next step. Participants will learn tools that enable them to create, disseminate, and collaborate effectively beyond their own projects. The course builds on Short Course on R Tools (http://mmadoliat.github.io/SCoRT). The focus is on hands-on, practical coding sessions and working examples participants can adapt immediately.

Toward Trustworthy Statistical Inference with Black-Box AI Predictions
Jiwei Zhao, University of Wisconsin-Madison

The widespread adoption of AI and ML has reshaped modern data analysis. Predictions, embeddings, and synthetic data from black-box models, such as deep neural networks and large language models, are increasingly incorporated into downstream statistical workflows. For instance, predicted gene expression values or polygenic risk scores are often substituted for experimental assays, enabling researchers to enlarge cohorts and pursue hypotheses when direct measurement is infeasible, costly, or time-consuming. While AI/ML models can usually deliver strong predictive performance, their opaque mechanisms and potential biases introduce additional layers of uncertainty that can compromise the validity of classical inference. Treating black-box outputs as ground truth risks biased estimation, misleading confidence intervals, and invalid hypothesis tests.

 

Key Dates

  • October 15, 2025
    Refereed Abstract & Panel Submission Opens
  • December 10, 2025 11:59 PM
    Refereed Abstract Submission Closes
  • December 10, 2025 11:59 PM
    Panel Proposal Submission Closes
  • January 7, 2026
    Lightning Abstract Submission Opens
  • January 7, 2026
    Early Registration and Housing Open
  • February 5, 2026 11:59 PM
    Lightning Abstract Submission Closes
  • March 25, 2026
    Early & Speaker Registration Deadline
  • March 26, 2026
    Regular Registration (increased fees apply)
  • April 3, 2026 5:00 PM
    Housing Deadline (EXTENDED!)
  • April 28, 2026 – May 1, 2026
    SDSS 2026 om Milwaukee, WI