Online Program

Return to main conference page
Friday, May 31
Machine Learning
Machine Learning E-Posters, I
Fri, May 31, 9:45 AM - 10:45 AM
Grand Ballroom Foyer

Adaptively Stacked Ensembles for Influenza Forecasting with Incomplete Data (306223)


*Thomas Charles McAndrew, University of Massachusetts Amherst 
Nicholas G Reich, University of Massachusetts Amherst 

Keywords: Forecasting, Ensemble methods, Influenza, Bayesian, Machine learning

Seasonal Influenza infects an average 30 million people in the United States every year, overburdening hospitals during weeks of peak incidence.Named by the CDC as an important tool to fight the damaging effects of these epidemics, accurate forecasts of influenza and influenza like illness (ILI) forewarn public health officials about when, and where, seasonal influenza outbreaks will hit hardest.

Ensemble forecasts have shown positive results in forecasting 1 to 4 week ahead ILI percentages better than any single model in an ensemble.But current ensemble forecasts are static within a season, training on past ILI data before the season begins and generating optimal weights for each model kept constant throughout the season.We propose a novel adaptive ensemble forecast capable of changing model weights week-by-week throughout the flu season, only needing current flu season data to make predictions, and able to moderate ensemble weights with a prior chosen through cross-validation.At the start of the season, without any data, models are weighted equally, and after observing how the ensemble's models perform on week ahead forecasts, a variational inference algorithm updates model weights and forecasts the next 4 weeks of ILI percentages.

Our adaptive ensemble performs as well as the static ensemble in the US national and regional level (mean % log-score between adaptive and static ensemble = 1.1;upper 97.5CI =1.4) and better for more sparse seasons (e.g. Season 2011/2012;mean = -2.9;95CI = [-3.7, -2.1];p<0.01).The choice of prior (equal across models) also affects adaptive ensemble performance, finding the best logscore with a prior weight equal to 8% of the training data.

In settings without substantial past data (i.e. emerging pandemics) or when new models do not have a long track-record of performance, an adaptive ensemble approach will be the only option for performance-based weighting of models, enhancing the public health impact of ensemble forecasts.