Online Program

Return to main conference page
Saturday, May 19
Data Science
Data Science in Practice
Sat, May 19, 10:30 AM - 12:00 PM
Grand Ballroom G

Applied Techniques for Machine Learning with Limited Data (304449)

*Andrew Hoblitzell, IUPUI 
*Andrew Hoblitzell, Purdue University 

Keywords: deep learning, uncertainty quantification, online machine learning, data farming, few-shot learning, design of experiments, interactive optimization, decision support system, cybernetics, user modeling

While there has recently been a renaissance in big-data infrastructure tools and machine learning algorithms, there are still a number of use cases where there is an inherent limitation on data set size due to the economic cost involved in procuring data. This work will provide an overview of several techniques which are available in the context of 'small data' data science, with more focus on implementation within a deep learning network. Some of the techniques touched on will include finding domain experts who are familiar with small sample experiments from that area and reducing model complexity, and then we will focus on techniques more specific to our approach including application-specific data augmentation and engineering, generative models, and few-shot learning. Due to the limited amount of data involved, there will also be discussion about ways to provide feedback about the prediction uncertainty to stakeholders. An interactive decision support system where stakeholders design user-centric environmental solutions will be presented to provide a case study of one successful way of dealing with the limited data problem. This work will thus have the following contributions: i) describe current techniques for dealing with limited data, ii) outline current work in user modeling with limited data, and iii) describe our work in implementing accurate virtual user models for driving environmental outcomes