Online Program Home
My Program

Abstract Details

Activity Number: 352 - Recent Development in Imaging Data Analysis
Type: Contributed
Date/Time: Tuesday, July 30, 2019 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistics in Imaging
Abstract #307038 Presentation
Title: Estimating the Amount of Training Data for a Deep Learning Algorithm to Detect Severe Burns
Author(s): Amy Nussbaum* and Jeffrey Thatcher and Faliu Yi and Ron Baxter and Aadeesh Shringarpure and Humberto Talavera and Kevin Plant
Companies: SpectralMD and SpectralMD and SpectralMD and SpectralMD and SpectralMD and SpectralMD and SpectralMD
Keywords: Deep Learning; Image Processing; Sample Size; Training; Validation
Abstract:

With a recent increase in deep learning and image processing applications in medical diagnosis, it is necessary to estimate the amount of data required to adequately train these algorithms. This is especially difficult when the ideal data does not exist in an accessible database.

We are designing a study to collect images to train a deep learning algorithm to detect severe burns on human skin. We will generate our own training database and estimate the sample size necessary for the algorithm to achieve an appropriate level of accuracy when validated on an independent dataset. One method for sample size estimation is a learning curve (Figueroa, et al. 2012, Equation 1), dependent on minimum achievable error, learning rate, and decay rate parameters.

Using bootstrap, we apply this learning curve to our current small dataset. This method estimates the algorithm would require a dataset of approximately 108 subjects to achieve 88% accuracy in detecting burned tissue. Limitations of included outliers, non-convergent parameter estimates, and unrepresentative validation sets. Despite the limitations, we found this method provides a practical estimate of training database sample size.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program