Online Program

Return to main conference page
Saturday, February 16
Sat, Feb 16, 11:00 AM - 12:30 PM
Magazine
Parametric-Independent Methods

On Missing Random Effects in Machine Learning (303826)

View Presentation View Presentation

*Fabio D'Ottaviano, The Dow Chemical Company 

Keywords: Random Effect, Machine Learning, Mixed Effects Model

The large availability of undesigned data, a by-product of chemical industrial research and manufacturing, makes it attractive the venturesome use of machine learning for its plug-and-play appeal in attempt to extract value out of this data. Often this type of data does not only reflect the response to controlled variation but also to that caused by random effects. The literature search presented in this study corroborates with the idea that within realm of the Chemical industry the notion of random effects is relatively unknown. Thus, machine learning based models in this industry may easily miss active random effects out. This study shows by simulation the effect of missing a random effect (batch ID) via machine learning — vs. including it via mixed effects modeling — for a set of experimental variables commonly encountered in the Chemical industry and as a function of the total variance in the response, proportion of this variance attributed to the batch random effect, batch size relative to data size, and data size. The results also generalize to the effect of missing a random effect by means other than machine learning.