Saturday, February 25
CS23 Latent Variable and Mixed Effects Models Sat, Feb 25, 11:00 AM - 12:30 PM
City Terrace 7

Optimization of Processes and Products from Historical (Un-Designed) Data (303299)

Mark-John Bruwer, ProSensus, Inc. 
*John F. MacGregor, ProSensus, Inc. 

Keywords: Historical data, process otpimization, product development, latent variable models

R.A. Fisher’s famous quote that “all one can do with happenstance data is a post-mortem to see what they died of” has been a constant motivation for statisticians to promote the use of DOE’s. Unfortunately, the huge volume of historical data (“BIG Data”) being collected routinely by the process industries is happenstance data characterized by extreme correlation among the variables. This results in a lack of regression model uniqueness and a lack of causal information among the variables. The unfortunate result is that this plentiful and readily available historical data is often not used to optimize processes and products. This paper discusses how latent variable models built using methods such as PLS (Partial Least Squares or Projection to Latent Structures) simultaneously provide reduced dimensional models for both the X and Y spaces, thereby ensuring uniqueness. Furthermore, these models are causal in the reduced dimensional latent variable space and hence allow one to optimize a process based on its historical data. These concepts are illustrated on several industrial problems and data sets: (1) optimization of batch and continuous chemical processes using historical data (a herbicide and a polymerization process, respectively); (2) the development of new and improved products through the simultaneous selection of raw materials, formulation ratios and process conditions (e.g., functional polymers for golf ball cores, high performance polymeric coatings, and food formulations). The optimizations are performed in the latent variable space (all constraints that must be respected are projected into that space).