Abstract:
|
Reaction chemistry is poised to undergo the same high-throughput revolution that produced the omics revolution within biology. We propose an iterative data science workflow that allows quick, cheap, and effective analysis of chemical reactions. In particular, we identify necessary data-driven decisions at multiple stages of the analysis including the screening of plate output for successful reaction combinations, the prediction of new reactions for either exploitation of high yield conditions or exploration of effective substrate pairings, and the design of new plates. Our setup relies heavily on bespoke chemical systems that allow many reactions to be run at once, which massively increases the scale, but also presents unique challenges such as a short time horizon, the desire to perform branched inquiries, the presence of separate high-dimensional descriptor spaces, and design constraints on plates. In this talk, I will underscore these statistical challenges, sketch our pipeline, and discuss both the subtleties for removing plate effects and the reformulation of the design of a new plate as a constrained optimization problem. This is joint work with colleagues in BU Chemistry.
|