Talks and presentations

Bayesian optimization as an approach to drug development

July 06, 2021

Talk, Machine Learning and AI in Bio(Chemical) Engineering Conference, Invited Speaker, Cambridge, UK

Optimization is ubiquitous in pharmaceutical development, from tuning chemical structure to maximize potency to optimizing the yield of a chemical process. Likewise, parameter optimization is omnipresent in artificial intelligence, from tuning virtual personal assistants to training social media and product recommendation systems. Owing to the high cost associated with carrying out experiments, scientists in both areas set numerous (hyper)parameter values by evaluating only a small subset of the possible configurations. Bayesian optimization, an iterative response surface-based global optimization algorithm, has demonstrated exceptional performance in the tuning of machine learning models. Here we report the development of a framework for Bayesian reaction optimization and an open-source software tool that allows chemists to easily integrate state-of-the-art optimization algorithms into their everyday laboratory practices. We collect a large benchmark dataset for a palladium-catalysed direct arylation reaction, perform a systematic study of Bayesian optimization compared to human decision-making in reaction optimization, and apply Bayesian optimization to two real-world optimization efforts (Mitsunobu and deoxyfluorination reactions). Benchmarking is accomplished via an online game that links the decisions made by expert chemists and engineers to real experiments run in the laboratory. Our findings demonstrate that Bayesian optimization outperforms human decision making in both average optimization efficiency (number of experiments) and consistency (variance of outcome against initially available data). Overall, our studies suggest that adopting Bayesian optimization methods into everyday laboratory practices could facilitate more efficient synthesis of functional chemicals by enabling better-informed, data-driven decisions about which experiments to run.

Machine learning in methods development: From reaction outcome prediction to mechanistic understanding

May 01, 2019

Talk, Green Chemistry & Engineering, Invited Speaker, Reston, Virginia

Machine learning (ML), the development and study of computer algorithms that can learn from data, is increasingly important across a wide array of applications in chemistry. For example, ML has facilitated virtual screening of druglike molecules for medical applications, rapid prediction of physical data, and computer aided synthesis planning. While ML has become well-established in these areas, scientists have only just begun to advance tools for synthetic methods development (reaction optimization, prediction, mechanistic study). Though these burgeoning areas of research have already added to the synthetic chemist’s toolbox, average research practices have remained relatively unaffected. One approach to facilitating the adoption of ML in synthetic chemistry is to develop applications which integrate seamlessly with the typical methods of synthetic chemists. Here I will discuss approaches to some obstacles to incorporating ML in the synthetic mainstay including: (1) interpretability – scientists may not trust a model because predictions appear to be unintelligible or derived randomly from regressors. This challenge could be overcome by using simple interpretable graphics and traditional physical organic chemistry to explain and experimentally probe ML results. (2) Data – current approaches to applying ML in synthetic chemistry have focused on mining the chemical literature or actively generating new datasets on a per problem basis. However, mined data is sparse, noisy, and often incomplete and data set curation imposes a heavy experimental cost. An alternative approach is to draw from the success of ML in other areas which incorporate data endogenous to a given domain (e.g. product recommendation systems). Much of the data collected in synthetic chemistry laboratories is derived from the optimization of reactions. While this data is typically leveraged only towards the discovery of optimal conditions, a method which draws from optimization data, quantum chemical calculations, and ML could naturally integrate with synthetic research practices.