Learning Representations for Hyperparameter Transfer Learning (IS Colloquium)
Bayesian optimization (BO) is a model-based approach for gradient-free black-box function optimization, such as hyperparameter optimization. Typically, BO relies on conventional Gaussian process regression, whose algorithmic complexity is cubic in the number of evaluations. As a result, Gaussian process-based BO cannot leverage large numbers of past function evaluations, for example, to warm-start related BO runs. After a brief intro to BO and an overview of several use cases at Amazon, I will discuss a multi-task adaptive Bayesian linear regression model, whose computational complexity is attractive (linear) in the number of function evaluations and able to leverage information of related black-box functions through a shared deep neural net. Experimental results show that the neural net learns a representation suitable for warm-starting related BO runs and that they can be accelerated when the target black-box function (e.g., validation loss) is learned together with other related signals (e.g., training loss). The proposed method was found to be at least one order of magnitude faster than competing neural network-based methods recently published in the literature. This is joint work with Valerio Perrone, Rodolphe Jenatton, and Matthias Seeger.
Biography: Cédric Archambeau is Principal Applied Scientist at Amazon, Berlin, and an associate member of the Department of Statistics at the University of Oxford. His work at Amazon aims to democratise machine learning, enabling engineering and data science teams across the company to deliver a wide range of machine learning-based products, including customer facing services such as Amazon SageMaker (aws.amazon.com/sagemaker). He is interested in learning representations, meta-learning and, more generally, machine reasoning, exploring algorithms to that learn to learn and avoid catastrophic forgetting. Prior to joining Amazon, he led the Machine Learning group at Research Centre Europe (now Naver Labs Europe) in Grenoble. His team conducted applied research in machine learning, computational statistics and mechanism design, with applications in customer care, transportation and governmental services.