Preferential Bayesian Optimisation for Protein Design with Fine-Tuned Protein Language Model Ensembles

Alex Hawkins-Hooker | Paul Duckworth | Oliver Bent

Published

ABSTRACT

It has recently been observed that the use of ranking-based loss functions improves the quality of predictions of fitness landscapes for both standard supervised deep learning models and fine-tuned protein language models. We consider the implications of this finding for protein design with Bayesian optimisation. We investigate a range of uncertainty quantification techniques applicable to protein language models fine-tuned with ranking losses, showing that they can offer competitive calibration to CNN ensembles while demonstrating superior predictive performance. Finally, we offer a demonstration of how uncertainty-aware ranking-based models can be exploited for protein design within the framework of preferential Bayesian Optimisation.

InstaDeep
Privacy Overview

Please read our extensive Privacy policy here. You can also read our Privacy Notice and our Cookie Notice