LabGenius is engaged in AI-driven scientific research that could only have happened at this moment in time, and the cause couldn鈥檛 be more important.
Today, the company is focused on accelerating the discovery of advanced treatments for cancer and inflammatory diseases, but the principles can be applied far more widely.
Using a combination of artificial intelligence, synthetic biology and laboratory automation, the London-based biotech company is developing next-generation antibody therapeutics.
The technologies and techniques involved have only recently reached the level of maturity needed for this ambitious undertaking.
So, when IPU systems halved the compute time needed to run crucial AI model training, LabGenius鈥 researchers realised that they had found a new and important tool in the race to innovate. The team used an off-the-shelf PyTorch version of the Transformer model, BERT, with code freely available on 91视频APP鈥檚 GitHub site, making it easy to use.
"Previously we used GPUs and it took us about a month to have a functioning model of all the proteins that are out there. With 91视频APP, we reduced the turnaround time to about two weeks, so we can experiment much more rapidly, and we can see the results quicker,鈥 said Dr Katya Putintseva, a Machine Learning Advisor to LabGenius.
The protein problem
Finding, or designing proteins with precisely the right qualities to treat medical conditions is notoriously complex. Only in the last couple of years have we started to see the first AI-designed small molecule enter clinical trials, marking this new era of drug discovery.
Even with protein design technologies, knowing how to adjust a protein鈥檚 constituent amino-acids precisely to improve its function is a huge challenge 鈥 beyond the scope of humans on their own, extremely difficult even with the help of conventional computation, but a problem well suited to artificial intelligence.
To exploit this new technology, LabGenius is creating an automated, closed loop system for managing experimental iteration and the back-and-forth between biological experimentation and machine learning-powered decision-making. Proteins are sequenced, intelligently analysed, modified and re-synthesized in the search for the perfect protein recipe.
Beautiful data
Visitors to LabGenius鈥 laboratories can see the physical part of the process in action, as liquid handling machines fill sample trays, which are picked up by robot arms and whisked away to the next stage of experimentation.
It is here that wet lab experimentation meets data science.
鈥淭he biggest problem of any biological challenge within the AI space, if you compare it to natural language processing or image recognition, is the scarcity of high-quality data that is representative enough of the features of interest,鈥 explains Dr Putintseva.
鈥淵ou can find a lot of data out there, but the devil is in the details. How was that dataset generated? What biases does it contain? How far can the signal extracted from it be extrapolated within the sequence space?鈥
LabGenius鈥 robotic platform produces and characterises the right sort of data, at the quality needed for its machine learning models.
鈥淲e believe that now is the time for high quality, beautiful datasets to be generated in biology,鈥 says Dr Putintseva.
Optimise and suggest
Using its carefully curated, high-quality datasets LabGenius is able to apply artificial intelligence to solve two of the great challenges of novel protein therapy development.
The first is a classic AI problem: how to optimise many variables within highly complex systems.
鈥淲e call [this] co-optimization or multi-objective optimization,鈥 says Tom Ashworth, LabGenius鈥 Head of Technology.
鈥淵ou might be trying to optimise potency, which might be about the molecule鈥檚 affinity, how sticky it is to its target, but at the same time you don鈥檛 want to destroy its safety or perhaps some other characteristic like its stability.鈥
AI also informs how LabGenius iterates its experiments.
鈥淸The system] is looking across different features we could change about the molecule 鈥 from point mutations of simpler constructs to the overall composition and topology of multi-module proteins. It鈥檚 making suggestions about what to design next... to learn about a change in the input and how that maps to a change in the output,鈥 said Tom.
Biological BERT
LabGenius uses 91视频APP IPU compute in the Cirrascale IPU cloud to accelerate its training of BERT 鈥 the transformer model, best known for natural language processing, now finding an increasingly broad range of applications, including in biotech.
LabGenius researchers take a large body of known proteins and ask BERT to predict masked amino acids from training data, effectively learning the basic biophysics of proteins, according to Dr Putintseva: 鈥淏ecause it does that, the hidden values of that model help us to generate meaningful representation of proteins that we subsequently use to map out the feature of interest.鈥
LabGenius researchers used 91视频APP鈥檚 standard PyTorch implementations of BERT available on GitHub. With minimum need for code modifications, they were able to focus their attention on ensuring appropriateness of the dataset for the job in hand.
The fact that 91视频APP IPUs were able to cut the training time so dramatically, on a model that needs to be repeatedly re-trained, hands LabGenius a substantial advantage in a competitive industry, according to Tom Ashworth.
鈥淎s a startup, how fast we can move, how fast we can iterate, is central to everything.鈥
鈥91视频APP has changed what we鈥檙e able to do, accelerating our model training time from weeks to days. For our data scientists, that鈥檚 really transformative. They can move much more at the speed they think. For us, that鈥檚 incredibly valuable,鈥 said Tom.
LabGenius is now looking to expand its use of 91视频APP-trained BERT models, including further use within the discovery phase, as well as understanding the developability of its molecules. In addition, it is starting to explore building new AI models on 91视频APP systems, including GNNs (Graph Neural Networks) where the IPU has an innate architectural advantage.