A person walks next to the Google Cloud logo at the Mobile World Congress (MWC) in Barcelona, Spain February 27, 2023.
Nacho Doce | Reuters
One tool, called the Target and Lead Identification Suite, is designed to help companies predict and understand the structure of proteins, a fundamental part of drug development. Another, the Multiomics Suite, will help researchers ingest, store, analyze and share mass amounts of genomic data.
related investing news
The new developments mark Google’s latest advancement in the red-hot AI arms race, where tech companies are competing to dominate a market that analysts believe could someday be worth trillions. The company has faced pressure to showcase its generative artificial intelligence technology since the public release of OpenAI’s ChatGPT late last year.
The two new Google Cloud suites help address a long-standing issue in the biopharma industry: the lengthy and costly process of bringing a new medicine to the U.S. market.
Drug companies can invest anywhere from a few hundred million dollars to more than $2 billion to launch a single drug, according to a recent Deloitte report. Their efforts aren’t always successful. Medicines that reach clinical trials have a 16% chance of being approved in the U.S., another Deloitte report says.
That hefty cost and bleak success rate is accompanied by an extensive and tedious research process that typically lasts about 10 to 15 years.
The new suites will save companies a “statistically significant” amount of time and money throughout the drug development process, said Shweta Maniar, Google Cloud’s global director of life sciences strategy and solutions. Google did not provide CNBC with specific figures.
“We’re helping organizations get medicines to the right people faster,” Maniar told CNBC in an interview. “I am personally very excited, this is something that myself and the team have been working on for a few years now.”
Both suites are widely available to customers starting Tuesday. Google said the cost will vary depending on the company. Several businesses, including Big Pharma’s Pfizer and the biotech companies Cerevel Therapeutics and Colossal Biosciences, have already been using the products.
Target and Lead Identification Suite
The Target and Lead Identification Suite aims to streamline the first key step of drug development, which is identifying a biological target that researchers can focus on and design a treatment around, according to Maniar.
A biological target is most commonly a protein, an essential building block of diseases and all other parts of life. Finding that target involves identifying the structure of a protein, which determines its function, or the role it plays in a disease.
“If you can understand the role, the protein structure and role, now you can start developing drugs around that,” Maniar said.
But that process is time-consuming and often unsuccessful.
Scientists can take around 12 months just to identify a biological target, according to a widely followed guidance manual for drugmakers posted in a database run by the federal National Library of Medicine. The two techniques researchers traditionally use to determine protein structures also have a high rate of failure, according to Maniar.
She also said it’s difficult for traditional technologies to increase or decrease the amount of work they do based on demand.
Google Cloud’s suite has a three-pronged approach for making that process more efficient.
The suite allows scientists to ingest, share and manage molecular data on a protein using Google Cloud’s Analytics Hub, a platform that lets users securely exchange data across organizations.
Researchers can then use that data to predict the structure of a protein with AlphaFold2, a machine learning model developed by a subsidiary of Google.
AlphaFold2 runs on Google’s Vertex AI pipeline, a platform that allows researchers to build and deploy machine learning models faster.
In minutes, AlphaFold2 can predict the 3D structure of a protein with more accuracy than traditional technologies and at the scale researchers need. Predicting that structure is critical because it can help researchers understand a protein’s function in a disease.
The final component of Google Cloud’s suite helps researchers identify how the protein’s structure interacts with different molecules. A molecule can become the basis for a new drug if it changes that protein’s function and ultimately demonstrates the ability to treat the disease.
Researchers can use Google Cloud’s high-performance computing resources to find “the most promising” molecules that could lead to the development of a new drug, according to a press release on the new tools. Those services provide the infrastructure companies need to accelerate, automate and scale up their work.
Cerevel, which focuses on developing treatments for neuroscience diseases, typically has to screen a large library of 3 million different molecules to find one that will produce a positive effect against a disease, according to Chief Scientific Officer John Renger. He called that process “complicated and involved and expensive.”
But Renger said the company will be able to weed out molecules faster using Google Cloud’s suite. Computers will take care of screening molecules and help Cerevel “get to an answer really quickly,” he said.
Renger estimates Cerevel will save at least three years on average by using the suite to discover a new drug. He said it’s difficult to estimate how much money the company will save, but emphasized that the suite cuts down on the resources and manual labor typically required to screen molecules.
“What it means is we can get there faster, get there cheaper and we can get to drugs to patients much more quickly without as many failures,” he told CNBC.
Cerevel has been working with Google for more than a month to further understand the suite and determine how the company will use it. But Renger hopes Cerevel will “be at a place where we get some results” in the next month.
Google Cloud’s second solution, the Multiomics Suite, aims to help researchers tackle another daunting challenge: genomic data analysis.
Colossal Biosciences, a biotechnology company that aims to use DNA and genetic engineering to reverse extinction, has been using the Multiomics Suite in its research.
As a startup, Colossal did not have the internal infrastructure necessary to organize or decipher massive quantities of genomic data. One human genome sequence alone requires more than 200 gigabytes of storage, and researchers believe that they will need 40 exabytes to store the world’s genomic data by 2025, according to the National Human Genome Research Institute.
The institute estimates that five exabytes could store every word ever spoken by humans, so building the technology to support genomic data analysis is not a small task.
As such, the Multiomics Suite aims to provide companies like Colossal with the infrastructure they need to make sense of large amounts of data so they can spend more time focusing on new scientific discoveries.
“If we had to do everything from scratch, I mean, that’s the power of Google Cloud, right?” Colossal’s vice president of strategy and computational sciences, Alexander Titus, told CNBC in an interview. “We don’t have to build that from scratch, so that definitely saves us time and money.”
Researchers’ ability to sequence DNA has historically outpaced their ability to decipher and analyze it. But as technology has improved in recent years, genomic data has unlocked new insights into areas like the genetic variations associated with disease.
Google Cloud’s Maniar said it could ultimately aid in the development of more personalized drugs and treatments. In 2021 alone, two-thirds of drugs approved by the Food and Drug Administration were supported by human genetics research, according to a paper published in the journal “Nature.”
Maniar believes the Multiomics Suite will help encourage further innovation.
Ben Lamm, CEO of Colossal, said the Multiomics Suite is the reason the company has been able to carry out research on “any reasonable timeline.” Colossal started piloting Google’s technology late last year, and as a result, Lamm said the company is on target to produce a woolly mammoth by 2028.
Without the Multiomics Suite, Lamm said he thinks the company would have been set back by over a decade.
“We would not be anywhere near where we are today,” he said.
Prior to using Google Cloud’s suite, much of Colossal’s data management was done manually using spreadsheets, Lamm said.
He said it would have been a “massive burden” on the company to try to build the more complex tools it needed for research.
“We’re no longer in small data when it comes to biology,” said Colossal’s Titus. “We’re thinking on the scale of how do we get insights into 10,000, 20,000, 10 million years of evolutionary history? And those questions just aren’t answered without scalable computing infrastructure and tools like cloud computing and Multiomics.”
Correction: Scientists can take around 12 months just to identify a biological target, according to a widely followed guidance manual for drugmakers posted in a database run by the federal National Library of Medicine. An earlier version misstated the attribution.