New MIT Computer Model Helps Identify Mutations That Drive Cancer

MIT Model Helps Identify Mutations That Drive Cancer

An MIT-led team has built a new system that rapidly scans the genome of cancer cells and could help researchers find targets for new drugs. Credit: Dylan Burnette and Jennifer Lippincott-Schwartz, National Institutes of Health, edited by MIT News

The system rapidly scans the genome of cancer cells and could help scientists find targets for new drugs.

Cancer cells can have thousands of[{” attribute=””>DNA mutations. However, only a small number of those actually drive the progression of cancer; the rest are just along for the ride.

Researchers could identify better drug targets if they are able to distinguish these harmful driver mutations from the neutral passengers. To boost those efforts, an

Sherman, MIT graduate student Adam Yaari, and former MIT research assistant Oliver Priebe are the lead authors of the study, which was published recently in Nature Biotechnology. Bonnie Berger, the Simons Professor of Mathematics at MIT and head of the Computation and Biology group at the Computer Science and Artificial Intelligence Laboratory (CSAIL), is a senior author of the study, along with Po-Ru Loh, an assistant professor at Harvard Medical School and associate member of the Broad Institute of MIT and Harvard. Felix Dietlein, an associate professor at Harvard Medical School and Boston Children’s Hospital, is also an author of the paper.

A new tool

Since the human genome was sequenced two decades ago, scientists have been scouring the genome to try to find mutations that contribute to cancer by causing cells to grow uncontrollably or evade the immune system. This has successfully yielded targets such as epidermal growth factor receptor (EGFR), which is commonly mutated in lung tumors, and BRAF, a common driver of melanoma. Both of these mutations can now be targeted by specific drugs.

While those targets have proven useful, protein-coding genes make up only about 2% of the genome. The other 98% also contains mutations that can occur in cancer cells, but it has been much more difficult to figure out if any of those mutations contribute to cancer development.

“There has really been a lack of computational tools that allow us to search for these driver mutations outside of protein-coding regions,” Berger says. “That’s what we were trying to do here: design a computational method to let us look at not only the 2% of the genome that codes for proteins, but 100% of it.”

To do that, the researchers trained a type of computational model called a deep neural network to search cancer genomes for mutations that occur more frequently than expected. As a first step, they trained the model on genomic data from 37 different types of cancer, which allowed the model to determine the background mutation rates for each of those types.

“The really nice thing about our model is that you train it once for a given cancer type, and it learns the mutation rate everywhere across the genome simultaneously for that particular type of cancer,” Sherman says. “Then you can query the mutations that you see in a patient cohort against the number of mutations you should expect to see.”

The data used to train the models came from the Roadmap Epigenomics Project and an international collection of data called the Pan-Cancer Analysis of Whole Genomes (PCAWG). The model’s analysis of this data gave the researchers a map of the expected passenger mutation rate across the genome, such that the expected rate in any set of regions (down to the single base pair) can be compared to the observed mutation count anywhere across the genome.

Changing the landscape

Using this model, the MIT scientists were able to add to the known landscape of mutations that can drive cancer. Currently, when cancer patients’ tumors are screened for cancer-causing mutations, a known driver will turn up about two-thirds of the time. The new results of the MIT study offer possible driver mutations for an additional 5 to 10 percent of the pool of patients.

One type of noncoding mutation the researchers focused on is called “cryptic splice mutations.” Most genes consist of sequences of exons, which encode protein-building instructions, and introns, which are spacer elements that usually get trimmed out of messenger

Another region where the researchers found a high concentration of noncoding driver mutations is in the untranslated regions of some tumor suppressor genes. The tumor suppressor gene TP53, which is defective in many types of cancer, was already known to accumulate many deletions in these sequences, known as 5’ untranslated regions. The MIT team found the same pattern in a tumor suppressor called ELF3.

The scientists also used their model to investigate whether common mutations that were already known might also be driving different types of cancers. As one example, the researchers found that BRAF, previously linked to melanoma, also contributes to cancer progression in smaller percentages of other types of cancers, including pancreatic, liver, and gastroesophageal.

“That says that there’s actually a lot of overlap between the landscape of common drivers and the landscape of rare drivers. That provides opportunity for therapeutic repurposing,” Sherman says. “These results could help guide the clinical trials that we should be setting up to expand these drugs from just being approved in one cancer, to being approved in many cancers and being able to help more patients.”

Reference: “Genome-wide mapping of somatic mutation rates uncovers drivers of cancer” by Maxwell A. Sherman, Adam U. Yaari, Oliver Priebe, Felix Dietlein, Po-Ru Loh and Bonnie Berger, 20 June 2022, Nature Biotechnology.
DOI: 10.1038/s41587-022-01353-8

The research was funded, in part, by the National Institutes of Health and the National Cancer Institute.

Leave a Reply

Your email address will not be published.