Many primary breast cancer genomes have been sequenced, including those that are part of the National Cancer Institute’s The Cancer Genome Atlas (TCGA). Sequencing individuals’ cancer genomes can help identify a particular cancer subtype and potentially important mutations that help a tumor cell survive, divide quickly, and evolve. Most of the known cancer-driving mutations are within the protein-coding regions of genes. Much less is known about cancer-driving mutations that are found outside of protein-coding sequences in the genome. A recent analysis of 360 primary breast cancer genomes published in Nature has identified some potentially important cancer-driving mutations found within noncoding genomic regions. Today we are speaking with one of the authors of this analysis, Gad Getz, PhD, of the Broad Institute and the Massachusetts General Hospital in Boston.
-Interviewed by Anna Azvolinsky
OncoTherapy Network: First, can you distinguish between potential cancer-driving mutations within protein-coding DNA regions vs noncoding regions? Are these noncoding regions more difficult to identify?
Dr. Getz: Yes, we can distinguish driver mutations, which are those that cause or promote cancer compared to “passengers”. Passengers are the majority of mutations that don’t really contribute to cancer. It is indeed easier to do that with coding regions because in the coding regions we have help from what is called the “silent” mutations, those mutations that don’t change the protein sequence and that can give us a sense of the local background mutation rate that we observe in that gene or region of the genome. Coding regions make this easier because we can compare the number of non-silent mutations with the number of silent mutations. This is a good way to distinguish the drivers from the passengers because those genes that have drivers have more than the silent mutations give you. In noncoding regions it is more complicated because we don’t have silent and non-silent mutations, so we need to estimate the local background rate based on the neighboring mutations since we don’t really know which ones would have a large effect or a small effect in noncoding regions. Nevertheless, we can estimate the background rate and build a statistical model for every part of the genome, including how many mutations we are seeing across the cohort. If we see more than expected in a statistically significant manner that suggests that there is potentially a driving mutation rather than a passenger.
OncoTherapy Network: Can you talk about the design of the recent genomics study you conducted with colleagues?
Dr. Getz: We wanted to study breast cancer and we wanted to look for mutations outside of the coding regions that came soon after discovery of the TERT [telomerase reverse transcriptase] promoter mutation. TERT is mutated at high frequencies in several tumor types, which again are in a noncoding region. The promoter of the gene TERT is responsible for maintaining the lengths of the telomeres of cells. In order to perform this breast cancer study, we designed a capture reagent that could capture not just the genes, but also regulatory regions, which capture the DNA. Then we can take the captured DNA and sequence it; this provides us with sequencing information not only on the coding regions but also on the regulatory regions. We focused the analysis on the promoter regions, which we defined as being around the transcription start site, at 400 bases before to 250 bases after. These are called the “promoters”, and many transcription factors bind in that region and control the expression of a gene. Therefore, mutations in those regions could potentially affect the expression of the genes in that region, and if those genes are important to drive cancer, those mutational changes could promote the cancer.
OncoTherapy Network: What did this large sequencing effort find? Were there any major surprises for you in the results of the analyses?
Dr. Getz: After we developed this statistical model for estimating the background rate (ie, how many mutations you would expect when looking at 360 breast cancers), we looked at 25,000 promoters of 25,000 genes, and in the vast majority of them, the number of mutations in them were the numbers that we expected based on the background rate [of mutations]. Only 9 genes had more [mutations] than we would expect by chance, and in particular, one of those genes, FOXA1, caught our eye. FOXA1 is a known driver in breast cancer, and is also known to be mutated within the coding region of the gene and amplified in some breast cancers. FOXA1 has a special function in that it helps recruit estrogen receptor, which we know is very important in breast cancer because it binds to the genes that it regulates. So the fact that we found hotspots of mutations in the promoter of FOXA1 suggested to us that these are truly functional mutations. We then performed functional experiments beyond the statistical analysis of the [sequencing] data. We wanted to demonstrate that if you have this mutation in front of a reporter gene, you can increase the expression of that gene. Based on analysis of the sequence around [the gene], we predicted that this would be an E2F [transcription factor] binding site. Then, with functional experiments, we showed that indeed, it is likely an E2F binding site and that that mutation increases the binding of E2F proteins to the promoter of FOXA1 and would predict increase in the expression of FOXA1, which would then increase the activity of the estrogen receptor. So, even if some cancers or tumors are growing in low-estrogen conditions, either as a result of treatment or just low estrogen, the increase of estrogen receptor would signal the cells to grow faster even at lower [estrogen] concentrations. FOXA1 promoter mutations are, we believe, truly driving the breast cancer. There were other genes that we found, and we also did experiments showing that some of those mutations seem to be functional, but those genes have less clear function compared to the FOXA1 mutations in breast cancer. Additional experiments are needed to understand how those other genes are performing their function [in potentially aiding tumor growth].
OncoTherapy Network: Can we generalize the results of this breast cancer genome sequencing analysis to the patterns of mutations in cancer genomes in general?
Dr. Getz: Yes. First, this reagent of pulling down or capturing the promoters of genes is a good way to study the promoters of genes in all cancers. It is not necessarily just for breast cancer. The benefit of the pull-down compared with whole-genome sequencing is that typically you can get to a much higher sequencing depth and therefore also handle impure tumors or regions that are difficult to sequence. Other ways of analyzing promoters of noncoding regions are by studying whole genome sequences. There are several studies going on—large-scale studies to look at many whole genomes, which my lab is also involved in. We are also part of efforts together with the TCGA and the International Cancer Genome Consortium to collect many hundreds of whole genomes and analyze them. So these kind of approaches that we applied, both the analytical and follow-up approaches, are applicable to other cancer types.
OncoTherapy Network: Have there been similar studies of the noncoding regions of cancer genomes? What is the picture that is emerging of the cancer genome as a whole?
Dr. Getz: As we’ve seen in our breast cancer study, the frequencies of noncoding mutations are wary between different genes, and the frequencies of driver mutations that we found in FOXA1 are not present in a large proportion of patients. That means that we will need many patients to find these mutations. In our study 360 breast cancer patients we did power calculations suggesting how many genomes we will need to find different frequencies, and that approach is generalizable. We will need large cohorts to really understand what the drivers are in the noncoding regions—hundreds if not thousands, in some cases, of cancer genomes per tumor type in order to understand this. The community is not there yet because we don't have many hundreds for all tumor types; however, the goal is to reach those numbers and therefore have a more comprehensive view of the cancer genome.
OncoTherapy Network: Thank you so much for joining us today, Professor Getz.
Dr. Getz: Thank you!