The central focus of this blog, and its companion, the Breast Cancer Cell Line Knowledge Base, is the use of breast cancer cell lines in breast cancer research. The central scientific premise of the blog and Knowledge Base is that breast cancer cell lines, and indeed all carefully developed cell lines from human cancers, are currently underutilized. If breast cancer cell lines that have been developed and maintained with scientific rigor are used properly, they represent an important gateway to the development of truly personalized targeted therapeutics for breast and other cancers.
One of the great advantages of the large panel of breast cancer cell lines that are available to researchers is that virtually all breast cancer cell lines have now been deeply characterized at many levels. As a result of comparative genomic hybridization studies and more recently genomic sequencing studies, all cell lines have been characterized for gene copy number changes, including copy number gains and losses, and focal gene amplifications. In addition, all cell lines have been characterized by expression profiling at the mRNA and protein levels using several platforms so that the expression levels of all genes are known for all cell lines. Exome sequence analysis has been used to identify all the SNPs and point mutations present in every cell line. More recently, the genomic data associated with each cell line has been functionalized as a result of the genome-scale shRNA or CRISPR screens that have been carried out in several labs. These screens have identified the most essential genes in each breast cancer cell line. This knowledge coupled with the recent extensive analysis of drug sensitivity across hundreds of targeted drugs for all breast cancer cell lines have yielded unparalleled data on the vulnerabilities of individual breast cancer cell lines. These vulnerabilities when analyzed in the context of the genomic profiling studies described above provide a gateway to the development of individualized approaches that can be used to reverse engineer individual cell lines, and these strategies can then be applied to individual breast cancer patients, particularly those with metastatic disease.
If we are to take full advantage of the wealth of descriptive and functional data that have been obtained using breast cancer cell lines, we must change the way we view them. We must stop trying to lump cell lines together into groups, and instead, we must analyze each cell line as a patient. This approach is the key to developing truly personalized therapeutic approaches for cancer therapy, because if we can learn how to effectively reverse engineer dozens to hundreds of individual cell lines, then computational and machine learning methods can be used that allow us to extrapolate those strategies to individual patients that share the same genomic features as the individual cell line. Thus, it is the goal of this blog to first convince the reader that more work, not less, is needed using cancer cell lines derived from patients, as the functional genomic data currently available for these cell lines is the gateway to novel and personalized therapeutic strategies for cancer, particularly metastatic disease. Given that, it is important to start this discussion with a general discussion about the usefulness of breast cancer cell lines, and whether they are good models of human breast cancer as it occurs in patients.
One may wonder why this topic even needs to be discussed, because after all, breast cancer cell lines have been a mainstay of breast cancer research for decades. But even though many thousands of papers have been published describing data derived from the use of breast cancer cell lines (nearly 10,000 using MCF-7 cells alone!) there are some who now feel that cell lines are obsolete because they are poor models of the human disease. As a result, several groups have argued that alternative strategies are now needed to move the field forward and this has resulted in the promotion of organoid models and PDX models of breast cancer as superior ways to advance the field. And, while I would never argue that we shouldn’t always be trying to find the best tools and strategies to perform breast cancer research, I think it’s important that we not throw the baby out with the bath water and disregard or diminish the importance of cell line-based research based on scanty evidence that the lines don’t adequately model human disease. So, to begin, I want to take start by summarizing the evidence in support of the use of breast cancer cell lines as valuable resources and argue that we have not done enough to fully leverage the power of breast cancer cell lines to understand the many types of human breast cancer and to identify their vulnerabilities.
One of the most common, and in my view unfounded complaints about breast cancer cell lines is that they are unstable and because of that, do not mimic human disease, or they simply can’t be trusted to provide reliable evidence. The evidence that is often cited to support this notion is that cell lines such as MCF-7 are said to have different characteristics in different laboratories around the world, suggesting that the inherent instability of cell lines is the cause of this. However, the recent use of STR profiling of cell lines is showing us that much of that “variability” is the result of poor cell culture technique and sloppy sharing of cell lines across laboratories that results in misidentification of cell lines. So, if MCF-7 cells in a laboratory no longer express the estrogen receptor, it is likely that they are actually Hela cells, or some other common contaminant, rather than bone fide MCF-7 cells.
Over the years, evidence has mounted on many levels that supports the notion that cell lines, when properly cultured and maintained, are remarkably stable models of specific types of human breast cancers. This is particularly true at the level of the driving genomic alterations that are at the heart of the biology of the patient’s disease and determine the characteristics of the cell lines. Several decades ago, the SKBR3 cell line was developed, and these cells have a focal amplification of the 17q12 genomic region in which the HER2 oncogene resides and these cells have a high level amplification and overexpression of HER2. These cells have been used by many investigators to study the biology of HER2 as a breast cancer oncogene and used to develop the HER2 targeted therapies that in routine use in the clinic today. In the late 1990s we developed the SUM-190 and SUM-225 cells lines each of which also has a focal amplification of HER2 with overexpression at the message and protein level, and which are responsive to HER2 targeted drugs. The SUM-225 cell line was derived from a patient who had a modified radical mastectomy for extensive high grade (comedo) ductal carcinoma in situ and then had a chest wall recurrence. We developed the cell line from the chest wall nodule, and when these cells are grown as xenografts in immune-deficient mice, the tumors that develop exhibit that same comedo growth pattern as seen in the original patient specimen and chest wall nodule. Thus, SUM-225 is an excellent of model of aggressive DCIS that maintains the genomic feature of the driving oncogene, and the same in vivo growth pattern that was observed in the patient. These properties of the SUM-225 cell line were first reported some 20 years ago. More recently, a poster was presented at the 2019 San Antonio Breast Cancer Conference that showed, among other things, that SUM-225 cells still retain this HER2 dependence, and in vivo growth pattern.
The SUM-190 cell line, in addition to having a focal ERBB2 amplification, has a point mutation in the PIK3CA oncogene, which results in sensitivity of these cells to the class I-alpha specific drug Alpelisib. Interestingly, the PIK3CA mutation also results in reduced sensitivity of these cells to HER2-specific tyrosine kinase inhibitors. Like for the SUM-225 cells, these characteristics have not changed over the 20 years that we and others have been working with them. The story of HER2 amplification and the dependence on HER2 activity for their growth and survival in several breast cancer cell lines is yet another indication of the remarkable stability of the phenotypes of human breast cancer cell lines.
This cell line stability can be observed at many levels in addition to the driving oncogenes characteristic of each line. First, the cytokeratin profiles of breast cancer cell lines don’t change over time. Indeed, Charles Perou and others have shown that breast cancer cell lines can be assigned to the same cluster groups as primary breast cancers based on unsupervised analysis of their expression profiles. Even the morphology of the cells does not change over time as can be clearly seen from the very different morphologies of the SUM cells, which has not changed over the 20+ years I have been working with them. Their proteome profiles also are stable and reflect the biology of the disease of the individual from which they came. The SUM-149 and SUM-190 cell lines came from patients with inflammatory breast cancer, and both cells over express E-cadherin, which is a common characteristic of this type of breast cancer. SUM-44 cells came from a patient with invasive lobular breast cancer, and these cells have the classic inactivating mutation in the E-cadherin gene and overexpress the estrogen receptor gene at the message and protein levels. In SUM-44 cells, expression of the estrogen receptor is essential for their survival.
I could go on and on, but hopefully by now the point is made. Breast cancer cell lines, when properly obtained and properly cultured are highly stable models of the specific disease of the patient from which they came, and they maintain those characteristics over decades. Thus, breast cancer cell lines are people too! For this reason, I have done my best, for each of the SUM lines, to describe on the SLKBase the characteristics of the patients whose specimen gave rise to each of the cell lines, as this information is important to properly interpret results obtained with each of the lines. More recently, I have added individual cell line pages for more than 50 other human breast cancer cell lines.
By now, you may be wondering why I’m making such a big deal about this. The reason is this: if you accept that individual breast cancer cell lines are good and stable models of the disease that was experienced by the patient from which they came, then an enormous amount of omics-based data becomes available that can be used to enhance the meaning of cell line-based research. As a result of work from the Joe Gray lab, from project Achilles at Dana Farber, from the MD Anderson reverse phase protein array lab, from the Cell Line Encyclopedia and the Genomics of Drug Sensitivity in Cancer database developed by the Welcome Trust, there is an enormous amount of data for over 70 breast cancer cell lines and for hundreds of other cancer cell lines derived from individual patients with cancer. These data can be used to move the field forward, particularly toward improving our ability to deliver precision therapeutics to patients. And thus, it is an overarching goal of the SLKBase to make these data available to our users and readers in a way that yields deep biological insights into individual breast cancer cell lines as individual patients. Bringing these tools to our users will allow us to obtain a deeper understanding of cell lines, one patient at a time, and then use that knowledge to generate novel strategies to reverse engineer each of these cell lines. So, despite the emerging popularity of organoid models and PDX models of breast cancer, none of these systems come with anywhere near the amount and sophistication of the data we have for cell lines, nor are these systems amenable to the generation of these kinds of data sets. Moreover, they are not flexible enough to allow creative experimentation to be done to test predictions that result from the enormous amount of functional omics data we have for cell lines.
Apparent instability of breast cancer cell lines: In the paragraphs above, I’ve attempted to provide a brief summary of the evidence that points to the inherent stability of human breast cancer cell lines as models of human disease and the impact that this has had on breast cancer research. However, we now know that a case can be mode for the instability of breast cancer cell lines. As a result, many researchers have turned away from using these models, and consider this instability to be a significant disadvantage. So, what are we to make of this apparent conflict?
There are two general explanations for “instability” that has been observed by researchers using breast cancer cell lines. One explanation isn’t really an explanation at all because the instability that some have observed is more apparent than real and is the result of sloppy cell culture technique that results in contamination of one cell line with another. We’ve all heard the stories about the ubiquity of the HeLa cell line and many researchers have discovered to their horror that cells they’ve been using for years are not at all what they thought they were. Another less famous example of cell line contamination invalidating the result of experiments performed with them is the development and use of the so-called MCF-7/ADR cells. This cell line was for many years thought to be a drug resistant version of MCF-7 cells. But of course, these cells were not derived from MCF-7 cells at all, nor do the bare any resemblance to MCF-7 cells. The use of experimental approaches to develop drug resistant variants by culturing cells for long periods of time in the presence of a drug and waiting for cells to emerge that can grow in the presence of the drug is ripe for the type of cell line contamination that occurred in the development of MCF-7/ADR. During the selection period, the vast majority of cells in the culture dish are killed by the treatment, and thus, if even a single clonogenic cell contaminates any of those culture dishes that happen to be more resistant to the drug then the original cell line, the cells that will emerge will be the result of the contaminant, and not the selection of a drug resistant variant. So, if care is needed in the routine use of breast cancer cell lines, extreme care is needed if one is to use such a selection approach to develop variants from a parental cell line. STR profiling, therefore, should always be carried out to validate the origin of any drug-resistant variant cell line derived in culture.
Many years ago, I reviewed a paper for Breast Cancer Research and Treatment in which MCF-7 cells had been obtained from several laboratories around the world and tested in a single lab for characteristics of true MCF-7 cells. As one might have predicted, the characteristics observed in the cells obtained from different labs varied widely across the panel of cell lines. In my assessment of this paper, the question that I set out to answer is the how the phenotypes of the cells related to the provenance of each of the cell lines. This was a natural question for me since I did my post-doctoral work at the Michigan Cancer Foundation (MCF!) and worked with Dr. Soule, the developer of that line (he developed MCF-10A too). What I discovered was that any cell line that could be traced directly back to the MCF exhibited the proper morphology, estrogen receptor expression, and other phenotypes of MCF-7 cells. By contrast, the cells that investigators had obtained from other laboratories varied widely in their phenotypes and many no longer expressed the estrogen receptor. The take home lesson from this is clear; never-ever obtain a cell line from anyone who isn’t the original source of the cells, or at the very least, a first-degree connection to that original source. And always STR validate any cell line that comes into your lab. To do otherwise, is to put your own research at risk. And yes, this means investigators actually have to purchase the cells to be used in experiments. Isn’t the integrity of your research worth a few hundred dollars? Most labs happily pay hundreds to thousands of dollars to purchase a tube of an antibody that gets used up and has to be purchased again. Cell line purchases are a one-time thing and provides the lab with a stable and reliable resource for decades. So, please, STOP getting cells from a friend or neighboring lab. This practice is one of the main reasons for the demonstrated lack of rigor and reproducibility that has been much publicized in recent years and has contributed mightily to the mistaken notion that cell lines are inherently unstable. This is an easy and inexpensive problem to solve.
The second explanation for the apparent “instability” of breast cancer cell lines is more interesting and more relevant to a key problem in cancer biology. This problem and has to do, not with the instability of the cell lines, but of their inherent heterogeneity, which is present in all human cancers. The notion of cellular heterogeneity of breast cancer specimens and cell lines was pioneered by Dr. Gloria Heppner, also of the Michigan Cancer Foundation. She and her colleagues at the MCF, were the first to show that breast cancer specimens and cell lines are comprised of sub-populations of cells that vary in their growth potential, their tumorigenic potential, and their metastatic potential. This idea has stood the test of time and the study of heterogeneity of cancer is now one of the hottest topics in the field. So, in this regard, once again cell lines accurately mimic the real biology of breast cancer. As such, they can be used to identify, clone or isolate different sub-populations of cells within the cell line in order to understand the contribution that each sub-population makes to the overall malignant potential of the parental cell line. This is a feature and not a bug of breast cancer cell lines and allows for the study of the role of cancer cell heterogeneity in the biology of the disease. There are now several excellent papers in the literature in which sub-populations were cloned from cancer cell lines and their individual contributions to the overall biology and malignant potential of the parent cell line was demonstrated. Indeed, the study of breast cancer stem cells is founded on the demonstrated ability to sort the stem cell population from a parental cell line using cell surface or biochemical markers of stem cells. Those isolated stem cells have been shown to regenerate the heterogeneity observed in the parental cell line. Thus, the so-called instability that some researchers worry about reflects the natural heterogeneity of breast cancers and breast cancer cell lines. Thus, cell lines provide a powerful tool for the study of cellular heterogeneity in cancer. Of course, this complexity makes it even more critical that researchers work with cell lines with great care and rigor in order to avoid unwanted selection of sub-populations of cells so as not to mistake heterogeneity for instability.
In summary, I hope that I’ve convinced you that breast cancer cell lines are people too! As such, they continue to be relevant and powerful sources of knowledge on the biology of breast cancer. The recent elucidation of functional genomic data sets only deepens their importance and relevance to the human disease. In subsequent blog entries, I will show how a functional genomic analysis of a large panel of breast cancer cell lines can lay the groundwork for the next generation of targeted therapeutic strategies for breast cancer, particularly for patients with metastatic disease. Since overt metastatic breast cancer remains uncurable, much work still needs to be done to improve survival of patients with metastatic breast cancer.