Algorithm identifies risk-stratifying glioblastoma tumor cells
An unsupervised, automated machine learning algorithm successfully identified glioblastoma tumor cells and stratified survival outcomes, according to study results published in eLife.
“A goal of cancer research is to reveal cell subsets linked to continuous clinical outcomes to generate new therapeutic and biomarker hypotheses,” Rebecca Ihrie, PhD, and Jonathan Irish, PhD, associate professors in the department of cell and developmental biology at Vanderbilt University, and colleagues wrote. “We introduce a machine learning algorithm, Risk Assessment Population IDentification (RAPID), that is unsupervised and automated, identifies phenotypically distinct cell populations, and determines whether these populations stratify patient survival.”
Ihrie and Irish told Healio what prompted this research, implications of the findings and what future research should entail.
Question: What prompted this research?
Ihrie: Cancers are now being studied using single-cell approaches, through which we can learn about the presence and abundance of different subsets of cells within the sample. This project aimed to identify tumor cell subsets that are associated with poor outcomes. For the last 5 years, our team built specific expertise for this project, including:
- cryopreserving cells from brain tumor resections;
- measuring phosphorylated signaling molecules in individual cells; and
- machine learning analysis of the associated data — in our case, approximately 40 readouts for each of more than 2 million cells.
We chose to study glioblastoma because of the importance of cell signaling to the disease and the fact that there is a great need for new treatment. Despite many years of research, it has been extremely challenging to find biological features that are correlated with large differences in patient survival, or that help researchers identify new avenues for treatment. Ultimately, we aim to address these gaps and to develop precision medicine strategies for glioblastoma based on cell signaling biology. Our study differs from others in the field because we chose to measure features at the protein level, rather than DNA or RNA — meaning we could identify cells based on features like post-translational modifications of these proteins, which are important to their function.
Q: What is unique about this algorithm?
Ihrie: Other algorithms usually do one of two things — identify subgroups of cells that are similar to known normal types or divide patients into “good” and “bad” outcomes and look for features that are more abundant in one group vs. the other. We designed RAPID to take the user all the way from the start of analysis (unprocessed single-cell data on cohorts of about 25 patients) to the finish (features that identify especially good or bad cells). RAPID is unique because it does not require prior knowledge about expected cell types or classification of patients in advance — instead, identification of biologically similar cell clusters across patients and testing of whether those clusters predict outcomes is done in an automated fashion. In other words, RAPID is fully unsupervised and uses statistical rules to reveal cells and determine their identity and significance. RAPID also creates human- and computer-readable descriptions of cell populations that can be used to design simpler tests, such as immunohistochemical stains, which are used more regularly in clinical practice and can be applied to large patient cohorts.
Q: What did you find?
Irish: RAPID identified tumor cells whose abundance independently and continuously stratified patient survival among a pilot mass cytometry data set of 2 million cells from 28 glioblastomas. We used an orthogonal platform for biological validation (immunohistochemistry) and a larger cohort of 73 patients with glioblastoma to confirm the findings from the pilot cohort, and we also found that RAPID was validated to find known risk-stratifying cells and features using published data from blood cancer.
Q: What are the clinical implications of your findings?
Irish: In glioblastoma, our findings suggest that patients whose tumors have a high fraction of the positive or negative phenotypes we identified may respond differently to investigational treatments. Patients whose tumors primarily have “positive-prognostic” cells also have a higher percentage of immune cells within their tumors, suggesting that they might benefit from immunotherapy more than patients with “negative-prognostic” tumors. These cells also were apparent in traditional histology in a larger validation cohort — so institutions that wish to identify patients with abundant negative- or positive-prognostic cells could do so using standard pathology techniques. More broadly, using a data set from another research group, we showed that RAPID can be used in many cancer types to identify cell subsets that correlate with patient survival or disease recurrence.
Q: What is next in terms of research?
Ihrie: The finding that specific combinations of proteins identify aggressive cancer cells raises many ideas that we are excited to explore. For example, is the aggressive phenotype more common in recurrent tumors? Which of the identified features are required for aggressive tumor cell growth or resistance to current therapy? Now that we know these cells exist, they can be studied in mechanistic experiments, developed into models and chemically targeted. We are especially interested in connecting the newly revealed mechanistic biology of the cells to features that can be measured in patients to guide treatment selection or application. In that vein, following publication of our work in eLife, we learned that the positive-prognostic cells identified by the RAPID algorithm are present in tumors that are located in different brain regions detectable by MRI. This result opens up the potential to use high-resolution MRI to therapeutically translate the cellular findings from the labs.
Q: Is there anything else that you would like to mention?
Irish: More and more studies are using single-cell methods to study a whole range of human diseases. We hope that RAPID will be used by many of these researchers to discover cell subpopulations that predict disease outcome.
For more information:
Rebecca Ihrie, PhD, can be reached at Vanderbilt University, 761B Preston Research Building, 2220 Pierce Ave., Nashville, TN 37232; email: email@example.com.
Jonathan Irish, PhD, can be reached at Vanderbilt University, 761B Preston Research Building, 2220 Pierce Ave., Nashville, TN 37232; email: firstname.lastname@example.org.