AI system outperforms pathologists in identifying prostate cancer aggressiveness
The use of artificial intelligence in cancer care holds promise in its ability to improve cancer diagnostics and optimize the oncology workforce.
Researchers at Radboud University Medical Center demonstrated the potential of AI by developing a deep learning system that performed better than many pathologists in determining the aggressiveness of prostate cancer, according to study results published in The Lancet Oncology.
The AI system “taught itself” to detect prostate cancer using data from more than 1,200 patients. It examined biopsies in a manner similar to that of a pathologist and graded them according to the Gleason grading standard.
Results of the study showed high agreement between the AI system and an expert reference standard (quadratic Cohen’s kappa = 0.91; 95% CI, 0.89-0.94). The system also performed well in measures of clinical decision-making. This included determinations of benign vs. malignant (area under the curve [AUC] = 0.99; 95% CI, 0.98-0.99), Gleason grade group of 2 or more (AUC = 0.97; 95% CI, 0.96-0.98) and Gleason grade group of 3 or more (AUC = 0.97; 95% CI, 0.96-0.98).
“Although our system showed pathologist-level performance, we think there is room for improvement,” researcher Wouter Bulten, a PhD candidate in the computational pathology group of the department of pathology at Radboud University, said in an interview with Healio. “Moreover, our system was developed using data from a single center. We will organize a competition where AI researchers can build upon our work.”
Bulten spoke with Healio about the development of the system, its capabilities and the goal of the upcoming competition.
Question: Can you describe the AI system you developed for prostate cancer?
Answer: We developed an AI system that can automatically grade prostate biopsies using the Gleason grading standard. After the tissue specimen is stained with hematoxylin and eosin, it gets scanned and can then be presented to the algorithm. Each biopsy reviewed by the system is assigned a biopsy-level grade group. Additionally, the AI system marks every gland that it detects with the labels ‘benign,’ ‘Gleason 3,’ ‘Gleason 4’ or ‘Gleason 5.’ This glandular output allows for a detailed inspection of the AI’s prediction.
Q: How did you develop the system?
A: To develop the system, we collected almost 6,000 biopsies from more than 1,200 men. We retrieved the original diagnosis from the pathology report and showed this, along with the images, to the system. The system, using a technique called deep learning, learned on its own to detect and grade prostate cancer.
Q: How did the AI system perform compared with trained pathologists?
A: We presented a set of 100 biopsies to a panel of 15 pathologists and residents from different labs and countries. This same set was presented to the AI system. When we compared the predictions of the panel and AI system with the reference standard, we found that the system performed better than 10 of the panel members. On a group level, the system performed equally to the pathologists who had more than 15 years of experience. Additionally, we investigated whether the AI system would be able to group patients in relevant risk categories. There, we found that the system achieved a pathologist-level performance.
Q: How will this technology change the detection and treatment of prostate cancer?
A: Systems such as ours can be used in different ways. First, it can be used to screen biopsies and to filter out the easy (benign) cases. This could reduce the workload for pathologists. Second, the system can be used as a second opinion after the pathologist’s initial read. The system can flag a case if its opinion differs from that of the pathologist. It also can give feedback during the first read, showing the pathologist where to look. In this case, the pathologist needs only to confirm the opinion of the AI system.
Q: What will your upcoming competition entail?
A: The main task will be to build an AI system that can perform Gleason grading. All participants will be evaluated on a multicenter test set. With this competition, we aim to get more insight into the best approaches for developing the AI systems. As part of the competition, we will make our data public for research purposes. Additionally, we made our algorithm available on our website, www.computationalpathologygroup.eu/software/automated-gleason-grading/.
Q: Will deep learning algorithms and AI make human pathologists obsolete?
A: We see our system as an additional tool that the pathologist can use. Although our system performs very well, it still makes mistakes. These mistakes are often different from those a human would make. We believe that when you merge the expertise of the pathologist with the second opinion of an AI system, you get the best of both worlds. – by Jennifer Byrne
For more information:
Wouter Bulten can be reached at Radboud University Medical Center, P.O. Box 9102, 6500 HB Nijmegen, the Netherlands; email: firstname.lastname@example.org.
Disclosures: The study was funded by Dutch Cancer Foundation. Bulten reports grants from Dutch Cancer Society. Please see the study for all other authors’ relevant financial disclosures.