Researchers Will Train AI to Diagnose Prostate Cancer
Himanshu Arora, Ph.D., oncology and urology researcher at Sylvester Comprehensive Cancer Center and Desai Sethi Urology Institute, both part of the University of Miami Miller School of Medicine, has received a $78,000 grant from the Scott R. MacKenzie Foundation to improve artificial intelligence-based diagnostic tools for prostate cancer.
The grant has an optional second year of funding, which would bring the total to $180,000.
Dr. Arora, an assistant professor at the Miller School, and his team will study a generative adversarial network (GAN), an artificial intelligence (AI) format that pits two neural networks against each other to develop synthetic data. The group will use that data to train machine-learning algorithms to better identify and grade prostate tumors.
“GAN technology is commonly used in video games, where there is a main player and an opponent,” said Dr. Arora. “As the main player develops new skills, the opponent adapts, learning the main player’s strengths and developing its own more advanced traits. For this work, we will be using clinical data to educate our own generative model. Validated synthetic data will train the AI models to make them more effective.”
Better AI Tools for Prostate Cancer
Dr. Arora and his team are motivated by shortcomings in the current prostate cancer diagnostic flow. If prostate-specific antigen (PSA) and genomic tests indicate risk, surgeons biopsy the prostate. Pathologists evaluate the samples and assign a Gleason score between 6 and 10. The greater the number, the greater the severity. Further genomic tests assess tumor aggressiveness.
This approach can be slow and expensive, and even expert pathologists may disagree on the appropriate score assignment, which could affect care. AI systems “read” images and automate this process, but they have trouble making fine distinctions in tumor severity.
The problem with these AI systems is training data fails to capture real-world scenarios. Most of this data comes from clinical trial results. But these studies are narrowly focused and often don’t produce the necessary information to effectively teach AI systems.
Trials have strict inclusion and exclusion criteria that produce data that may not be specific, randomized or relevant enough to educate the AI models about the disease. As a result, existing AI models for prostate tumors may generate false positives or negatives.
“These models are not efficient because the training data is not efficient,” said Dr. Arora. “We are aiming to replace that training data with synthetic data from the GAN, and that gives us the added advantage of acquiring the data without the added cost of conducting a clinical trial.”
Synthetic Data from GANs
The Sylvester team will begin with relevant data from The Cancer Genome Atlas and other publicly available repositories that contain digital pathology images and anonymized patient information. As the data goes through the GAN, researchers will assess it to ensure both accuracy and relevance. From there, researchers will train AI models to be more reliable when assessing Gleason scores.
The interdisciplinary team behind this project includes:
- Derek Van Booven, director of research and bioinformatics at the John P. Hussman Institute for Human Genomics at the Miller School
- Cheng-Bang Chen, Ph.D., assistant professor of research of industrial and systems engineering at the University of Miami
- Oleksandr N. Kryvenko, M.D., clinical professor of pathology and laboratory medicine in the Division of Anatomic Pathology and director of the Genitourinary Pathology Service at the Miller School
- Sanoj Punnen, M.D., co-chair, Genitourinary Site Disease Group at Sylvester, vice chair of research, Desai Sethi Urology Institute, and associate professor of urologic oncology at the Miller School
Their preliminary research that led to the MacKenzie Foundation grant holds promise. AI models trained with synthetic GAN images outperformed those trained with actual patient data. This could produce more accurate AI tools at a reduced cost.
The team’s goal is to create a platform that improves all AI diagnostic tools for prostate cancer.
“One of the biggest problems for current AI models is their limited diagnostic and prognostic capabilities,” said Dr. Arora. “With our synthetic data, we will be able to enhance the quality of diagnoses and prognoses for almost every AI model out there. And lastly, but importantly, our pipelines will be used to broaden the application to multiple cancer types in the near future.”