Any time a patient hears the word cancer as a potential diagnosis, it’s a highly stressful situation. For many people, the time it takes to get an official diagnosis can feel like an eternity. They may require more tests to get a clearer picture, or the location where those tests need to be performed may be far away, and the experts needed to examine scans or cells may be in short supply in their area. Utilizing NCSA’s DeltaAI supercomputer, a collaborative group of researchers has created OPENPROS, the first large-scale, public dataset designed to improve prostate cancer detection using a specialized imaging technique called Ultrasound Computed Tomography (USCT).
Speaking on behalf of the large research group, Youzuo Lin is an associate professor in the School of Data Science and Society at the University of North Carolina at Chapel Hill (UNC). The work on OPENPROS is very promising, potentially solving problems such as access to high-end prostate imaging (e.g., MRI) and helping overcome “blind spots” caused by the pelvic bone in images. The OPENPROS project was co-led by Johns Hopkins University (JHU), and the team’s paper regarding this work was recently accepted for presentation at the 2026 International Conference on Learning Representations (ICLR).

Mapping the Body Through Sound
Imagine you are trying to take a high-quality photo of an object, but you can only see it from one or two specific angles because there are walls in the way. This is the challenge doctors face when trying to get a clear image of the prostate. The pelvic bone creates a barrier, making it difficult to get a clear picture.
The UNC and JHU research group approached this problem in a novel way. They created a “digital training ground” to teach computers how to create a high-resolution image of the prostate.
“Our methodology combines realistic medical imaging data, physics-based simulation and machine learning to create a large, reliable benchmark for prostate ultrasound computed tomography (USCT),” said Lin. “To our knowledge, this is the first large-scale dataset designed specifically for learning-based and physics-informed reconstruction of prostate USCT under clinically realistic imaging constraints.”
To train the computers, they needed to create idea scans from “patients” for the machine to learn from. Hanchen Wang, the co-first author from UNC, explained, “We begin with clinically derived MRI and CT scans of the prostate, which are carefully annotated by medical experts to produce anatomically accurate 3D digital models. These models incorporate realistic tissue properties, including speed-of-sound measurements obtained from real ex vivo prostate samples.”

“From these 3D models,” Wang explained, “we systematically extract hundreds of thousands of clinically relevant 2D slices that reflect the limited-angle geometry imposed by real prostate imaging. For each slice, we simulate ultrasound wave propagation using high-fidelity physics solvers based on the acoustic wave equation, generating full waveform data that closely mimics clinical measurements.”
The goal of this work is to train the AI model to use all the variables to produce a highly detailed, accurate version of the medical image. The bonus was that the HPC resources provided by NCSA enabled these images to be created in milliseconds.
“We used this paired dataset of ultrasound signals and ground-truth tissue maps to train and benchmark both traditional physics-based reconstruction methods and deep learning models, including convolutional neural networks and transformers,” said Lin. “We modeled prostate ultrasound tomography as an image-to-image learning problem where multi-channel ultrasound waveform data are mapped directly to quantitative tissue images. This end-to-end pipeline allows us to rigorously study accuracy, speed, robustness and generalization under realistic clinical constraints.”

Faster Results Equal Faster Detection and Treatment
With cancer, getting a timely diagnosis is often key to treatment and long-term positive health outcomes. Lin’s group created a dataset that can meaningfully affect the time between testing and treatment.
“OPENPROS is designed to accelerate clinically realistic prostate ultrasound computed tomography (USCT) specifically under the limited-angle constraints in real prostate imaging (transrectal and transabdominal access, nearby bones, heterogeneous tissue),” said Yixuan Wu, the co-first author from JHU.
For decades, doctors have relied on standard gray-scale ultrasounds, which can be like looking at a grainy, black-and-white photo where tumors are easily hidden in the shadows of objects like pelvic bones. The OPENPROS approach instead focuses on more than just the image. The AI measures how fast sound travels through tissue – a specific “biomarker” that can pinpoint cancer much more accurately, even in hard-to-reach areas where traditional tests often fail.
A major breakthrough is the speed at which these tests can be performed. While older, more complex methods could take hours, this system can do so in a fraction of a second. “The baselines show learned reconstruction can be milliseconds per sample, vs. hours for iterative physics-based inversion,” said Wu. “It points toward real-time or near-real-time imaging that could fit biopsy/therapy workflows.”
By testing these tools against a massive, realistic database, researchers are ensuring that these algorithms aren’t just lab experiments – they are reliable, robust tools that clinicians can actually trust to make life-saving decisions on the spot.
“OPENPROS explicitly emphasizes benchmarking for generalization, robustness and uncertainty-aware reconstruction, all central to translating early-detection algorithms into tools clinicians can trust for treatment decisions,” said Wu.

Using HPC to Create More Affordable Care
It’s no secret that treating major conditions like cancer can be costly. While the work done on OPENPROS required the use of a large supercomputer like DeltaAI, Lin’s research group hopes their work helps alleviate some of the costlier aspects of diagnosis.
“A major barrier to widespread clinical adoption of Ultrasound Computed Tomography (USCT) is the computational cost of image reconstruction,” said Lin. “Traditional high-resolution USCT reconstruction relies on full-waveform inversion (FWI), an iterative physics-based optimization method that repeatedly solves large-scale wave equations. In practice, a single high-quality FWI reconstruction can require hundreds to thousands of forward and adjoint simulations, translating to days or even weeks of computation on GPUs or HPC clusters for one patient.”
Once a deep learning model is trained, reconstruction becomes a single forward pass through a neural network. In our benchmarks, deep learning inference takes seconds to a few minutes on a single GPU, and can even run on commodity hardware.
Associate Professor, University of North Carolina at Chapel Hill
This means it’s possible that, at some point, clinicians will be able to use this resource at smaller health institutions as well.
“This dramatic reduction in computational cost has important implications for accessibility. Faster reconstruction lowers the need for expensive computing infrastructure, reduces operational costs and makes it feasible to deploy USCT systems in a wider range of clinical settings, including community hospitals and outpatient clinics,” explains Lin. “By enabling high-quality imaging without specialized hardware or long processing delays, our approach helps move advanced cancer screening closer to being affordable and accessible to the average patient.”

How DeltaAI Helps Researchers Develop Medical Breakthroughs

NCSA is first and foremost a research-supporting institution dedicated to making innovation accessible. As a partner in the U.S. National Science Foundation ACCESS program, HPC resources like DeltaAI help advance society at large by supporting important research. Lin’s research group was able to create OPENPROS, a dataset designed to improve prostate cancer detection, thanks to their allocation through the ACCESS program. Without access to HPC resources, this kind of research would take far longer to complete.
“Data generation itself was computationally intensive,” said Lin. “Each sample requires numerically solving acoustic wave equations under realistic anatomical and clinical constraints. We addressed this by leveraging the DeltaAI supercomputing cluster supported by the U.S. National Science Foundation, which provided large-scale GPU resources and high-throughput job scheduling. Efficient parallelization, careful memory management and robust job orchestration were essential to avoid node failures and wasted compute time.”
The payoff for using HPC resources was like going from a sketch to a three-dimensional painting in seconds. Without this technology, calculating even a single medical image could take a computer up to 24 hours because the physics involved are so complex. By training AI on the DeltaAI supercomputer, Lin’s research group created a system that can now analyze a patient’s ultrasound and produce a clear, accurate map of the prostate in just five to nine milliseconds.
By leveraging the power of a supercomputer, the researchers did more than just make a better map of the human body – they turned a day-long waiting game into an instant result, bringing all of us one step closer to life-saving, real-time cancer detection in every doctor’s office.

What Comes Next
The UNC and JHU research group will continue refining the OPENPROS dataset. For one, their model still struggles to depict fine internal structures or small lesions. Lin’s team notes that, for advancements in this area, more work is needed.
“The next major hurdle isn’t just ‘more resolution,’ it’s clinical-grade reliability under real-world variability, i.e., proving that USCT reconstructions remain quantitatively accurate and robust across unseen anatomies, acquisition differences and noise/artifacts, with calibrated uncertainty, so clinicians can trust the image for decisions,” said Wu.
What this means is that while this technology is incredibly fast, moving it from a computer simulation into a real hospital is the next major hurdle. Every human body is unique, and clinics don’t always have the same equipment. The images may look amazing and highly defined, but they also need to be mathematically precise so doctors can trust them for life-altering decisions, such as where to perform a biopsy or how to target a tumor.
As Wu explains, “OPENPROS is simulated from anatomically accurate models, but hospitals will introduce system-specific effects (probe variability, coupling, motion, attenuation/aberration, clutter, imperfect calibration). Models must hold up when those assumptions break. The paper explicitly notes that current deep learning methods still fall short of clinically acceptable high-resolution/accuracy despite strong performance on the benchmark.”
The final test for this type of research is to bring something like OPENPROS into the busy flow of a doctor’s office, ensuring it is robust enough to work reliably every time, even with the variability introduced in a real-world setting, before it is used to guide patient care.
To read more about this research, you can find related papers here:
OPENPROS: A large-scale dataset for limited-view prostate ultrasound computed tomography, a paper recently accepted for presentation at the 2026 International Conference on Learning Representations (ICLR).
Realistic digital phantoms for prostate ultrasound and photoacoustic imaging from the SPIE Medical Imaging proceedings.
Physics-Guided Data-Driven Seismic Inversion: Recent Progress and Future Opportunities in Full Waveform Inversion from the journal IEEE-Signal Processing Magazine.
Survey of Deep Learning and Physics-Based Approaches in Computational Wave Imaging in arXiv.
Work on this project involved a large team of collaborators from multiple organizations. Team members include: Dr. Hanchen Wang, Research Scientist, School of Data Science and Society, University of North Carolina at Chapel Hill (currently Applied Scientist, Amazon); Dr. Yixuan Wu, Postdoc Fellow, Laboratory for Computational Sensing and Robotics, Johns Hopkins University; Yinan Feng, Graduate Student, School of Data Science and Society, University of North Carolina at Chapel Hill; Peng Jin, Graduate Student, College of Information Science and Technology, Penn State University (Incoming Research Scientist, University of North Carolina at Chapel Hill); Luoyuan Zhang, Graduate Student, School of Data Science and Society, University of North Carolina at Chapel Hill; Dr. Shihang Feng, Research Scientist, School of Data Science and Society, University of North Carolina at Chapel Hill; Prof. Songting Luo, Professor, Department of Mathematics, Iowa State University; Dr. Yinpeng Chen, Research Scientist, Google DeepMind; Prof. Emad Boctor, Associate Research Professor, Laboratory for Computational Sensing and Robotics, Johns Hopkins University; Prof. Youzuo Lin, Associate Professor, School of Data Science and Society, University of North Carolina at Chapel Hill.
ABOUT DELTA AND DELTAAI
NCSA’s Delta and DeltaAI are part of the national cyberinfrastructure ecosystem through the U.S. National Science FoundationACCESS program. Delta (OAC 2005572) is a powerful computing and data-analysis resource combining next-generation processor architectures and NVIDIA graphics processors with forward-looking user interfaces and file systems. The Delta project partners with the Science Gateways Community Institute to empower broad communities of researchers to easily access Delta and with the University of Illinois Division of Disability Resources & Educational Services and the School of Information Sciences to explore and reduce barriers to access. DeltaAI (OAC 2320345) maximizes the output of artificial intelligence and machine learning (AI/ML) research. Tripling NCSA’s AI-focused computing capacity and greatly expanding the capacity available within ACCESS, DeltaAI enables researchers to address the world’s most challenging problems by accelerating complex AI/ML and high-performance computing applications running terabytes of data. Additional funding for DeltaAI comes from the State of Illinois.