
Mask embedding for realistic high-resolution image synthesis
Unmet Need
General adversarial networks (GANs) are machine learning networks made of two neural networks competing with each other. They are used in medical imaging to generate training datasets for various medical imaging diagnostic processing tasks, such as denoising of scan images, reconstruction of MRIs, or diagnostic assessments. In spite of recent progress with diffusion and transformer-based models, most generative models still struggle with image realism. For high-stakes applications such as biomedical image diagnosis or analysis, these images need to be more accurate in depicting real-life anatomy, disease, and even image artifacts that an algorithm would see when processing real data. There is a need for increased realism and control in generating biomedical images.
Technology
Duke inventors have developed a new algorithm that improves the realism of synthesized medical images. This is intended to be used by diagnostic software developers to better train their diagnostic software to detect and respond to key patterns in images. Specifically, this technology uses mask embedding combined with latent feature vectors to ensure image realness. Mask embedding allows for user input in defining organs and other structures that should be present in the image. The inclusion of latent feature vectors allows the algorithm to still create de novo images while maintaining realism. Unlike other generative models such as diffusion or transformers, this approach is much easier to train, more efficient to use, and offers more direct control over the image realism. This technology has been demonstrated in generating synthetic mammogram images to be run through a separate diagnostic algorithm. These images then train it to diagnose abnormal lesions in actual breast imaging scans. When tested against pix2pix, an existing GAN for generating medical images, this algorithm generated images that were more realistic and were higher resolution than those generated by pix2pix, indicating that mask embedding can assist in guiding GANs in image generation. This technology has also been demonstrated to successfully synthesize facial images based on embedded masking guidelines.
Other Applications
This technology could also be used in training other diagnostic algorithms, as the means of generating these images is not tied to any single biological application.
Advantages
- Generates higher quality images than competing technology
- Scalable to various other biomedical image or real image types
- Use of latent feature vectors allows for greater detail in synthesized images than in those generated by competing technologies