Prompt2MedImage - Diffusion for Medical Images

Prompt2MedImage is a latent text to image diffusion model that has been fine-tuned on medical images from ROCO dataset.

The weights here are itended to be used with the 🧨Diffusers library.

This model was trained using Amazon SageMaker and the Hugging Face Deep Learning container.

Model Details

Developed by: Nihir Chadderwala
Model type: Diffusion based text to medical image generation model
Language: English
License: wtfpl
Model Description: This latent text to image diffusion model can be used to generate high quality medical images based on text prompts. It uses a fixed, pretrained text encoder (CLIP ViT-L/14) as suggested in the Imagen paper.

Examples

The patient had residual paralysis of the hand after poliomyelitis. It was necessary to stabilize the thumb with reference to the index finger. This was accomplished by placing a graft from the bone bank between the first and second metacarpals. The roentgenogram shows the complete healing of the graft one year later.

A 3-year-old child with visual difficulties. Axial FLAIR image show a supra-sellar lesion extending to the temporal lobes along the optic tracts (arrows) with moderate mass effect, compatible with optic glioma. FLAIR hyperintensity is also noted in the left mesencephalon from additional tumoral involvement

Showing the subtrochanteric fracture in the porotic bone.

License

This model is open access and available to all, with a Do What the F*ck You want to public license further specifying rights and usage.

You can't use the model to deliberately produce nor share illegal or harmful outputs or content.
The author claims no rights on the outputs you generate, you are free to use them and are accountable for their use.
You may re-distribute the weights and use the model commercially and/or as a service.

Run using PyTorch

pip install diffusers transformers

Running pipeline with default PNDM scheduler:

import torch
from diffusers import StableDiffusionPipeline

model_id = "Nihirc/Prompt2MedImage"
device = "cuda"

pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe = pipe.to(device)

prompt = "Showing the subtrochanteric fracture in the porotic bone."
image = pipe(prompt).images[0]  
    
image.save("porotic_bone_fracture.png")

Citation

O. Pelka, S. Koitka, J. Rückert, F. Nensa, C.M. Friedrich,
"Radiology Objects in COntext (ROCO): A Multimodal Image Dataset".
MICCAI Workshop on Large-scale Annotation of Biomedical Data and Expert Label Synthesis (LABELS) 2018, September 16, 2018, Granada, Spain. Lecture Notes on Computer Science (LNCS), vol. 11043, pp. 180-189, Springer Cham, 2018.
doi: 10.1007/978-3-030-01364-6_20

Downloads last month: 8,434

Model tree for Nihirc/Prompt2MedImage

Adapters

1 model

Spaces using Nihirc/Prompt2MedImage 8

Papers for Nihirc/Prompt2MedImage

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

Paper • 2205.11487 • Published May 23, 2022 • 1

Learning Transferable Visual Models From Natural Language Supervision

Paper • 2103.00020 • Published Feb 26, 2021 • 19

Model Details

Developed by: Nihir Chadderwala

Model type: Diffusion based text to medical image generation model

Language: English

License: wtfpl

Model Description: This latent text to image diffusion model can be used to generate high quality medical images based on text prompts. It uses a fixed, pretrained text encoder (CLIP ViT-L/14) as suggested in the Imagen paper.

Examples

The patient had residual paralysis of the hand after poliomyelitis. It was necessary to stabilize the thumb with reference to the index finger. This was accomplished by placing a graft from the bone bank between the first and second metacarpals. The roentgenogram shows the complete healing of the graft one year later.

A 3-year-old child with visual difficulties. Axial FLAIR image show a supra-sellar lesion extending to the temporal lobes along the optic tracts (arrows) with moderate mass effect, compatible with optic glioma. FLAIR hyperintensity is also noted in the left mesencephalon from additional tumoral involvement

Showing the subtrochanteric fracture in the porotic bone.

License

This model is open access and available to all, with a Do What the F*ck You want to public license further specifying rights and usage.

You can't use the model to deliberately produce nor share illegal or harmful outputs or content.

The author claims no rights on the outputs you generate, you are free to use them and are accountable for their use.

You may re-distribute the weights and use the model commercially and/or as a service.

Run using PyTorch

pip install diffusers transformers

Running pipeline with default PNDM scheduler:

import torch from diffusers import StableDiffusionPipeline model_id = "Nihirc/Prompt2MedImage" device = "cuda" pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe = pipe.to(device) prompt = "Showing the subtrochanteric fracture in the porotic bone." image = pipe(prompt).images[0] image.save("porotic_bone_fracture.png")

Citation

O. Pelka, S. Koitka, J. Rückert, F. Nensa, C.M. Friedrich, "Radiology Objects in COntext (ROCO): A Multimodal Image Dataset". MICCAI Workshop on Large-scale Annotation of Biomedical Data and Expert Label Synthesis (LABELS) 2018, September 16, 2018, Granada, Spain. Lecture Notes on Computer Science (LNCS), vol. 11043, pp. 180-189, Springer Cham, 2018. doi: 10.1007/978-3-030-01364-6_20

Nihirc
/

Prompt2MedImage

Prompt2MedImage - Diffusion for Medical Images

Model Details

Examples

License

Run using PyTorch

Citation

Model tree for Nihirc/Prompt2MedImage

Spaces using Nihirc/Prompt2MedImage 8

Papers for Nihirc/Prompt2MedImage

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

Learning Transferable Visual Models From Natural Language Supervision

Nihirc
/

Prompt2MedImage

Prompt2MedImage - Diffusion for Medical Images

Model Details

Examples

License

Run using PyTorch

Citation

Model tree for Nihirc/Prompt2MedImage

Spaces using Nihirc/Prompt2MedImage 8

Papers for Nihirc/Prompt2MedImage

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

Learning Transferable Visual Models From Natural Language Supervision