Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
Paper
โข 2205.11487 โข Published
โข 1
Prompt2MedImage is a latent text to image diffusion model that has been fine-tuned on medical images from ROCO dataset.
The weights here are itended to be used with the ๐งจDiffusers library.
This model was trained using Amazon SageMaker and the Hugging Face Deep Learning container.
This model is open access and available to all, with a Do What the F*ck You want to public license further specifying rights and usage.
pip install diffusers transformers
Running pipeline with default PNDM scheduler:
import torch
from diffusers import StableDiffusionPipeline
model_id = "Nihirc/Prompt2MedImage"
device = "cuda"
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe = pipe.to(device)
prompt = "Showing the subtrochanteric fracture in the porotic bone."
image = pipe(prompt).images[0]
image.save("porotic_bone_fracture.png")
O. Pelka, S. Koitka, J. Rรผckert, F. Nensa, C.M. Friedrich,
"Radiology Objects in COntext (ROCO): A Multimodal Image Dataset".
MICCAI Workshop on Large-scale Annotation of Biomedical Data and Expert Label Synthesis (LABELS) 2018, September 16, 2018, Granada, Spain. Lecture Notes on Computer Science (LNCS), vol. 11043, pp. 180-189, Springer Cham, 2018.
doi: 10.1007/978-3-030-01364-6_20