All Models Text To Image Image To Image LLMs

Latest Models

Image Superimpose V2

Superimpose V2 elevates image editing! Seamlessly layer images with background removal, precise positioning, and flexible resizing options. Explore 14 blending modes to create stunning effects

Video Faceswap

Video Faceswap is a powerful tool for creators, filmmakers, and meme enthusiasts. With this innovative technology, you can effortlessly replace faces in videos

Aura Flow

Largest completely open sourced flow-based generation model that is capable of text-to-image generation

Live Portrait

Live Portrait animates static images using a reference driving video through implicit key point based framework, bringing a portrait to life with realistic expressions and movements. It identifies key points on the face (think eyes, nose, mouth) and manipulates them to create expressions and movements.

Dubbing

ElevenLabs Dubbing uses AI to translate your audio into multiple languages. Easily create multilingual versions of your content without studios or voice actors for each language

Claude 3 Haiku

Claude 3 Haiku, the fastest and most cost-effective model LLM from Anthropic, delivers instant responses and image analysis. Build interactive AI experiences that mimic human conversation. Perfect for various applications, from research to enterprise

Claude 3 Opus

Claude 3 Opus is an LLM pushing the limits of language understanding. It excels at complex tasks, generates human-quality text, and remembers vast amounts of information.

Gemini PRO

Gemini 1.5 Pro represents a significant leap in large language model technology, offering exceptional understanding and performance across different modalities and contexts.

Gemini Flash

Gemini 1.5 Flash is a game-changer for developers and enterprises seeking a speedy and cost-effective large language model with exceptional long-context understanding.

Claude 3.5 Sonnet

Claude 3.5 Sonnet represents a significant advancement in AI language models, combining speed, accuracy, and visual reasoning capabilities. It excels at understanding and completing requests thoughtfully, and does so much faster than previous versions. Additionally, it boasts a stronger vision model, allowing it to analyze visual data like charts and images with exceptional accuracy.

Kolors

Kolors is a cutting-edge text-to-image model that bridges language and visual art. Transform your textual ideas into photorealistic images with semantic precision.

Playground V2.5

Playground V2.5 is a diffusion-based text-to-image generative model, designed to create highly aesthetic images based on textual prompts.

Image Superimpose

Superimpose model lets you to create captivating visuals by seamlessly overlaying one image on top of another. It streamlines your image layering process, allowing you to bring your creative vision to life effortlessly.

SDXL Img2Img

SDXL Img2Img is used for text-guided image-to-image translation. This model uses the weights from Stable Diffusion to generate new images from an input image using StableDiffusionImg2ImgPipeline from diffusers

SDXL Controlnet

SDXL ControlNet gives unprecedented control over text-to-image generation. SDXL ControlNet models Introduces the concept of conditioning inputs, which provide additional information to guide the image generation process

Story Diffusion

Story Diffusion turns your written narratives into stunning image sequences.

Elevenlabs Sound Generation

Eleven Labs' Sound Generation API provides a robust development tool for programmatically generating audio content using artificial intelligence. This API empowers developers and creators to integrate sound generation functionalities into their applications and workflows.

Elevenlabs Speech To Speech

Eleven Labs Speech-to-Speech offers AI-powered voice conversion for content creators, media professionals, and anyone seeking to modify or translate audio speech.

Elevenlabs Text To Speech

Eleven Labs Text-to-Speech (TTS) harnesses the power of deep learning to create realistic and engaging synthetic speech from written text.

Omni Zero

Omni-Zero: A diffusion pipeline for zero-shot stylized portrait creation.

LLAVA 1.6 7B

LLaVa translates images into text descriptions & captions.

LLaVA 13B

LLaVA 13B is a Vision-language model which allows both image and text as inputs.

Tooncrafter

Create videos from illustrated input images

V Express

V-Express lets you create portrait videos from single images.

SadTalker

Audio-based Lip Synchronization for Talking Head Video

Hallo

Hallo lets you create portrait videos from single images.

Relighting

Prompts to auto-magically relight your images.

Automatic Mask Generator

Automatic Mask Generator is a powerful tool that automates the creation of precise masks for inpainting

Magic Eraser

LaMA Object Removal- AI Magic Eraser

Inpaint Mask Maker

Real-Time Open-Vocabulary Object Detection

Background Eraser

Background Eraser helps in flawless background removal with exceptional accuracy.

Clarity Upscaler

High resolution creative image Upscaler and Enhancer. A free Magnific alternative.

Consistent Character

Create images of a given character in different poses

IDM VTON

Best-in-class clothing virtual try on in the wild

Stable Diffusion 3 Medium Text to Image

Stable Diffusion is a type of latent diffusion model that can generate images from text. It was created by a team of researchers and engineers from CompVis, Stability AI, and LAION. Stable Diffusion v2 is a specific version of the model architecture. It utilizes a downsampling-factor 8 autoencoder with an 865M UNet and OpenCLIP ViT-H/14 text encoder for the diffusion model. When using the SD 2-v model, it produces 768x768 px images. It uses the penultimate text embeddings from a CLIP ViT-H/14 text encoder to condition the generation process.

Fooocus

Fooocus enables high-quality image generation effortlessly, combining the best of Stable Diffusion and Midjourney.

IPAdapter Style Transfer

Style & Composition Transfer with Stable Diffusion IP Adapter

Profile Photo Style Transfer

Turn any image of a face into artwork using Stable Diffusion Controlnet and IPAdapter

illusion-diffusion-hq

Monster Labs QrCode ControlNet on top of SD Realistic Vision v5.1

PuLID

Novel tuning-free ID customization method for text-to-image generation.

Yamer's Realistic SDXL

The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software.

GPT 4 turbo

GPT-4 outperforms both previous large language models and as of 2023, most state-of-the-art systems (which often have benchmark-specific training or hand-engineering). On the MMLU benchmark, an English-language suite of multiple-choice questions covering 57 subjects, GPT-4 not only outperforms existing models by a considerable margin in English, but also demonstrates strong performance in other languages. Currently points to gpt-4-turbo-2024-04-09.

GPT 4o

GPT-4o (“o” for “omni”) is our most advanced model. It is multimodal (accepting text or image inputs and outputting text), and it has the same high intelligence as GPT-4 Turbo but is much more efficient—it generates text 2x faster and is 50% cheaper. Additionally, GPT-4o has the best vision and performance across non-English languages of any of our models. GPT-4o is available in the OpenAI API to paying customers.

GPT 4

Mixtral 8x7b

Mistral MoE 8x7B Instruct v0.1 model with Sparse Mixture of Experts. Fine tuned for instruction following.

Mixtral 8x22b

Mistral MoE 8x22B Instruct v0.1 model with Sparse Mixture of Experts. Fine tuned for instruction following.

PuLID Lightning

Faster version of PuLID, a novel tuning-free face customization method for text-to-image generation

Fashion AI

This model is capable of editing clothing in an image using a premier clothing segmentation algorithm.

face-to-many

Turn a face into 3D, emoji, pixel art, video game, claymation or toy

face-to-sticker

Turn a face into a sticker

Llama 3 8b

Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks.

material-transfer

Transfer a material from an image to a subject

Llama 3 70b

Faceswap V2

Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training

Insta Depth

InstantID aims to generate customized images with various poses or styles from only a single reference ID image while ensuring high fidelity

Background Removal V2

This model removes the background image from any image

NewReality Lightning SDXL

NewReality Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

DreamShaper Lightning SDXL

DreamShaper Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

Colossus Lightning SDXL

Colossus Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

Samaritan Lightning SDXL

Samaritan Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

Realism Lightning SDXL

Realism Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

ProtoVision Lightning SDXL

ProtoVision Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

NightVis Lightning SDXL

NightVis Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

WildCard Lightning SDXL

WildCard Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

Dynavis Lightning SDXL

Dynavis Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

Juggernaut Lightning SDXL

Juggernaut Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

Realvis Lightning SDXL

Realvis Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

Try-On Diffusion

Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on

Background Replace

This model is capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask

InstantID

InstantID aims to generate customized images with various poses or styles from only a single reference ID image while ensuring high fidelity

Samaritan 3D XL

Samaritan 3D XL leverages the robust capabilities of the SDXL framework, ensuring high-quality, detailed 3D character renderings.

Stable Video Diffusion

Takes image as input and returns a video.

Segmind-Vega

The Segmind-Vega Model is a distilled version of the Stable Diffusion XL (SDXL), offering a remarkable 70% reduction in size and an impressive 100% speedup while retaining high-quality text-to-image generation capabilities.

Segmind-VegaRT

Segmind-VegaRT a distilled consistency adapter for Segmind-Vega that allows to reduce the number of inference steps to only between 2 - 8 steps.

IP-adapter Openpose XL

IP Adapter XL Openpose is built on the SDXL framework. This model integrates the IP Adapter and Openpose preprocessor to offer unparalleled control and guidance in creating context-rich images.

IP-adapter Canny XL

IP Adpater XL Canny is built on the SDXL framework. This model integrates the IP Adapter and Canny edge preprocessor to offer unparalleled control and guidance in creating context-rich images.

IP-adapter Depth XL

IP Adapter Depth XL is built on the SDXL framework. This model integrates the IP Adapter and Depth preprocessor to offer unparalleled control and guidance in creating context-rich images.

SDXL Inpaint

This model is capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask

SSD Img2Img

This model uses SSD-1B to generate images by passing a text prompt and an initial image to condition the generation

SDXL-Openpose

This model leverages SDXL to generate the images with ControlNet conditioned on Human Pose Estimation.

SSD-Depth

This model leverages SSD-1B to generate the images with ControlNet conditioned on Depth Estimation

SSD-Canny

This model leverages SSD-1B to generate the images with ControlNet conditioned on Canny Images

SSD-1B

The Segmind Stable Diffusion Model (SSD-1B) is a distilled 50% smaller version of the Stable Diffusion XL (SDXL), offering a 60% speedup while maintaining high-quality text-to-image generation capabilities. It has been trained on diverse datasets, including Grit and Midjourney scrape data, to enhance its ability to create a wide range of visual content based on textual prompts.

Copax Timeless SDXL

The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software.

Zavychroma SDXL

The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software.

Realvis SDXL

The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software.

Dreamshaper SDXL

The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software.

Archived

Stable Diffusion 2.1

Archived

Stable Diffusion XL 0.9

The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software.

Word2img

Create beautifully designed words using Segmind’s word to image for your marketing purposes

Archived

Segmind Tiny-SD

Convert Text into Images with the latest distilled stable diffusion model

Stable Diffusion Inpainting

Stable Diffusion Inpainting is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask

Stable Diffusion img2img

This model uses diffusion-denoising mechanism as first proposed by SDEdit, Stable Diffusion is used for text-guided image-to-image translation. This model uses the weights from Stable Diffusion to generate new images from an input image using StableDiffusionImg2ImgPipeline from diffusers

Archived

Segmind Small-SD

Create realistic portrait images using the finetined Segmind Tiny SD model. Segmind Tiny SD (Portrait) Serverless APIs, Segmind offers fastest deployment for Tiny-Stable-Diffusion inferences

Stable Diffusion XL 1.0

The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software

Archived

Scifi

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

Archived

Samaritan

The most versatile photorealistic model that blends various models to achieve the amazing realistic images

Archived

RPG

This model corresponds to the Stable Diffusion RPG checkpoint for detailed images at the cost of a super detailed prompt

Reliberate

This model corresponds to the Stable Diffusion Reliberate checkpoint for detailed images at the cost of a super detailed prompt

Realistic Vision

This model corresponds to the Stable Diffusion Realistic Vision checkpoint for detailed images at the cost of a super detailed prompt

Archived

RCNZ - Cartoon

The most versatile photorealistic model that blends various models to achieve the amazing realistic images

Archived

Paragon

This model corresponds to the Stable Diffusion Paragon checkpoint for detailed images at the cost of a super detailed prompt

SD Outpainting

Stable Diffusion Outpainting can extend any image in any direction

Archived

Manmarumix

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

Archived

Majicmix

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

Juggernaut Final

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

Archived

Fruit Fusion

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

Archived

Flat 2d

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

Archived

Fantassified Icons

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

Epic Realism

This model corresponds to the Stable Diffusion Epic Realism checkpoint for detailed images at the cost of a super detailed prompt

Edge of Realism

This model corresponds to the Stable Diffusion Edge of Realism checkpoint for detailed images at the cost of a super detailed prompt

Archived

DvArch

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

Archived

Dream Shaper

Dreamshaper excels in delivering high-quality, detailed images. It is fine-tuned to understand and interpret a diverse range of artistic styles and subjects.

Archived

Disney

This model corresponds to the Stable Diffusion Disney checkpoint for detailed images at the cost of a super detailed prompt

Archived

Deep Spaced Diffusion

The most versatile photorealistic model that blends various models to achieve the amazing realistic space themed images.

Cyber Realistic

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

Archived

Cute Rich Style

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

Archived

Colorful

This model corresponds to the Stable Diffusion Colorful checkpoint for detailed images at the cost of a super detailed prompt

Archived

All in one pixe

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

Archived

526mix

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

Archived

QR Generator

Create beautiful and creative QR codes for your marketing campaigns.

Archived

Segmind Tiny-SD (Portrait)

Convert text to images with the distilled stable diffusion model by Segmind, Small-SD. Segmind Small SD Serverless APIs, Segmind offers fastest deployment for Small-Stable-Diffusion inferences.

Archived

Kandinsky 2.1

Kandinsky inherits best practices from Dall-E 2 and Latent diffusion, while introducing some new ideas.

ControlNet Soft Edge

This model corresponds to the ControlNet conditioned on Soft Edge.

ControlNet Scribble

This model corresponds to the ControlNet conditioned on Scribble images.

ControlNet Depth

This model corresponds to the ControlNet conditioned on Depth estimation.

ControlNet Canny

This model corresponds to the ControlNet conditioned on Canny edges.

Codeformer

CodeFormer is a robust face restoration algorithm for old photos or AI-generated faces.

Segment Anything Model

The Segment Anything Model (SAM) produces high quality object masks from input prompts such as points or boxes, and it can be used to generate masks for all objects in an image.

Faceswap

Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training

Archived

Revanimated

This model corresponds to the Stable Diffusion Revanimated checkpoint for detailed images at the cost of a super detailed prompt

Background Removal

This model removes the background image from any image

ESRGAN

AI-Powered Image Super-Resolution, upscaling and Image enhancement producing stunning, high-quality results using artificial intelligence

ControlNet Openpose

This model corresponds to the ControlNet conditioned on Human Pose Estimation.