Latest Models

test

Google Veo 2 Image To Video

Discover Google Veo 2, an AI-powered image-to-video model with 4K resolution, realistic motion, and cinematic effects for creators and developers.

test

Segmind Relighting

Prompts to auto-magically relight your images.

test

Juggernaut Lightning Flux

Juggernaut Lightning Flux: Blazing fast (<300ms!) & powerful inference with enhanced visuals.

test

Juggernaut Pro Flux

Juggernaut Pro FLUX: Create stunningly realistic AI images with unprecedented detail and sharpness.

test

Seg Swap: Swap Objects Instantly

Transform images effortlessly with the Seg-Swap model! Replace, add objects, and transfer patterns seamlessly.

test

Minimax Music-01

Generate up to 60 seconds of music with both accompaniment and vocals in a single pass, with vocals from lyrics and a reference track.

test

3B Orpheus TTS (0.1)

Orpheus TTS is an open-source text-to-speech (TTS) system powered by the Llama 3B language model, designed for high-quality and customizable speech synthesis.

test

Video Loop

Effortlessly loop videos for engaging social media & storytelling with our Video Loop.

test

Segmind Faceswap v4

Effortlessly perform face swap with speed and precision. Choose head or face swaps with style options.

test

Hunyuan-3d 2mv

Hunyuan3D-2mv is finetuned from Hunyuan3D-2 to support multiview controlled shape generation.

test

Gemini 2 Flash

With Gemini 2 Flash, create consistent visuals, edit images conversationally, and render text accurately.

test

Luma Ray flash 2 (720p)

Generate stunning 720p videos from text with the Luma ray-flash-2-720p model. Faster & cheaper than Ray 2, offering realistic motion & detail.

test

Wan Video Effects

Transform your videos with diverse video effects. Start creating captivating videos today.

test

Veo-2

Create stunning, realistic videos with Veo 2, Google's state-of-the-art AI video generation model. Experience enhanced quality & cinematic control.

test

Minimax-image-01

Generate high-fidelity images from text with precise control & stunning quality with Minimax Image-01.

test

Elevenlabs Transcript

Experience unmatched accuracy with ElevenLabs Transcript, the leading model for AI speech-to-text.

test

Ideogram Describe

Ideogram describe can effortlessly generate detailed prompts from images. Perfect for refining creations or replicating styles.

test

Ideogram Reframe

Transform your images with Ideogram Reframe! Easily reframe square images to your chosen resolution.

test

Ideogram 2a Image to Image

Ideogram Image to Image: Transform your images with ease! Enhance, modify, or create entirely new visuals using advanced AI. Perfect for artists, designers, and anyone seeking creative inspiration.

test

Ideogram 2a Text To Image

Create captivating designs, realistic images & innovative logos with Ideogram 2a text-to-image.

test

Ideogram Turbo Text To Image

Create stunning images in seconds with Ideogram Turbo Text to Image. Fast AI model for quick ideation & text rendering.

test

Ideogram Turbo Image To Image

Transform images instantly with Ideogram Turbo Image to Image! Fast AI for quick edits & creative remixes.

test

SegFIT: Segmind Fashion and Immersive Try-on

SegFIT by Segmind is a cutting-edge virtual try-on (VTON) model that enables ultra-realistic clothing visualization on custom fashion models.

test

Grok 2 Vision

Grok-2, xAI's latest language model with vision understanding.

test

Grok 2

Grok-2, xAI's latest language model, boasts superior reasoning, coding, and chat capabilities, outperforming many popular LLMs.

test

Wan_2.1 Text to Video

Create visually impressive and feature varied, lifelike motion videos with Wan2.1 using text prompts.

test

Wan 2.1 480p image to video

Create high-quality 480p videos with excellent visual quality and a broad spectrum of motion from static images.

test

Wan 2.1 720p image to video

Create high-quality 720p videos with excellent visual quality and a broad spectrum of motion from static images.

test

Claude 3.7 Sonnet

Claude 3.7 Sonnet is a large language model (LLM) launched by Anthropic AI. It is considered state-of-the-art, outperforming previous versions of Claude and competing models in a variety of tasks

test

Segmind Beyond: Outpaint with Ease

Effortlessly expand your visuals with AI Image Extend. Intelligently add pixels to any side of your image.

test

Luma Ray Image to Video

With Luma's Ray2 image-to-video, transform your static images into cinematic dynamic videos.

test

Minimax AI Director

Minimax video-01-director: Create high-quality videos with control camera movements precisely using text prompts.

test

Luma Ray Text to Video

Luma Ray2 text-to-video creates realistic, coherent videos from your text prompts.

test

Imagen 3

Imagen 3 is Google DeepMind's highest quality text-to-image model. Generates detailed images with enhanced lighting, diverse styles, and improved text rendering.

test

Qwen2 VL 72B Instruct

Qwen2-VL-72B-Instruct is a state-of-the-art multimodal model excelling in image and video understanding, with advanced capabilities for text-based interaction.

test

Llama 3.2 90B Vision Instruct

Experience the cutting edge of AI with Llama 3.2-90B Vision-Instruct. This 90B parameter multimodal LLM excels at image understanding, reasoning, captioning, and more.

test

Llama 3.2 11B Vision Instruct

Instruction-tuned image reasoning model from Meta with 11B parameters. Optimized for visual recognition, image reasoning, captioning, and answering general questions about an image. The model can understand visual data, such as charts and graphs and also bridge the gap between vision and language by generating text to describe images details.

test

Hunyuan3D-2

Hunyuan3D 2.0 enables the creation of high-quality 3D models with intricate details. Produce assets that are visually appealing and suitable for professional use.

test

AI Face Swap (image and video)

AI Face Swap: Effortlessly replace faces online. Fine-tune swaps with advanced controls for age, gender, and resolution.

test

QWEN2-VL-7B-Instruct

The Qwen2-VL-7B-Instruct is a cutting-edge vision-language model with 7 billion parameters, offering advanced capabilities like object recognition, image analysis and visual localization. It can also generate structured outputs and is optimized for both performance and flexibility. It can recognize objects, analyze image content, act as a visual agent, and generate structured data.

test

DeepSeek Chat

DeepSeek V3 combines cutting-edge AI technology with practical usability. Featuring a 671B parameter architecture, enhanced reasoning capabilities, and lightning-fast processing, it sets new standards for open-source AI models.

test

DeepSeek R1

DeepSeek-R1 is a cutting-edge AI reasoning model that combines reinforcement learning with supervised fine-tuning. Excels in complex problem-solving, mathematics, and coding tasks.

test

Minimax (Hailuo) Video-01-live

Create stunning animations with Minimax (Hailuo) video-01-live, an AI image-to-video model perfect for Live2D, anime, and more. Transform static images into dynamic videos with smooth motion, facial control, and style support for diverse use cases like art, character animation, and e-commerce.

test

Kling AI 1.6 Image to Video

Kling AI 1.6 Image-to-Video is a powerful AI tool that transforms static images into captivating, animated videos. Create high-quality content effortlessly with Kling AI's advanced capabilities.

test

Kling AI 1.6 Text to Video

Kling AI 1.6 Text-to-Video is a cutting-edge AI tool that transforms text into stunning, lifelike videos. Create professional-quality content effortlessly with Kling AI's advanced capabilities.

test

Furniture Staging

Furniture Staging

test

Ideogram Image To Image

Ideogram Image to Image: Transform your images with ease! Enhance, modify, or create entirely new visuals using advanced AI. Perfect for artists, designers, and anyone seeking creative inspiration.

test

LTX Video

LTX-Video is the first DiT-based video generation model capable of generating high-quality videos in real-time. It produces 24 FPS videos at a 768x512 resolution faster than they can be watched.

test

MiniMax AI (Hailuo)

With Video-01 by MiniMax, create high-definition videos at 720p resolution and 25fps, featuring cinematic camera movement effects based on text descriptions.

test

Hunyuan Video

Hunyuan AI Video is a new, state of the art, AI Video Generator that creates high-quality videos from text descriptions. With 13B parameters and state-of-the-art performance, it's the most powerful open-source video generation model available.

test

Omini Control

OminiControl is an innovative framework that optimizes Diffusion Transformer models for versatile image generation tasks.

test

Luma Photon Flash Text to Image

Luma Photon flash is a powerful and fast text-to-image model offering high-quality visuals with unmatched speed and precision. Ideal for creatives, it excels in instruction-following, composition, and aesthetic quality, transforming ideas into stunning images

test

Luma Photon Text to Image

Luma Photon is a powerful AI-driven text-to-image model offering high-quality visuals with unmatched speed and precision. Ideal for creatives, it excels in instruction-following, composition, and aesthetic quality, transforming ideas into stunning images

test

AI Product Photography

Elevate your product imagery with our AI-powered photography model. Create stunning, professional-quality photos that boost engagement and sales. Perfect for e-commerce and digital marketing.

test

Flux Fill Pro

Professional inpainting and outpainting model with state-of-the-art performance. Edit or extend images with natural, seamless results.

test

Flux Depth Pro

Professional depth-aware image generation. Edit images while preserving spatial relationships.

test

Flux Canny Pro

Professional edge-guided image generation. Control structure and composition using Canny edge detection

test

Flux Depth Dev

Open-weight depth-aware image generation. Edit images while preserving spatial relationships.

test

Flux Canny Dev

Open-weight edge-guided image generation. Control structure and composition using Canny edge detection.

test

Flux Fill Dev

Open-weight inpainting model for editing and extending images. Guidance-distilled from FLUX.1 Fill Dev

test

Flux Redux Schnell

Fast, efficient image variation model for rapid iteration and experimentation.

test

Flux Redux Dev

Open-weight image variation model. Create new versions while preserving key elements of your original.

test

Flux-1.1 Pro Ultra

Create stunning visuals effortlessly with Flux 1.1 Pro Ultra. Experience unparalleled image quality and speed.

test

Mochi 1

Mochi 1 is a cutting-edge, open-source AI model that transforms text prompts into stunning, high-fidelity videos. Create captivating videos from simple text prompts with unparalleled quality and realism. Experience high-fidelity motion, strong prompt adherence, and limitless creative possibilities

test

Recraft V3

Recraft V3, the latest iteration of Recraft AI, offers a significant advancement in AI-driven image generation. This state-of-the-art model is designed to produce high-quality, detailed vector graphics, catering to the needs of designers, artists, and content creators alike.

test

Recraft V3 Svg

Recraft V3 SVG generates high-quality, customizable vector graphics with precision and ease. Perfect for logos, infographics, illustrations, and more.

test

Stable Diffusion 3.5 Turbo Text to Image

Stable Diffusion 3.5 Turbo offers exceptional customizability, efficient performance on consumer hardware, and diverse image outputs that accurately represent different skin tones and features, all while maintaining high-quality results and strong prompt adherence.

test

Stable Diffusion 3.5 Large Text to Image

Stable Diffusion 3.5 Large offers exceptional customizability, efficient performance on consumer hardware, and diverse image outputs that accurately represent different skin tones and features, all while maintaining high-quality results and strong prompt adherence.

test

Faceswap V3

Face Swap V3 is a cutting-edge tool that empowers you to seamlessly swap faces in images. With customizable features and advanced technology, you can achieve professional-quality results.

test

Video Audio Merge

Effortlessly merge audio and video with our intuitive Video Audio Merge model. Create stunning multimedia content with precise timing, fade effects, and customizable audio options. Perfect for content creators, filmmakers, and marketers.

test

Runway Gen Alpha Turbo Image to Video

Runway Gen-3 AlphaTurbo is a cutting-edge AI tool that transforms static images into dynamic videos with exceptional fidelity and motion

test

Kling AI Image to Video

Kling AI Image-to-Video is a powerful AI tool that transforms static images into captivating, animated videos. Create high-quality content effortlessly with Kling AI's advanced capabilities.

test

Kling AI Text to Video

Kling AI Text-to-Video is a cutting-edge AI tool that transforms text into stunning, lifelike videos. Create professional-quality content effortlessly with Kling AI's advanced capabilities.

test

Meta MusicGen Medium

MusicGen: Transform text into music with AI. Create unique, high-quality audio from simple descriptions. Experience the future of music generation with this innovative AI model.

test

Video Captioner

With Video Captioner create accurate, customizable subtitles for your videos effortlessly.

test

Face Detailer

Restore characters' faces to their original glory with Face Detailer. Enhance facial details, eliminate distortion, and upscale images for stunning results.

test

flux-1.1-pro

Flux 1.1 Pro is a cutting-edge image generation tool offering exceptional speed, quality, and customization. Ideal for digital artists, designers, and content creators.

test

MyShell Text To Speech

MyShell's Voice Cloning and Text to Speech - Transform your audio content with realistic, personalized voices. Experience high-quality, efficient, and cost-effective audio synthesis.

test

Video Stitch

Revolutionize your video editing with the Video Stitch Model. Seamlessly stitch clips, add captivating audio, and create professional-looking videos in minutes.

test

Simple Vector Flux Lora

Flux is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions

test

Ideogram Text To Image

Ideogram Text to Image: Turn your ideas into stunning visuals instantly with this powerful AI tool. Create captivating designs, realistic images, and more. Perfect for artists, designers, and anyone seeking creative inspiration.

test

Openvoice

OpenVoice is a versatile voice cloning model that supports multiple languages and offers precise tone replication, flexible style control, and zero-shot cross-lingual capabilities

test

Cog videoX Image To Video

CogVideoX image-to-video is a cutting-edge AI model that converts static images into dynamic, high-quality videos. Perfect for content creation, animation, and education, it offers high-resolution output, efficient inference, and versatile precision. Transform your images into engaging videos with CogVideoX

test

Expression Editor

Expression Editor uses reference images to accurately generate new images with desired expressions. Perfect for digital art, memes, and marketing.

test

Esrgan Video Upscaler

ESRGAN Video Upscaler: Experience sharper, clearer 4k videos with ESRGAN. This AI-powered video upscaler boosts resolution and reduces artifacts, making your video content look its best. Best Topaz alternative.

test

Consistent Character With Pose

Create images of a given character in different poses

test

Consistent Character AI Neolemon V3

Create consistent characters in any pose with AI

test

Luma Text-to-Video

Luma Video (Text to Video) is an advanced AI model that turns text prompts into captivating videos. Designed for creators and marketers, it offers high-resolution outputs, rapid processing, and cinematic quality, making video production accessible and efficient.

test

Luma Image-to-Video

With Luma's Dream Machine, transform your static images into dynamic videos. It offers high-fidelity video generation, rapid processing, and cinematic quality, enabling users to enhance their content creation process effortlessly.

test

Flux Pulid

Flux PuLID: Customize AI-generated images with your unique identity. Seamlessly integrate faces into text-to-image models for realistic and customizable results. High fidelity, tuning-free customization, and versatile editing options.

test

Flux Ipadapter

Flux IP Adapter is a cutting-edge AI model that lets you to create stunning, customized images. With its advanced style adaptation capabilities, Flux IP Adapter lets you seamlessly blend different artistic styles into your creations.

test

Flux Inpaint

Flux Inpainting is a powerful image editing tool designed to effortlessly edit and enhance your images. It's perfect for tasks like removing unwanted objects, restoring damaged photos, and creating artistic effects.

test

Flux Controlnets

Flux ControlNets is a collection of models that gives you precise control over image generation. By integrating ControlNet with Flux.1, these models enable you to create highly detailed and customized images with unprecedented accuracy.

test

OpenAI o1-mini

o1-mini by OpenAI provides high-performance reasoning and coding capabilities. Ideal for developers and businesses seeking advanced AI without the high costs.

test

OpenAI o1-preview

o1-preview by OpenAI, is a powerful AI model that can tackle complex problems with exceptional accuracy and efficiency. Ideal for researchers, developers, and scientists seeking advanced AI capabilities.

test

Text Overlay

Elevate your visuals withText Overlay Model. Easily add customized text to any image, perfect for social media, marketing, and blogs. Enjoy precise positioning, advanced styling, and seamless integration.

test

Cog Video X 5B

CogVideo is a groundbreaking AI model that turns text into high-quality videos. Create realistic scenes, animations, and more with ease. Ideal for content creators, educators, and businesses.

test

Fast Flux.1 Schnell

Fast Flux.1 Schnell by Segmind is an optimized text-to-image model designed for developers needing faster image generation. It offers high efficiency without compromising quality. Perfect for startups and engineers seeking quick, resource-efficient AI models.

test

Consistent Character AI Neolemon V2

Create consistent characters in any pose with AI

test

Flux Realism Lora with Upscale

Flux Realism Lora with upscale, developed by XLabs AI is a cutting-edge model designed to generate realistic images from textual descriptions.

test

Sam V2 Image

SAM v2, the next-gen segmentation model from Meta AI, revolutionizes computer vision. Building on SAM's success, it excels at accurately segmenting objects in images, offering robust and efficient solutions for various visual contexts.

test

Sam V2 Video

SAM v2 Video by Meta AI, allows promptable segmentation of objects in videos.

test

Consistent Character V1

Create images of a given character in different poses

test

Flux.1 Image To Image

Flux Image-To-Image model by Black Forest Labs is an advanced deep learning tool designed for transforming images based on specific textual prompts.

test

Easy Animate

Easy Animate is a state-of-the-art image to animation model to convert static images into dynamic animations with remarkable accuracy and fluidity.

test

Flux.1 Dev

Flux Dev is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions

test

Flux.1 Schnell

Flux Schnell  is a state-of-the-art text-to-image generation model engineered for speed and efficiency.

test

Flux .1 Pro

Flux Pro is a state-of-the-art image generation with top of the line prompt following, visual quality, image detail and output diversity.

test

Text Embedding 3 Small

Text-embedding-3-small is a compact and efficient model developed for generating high-quality text embeddings. These embeddings are numerical representations of text data, enabling a variety of natural language processing (NLP) tasks such as semantic search, clustering, and text classification

test

Text Embedding 3 Large

Text-embedding-3-large is a robust language model by OpenAI designed for generating high-dimensional text embeddings for a wide range of natural language processing (NLP) tasks including semantic search, text clustering, and classification.

test

Realdream Pony V9

Real Dream Pony V9 is an advanced image generation model based on the Stable Diffusion XL (SDXL) architecture, excelling in photorealism.

test

AI Product Photo Editor

AI Product Photo Editor leverages advanced image-based ML techniques to generate high-quality product visuals using text prompts, product images, and background images.

test

RealDream Lightning

RealDream is a sophisticated image generation model utilizing SDXL Lightning architecture. It creates incredibly realistic images from textual prompts. With the ability to excellently generate human portraits from the user's descriptive text.

test

Llama 3.1 405b

Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks.

test

Llama 3.1 70b

Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks.

test

Llama 3.1 8b

Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks.

test

Stable Diffusion 3 Medium Image to Image

Stable Diffusion 3 Medium image-to-image is a cutting-edge AI tool that uses advanced image-to-image technology to transform one image into another.

test

SD3 Medium Tile Controlnet

SD3 Medium Tile ControlNet is a large generative image model designed for generating detailed images based on textual prompts and tile-based input images.

test

SD3 Medium Canny Controlnet

Stable Diffusion 3 (SD3) Medium Canny ControlNet uses Canny edge detection to provide fine-grained control over the generated outputs.

test

SD3 Medium Pose Controlnet

Stable Diffusion 3 (SD3) Pose ControlNet is a large generative image model tailored for generating images based on text prompts while using pose information as guidance.

test

Motion Control SVD

Motion Control SVD is an innovative deep learning framework that breathes life into static images. By intelligently managing both camera and object motion, it empowers creators to achieve precise animation effects.

test

Live Portrait video to video

Experience the magic of Live Portrait’s Video-to-Video Model! Transform your static images into dynamic videos seamlessly.

test

Image Superimpose V2

Superimpose V2 elevates image editing! Seamlessly layer images with background removal, precise positioning, and flexible resizing options. Explore 14 blending modes to create stunning effects

test

Video Faceswap

Video Faceswap is a powerful tool for creators, filmmakers, and meme enthusiasts. With this innovative technology, you can effortlessly replace faces in videos

test

Aura Flow

Largest completely open sourced flow-based generation model that is capable of text-to-image generation

test

Live Portrait

Live Portrait animates static images using a reference driving video through implicit key point based framework, bringing a portrait to life with realistic expressions and movements. It identifies key points on the face (think eyes, nose, mouth) and manipulates them to create expressions and movements.

test

ElevenLabs Dubbing

ElevenLabs Dubbing uses AI to translate your audio into multiple languages. Easily create multilingual versions of your content without studios or voice actors for each language

test

Claude 3 Haiku

Claude 3 Haiku, the fastest and most cost-effective model LLM from Anthropic, delivers instant responses and image analysis. Build interactive AI experiences that mimic human conversation. Perfect for various applications, from research to enterprise

test

Claude 3 Opus

Claude 3 Opus is an LLM pushing the limits of language understanding. It excels at complex tasks, generates human-quality text, and remembers vast amounts of information.

test

Gemini PRO

Gemini 1.5 Pro represents a significant leap in large language model technology, offering exceptional understanding and performance across different modalities and contexts.

test

Gemini Flash

Gemini 1.5 Flash is a game-changer for developers and enterprises seeking a speedy and cost-effective large language model with exceptional long-context understanding.

test

Claude 3.5 Sonnet

Claude 3.5 Sonnet represents a significant advancement in AI language models, combining speed, accuracy, and visual reasoning capabilities. It excels at understanding and completing requests thoughtfully, and does so much faster than previous versions. Additionally, it boasts a stronger vision model, allowing it to analyze visual data like charts and images with exceptional accuracy.

test

Kolors

Kolors is a cutting-edge text-to-image model that bridges language and visual art. Transform your textual ideas into photorealistic images with semantic precision.

test

Playground V2.5

Playground V2.5 is a diffusion-based text-to-image generative model, designed to create highly aesthetic images based on textual prompts.

test

Image Superimpose

Superimpose model lets you to create captivating visuals by seamlessly overlaying one image on top of another. It streamlines your image layering process, allowing you to bring your creative vision to life effortlessly.

test

SDXL Img2Img

SDXL Img2Img is used for text-guided image-to-image translation. This model uses the weights from Stable Diffusion to generate new images from an input image using StableDiffusionImg2ImgPipeline from diffusers

test

SDXL Controlnet

SDXL ControlNet gives unprecedented control over text-to-image generation. SDXL ControlNet models Introduces the concept of conditioning inputs, which provide additional information to guide the image generation process

test

Story Diffusion

Story Diffusion turns your written narratives into stunning image sequences.

test

Elevenlabs Sound Generation

Eleven Labs' Sound Generation API provides a robust development tool for programmatically generating audio content using artificial intelligence. This API empowers developers and creators to integrate sound generation functionalities into their applications and workflows.

test

Elevenlabs Speech To Speech

Eleven Labs Speech-to-Speech offers AI-powered voice conversion for content creators, media professionals, and anyone seeking to modify or translate audio speech.

test

Elevenlabs Text To Speech

Eleven Labs Text-to-Speech (TTS) harnesses the power of deep learning to create realistic and engaging synthetic speech from written text.

test

Omni Zero

Omni-Zero: A diffusion pipeline for zero-shot stylized portrait creation.

test

LLAVA 1.6 7B

LLaVa translates images into text descriptions & captions.

Archived
test

LLaVA 13B

LLaVA 13B is a Vision-language model which allows both image and text as inputs.

test

Tooncrafter

Create videos from illustrated input images

test

V Express

V-Express lets you create portrait videos from single images.

test

Hallo

Hallo lets you create portrait videos from single images.

test

Relighting

Prompts to auto-magically relight your images.

test

Automatic Mask Generator

Automatic Mask Generator is a powerful tool that automates the creation of precise masks for inpainting

test

Magic Eraser

LaMA Object Removal- AI Magic Eraser

test

Inpaint Mask Maker

Real-Time Open-Vocabulary Object Detection

test

Background Eraser

Background Eraser helps in flawless background removal with exceptional accuracy.

test

Clarity Upscaler

High resolution creative image Upscaler and Enhancer. A free Magnific alternative.

test

Consistent Character

Create images of a given character in different poses

test

IDM VTON

Best-in-class clothing virtual try on in the wild

test

Stable Diffusion 3 Medium Text to Image

Stable Diffusion is a type of latent diffusion model that can generate images from text. It was created by a team of researchers and engineers from CompVis, Stability AI, and LAION. Stable Diffusion v2 is a specific version of the model architecture. It utilizes a downsampling-factor 8 autoencoder with an 865M UNet and OpenCLIP ViT-H/14 text encoder for the diffusion model. When using the SD 2-v model, it produces 768x768 px images. It uses the penultimate text embeddings from a CLIP ViT-H/14 text encoder to condition the generation process.

test

Fooocus

Fooocus enables high-quality image generation effortlessly, combining the best of Stable Diffusion and Midjourney.

test

IPAdapter Style Transfer

Style & Composition Transfer with Stable Diffusion IP Adapter

test

Profile Photo Style Transfer

Turn any image of a face into artwork using Stable Diffusion Controlnet and IPAdapter

test

illusion-diffusion-hq

Monster Labs QrCode ControlNet on top of SD Realistic Vision v5.1

test

PuLID

Novel tuning-free ID customization method for text-to-image generation.

test

Yamer's Realistic SDXL

The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software.

test

GPT 4 turbo

GPT-4 outperforms both previous large language models and as of 2023, most state-of-the-art systems (which often have benchmark-specific training or hand-engineering). On the MMLU benchmark, an English-language suite of multiple-choice questions covering 57 subjects, GPT-4 not only outperforms existing models by a considerable margin in English, but also demonstrates strong performance in other languages. Currently points to gpt-4-turbo-2024-04-09.

test

GPT 4o

GPT-4o (“o” for “omni”) is our most advanced model. It is multimodal (accepting text or image inputs and outputting text), and it has the same high intelligence as GPT-4 Turbo but is much more efficient—it generates text 2x faster and is 50% cheaper. Additionally, GPT-4o has the best vision and performance across non-English languages of any of our models. GPT-4o is available in the OpenAI API to paying customers.

test

GPT 4

GPT-4 outperforms both previous large language models and as of 2023, most state-of-the-art systems (which often have benchmark-specific training or hand-engineering). On the MMLU benchmark, an English-language suite of multiple-choice questions covering 57 subjects, GPT-4 not only outperforms existing models by a considerable margin in English, but also demonstrates strong performance in other languages.

test

Mixtral 8x7b

Mistral MoE 8x7B Instruct v0.1 model with Sparse Mixture of Experts. Fine tuned for instruction following.

test

Mixtral 8x22b

Mistral MoE 8x22B Instruct v0.1 model with Sparse Mixture of Experts. Fine tuned for instruction following.

Archived
test

PuLID Lightning

Faster version of PuLID, a novel tuning-free face customization method for text-to-image generation

Archived
test

Fashion AI

This model is capable of editing clothing in an image using a premier clothing segmentation algorithm.

test

face-to-many

Turn a face into 3D, emoji, pixel art, video game, claymation or toy

test

face-to-sticker

Turn a face into a sticker

test

Llama 3 8b

Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks.

test

material-transfer

Transfer a material from an image to a subject

test

Llama 3 70b

Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks.

test

Faceswap V2

Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training

test

Insta Depth

InstantID aims to generate customized images with various poses or styles from only a single reference ID image while ensuring high fidelity

test

Background Removal V2

This model removes the background image from any image

test

NewReality Lightning SDXL

NewReality Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

test

DreamShaper Lightning SDXL

DreamShaper Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

test

Colossus Lightning SDXL

Colossus Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

test

Samaritan Lightning SDXL

Samaritan Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

test

Realism Lightning SDXL

Realism Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

test

ProtoVision Lightning SDXL

ProtoVision Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

test

NightVis Lightning SDXL

NightVis Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

test

WildCard Lightning SDXL

WildCard Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

test

Dynavis Lightning SDXL

Dynavis Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

test

Juggernaut Lightning SDXL

Juggernaut Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

test

Realvis Lightning SDXL

Realvis Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

test

Try-On Diffusion

Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on

test

Background Replace

This model is capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask

test

Fooocus Inpainting

Fooocus Inpainting is a powerful image generation model that allows you to selectively edit and enhance images.

test

Fooocus Outpainting

Fooocus Outpainting transforms ordinary images into extraordinary works of art by seamlessly expanding their boundaries.

test

InstantID

InstantID aims to generate customized images with various poses or styles from only a single reference ID image while ensuring high fidelity

test

Samaritan 3D XL

Samaritan 3D XL leverages the robust capabilities of the SDXL framework, ensuring high-quality, detailed 3D character renderings.

test

Stable Video Diffusion

Takes image as input and returns a video.

test

Segmind-Vega

The Segmind-Vega Model is a distilled version of the Stable Diffusion XL (SDXL), offering a remarkable 70% reduction in size and an impressive 100% speedup while retaining high-quality text-to-image generation capabilities.

test

Segmind-VegaRT

Segmind-VegaRT a distilled consistency adapter for Segmind-Vega that allows to reduce the number of inference steps to only between 2 - 8 steps.

test

IP-adapter Openpose XL

IP Adapter XL Openpose is built on the SDXL framework. This model integrates the IP Adapter and Openpose preprocessor to offer unparalleled control and guidance in creating context-rich images.

test

IP-adapter Canny XL

IP Adpater XL Canny is built on the SDXL framework. This model integrates the IP Adapter and Canny edge preprocessor to offer unparalleled control and guidance in creating context-rich images.

test

IP-adapter Depth XL

IP Adapter Depth XL is built on the SDXL framework. This model integrates the IP Adapter and Depth preprocessor to offer unparalleled control and guidance in creating context-rich images.

test

SDXL Inpaint

This model is capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask

test

SSD Img2Img

This model uses SSD-1B to generate images by passing a text prompt and an initial image to condition the generation

test

SDXL-Openpose

This model leverages SDXL to generate the images with ControlNet conditioned on Human Pose Estimation.

test

SSD-Depth

This model leverages SSD-1B to generate the images with ControlNet conditioned on Depth Estimation

test

SSD-Canny

This model leverages SSD-1B to generate the images with ControlNet conditioned on Canny Images

test

SSD-1B

The Segmind Stable Diffusion Model (SSD-1B) is a distilled 50% smaller version of the Stable Diffusion XL (SDXL), offering a 60% speedup while maintaining high-quality text-to-image generation capabilities. It has been trained on diverse datasets, including Grit and Midjourney scrape data, to enhance its ability to create a wide range of visual content based on textual prompts.

test

Copax Timeless SDXL

The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software.

test

Zavychroma SDXL

The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software.

test

Realvis SDXL

The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software.

test

Dreamshaper SDXL

The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software.

Archived
test

Stable Diffusion 2.1

Stable Diffusion is a type of latent diffusion model that can generate images from text. It was created by a team of researchers and engineers from CompVis, Stability AI, and LAION. Stable Diffusion v2 is a specific version of the model architecture. It utilizes a downsampling-factor 8 autoencoder with an 865M UNet and OpenCLIP ViT-H/14 text encoder for the diffusion model. When using the SD 2-v model, it produces 768x768 px images. It uses the penultimate text embeddings from a CLIP ViT-H/14 text encoder to condition the generation process.

Archived
test

Stable Diffusion XL 0.9

The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software.

test

Word2img

Create beautifully designed words using Segmind’s word to image for your marketing purposes

Archived
test

Segmind Tiny-SD

Convert Text into Images with the latest distilled stable diffusion model

test

Stable Diffusion Inpainting

Stable Diffusion Inpainting is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask

test

Stable Diffusion img2img

This model uses diffusion-denoising mechanism as first proposed by SDEdit, Stable Diffusion is used for text-guided image-to-image translation. This model uses the weights from Stable Diffusion to generate new images from an input image using StableDiffusionImg2ImgPipeline from diffusers

Archived
test

Segmind Small-SD

Create realistic portrait images using the finetined Segmind Tiny SD model. Segmind Tiny SD (Portrait) Serverless APIs, Segmind offers fastest deployment for Tiny-Stable-Diffusion inferences

test

Stable Diffusion XL 1.0

The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software

Archived
test

Scifi

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

Archived
test

Samaritan

The most versatile photorealistic model that blends various models to achieve the amazing realistic images

Archived
test

RPG

This model corresponds to the Stable Diffusion RPG checkpoint for detailed images at the cost of a super detailed prompt

test

Reliberate

This model corresponds to the Stable Diffusion Reliberate checkpoint for detailed images at the cost of a super detailed prompt

test

Realistic Vision

This model corresponds to the Stable Diffusion Realistic Vision checkpoint for detailed images at the cost of a super detailed prompt

Archived
test

RCNZ - Cartoon

The most versatile photorealistic model that blends various models to achieve the amazing realistic images

Archived
test

Paragon

This model corresponds to the Stable Diffusion Paragon checkpoint for detailed images at the cost of a super detailed prompt

test

SD Outpainting

Stable Diffusion Outpainting can extend any image in any direction

Archived
test

Manmarumix

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

Archived
test

Majicmix

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

test

Juggernaut Final

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

Archived
test

Fruit Fusion

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

Archived
test

Flat 2d

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

Archived
test

Fantassified Icons

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

test

Epic Realism

This model corresponds to the Stable Diffusion Epic Realism checkpoint for detailed images at the cost of a super detailed prompt

test

Edge of Realism

This model corresponds to the Stable Diffusion Edge of Realism checkpoint for detailed images at the cost of a super detailed prompt

Archived
test

DvArch

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

Archived
test

Dream Shaper

Dreamshaper excels in delivering high-quality, detailed images. It is fine-tuned to understand and interpret a diverse range of artistic styles and subjects.

Archived
test

Deep Spaced Diffusion

The most versatile photorealistic model that blends various models to achieve the amazing realistic space themed images.

test

Cyber Realistic

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

Archived
test

Cute Rich Style

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

Archived
test

Colorful

This model corresponds to the Stable Diffusion Colorful checkpoint for detailed images at the cost of a super detailed prompt

Archived
test

All in one pixe

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

Archived
test

526mix

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

Archived
test

QR Generator

Create beautiful and creative QR codes for your marketing campaigns.

Archived
test

Segmind Tiny-SD (Portrait)

Convert text to images with the distilled stable diffusion model by Segmind, Small-SD. Segmind Small SD Serverless APIs, Segmind offers fastest deployment for Small-Stable-Diffusion inferences.

Archived
test

Kandinsky 2.1

Kandinsky inherits best practices from Dall-E 2 and Latent diffusion, while introducing some new ideas.

test

ControlNet Soft Edge

This model corresponds to the ControlNet conditioned on Soft Edge.

test

ControlNet Scribble

This model corresponds to the ControlNet conditioned on Scribble images.

test

ControlNet Depth

This model corresponds to the ControlNet conditioned on Depth estimation.

test

ControlNet Canny

This model corresponds to the ControlNet conditioned on Canny edges.

test

Codeformer

CodeFormer is a robust face restoration algorithm for old photos or AI-generated faces.

test

Segment Anything Model

The Segment Anything Model (SAM) produces high quality object masks from input prompts such as points or boxes, and it can be used to generate masks for all objects in an image.

test

Faceswap

Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training

Archived
test

Revanimated

This model corresponds to the Stable Diffusion Revanimated checkpoint for detailed images at the cost of a super detailed prompt

test

Background Removal

This model removes the background image from any image

test

ESRGAN

AI-Powered Image Super-Resolution, upscaling and Image enhancement producing stunning, high-quality results using artificial intelligence

test

ControlNet Openpose

This model corresponds to the ControlNet conditioned on Human Pose Estimation.