AI Model Documentation

Learn about the AI models and their developers on our platform

API DocsBack to Home

49

AI Models

13

Free

14

Image

10

Video

19

Text

4

Audio

2

Post-Prod

Model Developers

🖼️

Free Image Models

Use top AI image models at zero cost. No credits needed.

FLUX.2 ProFree

Black Forest Labs

Free · High-quality T2I · Rate limited

flux-2-pro$0
FLUX.2 MaxFree

Black Forest Labs

Free · Maximum quality · Rate limited

flux-2-max$0
FLUX.2 FlexFree

Black Forest Labs

Free · Flexible styles · Rate limited

flux-2-flex$0
FLUX.2 Klein 4BFree

Black Forest Labs

Free · Lightweight · Rate limited

flux-2-klein-4b$0
Seedream 4.5Free

ByteDance

Free · Bilingual · Rate limited · May queue

seedream-v4.5$0
Free models have rate limits (~10 req/min) and may queue during peak hours. For faster speed and higher reliability, use paid models.
🖼️

Image Generation Models

Generate high-quality images from text descriptions with various styles and resolutions.

Seedream 4.5Standard

ByteDance

Latest flagship · Native bilingual · 4K ultra-HD

seedream-v4.5$0.40
Seedream 4Fast

ByteDance

High-quality image generation · Bilingual

seedream-v4$0.40
Dreamina 3.1Premium

ByteDance

High-fidelity aesthetics · Artistic style

dreamina-v3.1/text-to-image$0.60
Qwen ImageStandard

Alibaba

20B parameters · Excellent Chinese text rendering

qwen-image/text-to-image$0.50
Wan 2.6 ImageFast

Alibaba

Wan series image model · High resolution

wan-2.6/text-to-image$0.80
✏️

Image Editing Models

Upload existing images for editing, enhancement, or style transformation.

FLUX Kontext ProPremium

Black Forest Labs

Context-aware editing · Best for image & text editing

flux-kontext-pro$0.80
FLUX Kontext Pro MultiPremium

Black Forest Labs

Multi-image context editing · Style consistency

flux-kontext-pro/multi$0.80
UNOStandard

ByteDance

Universal image editing · Image + text

uno$0.50
Real-ESRGANFast

Xintao Wang et al.

Image super-resolution · Quality enhancement

real-esrgan$0.50
🎬

Video Generation Models (Text-to-Video)

Auto-generate short videos from text descriptions. Some models support synchronized audio generation.

Wan 2.2 — 480p Ultra FastFast

Alibaba

Ultra-fast generation · ~5s per video

wan-2.2/t2v-480p-ultra-fast$0.10
Wan 2.2 — 720pStandard

Alibaba

High-definition resolution

wan-2.2/t2v-720p$0.60
Wan 2.6AudioStandard

Alibaba

Latest Wan series · Audio support

wan-2.6/text-to-video$0.80
Seedance 1.5 ProAudioPremium

ByteDance

Cinematic quality · Audio support

seedance-v1.5-pro/text-to-video$1.00
Kling Video O3Premium

Kuaishou

Best motion quality

kling-video-o3-std/text-to-video$1.20
Seedance 2.0AudioPremium

ByteDance

Latest · Audio + lock camera · Up to 12s

seedance-2.0/text-to-video$1.20
🎞️

Video Generation Models (Image-to-Video)

Transform static images into dynamic videos, bringing images to life.

Wan 2.2 i2v — 480p FastFast

Alibaba

Image-to-video · Fast

wan-2.2/i2v-480p-ultra-fast$0.10
Wan 2.2 i2v — 720pStandard

Alibaba

Image-to-video · HD

wan-2.2/i2v-720p$0.60
Seedance 1.5 Pro i2vAudioPremium

ByteDance

Image-to-video · Cinematic · Audio

seedance-v1.5-pro/image-to-video$1.00
Seedance 2.0 i2vAudioPremium

ByteDance

Image-to-video · Audio + lock camera · 12s

seedance-2.0/image-to-video$1.20
🆓

Free Text Models

Use top AI language models at zero cost. No credits needed.

GPT-OSS 120BFree

OpenAI

Free · 120B open-source · Rate limited

gpt-oss-120b$0
Nemotron 3 SuperFree

NVIDIA

Free · 543B · Rate limited

nemotron-3-super$0
Qwen3 Coder 480BFree

Qwen

Free · 480B coding · Rate limited

qwen3-coder-480b$0
Llama 3.3 70BFree

Meta

Free · 70B · Rate limited

llama-3.3-70b-instruct$0
Gemma 3 27BFree

Google

Free · 27B · Rate limited

gemma-3-27b-it$0
Mistral Small 3.1 24BFree

Mistral

Free · 24B · Rate limited

mistral-small-3.1-24b$0
DeepSeek V3Free

DeepSeek

Free · Great for Chinese · Rate limited

deepseek-chat-v3$0
Hermes 3 405BFree

Nous Research

Free · 405B · Rate limited

hermes-3-llama-3.1-405b$0
Free models have rate limits (~10 req/min) and may queue during peak hours. For faster speed and higher reliability, use paid models.
📝

Text Generation Models

Multiple leading AI language models for social content creation, rewriting, and optimization.

GPT-4oPremium

OpenAI

Flagship · Most capable overall

openai/gpt-4o$12.50/1M in · $50/1M out
GPT-4o MiniFast

OpenAI

Lightweight · Cost-effective

openai/gpt-4o-mini$0.75/1M in · $3/1M out
GPT-5Premium

OpenAI

Latest flagship model

openai/gpt-5$6.25/1M in · $50/1M out
Claude Sonnet 4Premium

Anthropic

Excellent writing quality

anthropic/claude-sonnet-4$15/1M in · $75/1M out
Claude 3.5 HaikuFast

Anthropic

Fast · Cost-efficient

anthropic/claude-3.5-haiku$4/1M in · $20/1M out
Gemini 2.5 FlashFast

Google

Ultra-fast · Low cost

google/gemini-2.5-flash$1.50/1M in · $12.50/1M out
Gemini 2.5 ProPremium

Google

High performance reasoning

google/gemini-2.5-pro$6.25/1M in · $50/1M out
Grok 3Premium

xAI

Real-time aware

xai/grok-3$15/1M in · $75/1M out
Grok 3 MiniFast

xAI

Lightweight and fast

xai/grok-3-mini$1.50/1M in · $2.50/1M out
Mistral SmallFast

Mistral

Efficient European model

mistral/mistral-small$0.50/1M in · $1.50/1M out
Mistral MediumStandard

Mistral

Balanced performance

mistral/mistral-medium$2/1M in · $10/1M out
🎙️

Voice Synthesis Models

Convert text to natural speech with multiple voice options and speed control.

TTS-1Standard

OpenAI

High-quality TTS · 6 voices (alloy, echo, fable, onyx, nova, shimmer)

openai/tts-1
Available voices: Alloy · Echo · Fable · Onyx · Nova · Shimmer
🎵

Background Music Generation Models

Auto-generate synchronized background music from video content and text descriptions, no extra assets needed.

MMAudio V2Standard

Cheng et al.

Video-to-audio · Multimodal sync · BGM generation

mmaudio-v2
🗣️

Video Narration Models

AI automatically analyzes video content and generates voiced narration. This feature uses two models in tandem: Gemini 2.5 Flash analyzes the video frames, then TTS-1 converts the generated script to speech.

Gemini 2.5 FlashAnalysisFast

Google

Video analysis · Auto-generate narration

google/gemini-2.5-flash
TTS-1SynthesisStandard

OpenAI

Narration synthesis · 6 voices

openai/tts-1
Narration styles: Professional · Casual · Dramatic · Documentary · Enthusiastic
🎨

Post-Production Models

Video post-processing tools — object tracking, content replacement, and natural language editing.

SAM2 VideoStandard

Meta

Video object tracking · Click to track · Content replacement

meta/sam-2-video~$0.04/run
Wan 2.7 VideoEditPremium

Alibaba

Natural language video editing · AI smart modification

wan-2.7-videoedit~$0.50/run

Model Tier Guide

Free

Zero cost with rate limits. May queue during peak hours.

Fast

Fastest generation, lowest cost. Ideal for quick iteration.

Standard

Best balance of speed and quality. Recommended for most uses.

Premium

Highest quality. Best for professional and important content.

Try Media Studio →