AN
Open SourceVideo Generationby Shanghai AI Lab & CUHK MMLab

AnimateDiff

AnimateDiff is an open technique that turns text-to-image diffusion models into animators. By plugging a trained motion module into Stable Diffusion, it generates short animated clips from prompts while keeping the model's style and custom fine-tunes.

ai-animationanimatediffopen-source-aistable-diffusiontext-to-videovideo-generation
Quick facts
LicenseApache 2.0
TypeText-to-Video
Works withStable Diffusion
FeatureMotion Module
No ratings yet — be the first
Type
Motion module
text-to-video
Works with
Stable Diffusion
+ LoRAs
License
Apache 2.0
open source
Runs on
Consumer GPU
self-host

What is AnimateDiff?

AnimateDiff is an open technique for animating text-to-image diffusion models. It is not a standalone model but a plug-in motion module: by inserting a trained motion component into an existing Stable Diffusion model, AnimateDiff turns a still-image generator into one that produces short animated clips from text prompts — and crucially, it works with the model's existing style, fine-tunes and LoRAs, so your favourite custom image model can suddenly create motion in its own aesthetic without retraining it.

How it works

AnimateDiff trains a motion module on video data to learn general motion priors, separate from any specific image model. At generation time, this module is inserted into the layers of a Stable Diffusion model and applies temporal consistency across a batch of frames, so the frames form a coherent animation rather than independent images. Because the motion module is decoupled from the base model, the same module animates many different fine-tuned Stable Diffusion checkpoints and LoRAs — preserving their styles while adding movement.

What it is good at

AnimateDiff excels at stylised, prompt-driven short animations: bringing illustrations, characters and artistic styles to life, animated loops, GIF-style clips, and creative motion in a specific aesthetic. Its big advantage is compatibility with the huge Stable Diffusion ecosystem — any community checkpoint or LoRA can be animated — and combined with ControlNet and prompt scheduling it enables controlled, evolving animations. It is a favourite in open-source AI video tooling like ComfyUI.

Licensing & access

AnimateDiff is open source (Apache 2.0), with code and motion modules on GitHub and Hugging Face, and native support in the Diffusers library and ComfyUI. It runs on a consumer GPU (more VRAM helps for more frames), and since it builds on Stable Diffusion, the base model's licence also applies to your outputs. Multiple motion-module versions exist for different Stable Diffusion versions (1.5, SDXL).

Practical considerations

AnimateDiff produces short clips (a few seconds), and motion can be limited or jittery depending on the prompt, base model and settings, so expect iteration and tuning (motion strength, frame count, schedulers). It pairs best with good Stable Diffusion checkpoints, and combining it with ControlNet improves control. Mind the base model's licence for commercial use, respect copyrights and likeness, and disclose AI-generated video where appropriate.

How it compares

Stable Video Diffusion does image-to-video (animating a single still) with strong general motion; VideoGPT is an earlier token-based research model. AnimateDiff's distinct strength is text-to-video that taps the entire Stable Diffusion fine-tune/LoRA ecosystem, keeping custom styles. For animating in a specific artistic style from a prompt, AnimateDiff shines; for realistic motion from a photo, SVD fits better — and the two are often combined in advanced pipelines.

Getting started

Use the Diffusers library or ComfyUI: load a Stable Diffusion checkpoint plus an AnimateDiff motion module, write a prompt, and generate a short animated clip you can export as a GIF or video. Start with an SD 1.5 checkpoint and a matching motion module, tune frame count and motion strength, and add LoRAs or ControlNet for style and control — checking the base model's licence before commercial use.

Capabilities

🔌
Plug-in motion
Inserts a motion module into Stable Diffusion to add animation.
🎨
Keeps style
Animates any fine-tuned checkpoint or LoRA in its own aesthetic.
🎬
Text-to-video
Generates short, coherent animated clips from prompts.
🧭
Controllable
Combines with ControlNet and prompt scheduling for guided motion.

Pros & Cons

Pros6
  • Animates any Stable Diffusion checkpoint/LoRA
  • Keeps the base model's style
  • Text-to-video from prompts
  • Pairs with ControlNet for control
  • Open source (Apache 2.0)
  • Runs on consumer GPUs, huge ecosystem
Cons4
  • Short clips; motion can be limited/jittery
  • Needs tuning (frames, motion strength)
  • Base model's licence applies to outputs
  • Best results need good SD checkpoints

Inspiration

AnimateDiff use cases & project ideas

Stylised animation

Animate in a custom style.

Animated loops

GIF-style clips.

Character motion

Bring art to life.

Creative clips

Prompt-driven short video.

FAQ

Frequently asked questions

An open technique that inserts a trained motion module into Stable Diffusion, turning it into a text-to-video animator that keeps the model's style.