bytedance/dreamactor-m2.0

Animate any character, humans, cartoons, animals, even non-humans, from a single image + driving video

103 runs

Readme

DreamActor M2.0

Animate any character from a single image. Give it a video of someone moving, and DreamActor M2.0 will make your character do the same thing.

What it does

DreamActor M2.0 takes a still image of any character and brings it to life by copying movements from a driving video. Unlike earlier animation models that only work well with humans, this model handles humans, cartoons, animals, and pretty much any character you throw at it.

The model captures everything from subtle facial expressions to complex full-body movements. It works whether you’re animating a close-up portrait or a full-body shot, and it handles interactions with objects naturally.

How it works

The model treats animation as a learning problem. Instead of trying to extract skeletons or poses from your driving video (which breaks down for non-human characters), it learns to understand motion directly from the raw video pixels. This means it can animate characters that don’t have human-like bodies, like cartoon cats or fantasy creatures.

DreamActor M2.0 combines the appearance details from your reference image with motion patterns from the driving video in a unified way. This approach preserves the identity and look of your character while accurately copying the movements you want.

What makes it different

Most character animation models struggle with a trade-off: they either preserve the character’s identity well but lose motion accuracy, or they copy motion perfectly but the character stops looking like itself. DreamActor M2.0 addresses this by rethinking how motion information gets injected into the generation process.

The model also doesn’t depend on pose estimation systems that only work for humans. This makes it genuinely universal - you can animate realistic humans, stylized drawings, cartoon characters, and animals with the same model.

Good for

  • Animating portrait photos with facial expressions and head movements
  • Bringing illustrated characters to life
  • Transferring dance moves or gestures to any character
  • Creating videos where characters interact with objects
  • Animating animals or non-humanoid characters
  • Converting between different shot types (portrait to full-body, and vice versa)

Tips for best results

The model works best when your reference image shows the character clearly. Both portrait crops and full-body shots work well - the model adapts to different scales.

For the driving video, the clearer the motion, the better. The model can handle complex movements including subtle facial expressions, but cleaner inputs generally produce cleaner outputs.

If you’re animating multiple characters at once, make sure your reference images and driving videos align in terms of how many subjects they show.

Technical details

DreamActor M2.0 uses a two-stage training approach. First, it learns from videos with skeleton-based motion guidance. Then, it transitions to learning directly from raw video through a self-bootstrapped data synthesis process. This progression lets it generalize beyond what skeleton estimators can capture.

The model maintains temporal consistency across frames, which means your animated videos won’t have flickering or jitter. It also preserves appearance details even in complex movements where parts of the character might be temporarily hidden.

For more technical information, see the original paper.

Try it out

You can try DreamActor M2.0 on the Replicate Playground at replicate.com/playground

Model created
Model updated