prunaai/p-video

Fast video generation with built-in draft mode for rapid creative iteration. Text-to-video, image-to-video, and audio-to-video in a single endpoint.

58K runs

Readme

P-Video

P-Video is Pruna AI’s video generation model built for speed and creative iteration. It generates a 5-second 720p video in about 10 seconds, and includes a draft mode that’s 4× faster for quick previews before committing to a full render.

Features

  • All-in-one endpoint — text-to-video, image-to-video, and audio-to-video
  • Draft mode — 4× faster previews for rapid iteration
  • Built-in audio generation — native dialogue and sound, plus custom audio import
  • Up to 1080p at 48 FPS
  • Multi-aspect ratio support — 16:9, 9:16, 4:3, 3:4, 3:2, 2:3, 1:1
  • Prompt upsampling — automatic prompt enhancement with full user control

Pricing

Draft OFF Draft ON
720p $0.02/sec $0.005/sec
1080p $0.04/sec $0.01/sec

Inputs

  • prompt (required) — text description of the video you want to generate
  • image — input image for image-to-video generation (jpg, jpeg, png, webp)
  • audio — input audio to condition video generation (flac, mp3, wav)
  • duration — video length in seconds, 1–10 (default: 5). Ignored when audio is provided
  • aspect_ratio — 16:9, 9:16, 4:3, 3:4, 3:2, 2:3, or 1:1 (default: 16:9). Ignored when an input image is provided
  • resolution — 720p or 1080p (default: 720p)
  • fps — 24 or 48 frames per second (default: 24)
  • draft — enable draft mode for faster, lower-quality previews (default: false)
  • prompt_upsampling — enhance the prompt automatically (default: true)
  • seed — set for reproducible generation

What it’s good at

  • Talking avatars and lip sync — strong input-image consistency with reliable lip synchronization and native dialogue generation
  • Close-up subjects — particularly strong with foreground objects and close-up shots
  • Product animation — turn static product images into animated videos
  • Social ads and short-form content — fast iteration with multi-resolution output
  • Music videos — combine your own audio with generated visuals
  • Animating low-resolution assets — effective at bringing low-res images to life

Tips

  • Use draft mode for iteration. Start with draft mode on to quickly explore different prompts and compositions, then switch it off for the final render.
  • Vertical formats may work better at 1080p and 48 FPS.
  • Try different resolutions and FPS settings. Output quality can vary depending on the combination of resolution, FPS, and input framing.
  • Light prompt refinement helps. Like any generative model, a short experimentation phase with your prompts will get better results.

Limitations

  • Not designed for extreme cinematic camera motion or complex multi-scene storytelling
  • No native 4K output
  • Sound effects (SFX) performance is limited — for premium voice realism or advanced sound design, dedicated audio providers can deliver higher fidelity, and their output can be used as audio input to P-Video
  • Above two speakers, speaker separation can degrade
  • Speaker attribution drift can occur (e.g., one voice delivering multiple lines)
Model created
Model updated