acappemin / deepaudio-v1

DeepAudio-V1:Towards Multi-Modal Multi-Stage End-to-End Video to Speech and Audio Generation

  • Public
  • 38 runs
  • L40S
  • GitHub
  • Weights
  • Paper
Iterate in playground

Input

file
Preview

Input Video

string
Shift + Return to add a new line

Video-to-Audio Text Prompt

Default: ""

integer

Video-to-Audio Num Steps

Default: 25

string
Shift + Return to add a new line

Video-to-Speech Transcription

Default: ""

file
Preview
Video Player is loading.
Current Time 00:00:000
Duration 00:00:000
Loaded: 0%
Stream Type LIVE
Remaining Time 00:00:000
 
1x

Video-to-Speech Speech Prompt

string
Shift + Return to add a new line

Video-to-Speech Speech Prompt Transcription

Default: ""

integer

Video-to-Speech Num Steps

Default: 32

Output

Generated in

Run time and cost

This model runs on Nvidia L40S GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

This model doesn't have a readme.