turian / sgmse-speech-enhancement-deverb-replicate

Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation

  • Public
  • 59 runs
  • GitHub
  • Paper
  • License

Run turian/sgmse-speech-enhancement-deverb-replicate with an API

Use one of our client libraries to get started quickly. Clicking on a library will take you to the Playground tab where you can tweak different inputs, see the results, and copy the corresponding code to use in your own project.

Input schema

The fields you can use to run this model with an API. If you don't give a value for a field its default value will be used.

Field Type Default value Description
audio
string
Speech audio file
checkpoint
string (enum)
EARS-WHAM

Options:

EARS-WHAM, EARS-Reverb

Model checkpoint to use. EARS-WHAM speech enhancement or EARS-Reverb dereverberation.
corrector
string (enum)
ald

Options:

ald, langevin, none

Corrector class for the PC sampler.
corrector_steps
integer
1
Number of corrector steps
snr
number
0.5
SNR value for (annealed) Langevin dynamics.
N
integer
30
Number of reverse steps

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema
{
  "type": "string",
  "title": "Output",
  "format": "uri"
}