skytells-research/lipsync

Enhanced lip-syncing audio with videos

Public

712 runs

Run skytells-research/lipsync with an API

Use one of our client libraries to get started quickly. Clicking on a library will take you to the Playground tab where you can tweak different inputs, see the results, and copy the corresponding code to use in your own project.

Input schema

The fields you can use to run this model with an API. If you don't give a value for a field its default value will be used.

Field	Type	Default value	Description
face	string		video/image that contains faces to use
audio	string		video/audio file to use as raw audio source
pads	string	0 10 0 0	Padding for the detected face bounding box. Please adjust to include chin at least Format: "top bottom left right"
smooth	boolean	True	Smooth face detections over a short temporal window
fps	number	25	Can be specified only if input is a static image
resize_factor	integer	1	Reduce the resolution by this factor. Sometimes, best results are obtained at 480p or 720p

{
  "type": "object",
  "title": "Input",
  "required": [
    "face",
    "audio"
  ],
  "properties": {
    "fps": {
      "type": "number",
      "title": "Fps",
      "default": 25,
      "x-order": 4,
      "description": "Can be specified only if input is a static image"
    },
    "face": {
      "type": "string",
      "title": "Face",
      "format": "uri",
      "x-order": 0,
      "description": "video/image that contains faces to use"
    },
    "pads": {
      "type": "string",
      "title": "Pads",
      "default": "0 10 0 0",
      "x-order": 2,
      "description": "Padding for the detected face bounding box.\nPlease adjust to include chin at least\nFormat: \"top bottom left right\""
    },
    "audio": {
      "type": "string",
      "title": "Audio",
      "format": "uri",
      "x-order": 1,
      "description": "video/audio file to use as raw audio source"
    },
    "smooth": {
      "type": "boolean",
      "title": "Smooth",
      "default": true,
      "x-order": 3,
      "description": "Smooth face detections over a short temporal window"
    },
    "resize_factor": {
      "type": "integer",
      "title": "Resize Factor",
      "default": 1,
      "x-order": 5,
      "description": "Reduce the resolution by this factor. Sometimes, best results are obtained at 480p or 720p"
    }
  }
}

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema

{
  "type": "string",
  "title": "Output",
  "format": "uri"
}