Prediction

Model

camb-ai/mars5-tts:7f801560

c6vhp0rby5rgc0cfzr4t0ae304

Status

Succeeded

Source

Web

Hardware

Total duration

3m 50s

Created

over 1 year ago

Webhook

–

Input

text: Hey there, this is a test.
ref_audio_file: https://files.catbox.moe/be6df3.wav
ref_audio_transcript: We actually haven't managed to meet demand.

{
  "ref_audio_file": "https://files.catbox.moe/be6df3.wav",
  "ref_audio_transcript": "We actually haven't managed to meet demand.",
  "text": "Hey there, this is a test."
}

Install Replicate’s Node.js client library:

npm install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=r8_KCY**********************************

This is your API token. Keep it to yourself.

Import and set up the client:

import Replicate from "replicate";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run camb-ai/mars5-tts using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "camb-ai/mars5-tts:7f80156013d82f86f3bcdf95eb9cebef346d83af75c0f50e9e901e3fcd8f7152",
  {
    input: {
      ref_audio_file: "https://files.catbox.moe/be6df3.wav",
      ref_audio_transcript: "We actually haven't managed to meet demand.",
      text: "Hey there, this is a test."
    }
  }
);

console.log(output);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=r8_KCY**********************************

This is your API token. Keep it to yourself.

Import the client:

import replicate

Run camb-ai/mars5-tts using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "camb-ai/mars5-tts:7f80156013d82f86f3bcdf95eb9cebef346d83af75c0f50e9e901e3fcd8f7152",
    input={
        "ref_audio_file": "https://files.catbox.moe/be6df3.wav",
        "ref_audio_transcript": "We actually haven't managed to meet demand.",
        "text": "Hey there, this is a test."
    }
)

print(output)

To learn more, take a look at the guide on getting started with Python.

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=r8_KCY**********************************

This is your API token. Keep it to yourself.

Run camb-ai/mars5-tts using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "camb-ai/mars5-tts:7f80156013d82f86f3bcdf95eb9cebef346d83af75c0f50e9e901e3fcd8f7152",
    "input": {
      "ref_audio_file": "https://files.catbox.moe/be6df3.wav",
      "ref_audio_transcript": "We actually haven\'t managed to meet demand.",
      "text": "Hey there, this is a test."
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

https://files.catbox.moe/ulk1ao.wav

{
  "id": "c6vhp0rby5rgc0cfzr4t0ae304",
  "model": "camb-ai/mars5-tts",
  "version": "7f80156013d82f86f3bcdf95eb9cebef346d83af75c0f50e9e901e3fcd8f7152",
  "input": {
    "ref_audio_file": "https://files.catbox.moe/be6df3.wav",
    "ref_audio_transcript": "We actually haven't managed to meet demand.",
    "text": "Hey there, this is a test."
  },
  "logs": ">>>> Ref Audio file: /tmp/tmpdb0qyxi0be6df3.wav; ref_transcript: We actually haven't managed to meet demand.\n>>> Running inference\nNote: using deep clone. Assuming input `c_phones` is concatenated prompt and output phones. Also assuming no padded indices in `c_codes`.\nNew x: torch.Size([1, 990, 8]) | new x_known: torch.Size([1, 990, 8]) . Base prompt: torch.Size([1, 215, 8]). New padding mask: torch.Size([1, 990]) | m shape: torch.Size([1, 990, 8])\n>>>>> Done with inference\n[16384/717228] bytes |>                    |\n[32768/717228] bytes |>                    |\n[49152/717228] bytes |=>                   |\n[65536/717228] bytes |=>                   |\n[81920/717228] bytes |==>                  |\n[98304/717228] bytes |==>                  |\n[114688/717228] bytes |===>                 |\n[131072/717228] bytes |===>                 |\n[147456/717228] bytes |====>                |\n[163840/717228] bytes |====>                |\n[180224/717228] bytes |=====>               |\n[196608/717228] bytes |=====>               |\n[212992/717228] bytes |=====>               |\n[229376/717228] bytes |======>              |\n[245760/717228] bytes |======>              |\n[262144/717228] bytes |=======>             |\n[278528/717228] bytes |=======>             |\n[294912/717228] bytes |========>            |\n[311296/717228] bytes |========>            |\n[327680/717228] bytes |=========>           |\n[344064/717228] bytes |=========>           |\n[360448/717228] bytes |==========>          |\n[376832/717228] bytes |==========>          |\n[393216/717228] bytes |==========>          |\n[409600/717228] bytes |===========>         |\n[425984/717228] bytes |===========>         |\n[442368/717228] bytes |============>        |\n[458752/717228] bytes |============>        |\n[475136/717228] bytes |=============>       |\n[491520/717228] bytes |=============>       |\n[507904/717228] bytes |==============>      |\n[524288/717228] bytes |==============>      |\n[540672/717228] bytes |===============>     |\n[557056/717228] bytes |===============>     |\n[573440/717228] bytes |===============>     |\n[589824/717228] bytes |================>    |\n[606208/717228] bytes |================>    |\n[622592/717228] bytes |=================>   |\n[638976/717228] bytes |=================>   |\n[655360/717228] bytes |==================>  |\n[671744/717228] bytes |==================>  |\n[688128/717228] bytes |===================> |\n[704512/717228] bytes |===================> |\n[717228/717228] bytes |====================>|\n[717228/717228] bytes |====================>|>>>> Output file url: https://files.catbox.moe/ulk1ao.wav",
  "output": "https://files.catbox.moe/ulk1ao.wav",
  "data_removed": false,
  "error": null,
  "source": "web",
  "status": "succeeded",
  "created_at": "2024-06-09T17:25:56.849Z",
  "started_at": "2024-06-09T17:27:54.730683Z",
  "completed_at": "2024-06-09T17:29:47.287469Z",
  "urls": {
    "cancel": "https://api.replicate.com/v1/predictions/c6vhp0rby5rgc0cfzr4t0ae304/cancel",
    "get": "https://api.replicate.com/v1/predictions/c6vhp0rby5rgc0cfzr4t0ae304",
    "web": "https://replicate.com/p/c6vhp0rby5rgc0cfzr4t0ae304"
  },
  "metrics": {
    "predict_time": 112.556786,
    "total_time": 230.438469
  }
}

Generated in

1 minute 53 seconds

Tweak it Report

>>>> Ref Audio file: /tmp/tmpdb0qyxi0be6df3.wav; ref_transcript: We actually haven't managed to meet demand. >>> Running inference Note: using deep clone. Assuming input `c_phones` is concatenated prompt and output phones. Also assuming no padded indices in `c_codes`. New x: torch.Size([1, 990, 8]) | new x_known: torch.Size([1, 990, 8]) . Base prompt: torch.Size([1, 215, 8]). New padding mask: torch.Size([1, 990]) | m shape: torch.Size([1, 990, 8]) >>>>> Done with inference [16384/717228] bytes |> | [32768/717228] bytes |> | [49152/717228] bytes |=> | [65536/717228] bytes |=> | [81920/717228] bytes |==> | [98304/717228] bytes |==> | [114688/717228] bytes |===> | [131072/717228] bytes |===> | [147456/717228] bytes |====> | [163840/717228] bytes |====> | [180224/717228] bytes |=====> | [196608/717228] bytes |=====> | [212992/717228] bytes |=====> | [229376/717228] bytes |======> | [245760/717228] bytes |======> | [262144/717228] bytes |=======> | [278528/717228] bytes |=======> | [294912/717228] bytes |========> | [311296/717228] bytes |========> | [327680/717228] bytes |=========> | [344064/717228] bytes |=========> | [360448/717228] bytes |==========> | [376832/717228] bytes |==========> | [393216/717228] bytes |==========> | [409600/717228] bytes |===========> | [425984/717228] bytes |===========> | [442368/717228] bytes |============> | [458752/717228] bytes |============> | [475136/717228] bytes |=============> | [491520/717228] bytes |=============> | [507904/717228] bytes |==============> | [524288/717228] bytes |==============> | [540672/717228] bytes |===============> | [557056/717228] bytes |===============> | [573440/717228] bytes |===============> | [589824/717228] bytes |================> | [606208/717228] bytes |================> | [622592/717228] bytes |=================> | [638976/717228] bytes |=================> | [655360/717228] bytes |==================> | [671744/717228] bytes |==================> | [688128/717228] bytes |===================> | [704512/717228] bytes |===================> | [717228/717228] bytes |====================>| [717228/717228] bytes |====================>|>>>> Output file url: https://files.catbox.moe/ulk1ao.wav