thlz998 / chat-tts

text: chat T T S 是一款强大的对话式文本转语音模型。它有中英混读和多说话人的能力。 chat T T S 不仅能够生成自然流畅的语音，还能控制[laugh]笑声啊[laugh]，停顿啊[uv_break]语气词啊等副语言现象[uv_break]。这个韵律超越了许多开源模型[uv_break]。请注意，chat T T S 的使用应遵守法律和伦理准则，避免滥用的安全风险。[uv_break]
top_k: 20
top_p: 0.7
voice: 2222
prompt
skip_refine: 0
temperature: 0.3
custom_voice: 0

{
  "text": "chat T T S 是一款强大的对话式文本转语音模型。它有中英混读和多说话人的能力。\nchat T T S 不仅能够生成自然流畅的语音，还能控制[laugh]笑声啊[laugh]，\n停顿啊[uv_break]语气词啊等副语言现象[uv_break]。这个韵律超越了许多开源模型[uv_break]。\n请注意，chat T T S 的使用应遵守法律和伦理准则，避免滥用的安全风险。[uv_break]",
  "top_k": 20,
  "top_p": 0.7,
  "voice": 2222,
  "prompt": "",
  "skip_refine": 0,
  "temperature": 0.3,
  "custom_voice": 0
}

Install Replicate’s Node.js client library:

npm install replicate

Import and set up the client:

import Replicate from "replicate";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run thlz998/chat-tts using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "thlz998/chat-tts:864cbf22ea816a82f3da143049cd5d4c94b95b58567c98536165a44c74b540d1",
  {
    input: {
      text: "chat T T S 是一款强大的对话式文本转语音模型。它有中英混读和多说话人的能力。\nchat T T S 不仅能够生成自然流畅的语音，还能控制[laugh]笑声啊[laugh]，\n停顿啊[uv_break]语气词啊等副语言现象[uv_break]。这个韵律超越了许多开源模型[uv_break]。\n请注意，chat T T S 的使用应遵守法律和伦理准则，避免滥用的安全风险。[uv_break]",
      top_k: 20,
      top_p: 0.7,
      voice: 2222,
      prompt: "",
      skip_refine: 0,
      temperature: 0.3,
      custom_voice: 0
    }
  }
);

console.log(output);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Import the client:

import replicate

Run thlz998/chat-tts using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "thlz998/chat-tts:864cbf22ea816a82f3da143049cd5d4c94b95b58567c98536165a44c74b540d1",
    input={
        "text": "chat T T S 是一款强大的对话式文本转语音模型。它有中英混读和多说话人的能力。\nchat T T S 不仅能够生成自然流畅的语音，还能控制[laugh]笑声啊[laugh]，\n停顿啊[uv_break]语气词啊等副语言现象[uv_break]。这个韵律超越了许多开源模型[uv_break]。\n请注意，chat T T S 的使用应遵守法律和伦理准则，避免滥用的安全风险。[uv_break]",
        "top_k": 20,
        "top_p": 0.7,
        "voice": 2222,
        "prompt": "",
        "skip_refine": 0,
        "temperature": 0.3,
        "custom_voice": 0
    }
)
print(output)

To learn more, take a look at the guide on getting started with Python.

Run thlz998/chat-tts using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "thlz998/chat-tts:864cbf22ea816a82f3da143049cd5d4c94b95b58567c98536165a44c74b540d1",
    "input": {
      "text": "chat T T S 是一款强大的对话式文本转语音模型。它有中英混读和多说话人的能力。\\nchat T T S 不仅能够生成自然流畅的语音，还能控制[laugh]笑声啊[laugh]，\\n停顿啊[uv_break]语气词啊等副语言现象[uv_break]。这个韵律超越了许多开源模型[uv_break]。\\n请注意，chat T T S 的使用应遵守法律和伦理准则，避免滥用的安全风险。[uv_break]",
      "top_k": 20,
      "top_p": 0.7,
      "voice": 2222,
      "prompt": "",
      "skip_refine": 0,
      "temperature": 0.3,
      "custom_voice": 0
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

{ "audio_files": [ { "filename": "https://storage.googleapis.com/replicate-files/xzMIhsaUUf1mRiSOpfXefBqRn5m689wtCaKiPeePnJuI0lnuE/20240602-08_53_33-113d70262afdac3689889619d465b2cd.wav", "audio_duration": 29.46, "inference_time": 195.29 } ] }

{
  "completed_at": "2024-06-02T08:56:50.190305Z",
  "created_at": "2024-06-02T08:52:42.739000Z",
  "data_removed": false,
  "error": null,
  "id": "g2ks2d56edrgg0cfv0kbm9t9m8",
  "input": {
    "text": "chat T T S 是一款强大的对话式文本转语音模型。它有中英混读和多说话人的能力。\nchat T T S 不仅能够生成自然流畅的语音，还能控制[laugh]笑声啊[laugh]，\n停顿啊[uv_break]语气词啊等副语言现象[uv_break]。这个韵律超越了许多开源模型[uv_break]。\n请注意，chat T T S 的使用应遵守法律和伦理准则，避免滥用的安全风险。[uv_break]",
    "top_k": 20,
    "top_p": 0.7,
    "voice": 2222,
    "prompt": "",
    "skip_refine": 0,
    "temperature": 0.3,
    "custom_voice": 0
  },
  "logs": "voice=2222,custom_voice=0\nstart_time=1717318413.4744363\nINFO:ChatTTS.core:All initialized.\n  0%|          | 0/384 [00:00<?, ?it/s]\n  0%|          | 1/384 [00:38<4:07:14, 38.73s/it]\n  1%|          | 2/384 [01:18<4:10:01, 39.27s/it]\n  1%|          | 3/384 [01:56<4:06:43, 38.85s/it]\n  3%|▎         | 10/384 [02:33<1:10:23, 11.29s/it]\n  5%|▌         | 21/384 [02:33<24:04,  3.98s/it]  \n  9%|▊         | 33/384 [02:33<11:36,  1.98s/it]\n11%|█         | 43/384 [02:33<20:20,  3.58s/it]\n  0%|          | 0/2048 [00:00<?, ?it/s]\n  0%|          | 2/2048 [00:37<10:38:51, 18.74s/it]\n  1%|          | 13/2048 [00:37<1:12:01,  2.12s/it]\n  1%|          | 24/2048 [00:37<31:46,  1.06it/s]  \n  2%|▏         | 35/2048 [00:37<17:42,  1.89it/s]\n  2%|▏         | 46/2048 [00:37<10:53,  3.06it/s]\n  3%|▎         | 57/2048 [00:37<07:04,  4.69it/s]\n  3%|▎         | 68/2048 [00:38<04:46,  6.92it/s]\n  4%|▍         | 79/2048 [00:38<03:17,  9.95it/s]\n  4%|▍         | 90/2048 [00:38<02:20, 13.98it/s]\n  5%|▍         | 101/2048 [00:38<01:41, 19.20it/s]\n  5%|▌         | 112/2048 [00:38<01:15, 25.72it/s]\n  6%|▌         | 123/2048 [00:38<00:57, 33.54it/s]\n  7%|▋         | 134/2048 [00:38<00:45, 42.47it/s]\n  7%|▋         | 145/2048 [00:38<00:36, 52.03it/s]\n  8%|▊         | 156/2048 [00:38<00:30, 61.83it/s]\n  8%|▊         | 168/2048 [00:39<00:26, 71.96it/s]\n  9%|▊         | 179/2048 [00:39<00:23, 80.05it/s]\n  9%|▉         | 190/2048 [00:39<00:21, 86.88it/s]\n 10%|▉         | 202/2048 [00:39<00:19, 93.18it/s]\n 10%|█         | 213/2048 [00:39<00:18, 97.53it/s]\n 11%|█         | 224/2048 [00:39<00:18, 100.63it/s]\n 11%|█▏        | 235/2048 [00:39<00:17, 103.13it/s]\n 12%|█▏        | 246/2048 [00:39<00:17, 105.03it/s]\n 13%|█▎        | 258/2048 [00:39<00:16, 106.52it/s]\n 13%|█▎        | 269/2048 [00:39<00:16, 107.17it/s]\n 14%|█▎        | 280/2048 [00:40<00:16, 107.31it/s]\n 14%|█▍        | 291/2048 [00:40<00:16, 107.40it/s]\n 15%|█▍        | 302/2048 [00:40<00:16, 107.45it/s]\n 15%|█▌        | 313/2048 [00:40<00:16, 107.73it/s]\n 16%|█▌        | 324/2048 [00:40<00:15, 107.77it/s]\n 16%|█▋        | 335/2048 [00:40<00:15, 108.04it/s]\n 17%|█▋        | 346/2048 [00:40<00:15, 107.96it/s]\n 17%|█▋        | 357/2048 [00:40<00:15, 108.20it/s]\n 18%|█▊        | 368/2048 [00:40<00:15, 108.13it/s]\n18%|█▊        | 369/2048 [00:40<03:05,  9.03it/s]\n/root/.pyenv/versions/3.10.12/lib/python3.10/site-packages/torch/nn/modules/conv.py:306: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)\nreturn F.conv1d(input, weight, bias, self.stride,\n推理时长: 195.29 秒\n音频时长: 29.46 秒",
  "metrics": {
    "predict_time": 196.737931,
    "total_time": 247.451305
  },
  "output": {
    "audio_files": [
      {
        "filename": "https://storage.googleapis.com/replicate-files/xzMIhsaUUf1mRiSOpfXefBqRn5m689wtCaKiPeePnJuI0lnuE/20240602-08_53_33-113d70262afdac3689889619d465b2cd.wav",
        "audio_duration": 29.46,
        "inference_time": 195.29
      }
    ]
  },
  "started_at": "2024-06-02T08:53:33.452374Z",
  "status": "succeeded",
  "urls": {
    "get": "https://api.replicate.com/v1/predictions/g2ks2d56edrgg0cfv0kbm9t9m8",
    "cancel": "https://api.replicate.com/v1/predictions/g2ks2d56edrgg0cfv0kbm9t9m8/cancel"
  },
  "version": "864cbf22ea816a82f3da143049cd5d4c94b95b58567c98536165a44c74b540d1"
}

Generated in

3 minutes 17 seconds

Tweak it Share Report

voice=2222,custom_voice=0
start_time=1717318413.4744363
INFO:ChatTTS.core:All initialized.
  0%|          | 0/384 [00:00<?, ?it/s]
  0%|          | 1/384 [00:38<4:07:14, 38.73s/it]
  1%|          | 2/384 [01:18<4:10:01, 39.27s/it]
  1%|          | 3/384 [01:56<4:06:43, 38.85s/it]
  3%|▎         | 10/384 [02:33<1:10:23, 11.29s/it]
  5%|▌         | 21/384 [02:33<24:04,  3.98s/it]  
  9%|▊         | 33/384 [02:33<11:36,  1.98s/it]
11%|█         | 43/384 [02:33<20:20,  3.58s/it]
  0%|          | 0/2048 [00:00<?, ?it/s]
  0%|          | 2/2048 [00:37<10:38:51, 18.74s/it]
  1%|          | 13/2048 [00:37<1:12:01,  2.12s/it]
  1%|          | 24/2048 [00:37<31:46,  1.06it/s]  
  2%|▏         | 35/2048 [00:37<17:42,  1.89it/s]
  2%|▏         | 46/2048 [00:37<10:53,  3.06it/s]
  3%|▎         | 57/2048 [00:37<07:04,  4.69it/s]
  3%|▎         | 68/2048 [00:38<04:46,  6.92it/s]
  4%|▍         | 79/2048 [00:38<03:17,  9.95it/s]
  4%|▍         | 90/2048 [00:38<02:20, 13.98it/s]
  5%|▍         | 101/2048 [00:38<01:41, 19.20it/s]
  5%|▌         | 112/2048 [00:38<01:15, 25.72it/s]
  6%|▌         | 123/2048 [00:38<00:57, 33.54it/s]
  7%|▋         | 134/2048 [00:38<00:45, 42.47it/s]
  7%|▋         | 145/2048 [00:38<00:36, 52.03it/s]
  8%|▊         | 156/2048 [00:38<00:30, 61.83it/s]
  8%|▊         | 168/2048 [00:39<00:26, 71.96it/s]
  9%|▊         | 179/2048 [00:39<00:23, 80.05it/s]
  9%|▉         | 190/2048 [00:39<00:21, 86.88it/s]
 10%|▉         | 202/2048 [00:39<00:19, 93.18it/s]
 10%|█         | 213/2048 [00:39<00:18, 97.53it/s]
 11%|█         | 224/2048 [00:39<00:18, 100.63it/s]
 11%|█▏        | 235/2048 [00:39<00:17, 103.13it/s]
 12%|█▏        | 246/2048 [00:39<00:17, 105.03it/s]
 13%|█▎        | 258/2048 [00:39<00:16, 106.52it/s]
 13%|█▎        | 269/2048 [00:39<00:16, 107.17it/s]
 14%|█▎        | 280/2048 [00:40<00:16, 107.31it/s]
 14%|█▍        | 291/2048 [00:40<00:16, 107.40it/s]
 15%|█▍        | 302/2048 [00:40<00:16, 107.45it/s]
 15%|█▌        | 313/2048 [00:40<00:16, 107.73it/s]
 16%|█▌        | 324/2048 [00:40<00:15, 107.77it/s]
 16%|█▋        | 335/2048 [00:40<00:15, 108.04it/s]
 17%|█▋        | 346/2048 [00:40<00:15, 107.96it/s]
 17%|█▋        | 357/2048 [00:40<00:15, 108.20it/s]
 18%|█▊        | 368/2048 [00:40<00:15, 108.13it/s]
18%|█▊        | 369/2048 [00:40<03:05,  9.03it/s]
/root/.pyenv/versions/3.10.12/lib/python3.10/site-packages/torch/nn/modules/conv.py:306: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)
return F.conv1d(input, weight, bias, self.stride,
推理时长: 195.29 秒
音频时长: 29.46 秒

Prediction

thlz998/chat-tts:864cbf22ea816a82f3da143049cd5d4c94b95b58567c98536165a44c74b540d1

Model

thlz998/chat-tts:864cbf22

pz5wsbkr5drgc0cfv2f9pkfrv4

Status

Succeeded

Source

Web

Hardware

A100 (40GB)

Total duration

5m 31s

Created

12 months ago

Input

text: chat T T S is a text to speech model designed for dialogue applications. [uv_break]it supports mixed language input [uv_break]and offers multi speaker capabilities with precise control over prosodic elements [laugh]like like [uv_break]laughter[laugh], [uv_break]pauses, [uv_break]and intonation. [uv_break]it delivers natural and expressive speech,[uv_break]so please [uv_break] use the project responsibly at your own risk.[uv_break]
top_k: 20
top_p: 0.7
voice: 2222
prompt
skip_refine: 0
temperature: 0.3
custom_voice: 0

{
  "text": "chat T T S is a text to speech model designed for dialogue applications. \n[uv_break]it supports mixed language input [uv_break]and offers multi speaker \ncapabilities with precise control over prosodic elements [laugh]like like \n[uv_break]laughter[laugh], [uv_break]pauses, [uv_break]and intonation. \n[uv_break]it delivers natural and expressive speech,[uv_break]so please\n[uv_break] use the project responsibly at your own risk.[uv_break]",
  "top_k": 20,
  "top_p": 0.7,
  "voice": 2222,
  "prompt": "",
  "skip_refine": 0,
  "temperature": 0.3,
  "custom_voice": 0
}

Install Replicate’s Node.js client library:

npm install replicate

Import and set up the client:

import Replicate from "replicate";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run thlz998/chat-tts using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "thlz998/chat-tts:864cbf22ea816a82f3da143049cd5d4c94b95b58567c98536165a44c74b540d1",
  {
    input: {
      text: "chat T T S is a text to speech model designed for dialogue applications. \n[uv_break]it supports mixed language input [uv_break]and offers multi speaker \ncapabilities with precise control over prosodic elements [laugh]like like \n[uv_break]laughter[laugh], [uv_break]pauses, [uv_break]and intonation. \n[uv_break]it delivers natural and expressive speech,[uv_break]so please\n[uv_break] use the project responsibly at your own risk.[uv_break]",
      top_k: 20,
      top_p: 0.7,
      voice: 2222,
      prompt: "",
      skip_refine: 0,
      temperature: 0.3,
      custom_voice: 0
    }
  }
);

console.log(output);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Import the client:

import replicate

Run thlz998/chat-tts using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "thlz998/chat-tts:864cbf22ea816a82f3da143049cd5d4c94b95b58567c98536165a44c74b540d1",
    input={
        "text": "chat T T S is a text to speech model designed for dialogue applications. \n[uv_break]it supports mixed language input [uv_break]and offers multi speaker \ncapabilities with precise control over prosodic elements [laugh]like like \n[uv_break]laughter[laugh], [uv_break]pauses, [uv_break]and intonation. \n[uv_break]it delivers natural and expressive speech,[uv_break]so please\n[uv_break] use the project responsibly at your own risk.[uv_break]",
        "top_k": 20,
        "top_p": 0.7,
        "voice": 2222,
        "prompt": "",
        "skip_refine": 0,
        "temperature": 0.3,
        "custom_voice": 0
    }
)
print(output)

To learn more, take a look at the guide on getting started with Python.

Run thlz998/chat-tts using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "thlz998/chat-tts:864cbf22ea816a82f3da143049cd5d4c94b95b58567c98536165a44c74b540d1",
    "input": {
      "text": "chat T T S is a text to speech model designed for dialogue applications. \\n[uv_break]it supports mixed language input [uv_break]and offers multi speaker \\ncapabilities with precise control over prosodic elements [laugh]like like \\n[uv_break]laughter[laugh], [uv_break]pauses, [uv_break]and intonation. \\n[uv_break]it delivers natural and expressive speech,[uv_break]so please\\n[uv_break] use the project responsibly at your own risk.[uv_break]",
      "top_k": 20,
      "top_p": 0.7,
      "voice": 2222,
      "prompt": "",
      "skip_refine": 0,
      "temperature": 0.3,
      "custom_voice": 0
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

{ "audio_files": [ { "filename": "https://storage.googleapis.com/r8-outputs-us-central1-long-term/zE2cHcpQeGXEPSKjCwJlczG9VZaiAKUgdP65E03cofflmA1lA/20240602-11_05_04-e69e86d99788cc217ec9ab0b3e156bc7.wav", "audio_duration": 30.31, "inference_time": 241.26 } ] }

{
  "completed_at": "2024-06-02T11:09:06.210195Z",
  "created_at": "2024-06-02T11:03:35.211000Z",
  "data_removed": false,
  "error": null,
  "id": "pz5wsbkr5drgc0cfv2f9pkfrv4",
  "input": {
    "text": "chat T T S is a text to speech model designed for dialogue applications. \n[uv_break]it supports mixed language input [uv_break]and offers multi speaker \ncapabilities with precise control over prosodic elements [laugh]like like \n[uv_break]laughter[laugh], [uv_break]pauses, [uv_break]and intonation. \n[uv_break]it delivers natural and expressive speech,[uv_break]so please\n[uv_break] use the project responsibly at your own risk.[uv_break]",
    "top_k": 20,
    "top_p": 0.7,
    "voice": 2222,
    "prompt": "",
    "skip_refine": 0,
    "temperature": 0.3,
    "custom_voice": 0
  },
  "logs": "voice=2222,custom_voice=0\nstart_time=1717326304.6768682\nINFO:ChatTTS.core:All initialized.\n  0%|          | 0/384 [00:00<?, ?it/s]\n  0%|          | 1/384 [00:59<6:17:42, 59.17s/it]\n  1%|          | 2/384 [02:00<6:25:22, 60.53s/it]\n  1%|          | 3/384 [02:58<6:15:42, 59.17s/it]\n  2%|▏         | 8/384 [03:53<2:20:21, 22.40s/it]\n  4%|▍         | 15/384 [03:53<54:29,  8.86s/it] \n  6%|▌         | 23/384 [03:54<26:51,  4.46s/it]\n7%|▋         | 26/384 [03:54<53:43,  9.00s/it]\n  0%|          | 0/2048 [00:00<?, ?it/s]\n  0%|          | 4/2048 [00:00<00:53, 38.01it/s]\n  1%|          | 12/2048 [00:00<00:35, 57.54it/s]\n  1%|          | 20/2048 [00:00<00:31, 63.59it/s]\n  1%|▏         | 28/2048 [00:00<00:30, 66.59it/s]\n  2%|▏         | 36/2048 [00:00<00:29, 68.27it/s]\n  2%|▏         | 44/2048 [00:00<00:28, 69.41it/s]\n  3%|▎         | 52/2048 [00:00<00:28, 70.02it/s]\n  3%|▎         | 59/2048 [00:01<00:48, 40.86it/s]\n  3%|▎         | 67/2048 [00:01<00:41, 47.45it/s]\n  4%|▎         | 75/2048 [00:01<00:37, 53.12it/s]\n  4%|▍         | 83/2048 [00:01<00:34, 57.77it/s]\n  4%|▍         | 91/2048 [00:01<00:31, 61.23it/s]\n  5%|▍         | 99/2048 [00:01<00:30, 64.09it/s]\n  5%|▌         | 107/2048 [00:01<00:29, 66.13it/s]\n  6%|▌         | 115/2048 [00:01<00:28, 67.23it/s]\n  6%|▌         | 122/2048 [00:02<00:28, 67.62it/s]\n  6%|▋         | 130/2048 [00:02<00:28, 68.40it/s]\n  7%|▋         | 138/2048 [00:02<00:27, 69.00it/s]\n  7%|▋         | 146/2048 [00:02<00:27, 69.42it/s]\n  8%|▊         | 154/2048 [00:02<00:27, 69.48it/s]\n  8%|▊         | 161/2048 [00:02<00:27, 69.45it/s]\n  8%|▊         | 169/2048 [00:02<00:26, 69.89it/s]\n  9%|▊         | 177/2048 [00:02<00:26, 70.12it/s]\n  9%|▉         | 185/2048 [00:02<00:26, 69.94it/s]\n  9%|▉         | 193/2048 [00:03<00:26, 69.65it/s]\n 10%|▉         | 201/2048 [00:03<00:26, 69.83it/s]\n 10%|█         | 208/2048 [00:03<00:26, 69.83it/s]\n 11%|█         | 216/2048 [00:03<00:26, 70.07it/s]\n 11%|█         | 224/2048 [00:03<00:25, 70.29it/s]\n 11%|█▏        | 232/2048 [00:03<00:25, 70.50it/s]\n 12%|█▏        | 240/2048 [00:03<00:25, 70.27it/s]\n 12%|█▏        | 248/2048 [00:03<00:25, 70.02it/s]\n 12%|█▎        | 256/2048 [00:03<00:25, 69.59it/s]\n 13%|█▎        | 263/2048 [00:04<00:25, 69.54it/s]\n 13%|█▎        | 271/2048 [00:04<00:25, 69.85it/s]\n 14%|█▎        | 278/2048 [00:04<00:25, 69.70it/s]\n 14%|█▍        | 286/2048 [00:04<00:25, 70.08it/s]\n 14%|█▍        | 294/2048 [00:04<00:24, 70.29it/s]\n 15%|█▍        | 302/2048 [00:04<00:24, 70.12it/s]\n 15%|█▌        | 310/2048 [00:04<00:24, 70.43it/s]\n 16%|█▌        | 318/2048 [00:04<00:24, 70.25it/s]\n 16%|█▌        | 326/2048 [00:04<00:24, 69.53it/s]\n 16%|█▋        | 333/2048 [00:05<00:24, 69.43it/s]\n 17%|█▋        | 341/2048 [00:05<00:24, 69.69it/s]\n 17%|█▋        | 349/2048 [00:05<00:24, 69.97it/s]\n 17%|█▋        | 356/2048 [00:05<00:24, 68.87it/s]\n 18%|█▊        | 364/2048 [00:05<00:24, 69.43it/s]\n 18%|█▊        | 372/2048 [00:05<00:23, 69.93it/s]\n 19%|█▊        | 380/2048 [00:05<00:23, 70.29it/s]\n 19%|█▉        | 388/2048 [00:05<00:23, 70.55it/s]\n 19%|█▉        | 396/2048 [00:05<00:23, 69.41it/s]\n20%|█▉        | 402/2048 [00:06<00:24, 66.65it/s]\n推理时长: 241.26 秒\n音频时长: 30.31 秒",
  "metrics": {
    "predict_time": 241.575889,
    "total_time": 330.999195
  },
  "output": {
    "audio_files": [
      {
        "filename": "https://storage.googleapis.com/r8-outputs-us-central1-long-term/zE2cHcpQeGXEPSKjCwJlczG9VZaiAKUgdP65E03cofflmA1lA/20240602-11_05_04-e69e86d99788cc217ec9ab0b3e156bc7.wav",
        "audio_duration": 30.31,
        "inference_time": 241.26
      }
    ]
  },
  "started_at": "2024-06-02T11:05:04.634306Z",
  "status": "succeeded",
  "urls": {
    "get": "https://api.replicate.com/v1/predictions/pz5wsbkr5drgc0cfv2f9pkfrv4",
    "cancel": "https://api.replicate.com/v1/predictions/pz5wsbkr5drgc0cfv2f9pkfrv4/cancel"
  },
  "version": "864cbf22ea816a82f3da143049cd5d4c94b95b58567c98536165a44c74b540d1"
}

Generated in

4 minutes

Tweak it Share Report

voice=2222,custom_voice=0
start_time=1717326304.6768682
INFO:ChatTTS.core:All initialized.
  0%|          | 0/384 [00:00<?, ?it/s]
  0%|          | 1/384 [00:59<6:17:42, 59.17s/it]
  1%|          | 2/384 [02:00<6:25:22, 60.53s/it]
  1%|          | 3/384 [02:58<6:15:42, 59.17s/it]
  2%|▏         | 8/384 [03:53<2:20:21, 22.40s/it]
  4%|▍         | 15/384 [03:53<54:29,  8.86s/it] 
  6%|▌         | 23/384 [03:54<26:51,  4.46s/it]
7%|▋         | 26/384 [03:54<53:43,  9.00s/it]
  0%|          | 0/2048 [00:00<?, ?it/s]
  0%|          | 4/2048 [00:00<00:53, 38.01it/s]
  1%|          | 12/2048 [00:00<00:35, 57.54it/s]
  1%|          | 20/2048 [00:00<00:31, 63.59it/s]
  1%|▏         | 28/2048 [00:00<00:30, 66.59it/s]
  2%|▏         | 36/2048 [00:00<00:29, 68.27it/s]
  2%|▏         | 44/2048 [00:00<00:28, 69.41it/s]
  3%|▎         | 52/2048 [00:00<00:28, 70.02it/s]
  3%|▎         | 59/2048 [00:01<00:48, 40.86it/s]
  3%|▎         | 67/2048 [00:01<00:41, 47.45it/s]
  4%|▎         | 75/2048 [00:01<00:37, 53.12it/s]
  4%|▍         | 83/2048 [00:01<00:34, 57.77it/s]
  4%|▍         | 91/2048 [00:01<00:31, 61.23it/s]
  5%|▍         | 99/2048 [00:01<00:30, 64.09it/s]
  5%|▌         | 107/2048 [00:01<00:29, 66.13it/s]
  6%|▌         | 115/2048 [00:01<00:28, 67.23it/s]
  6%|▌         | 122/2048 [00:02<00:28, 67.62it/s]
  6%|▋         | 130/2048 [00:02<00:28, 68.40it/s]
  7%|▋         | 138/2048 [00:02<00:27, 69.00it/s]
  7%|▋         | 146/2048 [00:02<00:27, 69.42it/s]
  8%|▊         | 154/2048 [00:02<00:27, 69.48it/s]
  8%|▊         | 161/2048 [00:02<00:27, 69.45it/s]
  8%|▊         | 169/2048 [00:02<00:26, 69.89it/s]
  9%|▊         | 177/2048 [00:02<00:26, 70.12it/s]
  9%|▉         | 185/2048 [00:02<00:26, 69.94it/s]
  9%|▉         | 193/2048 [00:03<00:26, 69.65it/s]
 10%|▉         | 201/2048 [00:03<00:26, 69.83it/s]
 10%|█         | 208/2048 [00:03<00:26, 69.83it/s]
 11%|█         | 216/2048 [00:03<00:26, 70.07it/s]
 11%|█         | 224/2048 [00:03<00:25, 70.29it/s]
 11%|█▏        | 232/2048 [00:03<00:25, 70.50it/s]
 12%|█▏        | 240/2048 [00:03<00:25, 70.27it/s]
 12%|█▏        | 248/2048 [00:03<00:25, 70.02it/s]
 12%|█▎        | 256/2048 [00:03<00:25, 69.59it/s]
 13%|█▎        | 263/2048 [00:04<00:25, 69.54it/s]
 13%|█▎        | 271/2048 [00:04<00:25, 69.85it/s]
 14%|█▎        | 278/2048 [00:04<00:25, 69.70it/s]
 14%|█▍        | 286/2048 [00:04<00:25, 70.08it/s]
 14%|█▍        | 294/2048 [00:04<00:24, 70.29it/s]
 15%|█▍        | 302/2048 [00:04<00:24, 70.12it/s]
 15%|█▌        | 310/2048 [00:04<00:24, 70.43it/s]
 16%|█▌        | 318/2048 [00:04<00:24, 70.25it/s]
 16%|█▌        | 326/2048 [00:04<00:24, 69.53it/s]
 16%|█▋        | 333/2048 [00:05<00:24, 69.43it/s]
 17%|█▋        | 341/2048 [00:05<00:24, 69.69it/s]
 17%|█▋        | 349/2048 [00:05<00:24, 69.97it/s]
 17%|█▋        | 356/2048 [00:05<00:24, 68.87it/s]
 18%|█▊        | 364/2048 [00:05<00:24, 69.43it/s]
 18%|█▊        | 372/2048 [00:05<00:23, 69.93it/s]
 19%|█▊        | 380/2048 [00:05<00:23, 70.29it/s]
 19%|█▉        | 388/2048 [00:05<00:23, 70.55it/s]
 19%|█▉        | 396/2048 [00:05<00:23, 69.41it/s]
20%|█▉        | 402/2048 [00:06<00:24, 66.65it/s]
推理时长: 241.26 秒
音频时长: 30.31 秒

Want to make some of these yourself?

Run this model

thlz998 / chat-tts

Prediction

Input

Output

Prediction

Input

Output

Logs (g2ks2d56edrgg0cfv0kbm9t9m8)

Logs (pz5wsbkr5drgc0cfv2f9pkfrv4)