thlz998 / chat-tts
This is an implementation of the ChatTTS as a Cog model.
Prediction
thlz998/chat-tts:864cbf22ea816a82f3da143049cd5d4c94b95b58567c98536165a44c74b540d1Input
- text
- chat T T S 是一款强大的对话式文本转语音模型。它有中英混读和多说话人的能力。 chat T T S 不仅能够生成自然流畅的语音,还能控制[laugh]笑声啊[laugh], 停顿啊[uv_break]语气词啊等副语言现象[uv_break]。这个韵律超越了许多开源模型[uv_break]。 请注意,chat T T S 的使用应遵守法律和伦理准则,避免滥用的安全风险。[uv_break]
- top_k
- 20
- top_p
- 0.7
- voice
- 2222
- prompt
- skip_refine
- 0
- temperature
- 0.3
- custom_voice
- 0
{ "text": "chat T T S 是一款强大的对话式文本转语音模型。它有中英混读和多说话人的能力。\nchat T T S 不仅能够生成自然流畅的语音,还能控制[laugh]笑声啊[laugh],\n停顿啊[uv_break]语气词啊等副语言现象[uv_break]。这个韵律超越了许多开源模型[uv_break]。\n请注意,chat T T S 的使用应遵守法律和伦理准则,避免滥用的安全风险。[uv_break]", "top_k": 20, "top_p": 0.7, "voice": 2222, "prompt": "", "skip_refine": 0, "temperature": 0.3, "custom_voice": 0 }
Install Replicate’s Node.js client library:npm install replicate
Import and set up the client:import Replicate from "replicate"; const replicate = new Replicate({ auth: process.env.REPLICATE_API_TOKEN, });
Run thlz998/chat-tts using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
const output = await replicate.run( "thlz998/chat-tts:864cbf22ea816a82f3da143049cd5d4c94b95b58567c98536165a44c74b540d1", { input: { text: "chat T T S 是一款强大的对话式文本转语音模型。它有中英混读和多说话人的能力。\nchat T T S 不仅能够生成自然流畅的语音,还能控制[laugh]笑声啊[laugh],\n停顿啊[uv_break]语气词啊等副语言现象[uv_break]。这个韵律超越了许多开源模型[uv_break]。\n请注意,chat T T S 的使用应遵守法律和伦理准则,避免滥用的安全风险。[uv_break]", top_k: 20, top_p: 0.7, voice: 2222, prompt: "", skip_refine: 0, temperature: 0.3, custom_voice: 0 } } ); console.log(output);
To learn more, take a look at the guide on getting started with Node.js.
Install Replicate’s Python client library:pip install replicate
Import the client:import replicate
Run thlz998/chat-tts using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
output = replicate.run( "thlz998/chat-tts:864cbf22ea816a82f3da143049cd5d4c94b95b58567c98536165a44c74b540d1", input={ "text": "chat T T S 是一款强大的对话式文本转语音模型。它有中英混读和多说话人的能力。\nchat T T S 不仅能够生成自然流畅的语音,还能控制[laugh]笑声啊[laugh],\n停顿啊[uv_break]语气词啊等副语言现象[uv_break]。这个韵律超越了许多开源模型[uv_break]。\n请注意,chat T T S 的使用应遵守法律和伦理准则,避免滥用的安全风险。[uv_break]", "top_k": 20, "top_p": 0.7, "voice": 2222, "prompt": "", "skip_refine": 0, "temperature": 0.3, "custom_voice": 0 } ) print(output)
To learn more, take a look at the guide on getting started with Python.
Run thlz998/chat-tts using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
curl -s -X POST \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ -H "Content-Type: application/json" \ -H "Prefer: wait" \ -d $'{ "version": "thlz998/chat-tts:864cbf22ea816a82f3da143049cd5d4c94b95b58567c98536165a44c74b540d1", "input": { "text": "chat T T S 是一款强大的对话式文本转语音模型。它有中英混读和多说话人的能力。\\nchat T T S 不仅能够生成自然流畅的语音,还能控制[laugh]笑声啊[laugh],\\n停顿啊[uv_break]语气词啊等副语言现象[uv_break]。这个韵律超越了许多开源模型[uv_break]。\\n请注意,chat T T S 的使用应遵守法律和伦理准则,避免滥用的安全风险。[uv_break]", "top_k": 20, "top_p": 0.7, "voice": 2222, "prompt": "", "skip_refine": 0, "temperature": 0.3, "custom_voice": 0 } }' \ https://api.replicate.com/v1/predictions
To learn more, take a look at Replicate’s HTTP API reference docs.
Output
{ "audio_files": [ { "filename": "https://storage.googleapis.com/replicate-files/xzMIhsaUUf1mRiSOpfXefBqRn5m689wtCaKiPeePnJuI0lnuE/20240602-08_53_33-113d70262afdac3689889619d465b2cd.wav", "audio_duration": 29.46, "inference_time": 195.29 } ] }{ "completed_at": "2024-06-02T08:56:50.190305Z", "created_at": "2024-06-02T08:52:42.739000Z", "data_removed": false, "error": null, "id": "g2ks2d56edrgg0cfv0kbm9t9m8", "input": { "text": "chat T T S 是一款强大的对话式文本转语音模型。它有中英混读和多说话人的能力。\nchat T T S 不仅能够生成自然流畅的语音,还能控制[laugh]笑声啊[laugh],\n停顿啊[uv_break]语气词啊等副语言现象[uv_break]。这个韵律超越了许多开源模型[uv_break]。\n请注意,chat T T S 的使用应遵守法律和伦理准则,避免滥用的安全风险。[uv_break]", "top_k": 20, "top_p": 0.7, "voice": 2222, "prompt": "", "skip_refine": 0, "temperature": 0.3, "custom_voice": 0 }, "logs": "voice=2222,custom_voice=0\nstart_time=1717318413.4744363\nINFO:ChatTTS.core:All initialized.\n 0%| | 0/384 [00:00<?, ?it/s]\n 0%| | 1/384 [00:38<4:07:14, 38.73s/it]\n 1%| | 2/384 [01:18<4:10:01, 39.27s/it]\n 1%| | 3/384 [01:56<4:06:43, 38.85s/it]\n 3%|▎ | 10/384 [02:33<1:10:23, 11.29s/it]\n 5%|▌ | 21/384 [02:33<24:04, 3.98s/it] \n 9%|▊ | 33/384 [02:33<11:36, 1.98s/it]\n11%|█ | 43/384 [02:33<20:20, 3.58s/it]\n 0%| | 0/2048 [00:00<?, ?it/s]\n 0%| | 2/2048 [00:37<10:38:51, 18.74s/it]\n 1%| | 13/2048 [00:37<1:12:01, 2.12s/it]\n 1%| | 24/2048 [00:37<31:46, 1.06it/s] \n 2%|▏ | 35/2048 [00:37<17:42, 1.89it/s]\n 2%|▏ | 46/2048 [00:37<10:53, 3.06it/s]\n 3%|▎ | 57/2048 [00:37<07:04, 4.69it/s]\n 3%|▎ | 68/2048 [00:38<04:46, 6.92it/s]\n 4%|▍ | 79/2048 [00:38<03:17, 9.95it/s]\n 4%|▍ | 90/2048 [00:38<02:20, 13.98it/s]\n 5%|▍ | 101/2048 [00:38<01:41, 19.20it/s]\n 5%|▌ | 112/2048 [00:38<01:15, 25.72it/s]\n 6%|▌ | 123/2048 [00:38<00:57, 33.54it/s]\n 7%|▋ | 134/2048 [00:38<00:45, 42.47it/s]\n 7%|▋ | 145/2048 [00:38<00:36, 52.03it/s]\n 8%|▊ | 156/2048 [00:38<00:30, 61.83it/s]\n 8%|▊ | 168/2048 [00:39<00:26, 71.96it/s]\n 9%|▊ | 179/2048 [00:39<00:23, 80.05it/s]\n 9%|▉ | 190/2048 [00:39<00:21, 86.88it/s]\n 10%|▉ | 202/2048 [00:39<00:19, 93.18it/s]\n 10%|█ | 213/2048 [00:39<00:18, 97.53it/s]\n 11%|█ | 224/2048 [00:39<00:18, 100.63it/s]\n 11%|█▏ | 235/2048 [00:39<00:17, 103.13it/s]\n 12%|█▏ | 246/2048 [00:39<00:17, 105.03it/s]\n 13%|█▎ | 258/2048 [00:39<00:16, 106.52it/s]\n 13%|█▎ | 269/2048 [00:39<00:16, 107.17it/s]\n 14%|█▎ | 280/2048 [00:40<00:16, 107.31it/s]\n 14%|█▍ | 291/2048 [00:40<00:16, 107.40it/s]\n 15%|█▍ | 302/2048 [00:40<00:16, 107.45it/s]\n 15%|█▌ | 313/2048 [00:40<00:16, 107.73it/s]\n 16%|█▌ | 324/2048 [00:40<00:15, 107.77it/s]\n 16%|█▋ | 335/2048 [00:40<00:15, 108.04it/s]\n 17%|█▋ | 346/2048 [00:40<00:15, 107.96it/s]\n 17%|█▋ | 357/2048 [00:40<00:15, 108.20it/s]\n 18%|█▊ | 368/2048 [00:40<00:15, 108.13it/s]\n18%|█▊ | 369/2048 [00:40<03:05, 9.03it/s]\n/root/.pyenv/versions/3.10.12/lib/python3.10/site-packages/torch/nn/modules/conv.py:306: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)\nreturn F.conv1d(input, weight, bias, self.stride,\n推理时长: 195.29 秒\n音频时长: 29.46 秒", "metrics": { "predict_time": 196.737931, "total_time": 247.451305 }, "output": { "audio_files": [ { "filename": "https://storage.googleapis.com/replicate-files/xzMIhsaUUf1mRiSOpfXefBqRn5m689wtCaKiPeePnJuI0lnuE/20240602-08_53_33-113d70262afdac3689889619d465b2cd.wav", "audio_duration": 29.46, "inference_time": 195.29 } ] }, "started_at": "2024-06-02T08:53:33.452374Z", "status": "succeeded", "urls": { "get": "https://api.replicate.com/v1/predictions/g2ks2d56edrgg0cfv0kbm9t9m8", "cancel": "https://api.replicate.com/v1/predictions/g2ks2d56edrgg0cfv0kbm9t9m8/cancel" }, "version": "864cbf22ea816a82f3da143049cd5d4c94b95b58567c98536165a44c74b540d1" }
Generated invoice=2222,custom_voice=0 start_time=1717318413.4744363 INFO:ChatTTS.core:All initialized. 0%| | 0/384 [00:00<?, ?it/s] 0%| | 1/384 [00:38<4:07:14, 38.73s/it] 1%| | 2/384 [01:18<4:10:01, 39.27s/it] 1%| | 3/384 [01:56<4:06:43, 38.85s/it] 3%|▎ | 10/384 [02:33<1:10:23, 11.29s/it] 5%|▌ | 21/384 [02:33<24:04, 3.98s/it] 9%|▊ | 33/384 [02:33<11:36, 1.98s/it] 11%|█ | 43/384 [02:33<20:20, 3.58s/it] 0%| | 0/2048 [00:00<?, ?it/s] 0%| | 2/2048 [00:37<10:38:51, 18.74s/it] 1%| | 13/2048 [00:37<1:12:01, 2.12s/it] 1%| | 24/2048 [00:37<31:46, 1.06it/s] 2%|▏ | 35/2048 [00:37<17:42, 1.89it/s] 2%|▏ | 46/2048 [00:37<10:53, 3.06it/s] 3%|▎ | 57/2048 [00:37<07:04, 4.69it/s] 3%|▎ | 68/2048 [00:38<04:46, 6.92it/s] 4%|▍ | 79/2048 [00:38<03:17, 9.95it/s] 4%|▍ | 90/2048 [00:38<02:20, 13.98it/s] 5%|▍ | 101/2048 [00:38<01:41, 19.20it/s] 5%|▌ | 112/2048 [00:38<01:15, 25.72it/s] 6%|▌ | 123/2048 [00:38<00:57, 33.54it/s] 7%|▋ | 134/2048 [00:38<00:45, 42.47it/s] 7%|▋ | 145/2048 [00:38<00:36, 52.03it/s] 8%|▊ | 156/2048 [00:38<00:30, 61.83it/s] 8%|▊ | 168/2048 [00:39<00:26, 71.96it/s] 9%|▊ | 179/2048 [00:39<00:23, 80.05it/s] 9%|▉ | 190/2048 [00:39<00:21, 86.88it/s] 10%|▉ | 202/2048 [00:39<00:19, 93.18it/s] 10%|█ | 213/2048 [00:39<00:18, 97.53it/s] 11%|█ | 224/2048 [00:39<00:18, 100.63it/s] 11%|█▏ | 235/2048 [00:39<00:17, 103.13it/s] 12%|█▏ | 246/2048 [00:39<00:17, 105.03it/s] 13%|█▎ | 258/2048 [00:39<00:16, 106.52it/s] 13%|█▎ | 269/2048 [00:39<00:16, 107.17it/s] 14%|█▎ | 280/2048 [00:40<00:16, 107.31it/s] 14%|█▍ | 291/2048 [00:40<00:16, 107.40it/s] 15%|█▍ | 302/2048 [00:40<00:16, 107.45it/s] 15%|█▌ | 313/2048 [00:40<00:16, 107.73it/s] 16%|█▌ | 324/2048 [00:40<00:15, 107.77it/s] 16%|█▋ | 335/2048 [00:40<00:15, 108.04it/s] 17%|█▋ | 346/2048 [00:40<00:15, 107.96it/s] 17%|█▋ | 357/2048 [00:40<00:15, 108.20it/s] 18%|█▊ | 368/2048 [00:40<00:15, 108.13it/s] 18%|█▊ | 369/2048 [00:40<03:05, 9.03it/s] /root/.pyenv/versions/3.10.12/lib/python3.10/site-packages/torch/nn/modules/conv.py:306: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.) return F.conv1d(input, weight, bias, self.stride, 推理时长: 195.29 秒 音频时长: 29.46 秒
Prediction
thlz998/chat-tts:864cbf22ea816a82f3da143049cd5d4c94b95b58567c98536165a44c74b540d1IDpz5wsbkr5drgc0cfv2f9pkfrv4StatusSucceededSourceWebHardwareA100 (40GB)Total durationCreatedInput
- text
- chat T T S is a text to speech model designed for dialogue applications. [uv_break]it supports mixed language input [uv_break]and offers multi speaker capabilities with precise control over prosodic elements [laugh]like like [uv_break]laughter[laugh], [uv_break]pauses, [uv_break]and intonation. [uv_break]it delivers natural and expressive speech,[uv_break]so please [uv_break] use the project responsibly at your own risk.[uv_break]
- top_k
- 20
- top_p
- 0.7
- voice
- 2222
- prompt
- skip_refine
- 0
- temperature
- 0.3
- custom_voice
- 0
{ "text": "chat T T S is a text to speech model designed for dialogue applications. \n[uv_break]it supports mixed language input [uv_break]and offers multi speaker \ncapabilities with precise control over prosodic elements [laugh]like like \n[uv_break]laughter[laugh], [uv_break]pauses, [uv_break]and intonation. \n[uv_break]it delivers natural and expressive speech,[uv_break]so please\n[uv_break] use the project responsibly at your own risk.[uv_break]", "top_k": 20, "top_p": 0.7, "voice": 2222, "prompt": "", "skip_refine": 0, "temperature": 0.3, "custom_voice": 0 }
Install Replicate’s Node.js client library:npm install replicate
Import and set up the client:import Replicate from "replicate"; const replicate = new Replicate({ auth: process.env.REPLICATE_API_TOKEN, });
Run thlz998/chat-tts using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
const output = await replicate.run( "thlz998/chat-tts:864cbf22ea816a82f3da143049cd5d4c94b95b58567c98536165a44c74b540d1", { input: { text: "chat T T S is a text to speech model designed for dialogue applications. \n[uv_break]it supports mixed language input [uv_break]and offers multi speaker \ncapabilities with precise control over prosodic elements [laugh]like like \n[uv_break]laughter[laugh], [uv_break]pauses, [uv_break]and intonation. \n[uv_break]it delivers natural and expressive speech,[uv_break]so please\n[uv_break] use the project responsibly at your own risk.[uv_break]", top_k: 20, top_p: 0.7, voice: 2222, prompt: "", skip_refine: 0, temperature: 0.3, custom_voice: 0 } } ); console.log(output);
To learn more, take a look at the guide on getting started with Node.js.
Install Replicate’s Python client library:pip install replicate
Import the client:import replicate
Run thlz998/chat-tts using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
output = replicate.run( "thlz998/chat-tts:864cbf22ea816a82f3da143049cd5d4c94b95b58567c98536165a44c74b540d1", input={ "text": "chat T T S is a text to speech model designed for dialogue applications. \n[uv_break]it supports mixed language input [uv_break]and offers multi speaker \ncapabilities with precise control over prosodic elements [laugh]like like \n[uv_break]laughter[laugh], [uv_break]pauses, [uv_break]and intonation. \n[uv_break]it delivers natural and expressive speech,[uv_break]so please\n[uv_break] use the project responsibly at your own risk.[uv_break]", "top_k": 20, "top_p": 0.7, "voice": 2222, "prompt": "", "skip_refine": 0, "temperature": 0.3, "custom_voice": 0 } ) print(output)
To learn more, take a look at the guide on getting started with Python.
Run thlz998/chat-tts using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
curl -s -X POST \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ -H "Content-Type: application/json" \ -H "Prefer: wait" \ -d $'{ "version": "thlz998/chat-tts:864cbf22ea816a82f3da143049cd5d4c94b95b58567c98536165a44c74b540d1", "input": { "text": "chat T T S is a text to speech model designed for dialogue applications. \\n[uv_break]it supports mixed language input [uv_break]and offers multi speaker \\ncapabilities with precise control over prosodic elements [laugh]like like \\n[uv_break]laughter[laugh], [uv_break]pauses, [uv_break]and intonation. \\n[uv_break]it delivers natural and expressive speech,[uv_break]so please\\n[uv_break] use the project responsibly at your own risk.[uv_break]", "top_k": 20, "top_p": 0.7, "voice": 2222, "prompt": "", "skip_refine": 0, "temperature": 0.3, "custom_voice": 0 } }' \ https://api.replicate.com/v1/predictions
To learn more, take a look at Replicate’s HTTP API reference docs.
Output
{ "audio_files": [ { "filename": "https://storage.googleapis.com/r8-outputs-us-central1-long-term/zE2cHcpQeGXEPSKjCwJlczG9VZaiAKUgdP65E03cofflmA1lA/20240602-11_05_04-e69e86d99788cc217ec9ab0b3e156bc7.wav", "audio_duration": 30.31, "inference_time": 241.26 } ] }{ "completed_at": "2024-06-02T11:09:06.210195Z", "created_at": "2024-06-02T11:03:35.211000Z", "data_removed": false, "error": null, "id": "pz5wsbkr5drgc0cfv2f9pkfrv4", "input": { "text": "chat T T S is a text to speech model designed for dialogue applications. \n[uv_break]it supports mixed language input [uv_break]and offers multi speaker \ncapabilities with precise control over prosodic elements [laugh]like like \n[uv_break]laughter[laugh], [uv_break]pauses, [uv_break]and intonation. \n[uv_break]it delivers natural and expressive speech,[uv_break]so please\n[uv_break] use the project responsibly at your own risk.[uv_break]", "top_k": 20, "top_p": 0.7, "voice": 2222, "prompt": "", "skip_refine": 0, "temperature": 0.3, "custom_voice": 0 }, "logs": "voice=2222,custom_voice=0\nstart_time=1717326304.6768682\nINFO:ChatTTS.core:All initialized.\n 0%| | 0/384 [00:00<?, ?it/s]\n 0%| | 1/384 [00:59<6:17:42, 59.17s/it]\n 1%| | 2/384 [02:00<6:25:22, 60.53s/it]\n 1%| | 3/384 [02:58<6:15:42, 59.17s/it]\n 2%|▏ | 8/384 [03:53<2:20:21, 22.40s/it]\n 4%|▍ | 15/384 [03:53<54:29, 8.86s/it] \n 6%|▌ | 23/384 [03:54<26:51, 4.46s/it]\n7%|▋ | 26/384 [03:54<53:43, 9.00s/it]\n 0%| | 0/2048 [00:00<?, ?it/s]\n 0%| | 4/2048 [00:00<00:53, 38.01it/s]\n 1%| | 12/2048 [00:00<00:35, 57.54it/s]\n 1%| | 20/2048 [00:00<00:31, 63.59it/s]\n 1%|▏ | 28/2048 [00:00<00:30, 66.59it/s]\n 2%|▏ | 36/2048 [00:00<00:29, 68.27it/s]\n 2%|▏ | 44/2048 [00:00<00:28, 69.41it/s]\n 3%|▎ | 52/2048 [00:00<00:28, 70.02it/s]\n 3%|▎ | 59/2048 [00:01<00:48, 40.86it/s]\n 3%|▎ | 67/2048 [00:01<00:41, 47.45it/s]\n 4%|▎ | 75/2048 [00:01<00:37, 53.12it/s]\n 4%|▍ | 83/2048 [00:01<00:34, 57.77it/s]\n 4%|▍ | 91/2048 [00:01<00:31, 61.23it/s]\n 5%|▍ | 99/2048 [00:01<00:30, 64.09it/s]\n 5%|▌ | 107/2048 [00:01<00:29, 66.13it/s]\n 6%|▌ | 115/2048 [00:01<00:28, 67.23it/s]\n 6%|▌ | 122/2048 [00:02<00:28, 67.62it/s]\n 6%|▋ | 130/2048 [00:02<00:28, 68.40it/s]\n 7%|▋ | 138/2048 [00:02<00:27, 69.00it/s]\n 7%|▋ | 146/2048 [00:02<00:27, 69.42it/s]\n 8%|▊ | 154/2048 [00:02<00:27, 69.48it/s]\n 8%|▊ | 161/2048 [00:02<00:27, 69.45it/s]\n 8%|▊ | 169/2048 [00:02<00:26, 69.89it/s]\n 9%|▊ | 177/2048 [00:02<00:26, 70.12it/s]\n 9%|▉ | 185/2048 [00:02<00:26, 69.94it/s]\n 9%|▉ | 193/2048 [00:03<00:26, 69.65it/s]\n 10%|▉ | 201/2048 [00:03<00:26, 69.83it/s]\n 10%|█ | 208/2048 [00:03<00:26, 69.83it/s]\n 11%|█ | 216/2048 [00:03<00:26, 70.07it/s]\n 11%|█ | 224/2048 [00:03<00:25, 70.29it/s]\n 11%|█▏ | 232/2048 [00:03<00:25, 70.50it/s]\n 12%|█▏ | 240/2048 [00:03<00:25, 70.27it/s]\n 12%|█▏ | 248/2048 [00:03<00:25, 70.02it/s]\n 12%|█▎ | 256/2048 [00:03<00:25, 69.59it/s]\n 13%|█▎ | 263/2048 [00:04<00:25, 69.54it/s]\n 13%|█▎ | 271/2048 [00:04<00:25, 69.85it/s]\n 14%|█▎ | 278/2048 [00:04<00:25, 69.70it/s]\n 14%|█▍ | 286/2048 [00:04<00:25, 70.08it/s]\n 14%|█▍ | 294/2048 [00:04<00:24, 70.29it/s]\n 15%|█▍ | 302/2048 [00:04<00:24, 70.12it/s]\n 15%|█▌ | 310/2048 [00:04<00:24, 70.43it/s]\n 16%|█▌ | 318/2048 [00:04<00:24, 70.25it/s]\n 16%|█▌ | 326/2048 [00:04<00:24, 69.53it/s]\n 16%|█▋ | 333/2048 [00:05<00:24, 69.43it/s]\n 17%|█▋ | 341/2048 [00:05<00:24, 69.69it/s]\n 17%|█▋ | 349/2048 [00:05<00:24, 69.97it/s]\n 17%|█▋ | 356/2048 [00:05<00:24, 68.87it/s]\n 18%|█▊ | 364/2048 [00:05<00:24, 69.43it/s]\n 18%|█▊ | 372/2048 [00:05<00:23, 69.93it/s]\n 19%|█▊ | 380/2048 [00:05<00:23, 70.29it/s]\n 19%|█▉ | 388/2048 [00:05<00:23, 70.55it/s]\n 19%|█▉ | 396/2048 [00:05<00:23, 69.41it/s]\n20%|█▉ | 402/2048 [00:06<00:24, 66.65it/s]\n推理时长: 241.26 秒\n音频时长: 30.31 秒", "metrics": { "predict_time": 241.575889, "total_time": 330.999195 }, "output": { "audio_files": [ { "filename": "https://storage.googleapis.com/r8-outputs-us-central1-long-term/zE2cHcpQeGXEPSKjCwJlczG9VZaiAKUgdP65E03cofflmA1lA/20240602-11_05_04-e69e86d99788cc217ec9ab0b3e156bc7.wav", "audio_duration": 30.31, "inference_time": 241.26 } ] }, "started_at": "2024-06-02T11:05:04.634306Z", "status": "succeeded", "urls": { "get": "https://api.replicate.com/v1/predictions/pz5wsbkr5drgc0cfv2f9pkfrv4", "cancel": "https://api.replicate.com/v1/predictions/pz5wsbkr5drgc0cfv2f9pkfrv4/cancel" }, "version": "864cbf22ea816a82f3da143049cd5d4c94b95b58567c98536165a44c74b540d1" }
Generated invoice=2222,custom_voice=0 start_time=1717326304.6768682 INFO:ChatTTS.core:All initialized. 0%| | 0/384 [00:00<?, ?it/s] 0%| | 1/384 [00:59<6:17:42, 59.17s/it] 1%| | 2/384 [02:00<6:25:22, 60.53s/it] 1%| | 3/384 [02:58<6:15:42, 59.17s/it] 2%|▏ | 8/384 [03:53<2:20:21, 22.40s/it] 4%|▍ | 15/384 [03:53<54:29, 8.86s/it] 6%|▌ | 23/384 [03:54<26:51, 4.46s/it] 7%|▋ | 26/384 [03:54<53:43, 9.00s/it] 0%| | 0/2048 [00:00<?, ?it/s] 0%| | 4/2048 [00:00<00:53, 38.01it/s] 1%| | 12/2048 [00:00<00:35, 57.54it/s] 1%| | 20/2048 [00:00<00:31, 63.59it/s] 1%|▏ | 28/2048 [00:00<00:30, 66.59it/s] 2%|▏ | 36/2048 [00:00<00:29, 68.27it/s] 2%|▏ | 44/2048 [00:00<00:28, 69.41it/s] 3%|▎ | 52/2048 [00:00<00:28, 70.02it/s] 3%|▎ | 59/2048 [00:01<00:48, 40.86it/s] 3%|▎ | 67/2048 [00:01<00:41, 47.45it/s] 4%|▎ | 75/2048 [00:01<00:37, 53.12it/s] 4%|▍ | 83/2048 [00:01<00:34, 57.77it/s] 4%|▍ | 91/2048 [00:01<00:31, 61.23it/s] 5%|▍ | 99/2048 [00:01<00:30, 64.09it/s] 5%|▌ | 107/2048 [00:01<00:29, 66.13it/s] 6%|▌ | 115/2048 [00:01<00:28, 67.23it/s] 6%|▌ | 122/2048 [00:02<00:28, 67.62it/s] 6%|▋ | 130/2048 [00:02<00:28, 68.40it/s] 7%|▋ | 138/2048 [00:02<00:27, 69.00it/s] 7%|▋ | 146/2048 [00:02<00:27, 69.42it/s] 8%|▊ | 154/2048 [00:02<00:27, 69.48it/s] 8%|▊ | 161/2048 [00:02<00:27, 69.45it/s] 8%|▊ | 169/2048 [00:02<00:26, 69.89it/s] 9%|▊ | 177/2048 [00:02<00:26, 70.12it/s] 9%|▉ | 185/2048 [00:02<00:26, 69.94it/s] 9%|▉ | 193/2048 [00:03<00:26, 69.65it/s] 10%|▉ | 201/2048 [00:03<00:26, 69.83it/s] 10%|█ | 208/2048 [00:03<00:26, 69.83it/s] 11%|█ | 216/2048 [00:03<00:26, 70.07it/s] 11%|█ | 224/2048 [00:03<00:25, 70.29it/s] 11%|█▏ | 232/2048 [00:03<00:25, 70.50it/s] 12%|█▏ | 240/2048 [00:03<00:25, 70.27it/s] 12%|█▏ | 248/2048 [00:03<00:25, 70.02it/s] 12%|█▎ | 256/2048 [00:03<00:25, 69.59it/s] 13%|█▎ | 263/2048 [00:04<00:25, 69.54it/s] 13%|█▎ | 271/2048 [00:04<00:25, 69.85it/s] 14%|█▎ | 278/2048 [00:04<00:25, 69.70it/s] 14%|█▍ | 286/2048 [00:04<00:25, 70.08it/s] 14%|█▍ | 294/2048 [00:04<00:24, 70.29it/s] 15%|█▍ | 302/2048 [00:04<00:24, 70.12it/s] 15%|█▌ | 310/2048 [00:04<00:24, 70.43it/s] 16%|█▌ | 318/2048 [00:04<00:24, 70.25it/s] 16%|█▌ | 326/2048 [00:04<00:24, 69.53it/s] 16%|█▋ | 333/2048 [00:05<00:24, 69.43it/s] 17%|█▋ | 341/2048 [00:05<00:24, 69.69it/s] 17%|█▋ | 349/2048 [00:05<00:24, 69.97it/s] 17%|█▋ | 356/2048 [00:05<00:24, 68.87it/s] 18%|█▊ | 364/2048 [00:05<00:24, 69.43it/s] 18%|█▊ | 372/2048 [00:05<00:23, 69.93it/s] 19%|█▊ | 380/2048 [00:05<00:23, 70.29it/s] 19%|█▉ | 388/2048 [00:05<00:23, 70.55it/s] 19%|█▉ | 396/2048 [00:05<00:23, 69.41it/s] 20%|█▉ | 402/2048 [00:06<00:24, 66.65it/s] 推理时长: 241.26 秒 音频时长: 30.31 秒
Want to make some of these yourself?
Run this model