ttsds/fishspeech_1_1
The Fish Speech V1.1 model.
The VALL-E models by Amphion.
The NaturalSpeech2 model by Amphion.
The MaskGCT model by Amphion.
The Vevo model by Amphion.
The Bark model by Suno.
The small version of the Bark model by Suno.
The Fish Speech V1.0 model.
The Fish Speech V1.2 model.
The Fish Speech V1.2 SFT model.
The Fish Speech V1.4 model.
The Fish Speech V1.5 model.
Prediction
ttsds/fishspeech_1_1:eba2a3e1e07cf38ac2a528d134fdac1cab4d222b8850ac98517635ddf5c1ca75IDfgdgr9djqxrme0cmnpx9bhp1emStatusSucceededSourceWebHardwareL40STotal durationCreatedInput
- text
- With tenure, Suzie'd have all the more leisure for yachting, but her publications are no good.
- text_reference
- and keeping eternity before the eyes, though much
- speaker_reference
- Video Player is loading.Current Time 00:00:000/Duration 00:00:000Loaded: 0%Stream Type LIVERemaining Time -00:00:0001x
- Chapters
- descriptions off, selected
- captions settings, opens captions settings dialog
- captions off, selected
This is a modal window.
Beginning of dialog window. Escape will cancel and close the window.
End of dialog window.
{ "text": "With tenure, Suzie'd have all the more leisure for yachting, but her publications are no good.", "text_reference": "and keeping eternity before the eyes, though much", "speaker_reference": "https://replicate.delivery/pbxt/MNFXdPaUPOwYCZjZM4azsymbzE2TCV2WJXfGpeV2DrFWaSq8/example_en.wav" }
Install Replicate’s Node.js client library:npm install replicate
Import and set up the client:import Replicate from "replicate"; import fs from "node:fs"; const replicate = new Replicate({ auth: process.env.REPLICATE_API_TOKEN, });
Run ttsds/fishspeech_1_1 using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
const output = await replicate.run( "ttsds/fishspeech_1_1:eba2a3e1e07cf38ac2a528d134fdac1cab4d222b8850ac98517635ddf5c1ca75", { input: { text: "With tenure, Suzie'd have all the more leisure for yachting, but her publications are no good.", text_reference: "and keeping eternity before the eyes, though much", speaker_reference: "https://replicate.delivery/pbxt/MNFXdPaUPOwYCZjZM4azsymbzE2TCV2WJXfGpeV2DrFWaSq8/example_en.wav" } } ); // To access the file URL: console.log(output.url()); //=> "http://example.com" // To write the file to disk: fs.writeFile("my-image.png", output);
To learn more, take a look at the guide on getting started with Node.js.
Install Replicate’s Python client library:pip install replicate
Import the client:import replicate
Run ttsds/fishspeech_1_1 using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
output = replicate.run( "ttsds/fishspeech_1_1:eba2a3e1e07cf38ac2a528d134fdac1cab4d222b8850ac98517635ddf5c1ca75", input={ "text": "With tenure, Suzie'd have all the more leisure for yachting, but her publications are no good.", "text_reference": "and keeping eternity before the eyes, though much", "speaker_reference": "https://replicate.delivery/pbxt/MNFXdPaUPOwYCZjZM4azsymbzE2TCV2WJXfGpeV2DrFWaSq8/example_en.wav" } ) # To access the file URL: print(output.url()) #=> "http://example.com" # To write the file to disk: with open("my-image.png", "wb") as file: file.write(output.read())
To learn more, take a look at the guide on getting started with Python.
Run ttsds/fishspeech_1_1 using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
curl -s -X POST \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ -H "Content-Type: application/json" \ -H "Prefer: wait" \ -d $'{ "version": "ttsds/fishspeech_1_1:eba2a3e1e07cf38ac2a528d134fdac1cab4d222b8850ac98517635ddf5c1ca75", "input": { "text": "With tenure, Suzie\'d have all the more leisure for yachting, but her publications are no good.", "text_reference": "and keeping eternity before the eyes, though much", "speaker_reference": "https://replicate.delivery/pbxt/MNFXdPaUPOwYCZjZM4azsymbzE2TCV2WJXfGpeV2DrFWaSq8/example_en.wav" } }' \ https://api.replicate.com/v1/predictions
To learn more, take a look at Replicate’s HTTP API reference docs.
Output
Video Player is loading.Current Time 00:00:000/Duration 00:00:000Loaded: 0%Stream Type LIVERemaining Time -00:00:0001x- Chapters
- descriptions off, selected
- captions settings, opens captions settings dialog
- captions off, selected
This is a modal window.
Beginning of dialog window. Escape will cancel and close the window.
End of dialog window.
{ "completed_at": "2025-01-28T16:26:05.034677Z", "created_at": "2025-01-28T16:24:41.407000Z", "data_removed": false, "error": null, "id": "fgdgr9djqxrme0cmnpx9bhp1em", "input": { "text": "With tenure, Suzie'd have all the more leisure for yachting, but her publications are no good.", "text_reference": "and keeping eternity before the eyes, though much", "speaker_reference": "https://replicate.delivery/pbxt/MNFXdPaUPOwYCZjZM4azsymbzE2TCV2WJXfGpeV2DrFWaSq8/example_en.wav" }, "logs": "2025-01-28 16:26:02.222 | INFO | tools.llama.generate:generate_long:491 - Encoded text: With tenure, Suzie'd have all\n2025-01-28 16:26:02.222 | INFO | tools.llama.generate:generate_long:491 - Encoded text: the more leisure for yachting,\n2025-01-28 16:26:02.222 | INFO | tools.llama.generate:generate_long:491 - Encoded text: but her publications are no\n2025-01-28 16:26:02.222 | INFO | tools.llama.generate:generate_long:491 - Encoded text: good.\n2025-01-28 16:26:02.223 | INFO | tools.llama.generate:generate_long:509 - Generating sentence 1/4 of sample 1/1\n 0%| | 0/1858 [00:00<?, ?it/s]/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/torch/backends/cuda/__init__.py:342: FutureWarning: torch.backends.cuda.sdp_kernel() is deprecated. In the future, this context manager will be removed. Please see, torch.nn.attention.sdpa_kernel() for the new context manager, with updated signature.\nwarnings.warn(\n 0%| | 6/1858 [00:00<00:31, 59.00it/s]\n 1%| | 12/1858 [00:00<00:32, 57.47it/s]\n 1%| | 19/1858 [00:00<00:30, 60.07it/s]\n 1%|▏ | 26/1858 [00:00<00:29, 61.43it/s]\n 2%|▏ | 33/1858 [00:00<00:29, 62.05it/s]\n 2%|▏ | 40/1858 [00:00<00:29, 62.57it/s]\n 3%|▎ | 47/1858 [00:00<00:29, 62.38it/s]\n3%|▎ | 49/1858 [00:00<00:29, 60.40it/s]\n2025-01-28 16:26:03.118 | INFO | tools.llama.generate:generate_long:565 - Generated 51 tokens in 0.90 seconds, 56.95 tokens/sec\n2025-01-28 16:26:03.118 | INFO | tools.llama.generate:generate_long:568 - Bandwidth achieved: 22.22 GB/s\n2025-01-28 16:26:03.118 | INFO | tools.llama.generate:generate_long:573 - GPU Memory used: 2.45 GB\n2025-01-28 16:26:03.119 | INFO | tools.llama.generate:generate_long:509 - Generating sentence 2/4 of sample 1/1\n 0%| | 0/1759 [00:00<?, ?it/s]\n 0%| | 7/1759 [00:00<00:27, 62.93it/s]\n 1%| | 14/1759 [00:00<00:27, 62.96it/s]\n 1%| | 21/1759 [00:00<00:27, 62.82it/s]\n 2%|▏ | 28/1759 [00:00<00:28, 61.75it/s]\n 2%|▏ | 35/1759 [00:00<00:27, 62.10it/s]\n2%|▏ | 40/1759 [00:00<00:28, 60.85it/s]\n2025-01-28 16:26:03.792 | INFO | tools.llama.generate:generate_long:565 - Generated 42 tokens in 0.67 seconds, 62.37 tokens/sec\n2025-01-28 16:26:03.793 | INFO | tools.llama.generate:generate_long:568 - Bandwidth achieved: 24.34 GB/s\n2025-01-28 16:26:03.793 | INFO | tools.llama.generate:generate_long:573 - GPU Memory used: 2.45 GB\n2025-01-28 16:26:03.793 | INFO | tools.llama.generate:generate_long:509 - Generating sentence 3/4 of sample 1/1\n 0%| | 0/1672 [00:00<?, ?it/s]\n 0%| | 7/1672 [00:00<00:26, 63.08it/s]\n 1%| | 14/1672 [00:00<00:26, 62.91it/s]\n 1%|▏ | 21/1672 [00:00<00:26, 62.97it/s]\n 2%|▏ | 28/1672 [00:00<00:26, 62.73it/s]\n 2%|▏ | 35/1672 [00:00<00:26, 62.46it/s]\n2%|▏ | 39/1672 [00:00<00:26, 61.00it/s]\n2025-01-28 16:26:04.454 | INFO | tools.llama.generate:generate_long:565 - Generated 41 tokens in 0.66 seconds, 62.06 tokens/sec\n2025-01-28 16:26:04.454 | INFO | tools.llama.generate:generate_long:568 - Bandwidth achieved: 24.21 GB/s\n2025-01-28 16:26:04.454 | INFO | tools.llama.generate:generate_long:573 - GPU Memory used: 2.45 GB\n2025-01-28 16:26:04.454 | INFO | tools.llama.generate:generate_long:509 - Generating sentence 4/4 of sample 1/1\n 0%| | 0/1608 [00:00<?, ?it/s]\n 0%| | 7/1608 [00:00<00:25, 63.11it/s]\n 1%| | 14/1608 [00:00<00:25, 63.00it/s]\n1%| | 14/1608 [00:00<00:27, 58.74it/s]\n2025-01-28 16:26:04.709 | INFO | tools.llama.generate:generate_long:565 - Generated 16 tokens in 0.25 seconds, 62.81 tokens/sec\n2025-01-28 16:26:04.709 | INFO | tools.llama.generate:generate_long:568 - Bandwidth achieved: 24.51 GB/s\n2025-01-28 16:26:04.710 | INFO | tools.llama.generate:generate_long:573 - GPU Memory used: 2.45 GB\n/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/torch/nn/modules/conv.py:306: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)\nreturn F.conv1d(input, weight, bias, self.stride,\nNext sample", "metrics": { "predict_time": 3.097940238, "total_time": 83.627677 }, "output": "https://replicate.delivery/xezq/I77Wgm5GPZJcPBiGe9WeRYHDfLr5blb16hWlvxnkk3f3xtmQB/generated.wav", "started_at": "2025-01-28T16:26:01.936737Z", "status": "succeeded", "urls": { "stream": "https://stream.replicate.com/v1/files/bsvm-kkjxzcufksibrwqp56dqg3nxd4bsxbkgeiscttvpaqnc4nagf4wa", "get": "https://api.replicate.com/v1/predictions/fgdgr9djqxrme0cmnpx9bhp1em", "cancel": "https://api.replicate.com/v1/predictions/fgdgr9djqxrme0cmnpx9bhp1em/cancel" }, "version": "eba2a3e1e07cf38ac2a528d134fdac1cab4d222b8850ac98517635ddf5c1ca75" }
Generated in2025-01-28 16:26:02.222 | INFO | tools.llama.generate:generate_long:491 - Encoded text: With tenure, Suzie'd have all 2025-01-28 16:26:02.222 | INFO | tools.llama.generate:generate_long:491 - Encoded text: the more leisure for yachting, 2025-01-28 16:26:02.222 | INFO | tools.llama.generate:generate_long:491 - Encoded text: but her publications are no 2025-01-28 16:26:02.222 | INFO | tools.llama.generate:generate_long:491 - Encoded text: good. 2025-01-28 16:26:02.223 | INFO | tools.llama.generate:generate_long:509 - Generating sentence 1/4 of sample 1/1 0%| | 0/1858 [00:00<?, ?it/s]/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/torch/backends/cuda/__init__.py:342: FutureWarning: torch.backends.cuda.sdp_kernel() is deprecated. In the future, this context manager will be removed. Please see, torch.nn.attention.sdpa_kernel() for the new context manager, with updated signature. warnings.warn( 0%| | 6/1858 [00:00<00:31, 59.00it/s] 1%| | 12/1858 [00:00<00:32, 57.47it/s] 1%| | 19/1858 [00:00<00:30, 60.07it/s] 1%|▏ | 26/1858 [00:00<00:29, 61.43it/s] 2%|▏ | 33/1858 [00:00<00:29, 62.05it/s] 2%|▏ | 40/1858 [00:00<00:29, 62.57it/s] 3%|▎ | 47/1858 [00:00<00:29, 62.38it/s] 3%|▎ | 49/1858 [00:00<00:29, 60.40it/s] 2025-01-28 16:26:03.118 | INFO | tools.llama.generate:generate_long:565 - Generated 51 tokens in 0.90 seconds, 56.95 tokens/sec 2025-01-28 16:26:03.118 | INFO | tools.llama.generate:generate_long:568 - Bandwidth achieved: 22.22 GB/s 2025-01-28 16:26:03.118 | INFO | tools.llama.generate:generate_long:573 - GPU Memory used: 2.45 GB 2025-01-28 16:26:03.119 | INFO | tools.llama.generate:generate_long:509 - Generating sentence 2/4 of sample 1/1 0%| | 0/1759 [00:00<?, ?it/s] 0%| | 7/1759 [00:00<00:27, 62.93it/s] 1%| | 14/1759 [00:00<00:27, 62.96it/s] 1%| | 21/1759 [00:00<00:27, 62.82it/s] 2%|▏ | 28/1759 [00:00<00:28, 61.75it/s] 2%|▏ | 35/1759 [00:00<00:27, 62.10it/s] 2%|▏ | 40/1759 [00:00<00:28, 60.85it/s] 2025-01-28 16:26:03.792 | INFO | tools.llama.generate:generate_long:565 - Generated 42 tokens in 0.67 seconds, 62.37 tokens/sec 2025-01-28 16:26:03.793 | INFO | tools.llama.generate:generate_long:568 - Bandwidth achieved: 24.34 GB/s 2025-01-28 16:26:03.793 | INFO | tools.llama.generate:generate_long:573 - GPU Memory used: 2.45 GB 2025-01-28 16:26:03.793 | INFO | tools.llama.generate:generate_long:509 - Generating sentence 3/4 of sample 1/1 0%| | 0/1672 [00:00<?, ?it/s] 0%| | 7/1672 [00:00<00:26, 63.08it/s] 1%| | 14/1672 [00:00<00:26, 62.91it/s] 1%|▏ | 21/1672 [00:00<00:26, 62.97it/s] 2%|▏ | 28/1672 [00:00<00:26, 62.73it/s] 2%|▏ | 35/1672 [00:00<00:26, 62.46it/s] 2%|▏ | 39/1672 [00:00<00:26, 61.00it/s] 2025-01-28 16:26:04.454 | INFO | tools.llama.generate:generate_long:565 - Generated 41 tokens in 0.66 seconds, 62.06 tokens/sec 2025-01-28 16:26:04.454 | INFO | tools.llama.generate:generate_long:568 - Bandwidth achieved: 24.21 GB/s 2025-01-28 16:26:04.454 | INFO | tools.llama.generate:generate_long:573 - GPU Memory used: 2.45 GB 2025-01-28 16:26:04.454 | INFO | tools.llama.generate:generate_long:509 - Generating sentence 4/4 of sample 1/1 0%| | 0/1608 [00:00<?, ?it/s] 0%| | 7/1608 [00:00<00:25, 63.11it/s] 1%| | 14/1608 [00:00<00:25, 63.00it/s] 1%| | 14/1608 [00:00<00:27, 58.74it/s] 2025-01-28 16:26:04.709 | INFO | tools.llama.generate:generate_long:565 - Generated 16 tokens in 0.25 seconds, 62.81 tokens/sec 2025-01-28 16:26:04.709 | INFO | tools.llama.generate:generate_long:568 - Bandwidth achieved: 24.51 GB/s 2025-01-28 16:26:04.710 | INFO | tools.llama.generate:generate_long:573 - GPU Memory used: 2.45 GB /root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/torch/nn/modules/conv.py:306: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.) return F.conv1d(input, weight, bias, self.stride, Next sample
Want to make some of these yourself?
Run this model