ttsds / fishspeech_1_1_large
(Updated 4 months, 2 weeks ago)
- Public
- 232 runs
-
L40S
Prediction
ttsds/fishspeech_1_1_large:bf25b86020c83763b6727b138d7b0c3308dd210ee059accc00cb6f1971bbcd37IDz4ex31txjdrme0cmq0kv4mm6b0StatusSucceededSourceWebHardwareL40STotal durationCreatedInput
- text
- With tenure, Suzie'd have all the more leisure for yachting, but her publications are no good.
- text_reference
- and keeping eternity before the eyes, though much.
- speaker_reference
- Video Player is loading.Current Time 00:00:000/Duration 00:00:000Loaded: 0%00:00:000Stream Type LIVERemaining Time -00:00:0001x
- Chapters
- descriptions off, selected
- captions settings, opens captions settings dialog
- captions off, selected
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.Beginning of dialog window. Escape will cancel and close the window.
End of dialog window.
{ "text": "With tenure, Suzie'd have all the more leisure for yachting, but her publications are no good.", "text_reference": "and keeping eternity before the eyes, though much.", "speaker_reference": "https://replicate.delivery/pbxt/MNFXdPaUPOwYCZjZM4azsymbzE2TCV2WJXfGpeV2DrFWaSq8/example_en.wav" }
Install Replicate’s Node.js client library:npm install replicate
Import and set up the client:import Replicate from "replicate"; import fs from "node:fs"; const replicate = new Replicate({ auth: process.env.REPLICATE_API_TOKEN, });
Run ttsds/fishspeech_1_1_large using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
const output = await replicate.run( "ttsds/fishspeech_1_1_large:bf25b86020c83763b6727b138d7b0c3308dd210ee059accc00cb6f1971bbcd37", { input: { text: "With tenure, Suzie'd have all the more leisure for yachting, but her publications are no good.", text_reference: "and keeping eternity before the eyes, though much.", speaker_reference: "https://replicate.delivery/pbxt/MNFXdPaUPOwYCZjZM4azsymbzE2TCV2WJXfGpeV2DrFWaSq8/example_en.wav" } } ); // To access the file URL: console.log(output.url()); //=> "http://example.com" // To write the file to disk: fs.writeFile("my-image.png", output);
To learn more, take a look at the guide on getting started with Node.js.
Install Replicate’s Python client library:pip install replicate
Import the client:import replicate
Run ttsds/fishspeech_1_1_large using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
output = replicate.run( "ttsds/fishspeech_1_1_large:bf25b86020c83763b6727b138d7b0c3308dd210ee059accc00cb6f1971bbcd37", input={ "text": "With tenure, Suzie'd have all the more leisure for yachting, but her publications are no good.", "text_reference": "and keeping eternity before the eyes, though much.", "speaker_reference": "https://replicate.delivery/pbxt/MNFXdPaUPOwYCZjZM4azsymbzE2TCV2WJXfGpeV2DrFWaSq8/example_en.wav" } ) print(output)
To learn more, take a look at the guide on getting started with Python.
Run ttsds/fishspeech_1_1_large using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
curl -s -X POST \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ -H "Content-Type: application/json" \ -H "Prefer: wait" \ -d $'{ "version": "ttsds/fishspeech_1_1_large:bf25b86020c83763b6727b138d7b0c3308dd210ee059accc00cb6f1971bbcd37", "input": { "text": "With tenure, Suzie\'d have all the more leisure for yachting, but her publications are no good.", "text_reference": "and keeping eternity before the eyes, though much.", "speaker_reference": "https://replicate.delivery/pbxt/MNFXdPaUPOwYCZjZM4azsymbzE2TCV2WJXfGpeV2DrFWaSq8/example_en.wav" } }' \ https://api.replicate.com/v1/predictions
To learn more, take a look at Replicate’s HTTP API reference docs.
You can run this model locally using Cog. First, install Cog:brew install cog
If you don’t have Homebrew, there are other installation options available.
Run this to download the model and run it in your local environment:
cog predict r8.im/ttsds/fishspeech_1_1_large@sha256:bf25b86020c83763b6727b138d7b0c3308dd210ee059accc00cb6f1971bbcd37 \ -i $'text="With tenure, Suzie\'d have all the more leisure for yachting, but her publications are no good."' \ -i 'text_reference="and keeping eternity before the eyes, though much."' \ -i 'speaker_reference="https://replicate.delivery/pbxt/MNFXdPaUPOwYCZjZM4azsymbzE2TCV2WJXfGpeV2DrFWaSq8/example_en.wav"'
To learn more, take a look at the Cog documentation.
Run this to download the model and run it in your local environment:
docker run -d -p 5000:5000 --gpus=all r8.im/ttsds/fishspeech_1_1_large@sha256:bf25b86020c83763b6727b138d7b0c3308dd210ee059accc00cb6f1971bbcd37
curl -s -X POST \ -H "Content-Type: application/json" \ -d $'{ "input": { "text": "With tenure, Suzie\'d have all the more leisure for yachting, but her publications are no good.", "text_reference": "and keeping eternity before the eyes, though much.", "speaker_reference": "https://replicate.delivery/pbxt/MNFXdPaUPOwYCZjZM4azsymbzE2TCV2WJXfGpeV2DrFWaSq8/example_en.wav" } }' \ http://localhost:5000/predictions
To learn more, take a look at the Cog documentation.
Output
Video Player is loading.Current Time 00:00:000/Duration 00:00:000Loaded: 0%00:00:000Stream Type LIVERemaining Time -00:00:0001x- Chapters
- descriptions off, selected
- captions settings, opens captions settings dialog
- captions off, selected
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.Beginning of dialog window. Escape will cancel and close the window.
End of dialog window.
{ "completed_at": "2025-01-30T17:01:30.241508Z", "created_at": "2025-01-30T16:59:35.187000Z", "data_removed": false, "error": null, "id": "z4ex31txjdrme0cmq0kv4mm6b0", "input": { "text": "With tenure, Suzie'd have all the more leisure for yachting, but her publications are no good.", "text_reference": "and keeping eternity before the eyes, though much.", "speaker_reference": "https://replicate.delivery/pbxt/MNFXdPaUPOwYCZjZM4azsymbzE2TCV2WJXfGpeV2DrFWaSq8/example_en.wav" }, "logs": "2025-01-30 17:01:26.111 | INFO | tools.llama.generate:generate_long:491 - Encoded text: With tenure, Suzie'd have all\n2025-01-30 17:01:26.111 | INFO | tools.llama.generate:generate_long:491 - Encoded text: the more leisure for yachting,\n2025-01-30 17:01:26.112 | INFO | tools.llama.generate:generate_long:491 - Encoded text: but her publications are no\n2025-01-30 17:01:26.112 | INFO | tools.llama.generate:generate_long:491 - Encoded text: good.\n2025-01-30 17:01:26.112 | INFO | tools.llama.generate:generate_long:509 - Generating sentence 1/4 of sample 1/1\n 0%| | 0/1857 [00:00<?, ?it/s]/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/torch/backends/cuda/__init__.py:342: FutureWarning: torch.backends.cuda.sdp_kernel() is deprecated. In the future, this context manager will be removed. Please see, torch.nn.attention.sdpa_kernel() for the new context manager, with updated signature.\nwarnings.warn(\n 0%| | 6/1857 [00:00<00:35, 51.79it/s]\n 1%| | 12/1857 [00:00<00:34, 53.93it/s]\n 1%| | 18/1857 [00:00<00:33, 54.19it/s]\n 1%|▏ | 24/1857 [00:00<00:33, 54.77it/s]\n 2%|▏ | 30/1857 [00:00<00:33, 55.09it/s]\n 2%|▏ | 36/1857 [00:00<00:32, 55.26it/s]\n 2%|▏ | 42/1857 [00:00<00:32, 55.42it/s]\n 3%|▎ | 48/1857 [00:00<00:32, 55.49it/s]\n 3%|▎ | 54/1857 [00:00<00:32, 55.56it/s]\n 3%|▎ | 60/1857 [00:01<00:32, 55.55it/s]\n 4%|▎ | 66/1857 [00:01<00:32, 55.52it/s]\n 4%|▍ | 72/1857 [00:01<00:32, 55.56it/s]\n 4%|▍ | 78/1857 [00:01<00:32, 55.21it/s]\n4%|▍ | 81/1857 [00:01<00:32, 54.48it/s]\n2025-01-30 17:01:27.699 | INFO | tools.llama.generate:generate_long:565 - Generated 83 tokens in 1.59 seconds, 52.28 tokens/sec\n2025-01-30 17:01:27.700 | INFO | tools.llama.generate:generate_long:568 - Bandwidth achieved: 53.67 GB/s\n2025-01-30 17:01:27.700 | INFO | tools.llama.generate:generate_long:573 - GPU Memory used: 4.01 GB\n2025-01-30 17:01:27.700 | INFO | tools.llama.generate:generate_long:509 - Generating sentence 2/4 of sample 1/1\n 0%| | 0/1726 [00:00<?, ?it/s]\n 0%| | 6/1726 [00:00<00:30, 55.57it/s]\n 1%| | 12/1726 [00:00<00:30, 55.47it/s]\n 1%| | 18/1726 [00:00<00:30, 55.52it/s]\n 1%|▏ | 24/1726 [00:00<00:30, 55.57it/s]\n 2%|▏ | 30/1726 [00:00<00:30, 55.51it/s]\n 2%|▏ | 36/1726 [00:00<00:30, 55.52it/s]\n 2%|▏ | 42/1726 [00:00<00:30, 55.54it/s]\n 3%|▎ | 48/1726 [00:00<00:30, 55.53it/s]\n3%|▎ | 50/1726 [00:00<00:30, 54.43it/s]\n2025-01-30 17:01:28.645 | INFO | tools.llama.generate:generate_long:565 - Generated 52 tokens in 0.94 seconds, 55.04 tokens/sec\n2025-01-30 17:01:28.645 | INFO | tools.llama.generate:generate_long:568 - Bandwidth achieved: 56.50 GB/s\n2025-01-30 17:01:28.645 | INFO | tools.llama.generate:generate_long:573 - GPU Memory used: 4.01 GB\n2025-01-30 17:01:28.646 | INFO | tools.llama.generate:generate_long:509 - Generating sentence 3/4 of sample 1/1\n 0%| | 0/1629 [00:00<?, ?it/s]\n 0%| | 6/1629 [00:00<00:29, 55.72it/s]\n 1%| | 12/1629 [00:00<00:29, 55.44it/s]\n 1%| | 18/1629 [00:00<00:29, 55.43it/s]\n 1%|▏ | 24/1629 [00:00<00:28, 55.44it/s]\n 2%|▏ | 30/1629 [00:00<00:28, 55.51it/s]\n 2%|▏ | 36/1629 [00:00<00:28, 55.53it/s]\n 3%|▎ | 42/1629 [00:00<00:28, 55.52it/s]\n3%|▎ | 43/1629 [00:00<00:29, 54.22it/s]\n2025-01-30 17:01:29.459 | INFO | tools.llama.generate:generate_long:565 - Generated 45 tokens in 0.81 seconds, 55.34 tokens/sec\n2025-01-30 17:01:29.459 | INFO | tools.llama.generate:generate_long:568 - Bandwidth achieved: 56.81 GB/s\n2025-01-30 17:01:29.459 | INFO | tools.llama.generate:generate_long:573 - GPU Memory used: 4.01 GB\n2025-01-30 17:01:29.459 | INFO | tools.llama.generate:generate_long:509 - Generating sentence 4/4 of sample 1/1\n 0%| | 0/1561 [00:00<?, ?it/s]\n 0%| | 6/1561 [00:00<00:28, 54.37it/s]\n 1%| | 12/1561 [00:00<00:28, 55.05it/s]\n 1%| | 18/1561 [00:00<00:28, 54.63it/s]\n1%|▏ | 22/1561 [00:00<00:29, 52.42it/s]\n2025-01-30 17:01:29.899 | INFO | tools.llama.generate:generate_long:565 - Generated 24 tokens in 0.44 seconds, 54.53 tokens/sec\n2025-01-30 17:01:29.900 | INFO | tools.llama.generate:generate_long:568 - Bandwidth achieved: 55.97 GB/s\n2025-01-30 17:01:29.900 | INFO | tools.llama.generate:generate_long:573 - GPU Memory used: 4.01 GB\n/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/torch/nn/modules/conv.py:306: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)\nreturn F.conv1d(input, weight, bias, self.stride,\nNext sample", "metrics": { "predict_time": 4.436514462, "total_time": 115.054508 }, "output": "https://replicate.delivery/xezq/cvDmYj7AkS7tCJ1L4xokpE0UPfq8qIr2Ey17B5YYk2fqJWKUA/generated.wav", "started_at": "2025-01-30T17:01:25.804994Z", "status": "succeeded", "urls": { "stream": "https://stream.replicate.com/v1/files/bsvm-oyur2w3altprrqoq2gmfap2fatjfjob7q4ccipws6k27y6glqlda", "get": "https://api.replicate.com/v1/predictions/z4ex31txjdrme0cmq0kv4mm6b0", "cancel": "https://api.replicate.com/v1/predictions/z4ex31txjdrme0cmq0kv4mm6b0/cancel" }, "version": "bf25b86020c83763b6727b138d7b0c3308dd210ee059accc00cb6f1971bbcd37" }
Generated in2025-01-30 17:01:26.111 | INFO | tools.llama.generate:generate_long:491 - Encoded text: With tenure, Suzie'd have all 2025-01-30 17:01:26.111 | INFO | tools.llama.generate:generate_long:491 - Encoded text: the more leisure for yachting, 2025-01-30 17:01:26.112 | INFO | tools.llama.generate:generate_long:491 - Encoded text: but her publications are no 2025-01-30 17:01:26.112 | INFO | tools.llama.generate:generate_long:491 - Encoded text: good. 2025-01-30 17:01:26.112 | INFO | tools.llama.generate:generate_long:509 - Generating sentence 1/4 of sample 1/1 0%| | 0/1857 [00:00<?, ?it/s]/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/torch/backends/cuda/__init__.py:342: FutureWarning: torch.backends.cuda.sdp_kernel() is deprecated. In the future, this context manager will be removed. Please see, torch.nn.attention.sdpa_kernel() for the new context manager, with updated signature. warnings.warn( 0%| | 6/1857 [00:00<00:35, 51.79it/s] 1%| | 12/1857 [00:00<00:34, 53.93it/s] 1%| | 18/1857 [00:00<00:33, 54.19it/s] 1%|▏ | 24/1857 [00:00<00:33, 54.77it/s] 2%|▏ | 30/1857 [00:00<00:33, 55.09it/s] 2%|▏ | 36/1857 [00:00<00:32, 55.26it/s] 2%|▏ | 42/1857 [00:00<00:32, 55.42it/s] 3%|▎ | 48/1857 [00:00<00:32, 55.49it/s] 3%|▎ | 54/1857 [00:00<00:32, 55.56it/s] 3%|▎ | 60/1857 [00:01<00:32, 55.55it/s] 4%|▎ | 66/1857 [00:01<00:32, 55.52it/s] 4%|▍ | 72/1857 [00:01<00:32, 55.56it/s] 4%|▍ | 78/1857 [00:01<00:32, 55.21it/s] 4%|▍ | 81/1857 [00:01<00:32, 54.48it/s] 2025-01-30 17:01:27.699 | INFO | tools.llama.generate:generate_long:565 - Generated 83 tokens in 1.59 seconds, 52.28 tokens/sec 2025-01-30 17:01:27.700 | INFO | tools.llama.generate:generate_long:568 - Bandwidth achieved: 53.67 GB/s 2025-01-30 17:01:27.700 | INFO | tools.llama.generate:generate_long:573 - GPU Memory used: 4.01 GB 2025-01-30 17:01:27.700 | INFO | tools.llama.generate:generate_long:509 - Generating sentence 2/4 of sample 1/1 0%| | 0/1726 [00:00<?, ?it/s] 0%| | 6/1726 [00:00<00:30, 55.57it/s] 1%| | 12/1726 [00:00<00:30, 55.47it/s] 1%| | 18/1726 [00:00<00:30, 55.52it/s] 1%|▏ | 24/1726 [00:00<00:30, 55.57it/s] 2%|▏ | 30/1726 [00:00<00:30, 55.51it/s] 2%|▏ | 36/1726 [00:00<00:30, 55.52it/s] 2%|▏ | 42/1726 [00:00<00:30, 55.54it/s] 3%|▎ | 48/1726 [00:00<00:30, 55.53it/s] 3%|▎ | 50/1726 [00:00<00:30, 54.43it/s] 2025-01-30 17:01:28.645 | INFO | tools.llama.generate:generate_long:565 - Generated 52 tokens in 0.94 seconds, 55.04 tokens/sec 2025-01-30 17:01:28.645 | INFO | tools.llama.generate:generate_long:568 - Bandwidth achieved: 56.50 GB/s 2025-01-30 17:01:28.645 | INFO | tools.llama.generate:generate_long:573 - GPU Memory used: 4.01 GB 2025-01-30 17:01:28.646 | INFO | tools.llama.generate:generate_long:509 - Generating sentence 3/4 of sample 1/1 0%| | 0/1629 [00:00<?, ?it/s] 0%| | 6/1629 [00:00<00:29, 55.72it/s] 1%| | 12/1629 [00:00<00:29, 55.44it/s] 1%| | 18/1629 [00:00<00:29, 55.43it/s] 1%|▏ | 24/1629 [00:00<00:28, 55.44it/s] 2%|▏ | 30/1629 [00:00<00:28, 55.51it/s] 2%|▏ | 36/1629 [00:00<00:28, 55.53it/s] 3%|▎ | 42/1629 [00:00<00:28, 55.52it/s] 3%|▎ | 43/1629 [00:00<00:29, 54.22it/s] 2025-01-30 17:01:29.459 | INFO | tools.llama.generate:generate_long:565 - Generated 45 tokens in 0.81 seconds, 55.34 tokens/sec 2025-01-30 17:01:29.459 | INFO | tools.llama.generate:generate_long:568 - Bandwidth achieved: 56.81 GB/s 2025-01-30 17:01:29.459 | INFO | tools.llama.generate:generate_long:573 - GPU Memory used: 4.01 GB 2025-01-30 17:01:29.459 | INFO | tools.llama.generate:generate_long:509 - Generating sentence 4/4 of sample 1/1 0%| | 0/1561 [00:00<?, ?it/s] 0%| | 6/1561 [00:00<00:28, 54.37it/s] 1%| | 12/1561 [00:00<00:28, 55.05it/s] 1%| | 18/1561 [00:00<00:28, 54.63it/s] 1%|▏ | 22/1561 [00:00<00:29, 52.42it/s] 2025-01-30 17:01:29.899 | INFO | tools.llama.generate:generate_long:565 - Generated 24 tokens in 0.44 seconds, 54.53 tokens/sec 2025-01-30 17:01:29.900 | INFO | tools.llama.generate:generate_long:568 - Bandwidth achieved: 55.97 GB/s 2025-01-30 17:01:29.900 | INFO | tools.llama.generate:generate_long:573 - GPU Memory used: 4.01 GB /root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/torch/nn/modules/conv.py:306: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.) return F.conv1d(input, weight, bias, self.stride, Next sample
Want to make some of these yourself?
Run this model