cjwbw / voicecraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild

Cold

Public
10.3K runs
L40S
GitHub
Paper
License

Run with an API

Playground API Examples README Versions

Input

Video Player is loading.

Current Time 00:00:000

Duration 00:00:000

Loaded: 0%

Stream Type LIVE

Remaining Time 00:00:000

task

string

Choose a task

Default: "zero-shot text-to-speech"

voicecraft_model

string

Choose a model

Default: "giga330M_TTSEnhanced.pth"

orig_audio

*file

Original audio file

orig_transcript

string

Shift + Return to add a new line

Optionally provide the transcript of the input audio. Leave it blank to use the WhisperX model below to generate the transcript. Inaccurate transcription may lead to error TTS or speech editing

Default: ""

whisperx_model

string

If orig_transcript is not provided above, choose a WhisperX model for generating the transcript. Inaccurate transcription may lead to error TTS or speech editing. You can modify the generated transcript and provide it directly to orig_transcript above

Default: "base.en"

target_transcript

*string

Shift + Return to add a new line

I cannot believe that the same model can also do text to speech synthesis too!I cannot believe that the same model can also do text to speech synthesis too!

Transcript of the target audio file

cut_off_sec

number

Only used for for zero-shot text-to-speech task. The first seconds of the original audio that are used for zero-shot text-to-speech. 3 sec of reference is generally enough for high quality voice cloning, but longer is generally better, try e.g. 3~6 sec

Default: 3.01

kvcache

integer

Set to 0 to use less VRAM, but with slower inference

Default: 1

left_margin

number

Margin to the left of the editing segment

Default: 0.08

right_margin

number

Margin to the right of the editing segment

Default: 0.08

temperature

number

Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic. Do not recommend to change

Default: 1

top_p

number

Default value for TTS is 0.9, and 0.8 for speech editing

Default: 0.9

stop_repetition

integer

Default value for TTS is 3, and -1 for speech editing. -1 means do not adjust prob of silence tokens. if there are long silence or unnaturally stretched words, increase sample_batch_size to 2, 3 or even 4

Default: 3

sample_batch_size

integer

Default value for TTS is 4, and 1 for speech editing. The higher the number, the faster the output will be. Under the hood, the model will generate this many samples and choose the shortest one

Default: 4

seed

integer

Random seed. Leave blank to randomize the seed

Run this model in Node.js with one line of code:

npx create-replicate --model=cjwbw/voicecraft

or set up a project from scratch

Install Replicate’s Node.js client library:

npm install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import and set up the client:

import Replicate from "replicate";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run cjwbw/voicecraft using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "cjwbw/voicecraft:db97f6312d4c4d20e500e47fd95d8f14b00d8d28e046834faffb7999d83b6b30",
  {
    input: {
      task: "zero-shot text-to-speech",
      top_p: 0.8,
      kvcache: 1,
      orig_audio: "https://replicate.delivery/pbxt/Kh3PJuzs2xNgaaNOU6fD3jTz0Xx2dE1zpdXpT2k19fzsB8qE/84_121550_000074_000000.wav",
      cut_off_sec: 3.01,
      left_margin: 0.08,
      temperature: 1,
      right_margin: 0.08,
      whisperx_model: "base.en",
      orig_transcript: "",
      stop_repetition: 3,
      voicecraft_model: "giga330M_TTSEnhanced.pth",
      sample_batch_size: 4,
      target_transcript: "I cannot believe that the same model can also do text to speech synthesis too!"
    }
  }
);

console.log(output);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import the client:

import replicate

Run cjwbw/voicecraft using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "cjwbw/voicecraft:db97f6312d4c4d20e500e47fd95d8f14b00d8d28e046834faffb7999d83b6b30",
    input={
        "task": "zero-shot text-to-speech",
        "top_p": 0.8,
        "kvcache": 1,
        "orig_audio": "https://replicate.delivery/pbxt/Kh3PJuzs2xNgaaNOU6fD3jTz0Xx2dE1zpdXpT2k19fzsB8qE/84_121550_000074_000000.wav",
        "cut_off_sec": 3.01,
        "left_margin": 0.08,
        "temperature": 1,
        "right_margin": 0.08,
        "whisperx_model": "base.en",
        "orig_transcript": "",
        "stop_repetition": 3,
        "voicecraft_model": "giga330M_TTSEnhanced.pth",
        "sample_batch_size": 4,
        "target_transcript": "I cannot believe that the same model can also do text to speech synthesis too!"
    }
)
print(output)

To learn more, take a look at the guide on getting started with Python.

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Run cjwbw/voicecraft using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "cjwbw/voicecraft:db97f6312d4c4d20e500e47fd95d8f14b00d8d28e046834faffb7999d83b6b30",
    "input": {
      "task": "zero-shot text-to-speech",
      "top_p": 0.8,
      "kvcache": 1,
      "orig_audio": "https://replicate.delivery/pbxt/Kh3PJuzs2xNgaaNOU6fD3jTz0Xx2dE1zpdXpT2k19fzsB8qE/84_121550_000074_000000.wav",
      "cut_off_sec": 3.01,
      "left_margin": 0.08,
      "temperature": 1,
      "right_margin": 0.08,
      "whisperx_model": "base.en",
      "orig_transcript": "",
      "stop_repetition": 3,
      "voicecraft_model": "giga330M_TTSEnhanced.pth",
      "sample_batch_size": 4,
      "target_transcript": "I cannot believe that the same model can also do text to speech synthesis too!"
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

generated_audio

Video Player is loading.

Current Time 00:00:000

Duration 00:00:000

Loaded: 0%

Stream Type LIVE

Remaining Time 00:00:000

whisper_transcript_orig_audio

But when I had approached so near to them, the common object, which the sense deceives, lost not by distance any of its marks.

{
  "completed_at": "2024-04-21T22:20:21.956533Z",
  "created_at": "2024-04-21T22:20:14.657000Z",
  "data_removed": false,
  "error": null,
  "id": "3wfg8mpzr5rgg0cf0azajq1x50",
  "input": {
    "task": "zero-shot text-to-speech",
    "top_p": 0.8,
    "kvcache": 1,
    "orig_audio": "https://replicate.delivery/pbxt/Kh3PJuzs2xNgaaNOU6fD3jTz0Xx2dE1zpdXpT2k19fzsB8qE/84_121550_000074_000000.wav",
    "cut_off_sec": 3.01,
    "left_margin": 0.08,
    "temperature": 1,
    "right_margin": 0.08,
    "whisperx_model": "base.en",
    "orig_transcript": "",
    "stop_repetition": 3,
    "voicecraft_model": "giga330M_TTSEnhanced.pth",
    "sample_batch_size": 4,
    "target_transcript": "I cannot believe that the same model can also do text to speech synthesis too!"
  },
  "logs": "Using seed: 34156\nSuppressing numeral and symbol tokens: [3, 4, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 352, 362, 405, 486, 513, 580, 604, 642, 657, 678, 718, 720, 767, 807, 830, 838, 860, 939, 940, 1065, 1105, 1120, 1129, 1157, 1160, 1238, 1248, 1264, 1270, 1314, 1315, 1367, 1415, 1433, 1467, 1478, 1485, 1495, 1507, 1511, 1542, 1558, 1584, 1594, 1596, 1679, 1731, 1795, 1802, 1821, 1828, 1853, 1899, 1946, 1954, 1959, 1983, 1987, 2026, 2075, 2078, 2079, 2091, 2154, 2167, 2177, 2211, 2231, 2242, 2310, 2319, 2321, 2327, 2388, 2414, 2425, 2481, 2534, 2548, 2579, 2598, 2608, 2623, 2624, 2670, 2681, 2682, 2713, 2718, 2757, 2780, 2791, 2808, 2813, 2816, 2857, 2864, 2919, 2920, 2931, 2996, 2998, 2999, 3023, 3050, 3064, 3070, 3104, 3126, 3132, 3134, 3261, 3270, 3312, 3324, 3365, 3388, 3439, 3459, 3510, 3553, 3559, 3571, 3648, 3682, 3695, 3717, 3720, 3829, 3865, 3901, 3933, 3980, 4019, 4051, 4059, 4064, 4089, 4101, 4153, 4248, 4304, 4309, 4310, 4317, 4343, 4349, 4353, 4407, 4521, 4524, 4531, 4570, 4626, 4747, 4751, 4761, 4764, 4790, 4793, 4846, 4869, 4967, 4974, 5014, 5066, 5075, 5125, 5214, 5237, 5304, 5323, 5332, 5333, 5433, 5441, 5472, 5534, 5539, 5598, 5607, 5705, 5774, 5816, 5824, 5846, 5867, 5878, 5892, 5946, 5996, 5999, 6052, 6073, 6135, 6200, 6244, 6298, 6303, 6337, 6390, 6420, 6469, 6640, 6659, 6740, 6885, 6957, 6999, 7029, 7169, 7175, 7192, 7198, 7225, 7265, 7337, 7358, 7388, 7410, 7441, 7600, 7618, 7632, 7643, 7724, 7769, 7795, 7816, 7863, 7908, 7930, 7982, 8054, 8069, 8093, 8190, 8235, 8257, 8269, 8275, 8298, 8309, 8454, 8487, 8541, 8576, 8628, 8644, 8646, 8684, 8699, 8702, 8735, 8753, 8784, 8854, 8870, 8915, 8949, 9031, 9130, 9162, 9166, 9193, 9225, 9415, 9507, 9508, 9656, 9661, 9698, 9768, 9773, 9796, 9804, 9849, 9879, 9907, 9919, 10048, 10053, 10083, 10111, 10163, 10190, 10232, 10249, 10261, 10333, 10460, 10495, 10531, 10535, 11024, 11104, 11245, 11323, 11442, 11445, 11470, 11509, 11528, 11546, 11623, 11645, 11785, 12113, 12122, 12131, 12279, 12713, 12726, 12762, 12825, 12844, 12863, 12865, 12877, 12923, 12952, 13037, 13108, 13130, 13151, 13330, 13343, 13348, 13374, 13381, 13454, 13464, 13521, 13539, 13540, 13702, 13803, 14062, 14198, 14280, 14315, 14436, 14454, 14489, 14585, 14656, 14686, 14745, 14877, 14956, 14988, 15136, 15143, 15187, 15197, 15231, 15259, 15277, 15349, 15363, 15377, 15408, 15426, 15495, 15524, 15533, 15589, 15629, 15674, 15696, 15711, 15724, 15761, 15801, 15897, 15904, 15920, 15963, 15982, 16003, 16088, 16101, 16102, 16226, 16236, 16243, 16315, 16382, 16450, 16489, 16562, 16616, 16626, 16677, 16679, 16763, 16799, 16817, 16942, 16945, 16994, 17031, 17032, 17059, 17279, 17318, 17342, 17430, 17464, 17477, 17501, 17544, 17572, 17575, 17643, 17657, 17672, 17729, 17759, 17817, 17827, 17885, 17971, 18005, 18112, 18182, 18294, 18298, 18376, 18395, 18444, 18458, 18500, 18523, 18638, 18693, 18741, 18742, 18781, 18823, 18897, 18938, 18946, 19004, 19035, 19038, 19048, 19060, 19104, 19214, 19244, 19322, 19342, 19409, 19420, 19442, 19504, 19683, 19707, 19708, 19710, 19755, 19782, 19880, 19884, 19891, 19924, 20007, 20033, 20064, 20107, 20167, 20198, 20219, 20224, 20233, 20248, 20299, 20343, 20356, 20370, 20416, 20479, 20483, 20510, 20548, 20666, 20708, 20809, 20943, 20959, 20964, 20986, 21033, 21056, 21113, 21139, 21148, 21261, 21268, 21273, 21288, 21315, 21355, 21395, 21409, 21431, 21489, 21495, 21498, 21503, 21526, 21536, 21577, 21599, 21601, 21626, 21643, 21652, 21709, 21719, 21734, 21738, 21761, 21777, 21794, 21844, 21895, 21908, 21940, 22042, 22047, 22131, 22136, 22148, 22169, 22172, 22186, 22219, 22243, 22288, 22291, 22318, 22337, 22352, 22370, 22413, 22416, 22458, 22515, 22538, 22544, 22567, 22572, 22579, 22613, 22626, 22666, 22709, 22717, 22730, 22745, 22799, 22800, 22842, 22855, 22883, 22909, 22913, 22914, 22951, 22980, 22985, 22986, 22995, 22996, 23045, 23055, 23068, 23120, 23134, 23148, 23188, 23195, 23234, 23237, 23313, 23336, 23344, 23349, 23362, 23378, 23451, 23460, 23487, 23516, 23539, 23601, 23628, 23664, 23666, 23679, 23721, 23726, 23734, 23753, 23756, 23815, 23847, 23859, 23871, 23906, 23924, 24038, 24041, 24045, 24063, 24096, 24136, 24137, 24168, 24214, 24217, 24235, 24294, 24309, 24339, 24356, 24369, 24403, 24409, 24414, 24465, 24529, 24555, 24591, 24598, 24648, 24652, 24669, 24693, 24718, 24760, 24793, 24839, 24840, 24848, 24894, 24909, 24938, 24940, 24943, 24970, 24977, 24991, 25022, 25054, 25061, 25090, 25096, 25150, 25177, 25181, 25190, 25191, 25240, 25257, 25264, 25270, 25272, 25307, 25325, 25326, 25399, 25429, 25475, 25500, 25508, 25540, 25597, 25600, 25643, 25644, 25645, 25667, 25674, 25707, 25710, 25764, 25816, 25829, 25836, 25838, 25859, 25870, 25948, 25964, 26007, 26050, 26063, 26073, 26115, 26118, 26143, 26200, 26250, 26259, 26276, 26279, 26352, 26422, 26427, 26429, 26481, 26492, 26514, 26525, 26539, 26561, 26582, 26598, 26607, 26660, 26704, 26709, 26717, 26753, 26780, 26826, 26833, 26881, 26895, 26912, 26937, 26956, 27019, 27033, 27037, 27057, 27121, 27137, 27191, 27192, 27203, 27211, 27228, 27230, 27253, 27260, 27277, 27301, 27310, 27326, 27367, 27368, 27371, 27408, 27412, 27550, 27551, 27559, 27621, 27641, 27649, 27653, 27693, 27696, 27712, 27720, 27728, 27778, 27790, 27791, 27795, 27800, 27824, 27829, 27877, 27936, 27937, 27956, 27970, 27988, 28011, 28017, 28041, 28054, 28072, 28119, 28174, 28256, 28262, 28277, 28296, 28324, 28362, 28369, 28460, 28481, 28551, 28555, 28560, 28567, 28581, 28592, 28598, 28645, 28658, 28669, 28676, 28684, 28687, 28688, 28694, 28714, 28727, 28771, 28815, 28817, 28857, 28872, 28878, 28896, 28933, 28947, 28977, 28978, 29022, 29041, 29059, 29088, 29101, 29110, 29119, 29143, 29159, 29173, 29211, 29217, 29228, 29279, 29300, 29326, 29331, 29334, 29414, 29416, 29524, 29558, 29568, 29626, 29637, 29691, 29703, 29769, 29796, 29807, 29903, 29953, 30005, 30057, 30110, 30120, 30123, 30179, 30206, 30272, 30273, 30290, 30299, 30336, 30368, 30435, 30453, 30460, 30483, 30484, 30505, 30557, 30607, 30610, 30695, 30704, 30727, 30743, 30763, 30803, 30863, 30924, 30986, 30989, 30992, 30995, 31009, 31010, 31011, 31020, 31027, 31046, 31064, 31102, 31115, 31128, 31211, 31360, 31380, 31418, 31495, 31496, 31503, 31510, 31552, 31566, 31575, 31654, 31672, 31675, 31697, 31714, 31751, 31773, 31794, 31883, 31911, 31916, 31938, 31952, 31953, 31980, 31982, 32047, 32056, 32059, 32062, 32066, 32090, 32114, 32118, 32128, 32148, 32158, 32182, 32190, 32196, 32215, 32216, 32220, 32320, 32321, 32382, 32417, 32459, 32471, 32531, 32544, 32568, 32576, 32583, 32591, 32614, 32624, 32637, 32642, 32647, 32747, 32759, 32811, 32817, 32869, 32883, 32921, 32996, 33015, 33028, 33032, 33042, 33057, 33121, 33160, 33206, 33289, 33300, 33319, 33372, 33394, 33400, 33438, 33448, 33459, 33470, 33507, 33535, 33548, 33551, 33580, 33581, 33618, 33638, 33646, 33660, 33690, 33698, 33759, 33781, 33797, 33808, 33879, 33882, 33916, 33942, 33963, 33981, 34044, 34085, 34091, 34107, 34125, 34131, 34135, 34137, 34155, 34159, 34206, 34215, 34229, 34251, 34256, 34287, 34294, 34323, 34353, 34385, 34427, 34463, 34465, 34483, 34489, 34583, 34598, 34620, 34625, 34626, 34716, 34741, 34770, 34772, 34801, 34808, 34825, 34865, 34938, 34951, 35005, 35038, 35090, 35124, 35126, 35133, 35145, 35148, 35150, 35175, 35195, 35218, 35264, 35273, 35307, 35360, 35369, 35378, 35402, 35404, 35411, 35419, 35435, 35447, 35500, 35534, 35549, 35592, 35617, 35638, 35642, 35665, 35667, 35745, 35768, 35809, 35844, 35890, 35897, 35916, 35978, 35989, 36006, 36042, 36058, 36088, 36094, 36100, 36117, 36141, 36150, 36189, 36203, 36243, 36244, 36260, 36330, 36445, 36453, 36490, 36521, 36561, 36565, 36566, 36625, 36626, 36629, 36641, 36657, 36676, 36678, 36680, 36720, 36737, 36809, 36864, 36879, 36917, 36928, 36959, 36966, 36993, 37128, 37144, 37166, 37187, 37224, 37255, 37272, 37283, 37290, 37309, 37364, 37381, 37397, 37452, 37466, 37517, 37528, 37547, 37563, 37576, 37601, 37633, 37637, 37667, 37674, 37680, 37688, 37710, 37730, 37737, 37747, 37750, 37781, 37804, 37831, 37841, 37856, 37864, 37950, 37967, 37988, 38055, 38056, 38073, 38089, 38107, 38108, 38123, 38147, 38158, 38172, 38190, 38205, 38210, 38219, 38249, 38314, 38326, 38339, 38369, 38380, 38384, 38391, 38431, 38446, 38449, 38472, 38503, 38525, 38547, 38549, 38565, 38569, 38595, 38605, 38612, 38634, 38652, 38703, 38721, 38783, 38819, 38831, 38850, 38867, 38892, 38902, 38905, 38907, 38956, 39064, 39084, 39088, 39093, 39101, 39103, 39111, 39118, 39121, 39132, 39135, 39166, 39174, 39188, 39195, 39226, 39251, 39254, 39260, 39277, 39280, 39320, 39322, 39357, 39380, 39449, 39466, 39506, 39509, 39570, 39595, 39647, 39658, 39667, 39697, 39710, 39761, 39768, 39850, 39861, 39882, 39885, 39923, 39925, 39937, 39997, 40022, 40035, 40064, 40090, 40111, 40149, 40173, 40179, 40215, 40220, 40248, 40256, 40271, 40286, 40350, 40353, 40384, 40385, 40393, 40400, 40401, 40403, 40417, 40427, 40454, 40463, 40486, 40523, 40538, 40554, 40585, 40639, 40643, 40652, 40654, 40660, 40675, 40736, 40761, 40828, 40839, 40873, 40884, 41019, 41023, 41060, 41103, 41172, 41208, 41234, 41235, 41241, 41247, 41263, 41287, 41289, 41290, 41292, 41322, 41417, 41423, 41435, 41507, 41531, 41544, 41561, 41569, 41580, 41583, 41612, 41625, 41647, 41655, 41706, 41717, 41734, 41739, 41761, 41810, 41813, 41820, 41853, 41874, 41879, 41922, 41931, 41934, 41948, 41977, 42018, 42032, 42060, 42117, 42141, 42163, 42199, 42215, 42224, 42240, 42246, 42250, 42294, 42313, 42321, 42334, 42363, 42444, 42479, 42489, 42520, 42534, 42548, 42622, 42671, 42691, 42716, 42751, 42752, 42759, 42780, 42802, 42819, 42830, 42875, 42877, 42947, 42980, 43019, 43134, 43147, 43155, 43184, 43193, 43234, 43239, 43240, 43284, 43292, 43313, 43336, 43356, 43364, 43367, 43379, 43434, 43452, 43489, 43509, 43526, 43550, 43564, 43571, 43587, 43610, 43637, 43641, 43665, 43686, 43690, 43697, 43704, 43722, 43785, 43798, 43864, 43916, 43918, 43927, 43950, 43977, 44063, 44084, 44085, 44087, 44093, 44103, 44169, 44183, 44214, 44215, 44218, 44227, 44230, 44300, 44318, 44341, 44361, 44367, 44417, 44427, 44465, 44468, 44505, 44541, 44550, 44552, 44578, 44586, 44613, 44617, 44622, 44626, 44673, 44675, 44688, 44698, 44717, 44729, 44750, 44808, 44821, 44826, 44856, 44928, 44966, 44969, 44980, 44994, 45021, 45039, 45063, 45068, 45095, 45151, 45191, 45192, 45210, 45214, 45271, 45278, 45310, 45326, 45331, 45345, 45385, 45403, 45418, 45432, 45438, 45439, 45440, 45455, 45469, 45473, 45491, 45598, 45600, 45601, 45611, 45620, 45719, 45720, 45722, 45734, 45758, 45791, 45839, 45881, 45900, 45937, 45959, 45969, 45987, 46044, 46096, 46239, 46244, 46250, 46302, 46351, 46352, 46393, 46396, 46425, 46435, 46438, 46477, 46519, 46556, 46572, 46588, 46589, 46618, 46633, 46636, 46660, 46712, 46720, 46723, 46752, 46761, 46815, 46821, 46839, 46841, 46871, 46872, 46899, 46900, 46951, 46957, 47007, 47072, 47101, 47106, 47113, 47159, 47197, 47202, 47233, 47235, 47325, 47338, 47343, 47372, 47396, 47407, 47448, 47465, 47493, 47512, 47521, 47567, 47576, 47580, 47582, 47679, 47705, 47744, 47760, 47784, 47785, 47801, 47838, 47915, 47936, 47941, 47946, 47996, 48000, 48057, 48065, 48082, 48096, 48104, 48132, 48136, 48156, 48170, 48173, 48194, 48200, 48207, 48246, 48250, 48252, 48284, 48290, 48340, 48341, 48365, 48372, 48391, 48475, 48524, 48527, 48528, 48529, 48531, 48548, 48555, 48564, 48581, 48597, 48602, 48609, 48630, 48634, 48638, 48645, 48655, 48712, 48724, 48758, 48768, 48777, 48868, 48882, 48889, 48891, 48894, 48908, 48952, 48964, 49020, 49051, 49087, 49125, 49150, 49211, 49231, 49234, 49259, 49287, 49294, 49327, 49351, 49352, 49356, 49388, 49429, 49447, 49489, 49503, 49517, 49539, 49541, 49542, 49545, 49557, 49561, 49563, 49584, 49616, 49633, 49641, 49649, 49658, 49669, 49682, 49689, 49703, 49721, 49803, 49814, 49841, 49856, 49888, 49934, 49959, 49989, 49995, 50038, 50049, 50055, 50080, 50119, 50138, 50148, 50150, 50154, 50165, 50205, 50242]\nThe transcript from the Whisper model: But when I had approached so near to them, the common object, which the sense deceives, lost not by distance any of its marks.\nWARNING:phonemizer:words count mismatch on 200.0% of the lines (2/1)",
  "metrics": {
    "predict_time": 7.261739,
    "total_time": 7.299533
  },
  "output": {
    "generated_audio": "https://replicate.delivery/pbxt/YvDHf5LaFg2FKCZieFteAxnHeKYSdfMRjbeeIBFQYgfjlM0sSA/out.wav",
    "whisper_transcript_orig_audio": "But when I had approached so near to them, the common object, which the sense deceives, lost not by distance any of its marks."
  },
  "started_at": "2024-04-21T22:20:14.694794Z",
  "status": "succeeded",
  "urls": {
    "get": "https://api.replicate.com/v1/predictions/3wfg8mpzr5rgg0cf0azajq1x50",
    "cancel": "https://api.replicate.com/v1/predictions/3wfg8mpzr5rgg0cf0azajq1x50/cancel"
  },
  "version": "6e42571a17e0fbbb0d92baa8d73c2926329cf8c3be8eedcee79822f7187b3080"
}

Generated in

7.3 seconds

Tweak it ShareReport View full prediction

Using seed: 34156
Suppressing numeral and symbol tokens: [3, 4, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 352, 362, 405, 486, 513, 580, 604, 642, 657, 678, 718, 720, 767, 807, 830, 838, 860, 939, 940, 1065, 1105, 1120, 1129, 1157, 1160, 1238, 1248, 1264, 1270, 1314, 1315, 1367, 1415, 1433, 1467, 1478, 1485, 1495, 1507, 1511, 1542, 1558, 1584, 1594, 1596, 1679, 1731, 1795, 1802, 1821, 1828, 1853, 1899, 1946, 1954, 1959, 1983, 1987, 2026, 2075, 2078, 2079, 2091, 2154, 2167, 2177, 2211, 2231, 2242, 2310, 2319, 2321, 2327, 2388, 2414, 2425, 2481, 2534, 2548, 2579, 2598, 2608, 2623, 2624, 2670, 2681, 2682, 2713, 2718, 2757, 2780, 2791, 2808, 2813, 2816, 2857, 2864, 2919, 2920, 2931, 2996, 2998, 2999, 3023, 3050, 3064, 3070, 3104, 3126, 3132, 3134, 3261, 3270, 3312, 3324, 3365, 3388, 3439, 3459, 3510, 3553, 3559, 3571, 3648, 3682, 3695, 3717, 3720, 3829, 3865, 3901, 3933, 3980, 4019, 4051, 4059, 4064, 4089, 4101, 4153, 4248, 4304, 4309, 4310, 4317, 4343, 4349, 4353, 4407, 4521, 4524, 4531, 4570, 4626, 4747, 4751, 4761, 4764, 4790, 4793, 4846, 4869, 4967, 4974, 5014, 5066, 5075, 5125, 5214, 5237, 5304, 5323, 5332, 5333, 5433, 5441, 5472, 5534, 5539, 5598, 5607, 5705, 5774, 5816, 5824, 5846, 5867, 5878, 5892, 5946, 5996, 5999, 6052, 6073, 6135, 6200, 6244, 6298, 6303, 6337, 6390, 6420, 6469, 6640, 6659, 6740, 6885, 6957, 6999, 7029, 7169, 7175, 7192, 7198, 7225, 7265, 7337, 7358, 7388, 7410, 7441, 7600, 7618, 7632, 7643, 7724, 7769, 7795, 7816, 7863, 7908, 7930, 7982, 8054, 8069, 8093, 8190, 8235, 8257, 8269, 8275, 8298, 8309, 8454, 8487, 8541, 8576, 8628, 8644, 8646, 8684, 8699, 8702, 8735, 8753, 8784, 8854, 8870, 8915, 8949, 9031, 9130, 9162, 9166, 9193, 9225, 9415, 9507, 9508, 9656, 9661, 9698, 9768, 9773, 9796, 9804, 9849, 9879, 9907, 9919, 10048, 10053, 10083, 10111, 10163, 10190, 10232, 10249, 10261, 10333, 10460, 10495, 10531, 10535, 11024, 11104, 11245, 11323, 11442, 11445, 11470, 11509, 11528, 11546, 11623, 11645, 11785, 12113, 12122, 12131, 12279, 12713, 12726, 12762, 12825, 12844, 12863, 12865, 12877, 12923, 12952, 13037, 13108, 13130, 13151, 13330, 13343, 13348, 13374, 13381, 13454, 13464, 13521, 13539, 13540, 13702, 13803, 14062, 14198, 14280, 14315, 14436, 14454, 14489, 14585, 14656, 14686, 14745, 14877, 14956, 14988, 15136, 15143, 15187, 15197, 15231, 15259, 15277, 15349, 15363, 15377, 15408, 15426, 15495, 15524, 15533, 15589, 15629, 15674, 15696, 15711, 15724, 15761, 15801, 15897, 15904, 15920, 15963, 15982, 16003, 16088, 16101, 16102, 16226, 16236, 16243, 16315, 16382, 16450, 16489, 16562, 16616, 16626, 16677, 16679, 16763, 16799, 16817, 16942, 16945, 16994, 17031, 17032, 17059, 17279, 17318, 17342, 17430, 17464, 17477, 17501, 17544, 17572, 17575, 17643, 17657, 17672, 17729, 17759, 17817, 17827, 17885, 17971, 18005, 18112, 18182, 18294, 18298, 18376, 18395, 18444, 18458, 18500, 18523, 18638, 18693, 18741, 18742, 18781, 18823, 18897, 18938, 18946, 19004, 19035, 19038, 19048, 19060, 19104, 19214, 19244, 19322, 19342, 19409, 19420, 19442, 19504, 19683, 19707, 19708, 19710, 19755, 19782, 19880, 19884, 19891, 19924, 20007, 20033, 20064, 20107, 20167, 20198, 20219, 20224, 20233, 20248, 20299, 20343, 20356, 20370, 20416, 20479, 20483, 20510, 20548, 20666, 20708, 20809, 20943, 20959, 20964, 20986, 21033, 21056, 21113, 21139, 21148, 21261, 21268, 21273, 21288, 21315, 21355, 21395, 21409, 21431, 21489, 21495, 21498, 21503, 21526, 21536, 21577, 21599, 21601, 21626, 21643, 21652, 21709, 21719, 21734, 21738, 21761, 21777, 21794, 21844, 21895, 21908, 21940, 22042, 22047, 22131, 22136, 22148, 22169, 22172, 22186, 22219, 22243, 22288, 22291, 22318, 22337, 22352, 22370, 22413, 22416, 22458, 22515, 22538, 22544, 22567, 22572, 22579, 22613, 22626, 22666, 22709, 22717, 22730, 22745, 22799, 22800, 22842, 22855, 22883, 22909, 22913, 22914, 22951, 22980, 22985, 22986, 22995, 22996, 23045, 23055, 23068, 23120, 23134, 23148, 23188, 23195, 23234, 23237, 23313, 23336, 23344, 23349, 23362, 23378, 23451, 23460, 23487, 23516, 23539, 23601, 23628, 23664, 23666, 23679, 23721, 23726, 23734, 23753, 23756, 23815, 23847, 23859, 23871, 23906, 23924, 24038, 24041, 24045, 24063, 24096, 24136, 24137, 24168, 24214, 24217, 24235, 24294, 24309, 24339, 24356, 24369, 24403, 24409, 24414, 24465, 24529, 24555, 24591, 24598, 24648, 24652, 24669, 24693, 24718, 24760, 24793, 24839, 24840, 24848, 24894, 24909, 24938, 24940, 24943, 24970, 24977, 24991, 25022, 25054, 25061, 25090, 25096, 25150, 25177, 25181, 25190, 25191, 25240, 25257, 25264, 25270, 25272, 25307, 25325, 25326, 25399, 25429, 25475, 25500, 25508, 25540, 25597, 25600, 25643, 25644, 25645, 25667, 25674, 25707, 25710, 25764, 25816, 25829, 25836, 25838, 25859, 25870, 25948, 25964, 26007, 26050, 26063, 26073, 26115, 26118, 26143, 26200, 26250, 26259, 26276, 26279, 26352, 26422, 26427, 26429, 26481, 26492, 26514, 26525, 26539, 26561, 26582, 26598, 26607, 26660, 26704, 26709, 26717, 26753, 26780, 26826, 26833, 26881, 26895, 26912, 26937, 26956, 27019, 27033, 27037, 27057, 27121, 27137, 27191, 27192, 27203, 27211, 27228, 27230, 27253, 27260, 27277, 27301, 27310, 27326, 27367, 27368, 27371, 27408, 27412, 27550, 27551, 27559, 27621, 27641, 27649, 27653, 27693, 27696, 27712, 27720, 27728, 27778, 27790, 27791, 27795, 27800, 27824, 27829, 27877, 27936, 27937, 27956, 27970, 27988, 28011, 28017, 28041, 28054, 28072, 28119, 28174, 28256, 28262, 28277, 28296, 28324, 28362, 28369, 28460, 28481, 28551, 28555, 28560, 28567, 28581, 28592, 28598, 28645, 28658, 28669, 28676, 28684, 28687, 28688, 28694, 28714, 28727, 28771, 28815, 28817, 28857, 28872, 28878, 28896, 28933, 28947, 28977, 28978, 29022, 29041, 29059, 29088, 29101, 29110, 29119, 29143, 29159, 29173, 29211, 29217, 29228, 29279, 29300, 29326, 29331, 29334, 29414, 29416, 29524, 29558, 29568, 29626, 29637, 29691, 29703, 29769, 29796, 29807, 29903, 29953, 30005, 30057, 30110, 30120, 30123, 30179, 30206, 30272, 30273, 30290, 30299, 30336, 30368, 30435, 30453, 30460, 30483, 30484, 30505, 30557, 30607, 30610, 30695, 30704, 30727, 30743, 30763, 30803, 30863, 30924, 30986, 30989, 30992, 30995, 31009, 31010, 31011, 31020, 31027, 31046, 31064, 31102, 31115, 31128, 31211, 31360, 31380, 31418, 31495, 31496, 31503, 31510, 31552, 31566, 31575, 31654, 31672, 31675, 31697, 31714, 31751, 31773, 31794, 31883, 31911, 31916, 31938, 31952, 31953, 31980, 31982, 32047, 32056, 32059, 32062, 32066, 32090, 32114, 32118, 32128, 32148, 32158, 32182, 32190, 32196, 32215, 32216, 32220, 32320, 32321, 32382, 32417, 32459, 32471, 32531, 32544, 32568, 32576, 32583, 32591, 32614, 32624, 32637, 32642, 32647, 32747, 32759, 32811, 32817, 32869, 32883, 32921, 32996, 33015, 33028, 33032, 33042, 33057, 33121, 33160, 33206, 33289, 33300, 33319, 33372, 33394, 33400, 33438, 33448, 33459, 33470, 33507, 33535, 33548, 33551, 33580, 33581, 33618, 33638, 33646, 33660, 33690, 33698, 33759, 33781, 33797, 33808, 33879, 33882, 33916, 33942, 33963, 33981, 34044, 34085, 34091, 34107, 34125, 34131, 34135, 34137, 34155, 34159, 34206, 34215, 34229, 34251, 34256, 34287, 34294, 34323, 34353, 34385, 34427, 34463, 34465, 34483, 34489, 34583, 34598, 34620, 34625, 34626, 34716, 34741, 34770, 34772, 34801, 34808, 34825, 34865, 34938, 34951, 35005, 35038, 35090, 35124, 35126, 35133, 35145, 35148, 35150, 35175, 35195, 35218, 35264, 35273, 35307, 35360, 35369, 35378, 35402, 35404, 35411, 35419, 35435, 35447, 35500, 35534, 35549, 35592, 35617, 35638, 35642, 35665, 35667, 35745, 35768, 35809, 35844, 35890, 35897, 35916, 35978, 35989, 36006, 36042, 36058, 36088, 36094, 36100, 36117, 36141, 36150, 36189, 36203, 36243, 36244, 36260, 36330, 36445, 36453, 36490, 36521, 36561, 36565, 36566, 36625, 36626, 36629, 36641, 36657, 36676, 36678, 36680, 36720, 36737, 36809, 36864, 36879, 36917, 36928, 36959, 36966, 36993, 37128, 37144, 37166, 37187, 37224, 37255, 37272, 37283, 37290, 37309, 37364, 37381, 37397, 37452, 37466, 37517, 37528, 37547, 37563, 37576, 37601, 37633, 37637, 37667, 37674, 37680, 37688, 37710, 37730, 37737, 37747, 37750, 37781, 37804, 37831, 37841, 37856, 37864, 37950, 37967, 37988, 38055, 38056, 38073, 38089, 38107, 38108, 38123, 38147, 38158, 38172, 38190, 38205, 38210, 38219, 38249, 38314, 38326, 38339, 38369, 38380, 38384, 38391, 38431, 38446, 38449, 38472, 38503, 38525, 38547, 38549, 38565, 38569, 38595, 38605, 38612, 38634, 38652, 38703, 38721, 38783, 38819, 38831, 38850, 38867, 38892, 38902, 38905, 38907, 38956, 39064, 39084, 39088, 39093, 39101, 39103, 39111, 39118, 39121, 39132, 39135, 39166, 39174, 39188, 39195, 39226, 39251, 39254, 39260, 39277, 39280, 39320, 39322, 39357, 39380, 39449, 39466, 39506, 39509, 39570, 39595, 39647, 39658, 39667, 39697, 39710, 39761, 39768, 39850, 39861, 39882, 39885, 39923, 39925, 39937, 39997, 40022, 40035, 40064, 40090, 40111, 40149, 40173, 40179, 40215, 40220, 40248, 40256, 40271, 40286, 40350, 40353, 40384, 40385, 40393, 40400, 40401, 40403, 40417, 40427, 40454, 40463, 40486, 40523, 40538, 40554, 40585, 40639, 40643, 40652, 40654, 40660, 40675, 40736, 40761, 40828, 40839, 40873, 40884, 41019, 41023, 41060, 41103, 41172, 41208, 41234, 41235, 41241, 41247, 41263, 41287, 41289, 41290, 41292, 41322, 41417, 41423, 41435, 41507, 41531, 41544, 41561, 41569, 41580, 41583, 41612, 41625, 41647, 41655, 41706, 41717, 41734, 41739, 41761, 41810, 41813, 41820, 41853, 41874, 41879, 41922, 41931, 41934, 41948, 41977, 42018, 42032, 42060, 42117, 42141, 42163, 42199, 42215, 42224, 42240, 42246, 42250, 42294, 42313, 42321, 42334, 42363, 42444, 42479, 42489, 42520, 42534, 42548, 42622, 42671, 42691, 42716, 42751, 42752, 42759, 42780, 42802, 42819, 42830, 42875, 42877, 42947, 42980, 43019, 43134, 43147, 43155, 43184, 43193, 43234, 43239, 43240, 43284, 43292, 43313, 43336, 43356, 43364, 43367, 43379, 43434, 43452, 43489, 43509, 43526, 43550, 43564, 43571, 43587, 43610, 43637, 43641, 43665, 43686, 43690, 43697, 43704, 43722, 43785, 43798, 43864, 43916, 43918, 43927, 43950, 43977, 44063, 44084, 44085, 44087, 44093, 44103, 44169, 44183, 44214, 44215, 44218, 44227, 44230, 44300, 44318, 44341, 44361, 44367, 44417, 44427, 44465, 44468, 44505, 44541, 44550, 44552, 44578, 44586, 44613, 44617, 44622, 44626, 44673, 44675, 44688, 44698, 44717, 44729, 44750, 44808, 44821, 44826, 44856, 44928, 44966, 44969, 44980, 44994, 45021, 45039, 45063, 45068, 45095, 45151, 45191, 45192, 45210, 45214, 45271, 45278, 45310, 45326, 45331, 45345, 45385, 45403, 45418, 45432, 45438, 45439, 45440, 45455, 45469, 45473, 45491, 45598, 45600, 45601, 45611, 45620, 45719, 45720, 45722, 45734, 45758, 45791, 45839, 45881, 45900, 45937, 45959, 45969, 45987, 46044, 46096, 46239, 46244, 46250, 46302, 46351, 46352, 46393, 46396, 46425, 46435, 46438, 46477, 46519, 46556, 46572, 46588, 46589, 46618, 46633, 46636, 46660, 46712, 46720, 46723, 46752, 46761, 46815, 46821, 46839, 46841, 46871, 46872, 46899, 46900, 46951, 46957, 47007, 47072, 47101, 47106, 47113, 47159, 47197, 47202, 47233, 47235, 47325, 47338, 47343, 47372, 47396, 47407, 47448, 47465, 47493, 47512, 47521, 47567, 47576, 47580, 47582, 47679, 47705, 47744, 47760, 47784, 47785, 47801, 47838, 47915, 47936, 47941, 47946, 47996, 48000, 48057, 48065, 48082, 48096, 48104, 48132, 48136, 48156, 48170, 48173, 48194, 48200, 48207, 48246, 48250, 48252, 48284, 48290, 48340, 48341, 48365, 48372, 48391, 48475, 48524, 48527, 48528, 48529, 48531, 48548, 48555, 48564, 48581, 48597, 48602, 48609, 48630, 48634, 48638, 48645, 48655, 48712, 48724, 48758, 48768, 48777, 48868, 48882, 48889, 48891, 48894, 48908, 48952, 48964, 49020, 49051, 49087, 49125, 49150, 49211, 49231, 49234, 49259, 49287, 49294, 49327, 49351, 49352, 49356, 49388, 49429, 49447, 49489, 49503, 49517, 49539, 49541, 49542, 49545, 49557, 49561, 49563, 49584, 49616, 49633, 49641, 49649, 49658, 49669, 49682, 49689, 49703, 49721, 49803, 49814, 49841, 49856, 49888, 49934, 49959, 49989, 49995, 50038, 50049, 50055, 50080, 50119, 50138, 50148, 50150, 50154, 50165, 50205, 50242]
The transcript from the Whisper model: But when I had approached so near to them, the common object, which the sense deceives, lost not by distance any of its marks.
WARNING:phonemizer:words count mismatch on 200.0% of the lines (2/1)

This output was created using a different version of the model, cjwbw/voicecraft:6e42571a.

Run time and cost

This model costs approximately $0.0047 to run on Replicate, or 212 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia L40S GPU hardware. Predictions typically complete within 5 seconds. The predict time for this model varies significantly based on the inputs.

Readme

VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild

Demo Paper

TL;DR

VoiceCraft is a token infilling neural codec language model, that achieves state-of-the-art performance on both speech editing and zero-shot text-to-speech (TTS) on in-the-wild data including audiobooks, internet videos, and podcasts.

To clone or edit an unseen voice, VoiceCraft needs only a few seconds of reference.

Acknowledgement

We thank Feiteng for his VALL-E reproduction, and we thank audiocraft team for open-sourcing encodec.

Citation

@article{peng2024voicecraft,
  author    = {Peng, Puyuan and Huang, Po-Yao and Li, Daniel and Mohamed, Abdelrahman and Harwath, David},
  title     = {VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild},
  journal   = {arXiv},
  year      = {2024},
}

Disclaimer

Any organization or individual is prohibited from using any technology mentioned in this paper to generate or edit someone’s speech without his/her consent, including but not limited to government leaders, political figures, and celebrities. If you do not comply with this item, you could be in violation of copyright laws.