lucataco / csm-1b

CSM (Conversational Speech Model) is a speech generation model from Sesame that generates RVQ audio codes from text and audio inputs

  • Public
  • 353 runs
  • L40S
  • GitHub
  • Weights
  • License
Iterate in playground
Run with an API
  • Prediction

    lucataco/csm-1b:e5223446b8f887df531a46461280b9352ffd58bdda729cbbc65f8991e93564a3
    ID
    4ajtra748hrge0cnmtmtz25d68
    Status
    Succeeded
    Source
    Web
    Hardware
    T4
    Total duration
    Created

    Input

    text
    Hello, this is a test of the speech generation model
    speaker
    0
    max_audio_length_ms
    10000

    Output

    Video Player is loading.
    Current Time 00:00:000
    Duration 00:00:000
    Loaded: 0%
    Stream Type LIVE
    Remaining Time 00:00:000
     
    1x
    Generated in
  • Prediction

    lucataco/csm-1b:e5223446b8f887df531a46461280b9352ffd58bdda729cbbc65f8991e93564a3
    ID
    09r7ja5931rga0cnmtrsng0dx8
    Status
    Succeeded
    Source
    Web
    Hardware
    T4
    Total duration
    Created

    Input

    text
    This is CSM by Sesame, generate FVQ audio codes from text
    speaker
    0
    max_audio_length_ms
    10000

    Output

    Video Player is loading.
    Current Time 00:00:000
    Duration 00:00:000
    Loaded: 0%
    Stream Type LIVE
    Remaining Time 00:00:000
     
    1x
    Generated in
  • Prediction

    lucataco/csm-1b:3e59b10a9894c54ae5f2fc0347e3a2f5c82f0574407e53a7d9f76ec7c502ad03
    ID
    5njhzfer0nrma0cnq75v9x4g9g
    Status
    Succeeded
    Source
    Web
    Hardware
    L40S
    Total duration
    Created
    by @lucataco

    Input

    text
    This is CSM by Sesame, generate FVQ audio codes from text
    speaker
    0
    max_audio_length_ms
    10000

    Output

    Video Player is loading.
    Current Time 00:00:000
    Duration 00:00:000
    Loaded: 0%
    Stream Type LIVE
    Remaining Time 00:00:000
     
    1x
    Generated in

Want to make some of these yourself?

Run this model