You're looking at a specific version of this model. Jump to the model overview.
Input schema
The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
Field | Type | Default value | Description |
---|---|---|---|
audio_input |
string
|
An audio file input to time stretch.
|
|
method |
string
(enum)
|
WSOLA
Options: OLA, WSOLA, PV-TSM, PV-TSM-int, TD-PSOLA |
Name of the method to use
|
s_fixed |
number
|
1
|
Time stretching factor s as a constant value.
|
s_ap |
string
|
Input/Output anchor point pair array for dynamic time stretching. Time stretching factor s as an 2 x n array of anchor points formatted in dict type[`input relative frame ratio`(0.0~1.0):`output relative frame ratio`]. (eg. [0:0, 0.5:1, 1:1.7] means first half;0~50% of the audio will be stretched 2x, and the last half;50~100% of the audio will be streched 140%.) Each input value has to be between 0.0 ~ 1.0, which represents the relative position of the anchor.(0 : starting point of an audio, 1 : the length;end point of an audio) When `s_ap` is given, s_fixed will be ignored.
|
|
td_psola_pitch_shift |
string
(enum)
|
None
Options: key, pitch, None |
Only for `TD-PSOLA` method. If `key`, pitch will be shifted based on `td_psola_key_updown`. If `pitch`, pitch will be shifted based on `td_psola_pitch_ratio`. If `None`, only time stretching will be performed based on `s_fixed`, and pitch shifting will not be applied.
|
td_psola_key_updown |
integer
|
Only for `TD-PSOLA` method when `td_psola_pitch_shift` is `key`. Value for pitch shifting based on 12 key system. (eg. 3 : 3 keys up, -5 : 5 keys down, 12:== +1 octave)
|
|
td_psola_pitch_ratio |
number
|
Only for `TD-PSOLA` method when `td_psola_pitch_shift` is `pitch`. Value for pitch shifting based on relative ratio. (eg. 1.0 : original pitch, 0.5 : -1 octave, 2.0 : +1 octave)
|
|
td_psola_dynamic_key |
string
|
Only for `TD-PSOLA` method when `td_psola_pitch_shift` is `key`. Overrides `td_psola_key_updown`. Dynamic pitch shift. Must be formatted in dict type[`relative frame ratio`(0.0~1.0):`key_shift_amount`]. (eg. [0.3:1, 0.6:-2] means for first 0 ~ 30% part of the audio, it keeps the original key, for 30 ~ 60% key is shifted +1 and for 60 ~ 100% key is shifted -2.)
|
|
td_psola_dynamic_pitch |
string
|
Only for `TD-PSOLA` method when `td_psola_pitch_shift` is `pitch`. Overrides `td_psola_pitch_ratio`. Dynamic pitch shift. Must be formatted in dict type[`relative frame ratio`(0.0~1.0):`pitch_shift_amount`]. (eg. [0.5:2, 0.8:1.3] means for first 0 ~ 50% part of the audio, it keeps the original key, for 50 ~ 80% pitch is shifted +1 octave and for 80 ~ 100% pitch is shifted 130% of original pitch value.)
|
|
absolute_second |
boolean
|
False
|
If `True`, `s_ap` and `td_psola_dynamic_*` use absolute second metric. If `False`, relative ratio value(0 : starting point of an audio, 1 : the length;end point of an audio) is used.
|
Output schema
The shape of the response you’ll get when you run this model with an API.
Schema
{'format': 'uri', 'title': 'Output', 'type': 'string'}