audioscrape/whisperx:f2fa4577 | Run with an API on Replicate

You're looking at a specific version of this model. Jump to the model overview.

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field	Type	Default value	Description
audio	string		Audio file (supports up to 4 hours)
min_speakers	integer	Min: 1 Max: 20	Minimum number of speakers (None = auto-detect)
max_speakers	integer	Min: 1 Max: 20	Maximum number of speakers (None = auto-detect)
language	string		Language code (e.g., 'en'). Leave empty for auto-detect
huggingface_token	string		HuggingFace token for speaker diarization (required)
batch_size	integer	8 Min: 1 Max: 32	Batch size for transcription (lower for long audio)
enable_diarization	boolean	True	Enable speaker diarization
return_word_timestamps	boolean	True	Return word-level timestamps

The shape of the response you’ll get when you run this model with an API.

Schema

{'additionalProperties': True, 'title': 'Output', 'type': 'object'}