openai / gpt-4o-transcribe
A speech-to-text model that uses GPT-4o to transcribe audio (Updated 3 weeks, 6 days ago)
- Public
- 995 runs
-
Priced by multiple properties
-
Commercial use
- License
Prediction
openai/gpt-4o-transcribeOfficial modelIDqj3j4z46hdrm80cpxs4ra9jsm8StatusSucceededSourceWebTotal durationCreatedInput
- language
- en
- audio_file
- Video Player is loading.
- temperature
- 0
Output
So we just added GPT-4o transcribe to Replicate and thought you'd want to know. It's basically a speech-to-text model that uses GPT-4o to turn your audio into text. The cool thing is that it's noticeably better than the Whisper models we've been using, fewer errors, better at recognizing different languages, and just more accurate overall. If you've ever been frustrated with transcripts that mess up technical terms or struggle with different accents, you'll probably appreciate this upgrade. It just works better. Some quick tech specs if you're curious. It has a 16,000 token context window, which means it can handle longer audio clips in one go. And it can output up to 2,000 tokens, so you'll get nice complete transcripts. The model's knowledge is current up to June 2024, so it's pretty up-to-date with language and terminology.Generated inInput tokens910Output tokens170Tokens per second66.02 tokens / secondTime to first tokenPrediction
openai/gpt-4o-transcribeInput
- language
- en
- audio_file
- temperature
- 0
Output
I refuse to accept the idea that man is mere flotsam and jetsam in the river of life, unable to influence the unfolding events which surround him. I refuse to accept the view that mankind is so tragically bound to the starless midnight of racism and war that the bright daybreak of peace and brotherhood can never become a reality. I refuse to accept the cynical notion that nation after nation must spiral down a militaristic stairway into the hell of nuclear annihilation.Generated inInput tokens810Output tokens97Tokens per second21.70 tokens / secondTime to first token
Want to make some of these yourself?
Run this model