Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Gpt4o mini transcribe is better and actually realtime. Whisper is trained to encode the entire audio (or at least 30s chunks) and then decode it.


So "gpt4o mini transcribe" is not just whisper v3 under the hood? Btw it's $0.006 / minute

For Whisper API online (with v3 large) I've found "$0.00125 per compute second" which is the cheapest absolute I've ever found.


>So it's not just whisper v3 under the hood?

Why it should be Whisper v3? They even released an open model: https://huggingface.co/mistralai/Voxtral-Mini-4B-Realtime-26...


Deepinfra offers Whisper V3 at 0.00045$ / minute of transcribed audio.


The linked article claims the average word error rate for Voxtral mini v2 is lower than GPT-4o mini transcribe


Gpt4o mini transcribe is better than whisper, the context is the parent comment.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: