Discussion
Search code, repositories, users, issues, pull requests...
konaraddi: That’s awesome! Do you know how it compares to Handy? Handy is open source and local only too. It’s been around a while and what I’ve been using.https://github.com/cjpais/handy
mathis: If you don't feel like downloading a large model, you can also use `yap dictate`. Yap leverages the built-in models exposed though Speech.framework on macOS 26 (Tahoe).Project repo: https://github.com/finnvoor/yap
hyperhello: Feature request or beg: let me play a speech video and transcribe it for me.
swaptr: Handy is awesome! I used it for quite a while before Claude Code added voice support. Solid software, very good linux and mac integration. Shoutout to Parakeet models as well, extremely fast and solid models for their relatively modest memory requirements.
youniverse: I love and have been using handy for a while too, what we need is this for mobile apps I don't think there's any free apps and native dictation is not always fully local and not as good.
parhamn: I see a lot of whisper stuff out there. Are these updated models are the same old OpenAI whispers or have they been updated heavily?I've been using parakeet v3 which is fantastic (and tiny). Confused still seeing whisper out there.
ericmcer: I see quite a few of these, the killer feature to me will be one that fine tunes the model based on your own voice.E.G. if your name is `Donold` (pronounced like Donald) there is not a transcription model in existence that will transcribe your name correctly. That means forget inputting your name or email ever, it will never output it correctly.Combine that with any subtleties of speech you have, or industry jargon you frequently use and you will have a much more useful tool.We have a ton of options for "predict the most common word that matches this audio data" but I haven't found any "predict MY most common word" setups.
MattHart88: I've found the "corrections" feature works well for most of the jargon and misspelling use cases. Can you give it a try and let me know edge cases?
daemonologist: Whisper is still old reliable - I find that it's less prone to hallucinations than newer models, easier to run (on AMD GPU, via whisper.cpp), and only ~2x slower than parakeet. I even bothered to "port" Parakeet to Nemo-less pytorch to run it on my GPU, and still went back to Whisper after a couple of days.