Show HN: Scriber Pro – Offline AI transcription for macOS
94 comments
·October 15, 2025pmarreck
Does it do separate speaker identification (diarization)?
What's the stack, if I may ask? (I believe Whisper-X does the diarization thing)
geerlingguy
I've been using MacWhisper for this, with a huge variety of transcription options and things like speaker detection. It works great for all the 1 hour and shorter videos I've fed it, but does this have more to offer?
I haven't tried a 4+ hour video with MacWhisper but I presume that would work the same.
rezivor
Please be my guest to test my claims. No tall tales here!
gcr
MacWhisper handles multiple-hour-long recordings just fine for me. I regularly process 4hrs on MacWhisper. Even whisper-cpp works fine these days for long recordings too.
Cool product, but it would be better if you stopped spreading misinformation to support it.
mattstudio
One thing that Rev and other online services have as well as MacWhisper is a good interface for editing the text to correct inevitable errors. Being able to click on the text and have it sync to the correct place in the audio is a must for my use case of transcribing interviews. Also speaker diarization.
rezivor
Scribers’ iCloud system automatically backs up each transcription and organizes them in a three-pane folder view—somewhat inspired by Bars’ layout. This structure allows a surprising degree of customization for all your data needs, especially when transcribing interviews. It would probably make for a very comfortable workflow here
torstenvl
You use the word "transcribe" but the page doesn't appear to support that claim? This looks like straightforward STT? Or does it actually support transcription (diarization, etc.)?
(Also, the text is completely illegible on your site.)
rezivor
r/#FF0000_rage
yewenjie
Does it support speaker diarization?
scilro
Seconding/thirding the request for diarization! I would use this as my main transcription app if it had that.
rezivor
I use it as my own transcription app, I really do love it ( biased I know, but genuinely)
oasisbob
Timecode drift is an interesting issue, think I faced this recently while translating a Google Meet transcript into an incident report timeline.
The elapsed-time timestamps didn't correlate well with other data sources. I figured it was a mistake on my end, and just brushed it off.
Telemakhos
What languages does this support? Does it support switching between multiple languages in one video?
For example, could it support a video that included spoken Latin, ancient Greek, German, and Italian?
rezivor
eng der fr de es it pt ru zh ko ar and ja
Telemakhos
So, can it handle multiple languages in one video, or do you need to segment the different languages using LID first? This has been a thorny issue for people working in multilingual audio (there are at least two or three of us).
rezivor
I haven't test that specific edge case, I'm sorry. I tested 2 langue's having a normal conversation and that worked fine- "Auto or English" handle multiple lan the best
CrazyCatDog
Question: can it discern (and label) different speakers? If so, could you kindly share the limit on speakers per video?
CharlesW
MacWhisper Pro supports this, if your need for this is time-sensitive. https://macwhisper.helpscoutdocs.com/article/32-automatic-sp...
rezivor
No, not yet! That will definitely be included in the next update next month. Thank you for reminding me of peoples unique need for this use case
oidar
You are looking for speaker diarization. No one is doing this well currently on device (in macOS land at least).
constantinum
I sort of use SuperWhisper, it is sort of good. https://superwhisper.com/
xnx
You can also run Whisper locally in your browser for free: https://ggml.ai/whisper.cpp/
rezivor
Great when you have time to kill and not a lot to process I suppose
nubg
How does it compare to MacWhisper?
rezivor
MacWhisper crashes at about an hour of context. This uses, smart, invisible regex in the text generation pipe. Makes this fast. + bonus, there is no context limit
barapa
Smart invisible regex makes it fast and prevents it from crashing? What does that mean?
grosswait
I've done 3+hours with MacWhisper without issue? One downside is the transcription is not real time - can Scriber Pro do realtime?
pmarreck
> Smart invisible regex
I've never heard a regex person speak this way of a regex.
Please tell me you didn't vibecode the regex... one of the areas it's still not good at
CharlesW
> MacWhisper crashes at about an hour of context.
This is not true. (I've been a MacWhisper user since 2023. I have two bugs during that time, which the author addressed quickly.)
fady0
I am a MacWhisper Pro user, and I successfully transcribed and translated a 15-hour course inside the app without any issues
fl_rn_st
"Smart, invisible regex" sounds like a lot of bs... could you give a more technical explanation?
Also the Whisper model doesn't really have a context window, it already segments the audio with a certain amount of overlap between the chunks, I really have a hard time understanding what you are trying to say here.
rezivor
Whisper will fail > 99%* (edit, most of the time) of the time at lengths over 90 minutes and fairly high over one hour.
gcr
What do you mean context limit?
Neither whisper nor MacWhisper have any context limit
Hey HN! Built this because I was tired of waiting hours for transcription services and didn't want to upload sensitive recordings to the cloud.