Skip to content(if available)orjump to list(if available)

Show HN: Mikey – No bot meeting notetaker for Windows

rs186

Microsoft Teams already provides similar built-in features, along with translation, and I have to say it is one of the rare AI tools from Microsoft that makes sense and actually works -- I had good experience using it for reviewing meetings in non English language. It's not hard to imagine that this will be a standard feature of all mainstream video conference software. Wonder what is the place for these tools.

darknavi

I've thoroughly enjoyed not having to anoint a "note taker" in my meetings in the last few months.

mijoharas

I was looking into something like this for linux recently. Didn't find anything obviously simple

(considered hooking up whisper.cpp and a bit of audio magic to make it at least transcribe, but it firstly seemed like a fair bit of a pain and secondly I couldn't think of a nice way to do speaker detection.)

utrack

https://github.com/m-bain/whisperX looks promising - I'm hacking away on an always-on transcriber for my notes for later search&recall. It has support for diarization (the speaker detection you're looking for).

I'm currently hacking away on a mix of https://github.com/speaches-ai/speaches + https://github.com/ufal/whisper_streaming though - mostly because my laptop doesn't have a decent GPU, I stream the audio to a home server instead.

But overall it's pretty simple to do after you wrangle the Python dependencies - all you need is a sink for the text files (for example, create a new file for every Teams meeting, but that's another story...)

dmantis

Looks cool. Is it possible to use a local model (like whisper) to avoid leaking conversations to the cloud-based AI?

hotrod46

That’s what’s planned next :)

bbor

What does “no bot” mean? I don’t see any elaboration, tho maybe I’m just blind!

simplemindedbot

There’s not a “bot” that needs to attend the meeting and show up in the list of attendees thus giving away the recording of the call. Otter.ai, for instance, shows up as “Otter” (or another name) on a Zoom call when it is recording and taking notes.

Cheer2171

Oh, so it is for more "seamlessly" helping people commit the crime of wiretapping in two-party consent jurisdictions, like California?

If you don't like people knowing you are recording them, you probably have a consent issue.

stevenAthompson

You could have said this exact same thing without it sounding like a personal attack, but you chose to be unkind instead. I wonder why?

adewinter

Should your concern lie with individuals transcribing their own conversations, or with mass surveillance and wiretapping actively being executed by a broad range of official and corporate entities without your consent?

maccard

Not affiliated, but I'd guess it doesn't have a "bot" account join the zoom/meets call

hotrod46

The other meeting note takers usually have a bot join the meet to take notes, that seemed a bit strange to me.

alkonaut

Something I find annoying with automatic transcriptions and summaries, like the one built into Teams, is that they lack the context necessary to properly interpret what's being said. Example if I have a meeting discussing products, abbreviations or systems with "internal" names then it can't discern them or statistically rejects them, replacing them with its best guess for a dictionary word instead. So say we have a long call involving frequent mentions about a measure called pNet pronounced in the meeting "Peenet". Then you end up with a transcription of a bunch of guys having a discussion about penises. Hilarious, the first few times. OK always hilarious, but not so useful.

Being able to set the system prompt for these transcriptions would be very useful. Like "You are a friendly bot transcribing meetings at a software company. Some common terms and abbreviations you'll encounter are...".

_joel

My favourite was Kubernetes in our meeting being referred to as Cuban Eighties. ⎈

sys_64738

Perhaps these will be flagged for the CIA or DEA to investigate due to illegal importation of Cubans from the enemy!

collinmcnulty

Gong has such a feature. It’ll even expand out acronyms the first time they show up in the transcript.

jvanderbot

This should be trivially solveable with a glossary as context, as you suggest. I bet the above repo would love a PR, too!

sesm

But the error happens in 'audio to text' part, so text prompt won't solve it. The way to fix it is probably fine-tuning the underlying audio to text model.

alkonaut

Doing audio-to-text requires having a statistical model for what word or phrase a piece of sound is most likely to be. Without context, you can't do better than ranking the most likely candidates where a common word is more likely than an uncommon one. Having a task-specific dictionary at that point would help.

One could also imagine doing it at the summary step where the AI could simply be asked to do phonetic analysis. "Here is a transcription of a meeting. Here is a list of terms/names/participants etc. Given the transcription, the meeting context/topics and assuming the transcriptor has made errors, replace similarly sounding words and terms with more likely ones from the context"

null

[deleted]

sirjaz

Looks awesome, love that it is a local native app

ForHackernews

>transcribing it using the Groq API

It's not really local: it sends all the audio to some cloud AI API.

troyvit

I'm not familiar with Groq, but it looks like:

https://sdk.vercel.ai/providers/ai-sdk-providers/groq

Some open models support it. It seems in theory that you could use your own cloud AI then right?

oersted

There's still a surprising lack of good video call recording services that can be controlled programmatically, unlike the end-to-end SaaS apps like Read.ai or Otter.ai.

The only open-source one I could find is Amurex, which looks promising. But it only supports Google Meet for now, it does it a bit differently with a Chrome extension, and it is generally rather immature, but I do wish them the best.

The only API services available are Recall.ai and MeetingBaaS, they both support the big three (Google Meet, Microsoft Teams and Zoom), but they are rather expensive at $0.5 - $1 per hour. The Calendar Syncing feature is also locked behind enterprise tiers with additional monthly fees in the hundreds, and it is rather important real-world use.

null

[deleted]

ttul

Has anyone done this on the Mac? I hate sending audio to Otter; it creeps me out.

simplemindedbot

Spellar.ai does a great job. There’s others out there for Mac but I like Spellar’s calendar integration.

Interestingly, their initial raison d’être was to help with English pronunciation and speaking speed, giving you real time feedback. They’ve downplayed this in recent releases, but the functionality is still there. Though, I’m a native English speaker and it always flagged me as pronouncing words incorrectly even though I’ve got little regional accent (I’ve been told this by others, not just my opinion. I had a speech therapist as a mother, hence little accent)

simplemindedbot

As an additional note, Spellar does let you bring your own Open AI key but does not allow for purely local processing. You’ve still got to send the audio out for transcription and interpretation.

Also, I have no affiliation with Spellar, just a user.

doug_life

https://speechpulse.com does fully local audio transcription. The UI and settings are not the most intuitive, but it works fairly well and they are making constant updates.

null

[deleted]

dartos

[flagged]

null

[deleted]