Skip to content(if available)orjump to list(if available)

Launch HN: Recall.ai (YC W20) – API for meeting recordings and transcripts

Launch HN: Recall.ai (YC W20) – API for meeting recordings and transcripts

25 comments

·September 10, 2025

Hey HN, we're David and Amanda from Recall.ai (https://www.recall.ai). Today we’re launching our Desktop Recording SDK, a way to get meeting data without a bot in the meeting: https://www.recall.ai/product/desktop-recording-sdk. It’s our biggest release in quite a while so we thought we’d finally do our Launch HN :)

Here’s a demo that shows it producing a transcript from a meeting, followed by examples in code: https://www.youtube.com/watch?v=4croAGGiKTA . API docs are at https://docs.recall.ai/.

Back in W20, our first product was an API that lets you send a bot participant into a meeting. This gives developers access to audio/video streams and other data in the meeting. Today, this API powers most of the meeting recording products on the market.

Recently, meeting recording through a desktop form factor instead of a bot has become popular. Many products like Notion and ChatGPT have added desktop recording functionality, and LLMs have made it easier to work with unstructured transcripts. But it’s actually hard to reliably record meetings at scale with a desktop app, and most developers who want to add recording functionality don’t want to build all this infrastructure.

Doing a basic recording with just the microphone and system audio is fairly straightforward since you can just use the system APIs. But it gets a lot harder when you want to capture speaker names, produce a video recording, get real-time data, or run this in production at large scale:

- Capturing speaker names involves using accessibility APIs to screen-scrape the video conference window to monitor who is speaking at what time. When video conferencing platforms change their UI, we must ship a change immediately, so this keeps working.

- Producing a video recording that is clean, and doesn’t capture the video conferencing platform UI involves detecting the participant tiles, cropping them out, and compositing them together into a clean video recording.

- Because the desktop recording code runs on end-user machines, we need to make it as efficient as possible. This means writing highly platform-optimized code, taking advantage of hardware encoders when available, and spending a lot of time doing profiling and performance testing.

Meeting recording has zero margin for failure because if anything breaks, you lose the data forever. Reliability is especially important, which dramatically increases the amount of engineering effort required.

Our Desktop Recording SDK takes care of all this and lets developers build meeting recording features into their desktop apps, so they can record both video conferences and in-person meetings without a bot.

We built Recall.ai because we experienced this problem ourselves. At our first startup, we built a tool for product managers that included a meeting recording feature. 70% of our engineering time was taken up by just this feature! We ended up starting Recall.ai to solve this instead. Since then, over 2000 companies use us to power their recording features, e.g. Hubspot for sales call recording, Clickup for their AI note taker. Our users are engineering teams building commercial products for financial services, telehealth, incident management, sales, interviewing, and more. We also power internal tooling for large enterprises.

Running this sort of infrastructure has led to unexpected technical challenges! For example, we had to debug a 1 in 36 million segfault in our audio encoder (https://www.recall.ai/blog/debugging-a-1-in-36-000-000-segfa...), we encountered a Postgres lock-up that only occurs when you have tens of thousands of concurrent writers (https://news.ycombinator.com/item?id=44490510), and we saved over $1M a year on AWS by optimizing the way we shuffle data around between our processes (https://news.ycombinator.com/item?id=42067275).

You can try it here: https://www.recall.ai. It's self-serve with $5 of free credits. Pricing starts at $0.70 for every hour of recording, prorated to the second. We offer volume discounts with scale.

All data recorded through Recall.ai is the property of our customers, we support 0-day retention, and we don’t train models on customer data.

We would love your feedback!

agwp

Have you explored using speaker diarization and speaker identification, given that pyannote etc. takes this approach?

I'm curious given your decision to capture speaker names from the screen. I see the merits during desktop recording, but I can also see how this limits utility when trying to offer the same functionality across desktop and other scenarios (e.g. in-person meetings, audio uploads etc.)

giveita

I have Loom recorded a Zoom meeting so get it. I think for corporates though the integrated approaches are so so convenient. Have your meeting and get a summary email by doing nothing (or one click opt in). I feel like your solution is for edge cases where the mainstream ways are not possible.

apollopower

Congrats Recall team, I've been a customer for the past 1.5 years and the entire infra layer behind meetings has let our company really focus on "what makes our beer taste better" instead of having to worry about building universal support for different (and tedious) platforms like Teams and Zoom. Eager to give the desktop recording sdk a try soon.

davidgu

Thanks and love to hear this!

orliesaurus

Congrats on ur launch. Amanda has the strongest LinkedIn game I have ever seen in my life. On the other hand the product is IMHO at risk? Models like Whisper, DistilWhisper, TinyLlama, miniGPT-4, OpenHermes, Vosk, and Llama.cpp make Recall.ai meeting transcription easy to replicate. IMHO in 1 weekend you can build an open-source tech stacks that can rival or EVEN surpass the value brought....or am I tripping?

chaos_emergent

Customer here, you're tripping. Recall provides transcription as an auxiliary service, not their core value prop.

Recall is, at its core, an API for bot recording. As someone building an application that relies heavily on conversational data, recording meetings is really important. Recall makes that process as easy as an API call, standardized across various meeting platforms. It's a huge PITA to set up infrastructure to get bots to join meetings that handle each platforms' proclivities, encoding and storing video data, etc.

The transcription service is just something they do to make transcribing recordings - one of the most common first post-processing steps for any conversational data - easier and lower friction.

davidgu

Amanda says thank you so much!

I actually agree that it’s become incredibly easy to transcribe conversations using open-source models, and that’s not where Recall adds the most value. The hard part is building the infrastructure that allows you to get real-time access to the raw audio, video, and transcript data directly from the meeting platforms. We abstract all of that away and provide you with a clean interface to access that data. Once you get the data, you could use any of the models that you mentioned to do your own transcription, or transcribe using Recall’s transcription models.

bingemaker

Pardon my ignorance, but is recording a call without informing the other participants considered bad practice?

Congrats on the launch! :tada:

davidgu

You're right, and I agree that participants should be aware when they’re being recorded

Because consent laws are complex and vary by region and industry, we leave the consent flow to the developer and we provide the tools and guidance to do it correctly. As with our Meeting Bot API, we also urge teams to follow local laws and make recording clearly visible to users

bingemaker

Thanks for the clarification.

rsingel

It's illegal in some states, legal in others.

Consult Linda Tripp

bingemaker

Wish she was around!

monkeydust

Also wondering this.

Hansenq

Wow, congrats on finally using up your single Launch HN, David and Amanda! :wink:

No but seriously, y'all have built not only an incredible product that I had the chance to demo, but a great company as well, through your previous pivots and cofounder changes. You're building schlep tools that product companies _definitely_ don't want to do, years before it was clear there was a market here, and do it well.

There's definitely demand for a native screen recorder, and I think it's the right move to be agnostic to privacy (the lower down the stack you go, the more permissable you should be about use-cases). Imagine how much competition in file storage there would have been had there been an API provider for Dropbox's Finder sync technology (though you could argue it just incentivizes large companies like Hubspot to build their own screen recording feature into their platform, rather than enabling new startups like Gong but I digress).

Y'all deserve the success that you have, and wishing you all the best of luck with the new product launch!

davidgu

Thanks! Really appreciate the kind words

wferrell

Out of interest, what is the thinking behind sending a physical mailer to what feels like a large fraction of San Francisco?

Both why send it and why send it with very little info included on the page?

davidgu

Did you get one? :) This was a part of our Series B raise to help get our name out

iddan

Congrats on the launch! I'm working on a new tool for startups sales (https://closer.so) and in many customer interviews the point of not wanting the bot in the meeting kept coming up. I love how Recall keeps brining frontier tech as APIs

berz01

70 cents per hour is a mountain of fees... basically a $1 per meeting. Sheesh.

nduncan_hmc

It is a lot but processing real time video and audio streams inherently consumes alot of CPU. So they may not be making as much profit on that price as you'd think.

I run an open source alternative to Recall (for meeting bots), and our costs are about 8 cents per hour.

davidgu

$0.70/hr is our starter rate for low-volume testing. In production, developers will see higher usage and choose to commit to volume and longer-term usage. Because of this, we've seen most teams don’t pay the starter price once they scale beyond early pilots

galaxy_gas

Is this with active speech or you pay in every second of silence too?

davidgu

Usage includes silent time too as we are still processing the media streams