Kokoro WebGPU: Real-time text-to-speech 100% locally in the browser
21 comments
·February 7, 2025xenova
amelius
Now that's what I call "server-less" computing!
deivid
Amazing! I'm interested in models running locally and Kokoro seems amazing. Are you aware of similar models but for Speech to text?
xenova
We have released a bunch of speech recognition demos (using whisper, moonshine, and others). For example:
- https://huggingface.co/spaces/Xenova/whisper-web
- https://huggingface.co/spaces/Xenova/whisper-webgpu
- https://huggingface.co/spaces/Xenova/realtime-whisper-webgpu
- https://huggingface.co/spaces/webml-community/moonshine-web
Ono-Sendai
whisper
reach-vb
Brilliant job! Love how fast it is, I'm sure if the rapid pace of speech ML continues we'll have Speech to Speech models directly running in our browser!
waynenilsen
Incredible work! I have listened to several tts and to have this be free and in complete control of the customer is absolutely incredible. This will unlock new use cases
I made https://app.readaloudto.me/ as a hobby thing and now it could be enhanced with a local tts option!
djeastm
Fyi I tried this on my Galaxy S21 with both Brave and Chrome browsers and just got screeching noises in the audio
mewse-hn
the mere idea of voice software's error mode being uncontrollable screeching is the most hilarious thing to me
bentt
Sounded perfect for me. Brave/Win11/3090
Asmod4n
Sounds horribly in chrome with an amd gpu, why is that?
SubiculumCode
Kokoro gives pretty good voices and is quite light...making it useful despite its lack of voice cloning capability. However, I haven't figured out how to run it in the context of a tts server without homebrewing the server...which maybe is easy? IDK.
fallinditch
Brave browser and Samsung Galaxy S22 ultra - gives horrible screeching noises
magicalhippo
Firefox on Samsung S21, worked fine albeit slow, around 20-25s for the demo text.
Quality sounded good compared to a lot of other small TTS models I've tried.
Guillaume86
Same with chrome on Zenfone 8
shaneofalltrad
same in MacOS intel Chrome browser.
rado
Crashes the iPad Safari tab
zamadatix
Mobile Safari (includes iPad) does not like to dish out large amounts of memory.
dindresto
Same on macOS Safari (Sequoia, Safari 18.3, M3 Pro, 18gb RAM)
oliwary
Worked on my Pixel 6a, albeit quite slowly (~30s for 4s audio). Still really impressed.
null
It took some time, but we finally got Kokoro TTS (v1.0) running in-browser w/ WebGPU acceleration! This enables real-time text-to-speech without the need for a server. Looking forward to your feedback!