Skip to content(if available)orjump to list(if available)

Orpheus-3B – Emotive TTS by Canopy Labs

Metricon

GGUF version created by "isaiahbjork" which is compatible with LM Studio and llama.cpp server at: https://github.com/isaiahbjork/orpheus-tts-local/

To run llama.cpp server: llama-server -m C:\orpheus-3b-0.1-ft-q4_k_m.gguf -c 8192 -ngl 28 --host 0.0.0.0 --port 1234 --cache-type-k q8_0 --cache-type-v q8_0 -fa --mlock

Zetaphor

I've been testing this out, it's quite good and especially fast. Crazy that this is working so well at Q4

NetOpWibby

Having a NetNavi is gonna be possible at some point. This is nuts.

hadlock

I'm looking forward to having an end-to-end "docker compose up" solution for self hosted chatgpt conversational voice mode. This is probably possible today, with enough glue code, but I haven't seen a neatly wrapped solution yet on par with ollama's.

nickthegreek

You can glue it with home assistant right now, but it’s not a simple docker compose. Piper TTS and Kokoro were the main 2 voice engines people are using.

Orpheus would be great to get wired up. I’m wondering how well their smallest model will run and if it will be fast enough for realtime

Zetaphor

With some tweaking I was able to get the current 3B's "realtime" streaming demo running on my 12GB 4070 Super with about a second of latency running at BF16

tough

harbor is a great docker bedrock for llm tools, has some tts stuff havent tried them https://github.com/av/harbor/wiki/1.1-Harbor-App#installatio...

deet

Impressive for a small model.

Two questions / thoughts:

1. I stumbled for a while looking for the license on your website before finding the Apache 2.0 mark on the Hugging Face model. That's big! Advertising that on your website and the Github repo would be nice. Though what's the business model?

2. Given the LLama 3 backbone, what's the lift to make this runnable in other languages and inference frameworks? (Specifically asking about MLX but Llama.cpp, Ollama, etc)

mmoskal

I wonder how can it be Apache if it's based on Llama?

nico

> even on an A100 40GB for the 3 billion parameter model

Would any of the models run on something like a raspberry pi?

How about a smartphone?

Zetaphor

They're going to be releasing a few more smaller models, as small as 150M

That said if you want something to use today on a Pi you should check out Kokoro

hadlock

What kind of binary do you run Kokoro with for audio output

Zetaphor

You can run it with Python or in the browser with WASM

michaelgiba

Nice, I’m particularly excited for the tiny models.