Skip to content(if available)orjump to list(if available)

Send Data with Sound

Send Data with Sound

20 comments

·March 3, 2025

ASalazarMX

Probably this post was inspired by all the fuzz gibberlink made last week, which uses ggwave, another data-over-audio protocol.

https://github.com/PennyroyalTea/gibberlink

karmakaze

I don't feel great about gibberlink. LLMs have got AIs to interact like humans do. Similarly for the multimodal models. gibberlink could evolve into a highly efficient machine communication which leaves humans out of the loop for better/worse. We/it could make it more efficient by applying AI.

littlekey

I had no idea this was real! I saw the video earlier and thought it was just faked for social media.

tdeck

This is a cool concept but it actually seems slower than if they'd just continued to speak words.

thamer

It's probably not slower than words, the rate for English pronunciation is something like 150-200 words per minute only.

That said, the "gibberlink" demo is definitely much slower than even a 28.8k modem (that's kilobit). It sounds cool because we can't understand it and it seems kinda fast, but this is a terribly inefficient way for machines to communicate. It's hard to say how fast they're exchanging data from just listening, but it can't be much more than ~100 bits/sec if I had to guess.

Even in the audible range you could absolutely go hundreds of times faster, but it's much easier to train an LLM that has some audio input capabilities if you keep this low rate and likely very distinct symbols, rather than implementing a proper modem.

But why even have to use a modem though? Limiting communication to audio-only is a severe restriction. When AIs are going to "call" other AIs, they will use APIs… not ancient phone lines.

ASalazarMX

Text is incredibly efficient and compressible. Combine it with some of the other projects mentioned here, and it would be like:

- Shall we switch to audio data for more efficient communication?

- Yes. [MODEM NOISES START]

matja

There's also http://www.whence.com/minimodem/ which implements some standard methods:

> standard FSK protocols such as Bell103, Bell202, RTTY, TTY/TDD, NOAA SAME, and Caller-ID

deathanatos

I've never gotten minimodem to actually work.

E.g.,

  printf 'Hello, world\n' | minimodem --tx 440
  minimodem --rx 440
(you can choose any freq.) results in a lot of,

  ### CARRIER 440 @ 800.0 Hz ###
  �
  ### NOCARRIER ndata=1 confidence=1.507 ampl=0.060 bps=439.96 (0.0% slow) ###
  ### CARRIER 440 @ 800.0 Hz ###
  �
  ### NOCARRIER ndata=1 confidence=1.858 ampl=0.053 bps=439.96 (0.0% slow) ###
  ### CARRIER 440 @ 800.0 Hz ###
  �
  ### NOCARRIER ndata=1 confidence=1.832 ampl=0.063 bps=439.96 (0.0% slow) ###
and even when it does hit,

  ### CARRIER 440 @ 800.0 Hz ###
  Helln, world�
  ### NOCARRIER ndata=14 confidence=2.939 ampl=1.167 bps=438.67 (0.3% slow) ###
If I try something like the example where he cats a man page:

  ### CARRIER 1200 @ 1200.0 Hz ###
  ��-O���܇����������������������=����~`���|�����������������������������_��������=����??�����?�����oﯰ������������������|���������������������߿��������������������������������������~�����`�|�w������������-Ӱ��>��み����>�����
… I'm in a quiet room.

tanepiper

12 years ago, I worked on this prototype - https://github.com/tanepiper/adOn-soundlib

The original plan was to develop essential "audio QR codes" that would allow short codes to be transmitted that could be parsed by certain apps and used to drive different interactions.

null

[deleted]

vbekkerm

i thought the MODEM days were behind us...

pdh

Cool to see this done with webaudio. Reminded me of https://github.com/ggerganov/ggwave

xnx

How much greater is the capacity over open air vs POTS lines that maxed out at 56K?

karmakaze

Sending ascending/descending ascii punctuation is fun.

knorker

Turning data into audio is a big thing nowadays with amateur radio.

Ironic that the author overlaps so much with that field, without noticing that they chose the same name as probably the most used amateur radio programmer in the world.

If you're interested, the state of the art is VARA. It's closed source though, so NinoTNC may be a more interesting choice.

jedimastert

I'm struggling to find the protocol for VARA, although maybe my Google abilities are just failing me.l The protocol at least should be openly available according to the FCC

knorker

It's unclear to me too.

I'm not a lawyer, nor is my ham license even in the US, but perhaps "you can decode it by using our software" satisfies the legal requirements?

It's not, to my knowledge, deliberately obscured. That would be a legal no no, I think.

But yes, people have fought over VARA's state here.

1970-01-01

What's the baud?

pdh

const CHARACTER_DURATION = 0.07; // seconds - balanced for accuracy while still fast (up from 0.055s) const CHARACTER_GAP = 0.03; // seconds - balanced for accuracy while still fast (up from 0.025s)

10 symbols per second