Skip to content(if available)orjump to list(if available)

Hyperpb: Faster dynamic Protobuf parsing

jsnell

See also the discussion on the technical description last week: https://news.ycombinator.com/item?id=44591605

(IMO much more interesting article than this announcement, and that probably should have gotten more attention than it did.)

dang

Thanks! That one was recent enough that I think we can re-up it. I'll put a link to this thread in there, so people can read both.

ManBeardPc

Interesting approach using a JIT compiler. It says compilation is slow, is there a way to persist the compiled code and load it later (for example for CLIs or faster redeployments)?

paulddraper

It's called AoT....

ManBeardPc

The key feature seems to be the dynamic nature while still being fast. Sure, they could also build it as a compiler that does all mentioned in the article and then dump optimized Go code. Maybe even use the Go PGO instead of their own. But this is another approach, what I mean is caching of the JIT generated code to avoid doing expensive part again while still being dynamic and adapt to incoming messages.

jayd16

No, I think they want Profile-Guided Optimization. I think the C# AoT mode uses the results of a JIT first run.

the_duke

The delta to the performance of C++/Rust Protobuf implementations would be interesting.

nateb2022

Even before Hyperpb, Go was already very competitive, e.g. this article from last year: https://www.greptime.com/blogs/2024-04-09-rust-protobuf-perf...

jeffbee

My experience is that the practical performance achievable with Go is higher because the C++ lifetime issues are too difficult to reason about and therefore the developer is forced to copy for safety. In Go you can fairly easily alias everything from the physical buffer into your parsed object. In the official C++ library, protobuf refuses to acknowledge even the possibility of aliasing. Even if you say that your string types are "view" there is an owned buffer inside the generated class into which your data is copied. This is exasperating because inside Google they have several different ways to not copy a string into a protobuf, and they're all patched out of the open source edition, and you can read them and cry about it by looking at their git logs for "internal change" commits with baffling only-whitespaces changes that are symptomatic of where they are patching out the good stuff.

reactordev

Oh it’s worse, it’s a full on marshal of the whole data. What we need is a no-allocation-protobuf that binds to existing memory, knows about aliases, can deal with a pointer. I love protobuf but I’ve moved to other messaging implementations that provide a faster marshal/unmarshal. Maybe I’ll give this a try.

haberman

I think you can alias the input data using Cord fields? As long as the input is Cord.

mwigdahl

Really missed a great naming opportunity with "superpb" (pronounced as "superb").

cryptonector

Why not coin hyperb as the hyper equivalent of super's superb?

JoshTriplett

I'd expect the current name to be pronounced like the first part of "hyperbole", which doesn't have nearly the same positive connotations, yeah.

null

[deleted]