Skip to content(if available)orjump to list(if available)

Java FFM zero-copy transport using io_uring

jeffreygoesto

27us roundtrip is not really state of the art for zero copy IPC, about 1us would be. What is causing this overhead?

jstimpfle

Asking for those who, like me, haven't yet taken the time to find technical information on that webpage:

What exactly does that roundtrip latency number measure (especially your 1us)? Does zero copy imply mapping pages between processes? Is there an async kernel component involved (like I would infer from "io_uring") or just two user space processes mapping pages?

null

[deleted]

rohanray

It's not a local IPC exactly. The roundtrip benchmark stat is for a TCP server-client ping/pong call using a 2 KB payload; TCP is although on local loopback (127.0.0.1).

Source: https://github.com/mvp-express/myra-transport/blob/main/benc...

znpy

It may or may not be good, depending on a number of fact.

I did read the original linux zerocopy papers from google for example, and at the time (when using tcp) the juice was worth the squeeze when payload was larger than than 10 kilobytes (or 20? Don’t remember right now and i’m on mobile).

Also a common technique is batching, so you amortise the round-trip time (this used to be the cost of sendmmsg/recvmmsg) over, say, 10 payloads.

So yeah that number alone can mean a lot or it can mean very little.

In my experience people that are doing low latency stuff already built their own thing around msg_zerocopy, io_uring and stuff :)

blibble

indeed, you can get a packet from one box to another in 1-2us

rohanray

It's not a local IPC exactly. The roundtrip benchmark stat is for a TCP server-client ping/pong call using a 2 KB payload; TCP is although on local loopback (127.0.0.1).

The payload is encoded using myra-codec FFM MemorySegment directly into a pre-registered buffer in io_uring SQE on the server. Similarly, on the client side CQE writes encoded payload directly into a client provided MemorySegment. The whole process saves a few SYSCALLs. Also, the above process is zero copy.

Source: https://github.com/mvp-express/myra-transport/blob/main/benc...

P.S.: I had posted this as a reply to jeffrey but not able to see it. Hence, reposting as a direct reply to the main post for visibility as well.

Disclaimer: I am the author of https://mvp.express. I would love feedback, critical suggestions/advise.

Thanks -RR