Show HN: Decentralized robots (and things) orchestration system
17 comments
·January 14, 2025wngr
Great idea combining batman with libp2p! You guys have the heart in the right place :-).
Currently, your project seems to be an opinionated wrapper ontop of libp2p. For this to become a proper distributed toolkit you lack an abstraction to for apps to collaborate over shared state (incl. convergence after partition). Come up with a good abstraction for that, and make it work p2p (e.g. delta state based CRDTs, or op-based CRDTs based on a replicated log; event sourcing ..). Tangentially related, a consensus abstraction might also be handy for some applications.
Also check out [iroh](https://github.com/n0-computer/iroh) as a potential awesome replacement for p2p; as well as [Actyx](https://github.com/Actyx/Actyx) as an inspiration of similar (sadly failed) project using rust-libp2p.
Oh, and you might want to give your docs a grammar review.
Kudos for showing!
matthewfcarlson
It's not clear what the hardware requirements for a system that can run this would be. Raspberry Pi is mentioned but it seems like an actual OS (not ESP32 for example) is a requirement.
NotAnOtter
Very fun. Is this primarily a passion project or are you hoping to get corporate sponsorship & adoption?
Can you provide some insight as to why this would be preferred over an orchestration server? In this context - Would a 'mothership'/Wheel-and-spoke drone responsible for controlling the rest of the hive be considered an orchestration server?
This isn't my area of expertise but I think "Hive mind drones" tickles every engineer.
lmeierhoefer
> Is this primarily a passion project or are you hoping to get corporate sponsorship & adoption?
We are in the current YC W25 batch and our vision is to build a developer framework for autonomous robotics systems from the system we already have.
> Can you provide some insight as to why this would be preferred over an orchestration server?
It heavily depends on your application, there are applications where it makes sense and others where it doesn’t. The main advantages are that you don’t need an internet connection, the system is more resilient against network outages, and most importantly, the resources on the robots, which are idle otherwise, are used. I think for hobbyists, the main upsides is that it’s quick to set up, you only have to turn on the machines and it should work without having to care about networking or setting up a cloud connection.
> Would a 'mothership'/Wheel-and-spoke drone responsible for controlling the rest of the hive be considered an orchestration server?
If the mothership is static, in the sense that it doesn’t change over time, we would consider it an orchestration server. Our core services don’t need that and we envision that most of the decentralized algorithms running on our system also don’t rely on such central point of failure. However, there are some applications where it makes sense to have a “temporary mothership”. We are just currently working on a “group” abstraction, which continuously runs a leader election to determine a “mothership” among the group (which is fault-tolerant however, as the leader can fail anytime and the system will instantly determine another one).
NotAnOtter
> The main advantages are that you don’t need an internet connection
To that end, I'm not clear on benefit in this model. To solve that problem I would just take a centralized framework and stick it inside an oversized drone/vehicle capable of carrying the added weight (in CPU, battery, etc.). There are several centralized models that don't require an external data connection
> the resources on the robots, which are idle otherwise, are used
But what's the benefit of this? I don't see the use case of needing the swarm to perform lots of calculations beyond the ones required for it's own navigation & communication with others. I suppose I could imagine a chain of these 'idle' drones acting as a communication relay between two separate, active hives. But the benefit there seems marginal.
> our system also don’t rely on such central point of failure
This seems like the primary upside, and it's a big one. I'm imagining a disaster or military situation where natural or human forces could be trying to disable the hive. Now instead of knocking out a single mothership ATV - each and every drone need to be removed to full disable it. Big advantage.
> We are just currently working on a “group” abstraction
Makes sense to me. That's the 'value add', might as well really spec that out
> leader election to determine a “mothership” among the group
This seems perfectly reasonable to me and doesn't remove the advantages of the disconnected "hive". But I do find it funny that the solution to decentralization seems to be simply having the centralization move around easily / flexibly. It's not a hive of peers, it's a hive of temporary kings.
lmeierhoefer
Thanks for the feedback!
> I would just take a centralized framework and stick it inside an oversized drone/vehicle capable of carrying the added weight
Makes sense. I think there are scenarios where such “base stations” are a priori available and “shielded,” so in this case, it might make more sense to just go with a centralized system. This could also be built on top of our system, though.
> But what’s the benefit of this?
I agree that, in many cases, the return on saving costs might be marginal. However, say you have a cluster of drones equipped with computing hardware capable enough to run all algorithms themselves—why spin up a cloud instance for running a centralized version of that algorithm? It is more of an engineering-ideological point, though ;)
> But I do find it funny that the solution to decentralization seems to be simply having the centralization move around easily / flexibly. It’s not a hive of peers, it’s a hive of temporary kings.
Most of our applications will not need this group leader. For example, the pubsub system does not work by aggregating and dispatching the messages at a central point (like MQTT) but employs a gossip mechanism (https://docs.libp2p.io/concepts/pubsub/overview/).
What I meant is that, in some situations, it might be more efficient (and it’s easier to reason about) to elect a leader. For example, say you have an algorithm that needs to do a matching between neighboring nodes —i.e., each node has some data point, and the algorithm wants to compute a pairwise similarity metric and share all computed metrics back to all nodes. You could do some kind of “ring-structure” algorithm, where you have an ordering among the nodes, and each node receives data points from the predecessor, computes its own similarity against the incoming data point, and forwards the received data point to its successor. If one node fails, the neighboring nodes in the ring will switch to the successor. This would be truly decentralized, and there is no single point of failure. However, in most cases, this approach will have a higher computation latency than just electing a temporary leader (by letting the leader compute the matchings and send them back to everyone). So someone caring about efficiency (and not resiliency) will probably want such a leader mechanism.
jfantl
This is awesome stuff, I'm going to look into getting this running my Pis this weekend. How hard would it be to add in custom services? I like to play with decentralized algorithms such as Size Estimation and Clock Synchronization (https://jasonfantl.com/) and have always wanted to get them running on real hardware.
hannesfur
Awesome! From what I see, the clock synchronization can be implemented with our SDKs (mainly pub-sub).
I think the size estimation could also be implemented within the provided abstractions (mainly request-response) but might require you to keep track of neighbors. I think you could implement both algorithms by using our SDKs (none for Go yet).
If you need more control or performance, beyond what we expose through our SDKs, you might need to write a custom libp2p behavior and add it to our daemon. The libp2p part is fairly involved, but I would love to help you with that. Either way I would love to help you out :)
I'm so disappointed that I've never seen your blog before. The stuff you write about is so interesting and actually addresses some issues we are facing. I just sent you an email :)
rgbrgb
this is so cool, congrats on launching. what kind of biz model are you guys going after?
lmeierhoefer
We will most likely go with an open-core model. The main part will stay open source (the Core OS extension is under GPL3, and everything SDK-related is MIT).
For paid features, we have several ideas: a hosted management plane to configure and control the swarm (with company rbac integration) when one of the nodes is connected to the internet; advanced security (currently no access management or authentication is happening); sophisticated orchestration primitives; and LoRa connectivity (to scale the mesh radius to miles).
Appreciate your feedback on this!
iugtmkbdfil834
I am neophyte in this realm, but I like what I am seeing so far. Since I want to get into robotics for my own fun, I will be looking at it more closely this weekend:D
hannesfur
Please do and let us know if you have any questions!
diggan
> We’d love to hear your thoughts! :)
Have you ever played any of the Horizon (Zero Dawn/Forbidden West) games? :)
Jokes aside, it looks pretty cool. What kind of hardware have you tested it with so far? Is this using WiFi only?
hannesfur
Actually, just very briefly at a friend’s ;)
Thank you! So far, we have tested it with Raspberry Pi 4/5. Jetson boards are on backorder. We have some Intel WiFi chips (since they support some stuff we want), and we will get around to trying them next.
The binaries were also tested on x86 machinery.
In general, I'm not too worried about hardware support since batman-adv is quite widely deployed on a diverse set of hardware and the rest is hardware agnostic.
matthewfcarlson
I've been thinking about building a little tiny SLAM robot to have something to drive around the house when I'm out of town (I don't want always on cameras everywhere but having a camera that can move around sounds useful). The ideas here are awesome and I'm looking forward to the tutorials being more fleshed out.
lmeierhoefer
Yeah, SLAM seems also like a natural showcase for us. I am just working on a decentralised collaborative SLAM package on top of our system, where multiple robots can drive around and continuously merge their maps without a coordination server, using the Mesh integration and PubSub system. Should be out in about a week.
matthewfcarlson
Now that sounds absolutely fascinating. I'll look forward to that
Hi HN, we build an open-source operating system extension for orchestrating robot swarms fully decentralized.
This first beta version allows you to create fully decentralized robot swarms. The system will set up a wireless mesh network and run a p2p networking stack on top of it, such that nodes can interact with each other through various abstractions using our SDKs (Rust, Python, TypeScript) or a CLI.
We hope this is a step toward better inter-robot communication (and a fun project if you have a few Raspberry Pis lying around).
Our mesh network is created by B.A.T.M.A.N.-adv and we’ve combined this with optimized decentralized algorithms. To a user, it becomes very easy to write decentralized applications involving several peers since we’ve abstracted away much of the complexity. Our system currently offers several orchestration primitives (Key-Value Store, Pub-Sub, Discovery, Request-Response, Mesh Inspection, Debug Services, etc.)
Internally, everything except the SDKs is written in Rust, building on top of libp2p. We use gRPC to communicate between the SDKs and the CLI, so libraries for other languages are possible, and we welcome contributions (or feedback).
The C++ SDK and a ROS package that should feel natural to roboticists are in the works. Soon we also want to support a collaborative SLAM and a distributed task queue.
We’d love to hear your thoughts! :)