FoundationDB: From idea to Apple acquisition [video]
35 comments
·July 25, 2025romanhn
Posted about this in the past, but what really got FoundationDB on my radar was a demo at a developer conference, back in 2014-ish. They had the database running across a bunch of machines, with a visual showing their health and data distribution. One team member would then be turning machines on and off (or maybe unplugging them from the network) and you could see FDB effortlessly rebalancing the data across the available nodes. It was a very striking, impressive presentation (especially as we were dealing with the challenges of distributed Cassandra at the time).
The beginning of this video has some of that: https://youtu.be/Nrb3LN7X1Pg
AtlasBarfed
So ... cassandra does that? I get the FDB demo probably made it look better and easier.
But data doesn't teleport except in demos. Rebalancing means streaming data across a network, consuming total network I/O, regardless of the distributed database.
Did you actually implement FDB, and was it better?
jwr
Comparing Cassandra to FoundationDB is like comparing a spreadsheet in Google Sheets to PostgreSQL.
I mean, both kind of store data, and multiple users can change the data that is being stored. The story of what you'll get back and when (if ever), however, is rather different.
I would respectfully suggest that anyone that wants to comment in distributed database discussions should be familiar with https://jepsen.io/consistency/models and https://antithesis.com/resources/reliability_glossary/ and use the wording found there.
If your eyes gloss over, because there is a lot of complex stuff there, it is likely that your comments will not have much value.
rapsey
Many years later did cassandra get reliable. Fdb was the gold standard that set the bar. They did not need jepsen tests to implement it properly.
vlovich123
What a great story and really interesting courage to double-down on improving the testing even when a critical flaw that testing should have found was found. Wish that they had managed long enough for Snowflake to keep them alive, but then we wouldn't have Antithesis as a service so silver lining.
tptacek
By "keep them alive", you mean the team, right? People are definitely still using FDB!
vlovich123
The team pushing forward the vision. Using FDB is a fraction of the vision if you listen to them.
tptacek
Makes sense, thanks! Antithesis is pretty neat, though (we use it for a distributed system thing here).
pjmlp
Nowadays being rewritten into Swift.
"Swift as C++ Successor in FoundationDB" by Konrad Malawski (Strange Loop 2023)
jen20
That was an experiment the team didn’t end up committing to - it’s been backed out. That said it was a fascinating dive into the flexibility of Swift, and the Konrad’s talk is excellent and worth watching.
https://github.com/apple/foundationdb/commit/e52fc3621fd5e41...
pjmlp
Interesting, thanks for sharing, is there a rationale somewhere?
As someone that enjoys using C++ despite all its warts, I can imagine a few reasons, but would nonetheless an interesting read, in case that is public.
I guess that experience might also had an impact on ongoing Swift 6+ features.
TobbenTM
The Ladybird project started a similar journey, and indeed they are mainly waiting on Swift 6+ features as documented in their blockers issue: https://github.com/LadybirdBrowser/ladybird/issues/933
msy
Does anyone know how widely FoundationDB is now being used at Apple? I know they run a huge Cassandra cluster, does this serve a different use case?
ethan_smith
Apple uses FoundationDB extensively for iCloud services including CloudKit, with public documentation confirming it handles billions of operations per second across their infrastructure.
ntqz
My understanding is that CloudKit runs on it.
minitoar
iCloud uses both.
gregoriol
So if it has been acquired by Apple, it's a failure, isn't it? Most things acquired by Apple get unmaintained or change completely, or disappear. Being "open-source" here doesn't bring any guarantees to any third-party user about maintenance or long-term life. It should be a serious no-go indicator for anyone willing to build something with it.
dialup_sounds
It was acquired ten years ago.
Nican
FoundationDB has been growing as my favorite database lately. Even though it is only key-value store.
Out of curiosity: what are the scale limits of FoundationDB? What kind of issues would it start to have? For example, being able to store all of Discord messages on it?
I see blog posts of Discord moving to Scylla and ElasticSearch, but I wonder if there would be any difficulties here.
hardwaresofton
Note that FDB can support other paradigms on top of KV
https://foundationdb.github.io/fdb-record-layer/SQL_Referenc...
Also IIRC Apple uses FDB at tremendous scale:
https://read.engineerscodex.com/p/how-apple-built-icloud-to-...
piokoch
I've looked on FoundationDB and on paper it looks great. But it never got momentum, like, say, MongoDB. Is this just a matter of hype or it is not that great as advertised?
jwr
It is difficult to use by itself: the "foundation" in the name describes it quite well. It is a foundation that you build a database on. It fits my use case very well, for example, because I know my data model and usage patterns very well and I can integrate deeply with the database, but it's not a good match by itself for quick-and-dirty apps.
It provides fantastic (strict serializable) consistency guarantees in a distributed database, which is extremely rare. It is a huge advantage, but sadly most people do not understand how badly most distributed databases are broken and don't even understand the concepts (https://antithesis.com/resources/reliability_glossary/) well enough to talk about the issues involved. See every discussion where someone mentions ACID.
It's hard to compete for mindshare when the concepts are difficult and every other database has a warm-and-fuzzy-feeling website saying that everything will be great (it usually won't).
Personally, I hope more people will start using it, and I hope to see more easy-to-use databases built on top of it (that's what it was designed for, really). In my experience with it, working with a fast distributed database that gives you strict serializable semantics right in your code is fantastic.
chrischen
I think it wasn't as easy to use or get started with. There was a MongoDB compatibility layer but it wasn't maintained.
qcnguy
Many reasons.
FoundationDB started development in the same year MongoDB launched but took nearly four years to reach the market. It's the rarely discussed dark side of great testing - you can end up with robust code nobody cares about because it arrives years after people decided they wanted it. Everyone went with what existed and learned to deal with its quirks. In this case they got lucky I guess that Apple saw the potential for iCloud and bought them out, but the people who had bet on FDB before then kinda lost. You really don't want your database to be bought and made fully private tech. MongoDB was open source at the start and went closed later but never disappeared, so whilst the license switch pissed people off it didn't fundamentally wreck MongoDB as a viable tech.
Database tech has a chicken and egg problem. Most people don't want to run their own infrastructure anymore. No clouds offer hosted FoundationDB, so people don't want to use it for that reason, which means there's no demand, so clouds don't offer it, ad infinitum. MongoDB was released around the start of the cloud era, just three years after AWS first launched, so that was less of an issue. Back then "cloud" just meant VMs and storage. And later Mongo built their own cloud offering.
FoundationDB does full strict serializability checks, which is expensive. One trick it uses to get acceptable performance is by imposing a difficult programming model on the user. Keys and values must be small. Think individual fields of a JSON object, not objects themselves. Transactions also have very small limits in lifespan and size. You can't open a transaction and run a computation against your entire dataset in FoundationDB unless it's tiny. Everything has to complete in five seconds or else your transaction dies.
Their website used to claim this timeout isn't even configurable, it's hard to know if it changed because the FoundationDB team at Apple don't care about marketing. Probably Apple don't care if anyone else uses it and only made it open source to make the team happy. Even quite average open source projects have better marketing. Their blog consists only of release announcements and the last one was in 2022. A casual visitor who didn't know better would think it had been abandoned years ago.
The scalability story is unclear. It doesn't matter for most people but the biggest FDB clusters are about 100T in size. Apple say they use it for iCloud but really they use a large fleet of FDB clusters with lots of in-house tooling for balancing and moving data between those clusters. Effectively they built another scaling layer on top of core FDB.
Even if you work through all of that, what you get is a key value store. Not really a database, it's more like the bottom layer of a database. That's why it's called FoundationDB. It's not meant to be used directly. There are layers that turn your actual data into key/value pairs in a way that offers features like schema handling, object serialization etc but they are language specific and not so well documented. Most devs on the backend will have ORMs or frameworks they already want to use, and Apple server-side is mostly a Java shop so there's a Java layer, but you can't just point Spring at an FDB cluster and go. For instance, there's no notion of a query, or a query planner or even indexes. You're expected to handle all that stuff using libraries in your app.
So overall it's a highly solid bit of tech that solved a very small, very specific problem very well but years too late for anyone to care. Except for Apple. Good work, whichever Apple executive sponsored that deal!
iangregson
+1 really enjoyed this
philosopher1234
Does anyone know of cool things built with fdb? I’ve been aware of it for a while and it seems very cool but I haven’t seen a lot of details about how folks are using it.
jwr
I am moving my SaaS from RethinkDB to FoundationDB. It's a long-term project that needs to be done very carefully (thousands of people using the app), but the rewards are significant. Thanks to FoundationDB versionstamps, I'll be able to replace changefeeds with polling, simplifying the system, and also make things much faster along the way.
The consistency guarantees are phenomenal and writing software is much easier when you have strict serializability. Most people do not appreciate this because they do not understand the anomalies that you can get without strict serializable consistency.
mannyv
From what I understand one of the big IP ad tracking services (El Toro) is built on FoundationDB.
bpicolo
Apple uses it for CloudKit. I'd say that's pretty cool. Snowflake uses it for their metadata layer. Datadog uses it for their system called Husky (https://www.datadoghq.com/blog/engineering/introducing-husky...)
pjd7
https://www.youtube.com/watch?v=oYiFTBO67uU
https://innovation.ebayinc.com/stories/graphload-a-framework...
It looks like people are using it to build graph related models on it.
I am looking at it & considering doing something similar for graph data sets. As well as a transactionally safe key value store to store roaring bitmaps.
pstuart
An abandoned project I'd love to see resurrected is SQLite on fdb: https://github.com/losfair/mvsqlite
null
majestik
I can't put my finger on it but there's a weird tension between the two Dave's in this video. Almost like Rosenthal is trying to impress or earn the praise of Scherer.
Is there a backstory between these guys / FDB?
Dave_Rosenthal
Ha, well I met Scherer ~30 years ago in a high school math class and we’ve done three companies together, so you could say we’ve known each other for a bit :)
Nice having this backstory (fantastic production value too, impressive start to this podcast). Dis-aggregating the responsibilities of the DB into multiple pieces just feels so logical, helps make sure each piece can scale. Deterministic Simulation Testing gets mentioned in the video & was way ahead of it's time here. https://apple.github.io/foundationdb/testing.html
Hacker News is here too! From July 2012 (78 points, 72 comments): https://news.ycombinator.com/item?id=4294719
For a general introduction, I enjoyed the recent submission How FoundationDB works and why it works: https://news.ycombinator.com/item?id=37552085 https://uvdn7.github.io/notes-on-the-foundationdb-paper/