Sync Engines Are the Future

211 comments

·March 18, 2025

codeulike

I've been thinking about this a lot - nearly every problem these days is a synchronisation problem. You're regularly downloading something from an API? Thats a sync. You've got a distributed database? Sync problem. Cache Invalidation? Basically a sync problem. You want online and offline functionality? sync problem. Collaborative editing? sync problem.

And 'synchronisation' as a practice gets very little attention or discussion. People just start with naive approaches like 'download whats marked as changed' and then get stuck in the quagmire of known problems and known edge cases (handling deletions, handling transport errors, handling changes that didn't get marked with a timestamp, how to repair after a bad sync, dealing with conflicting updates etc).

The one piece of discussion or attempt at a systematic approach I've seen to 'synchronisation' recently is to do with Conflict-free Replicated Data Types https://crdt.tech which is essentially restricting your data and the rules for dealing with conflicts to situations that are known to be resolvable and then packaging it all up into an object.

klabb3

> The one piece of discussion or attempt at a systematic approach I've seen to 'synchronisation' recently is to do with Conflict-free Replicated Data Types https://crdt.tech

I will go against the grain and say CRDTs have been a distraction and the overfocus on them have been delaying real progress. They are immature and highly complex and thus hard to debug and understand, and have extremely limited cross-language support in practice - let alone any indexing or storage engine support.

Yes, they are fascinating and yes they solve real problems but they are absolute overkill to your problems (except collab editing), at least currently. Why? Because they are all about conflict resolution. You can get very far without addressing this problem: for instance a cache, like you mentioned, has no need for conflict resolution. The main data store owns the data, and the cache follows. If you can have single ownership, (single writer) or last write wins, or similar, you can drop a massive pile of complexity on the floor and not worry about it. (In the rare cases it’s necessary like Google Docs or Figma I would be very surprised if they use off-the-shelf CRDT libs – I would bet they have an extremely bespoke and domain-specific data structures that are inspired by CRDTs.)

Instead, what I believe we need is end-to-end bidirectional stream based data communication, simple patch/replace data structures to efficiently notify of updates, and standard algorithms and protocols for processing it all. Basically adding async reactivity on the read path of existing data engines like SQL databases. I believe even this is a massive undertaking, but feasible, and delivers lasting tangible value.

mweidner

Indeed, the simple approach of "send your operations to the server and it will apply them in the order it receives them" gives you good-enough conflict resolution in many cases.

It is still tempting to turn to CRDTs to solve the next problem: how to apply server-side changes to a client when the client has its own pending local operations. But this can be solved in a fully general way using server reconciliation, which doesn't restrict your operations or data structures like a CRDT does. I wrote about it here: https://mattweidner.com/2024/06/04/server-architectures.html...

klabb3

Just got to reading this.

> how to apply server-side changes to a client when the client has its own pending local operations

I liked the option of restore and replay on top of the updated server state. I’m wondering when this causes perf issues? First local changes should propagate fast after eg a network partition, even if the person has queued up a lot of them (say during a flight).

Anyway, my thinking is that you can avoid many consensus problems by just partitioning data ownership. The like example is interesting in this way. A like count is an aggregate based on multiple data owners, and everyone else just passively follows with read replication. So thinking in terms of shared write access is the wrong problem description, imo, when in reality ”liked posts” is data exclusively owned by all the different nodes doing the liking (subject to a limit of one like per post). A server aggregate could exist but is owned by the server, so no shared write access is needed.

Similarly, say you have a messaging service. Each participant owns their own messages and others follow. No conflicts are needed. However, you can still break the protocol (say liking twice). Those can be considered malformed and eg ignored. In some cases, you can copy someone else’s data and make it your own: for instance to protect against impersonations: say that you can change your own nickname, and others follow. This can be exploited to impersonate but you can keep a local copy of the last seen nickname and then display a ”changed name” warning.

Anyway, I’m just a layman who wants things to be simple. It feels like CRDTs have been the ultimate nerd-snipe, and when I did my own evaluations I was disappointed with how heavyweight and opaque they were a few years ago (and probably still).

ochiba

> Yes, they are fascinating and yes they solve real problems but they are absolute overkill to your problems (except collab editing), at least currently. Why? Because they are all about conflict resolution. You can get very far without addressing this problem: for instance a cache, like you mentioned, has no need for conflict resolution. The main data store owns the data, and the cache follows. If you can have single ownership, (single writer) or last write wins, or similar, you can drop a massive pile of complexity on the floor and not worry about it. (In the rare cases it’s necessary like Google Docs or Figma I would be very surprised if they use off-the-shelf CRDT libs – I would bet they have an extremely bespoke and domain-specific data structures that are inspired by CRDTs.)

I agree with this. CRDTs are cool tech but I think in practice most folks would be surprised by the high percentage of use cases that can be solved with much simpler conflict resolution mechanism (and perhaps combined with server reconciliation as Matt mentioned). I also agree that collaborative document editing is a niche where CRDTs are indeed very useful.

satvikpendem

You might not need a CRDT [0]. But also, CRDTs are the future [1].

[0] https://news.ycombinator.com/item?id=33865672

[1] https://news.ycombinator.com/item?id=24617542

halfcat

> I believe we need is end-to-end bidirectional stream based data communication

I suspect the generalized solution is much harder to achieve, and looks more like batch-based reconciliation of full snapshots than streaming or event-driven.

The challenge is if you aim to sync data sources where the parties managing each data source are not incentivized to provide robust sync. Consider Dropbox or similar, where a single party manages the data set, and all software (server and clients), or ecosystems like Salesforce and Mulesoft which have this as a stated business goal, or ecosystems like blockchains where independent parties are still highly incentivized to coordinate and have technically robust mechanisms to accomplish it like Merkle trees and similar. You can achieve sync in those scenarios because independent parties are incentivized to coordinate (or there is only one party).

But if you have two or more independent systems, all of which provide some kind of API or import/export mechanisms, you can never guarantee those systems will stay in sync using a streaming or event-driven approach. And worse, those systems will inevitably drift out of sync, or even more worse, will propagate incorrect data across multiple systems, which can then only be reconciled by batch-like point-in-time snapshots, which then begs the question of why use streaming if you ultimately need batch to make it work reliably.

Put another way, people say batch is a special case of streaming, so just use streaming. But you could also say streaming is a fragile form of sync, so just use sync. But sync is a special case of batch, so just use batch.

9rx

> In the rare cases it’s necessary like Google Docs or Figma I would be very surprised if they use off-the-shelf CRDT libs

Or CRDTs at all. Google Docs is based on operational transforms and Figma on what they call multiplayer technology.

josephg

I agree! Lots more things are sync. Also: the state of my source files -> my compiler (in watch mode), about 20 different APIs in the kernel - from keyboard state to filesystem watching to process monitoring to connected USB devices.

Also, http caching is sort of a special case of sync - where the cache (say, nginx) is trying to keep a synchronised copy of a resource from the backend web server. But because there’s no way for the web server to notify nginx that the resource has changed, you get both stale reads and unnecessary polling. Doing fan-out would be way more efficient than a keep alive header if we had a way to do it!

CRDTs are cool tech. (I would know - I’ve been playing with them for years). But I think it’s worth dividing data interfaces into two types: owned data and shared data. Owned data has a single owner (eg the database, the kernel, the web server) and other devices live down stream of that owner. Shared data sources have more complex systems - eg everyone in the network has a copy of the data and can make changes, then it’s all eventually consistent. Or raft / paxos. Think git, or a distributed database. And they can be combined - eg, the app server is downstream of a distributed database. GitHub actions is downstream of a git repo.

I’ve been meaning to write a blog post about this for years. Once you realise how ubiquitous this problem is, you see it absolutely everywhere.

miki123211

And then there's the third super-special category of shared data with no central server, and where only certain users should be allowed to perform certain operations. This comes up most often in p2p networks, censorship resistance etc.

In most cases, the easiest approach there is just "slap a blockchain on it", as a good and modern (think Ethereum, not Bitcoin) blockchain essentially "abstracts away" the decentralization and mostly acts like a centralized computer to higher layers.

That is certainly not the only viable approach, and I wish we looked at others more. For example, a decentralized DNS-like system, without an attached cryptocurrency, but with global consensus on what a given name points to, would be extremely useful. I'm not convinced that such a thing is possible, you need some way of preventing one bad actor from grabbing all the names, and monetary compensation seems like the easiest one, but we should be looking in this direction a lot more.

josephg

> And then there's the third super-special category of shared data with no central server, and where only certain users should be allowed to perform certain operations. This comes up most often in p2p networks, censorship resistance etc.

In my mind, this is just the second category again. It’s just a shared data system, except with data validation & Byzantine fault tolerance requirements.

It’s a surprisingly common and thorny problem. For example, I could change my local git client to generate invalid / wrong hashes for my commits. When I push my changes, other peers should - in some way - reject them. PVH (of Ink&Switch) has a rule when thinking about systems like this. He says you’re free to deface your own copy of the US constitution. But I don’t have to pull your changes.

Access control makes the BFT problem much worse. The classic problem is that if two admins concurrently remove each other, it’s not clear what happens. In a crdt (or git), peers are free to backdate their changes to any arbitrary point in the past. If you try and implement user roles on top of a crdt, it’s a nightmare. I think CRDTs are just the wrong tool for thinking about access control.

jkaptur

I can't wait to read that blog post. I know you're an expert in this and respect your views.

One thing I think that is missing in the discussion about shared data (and maybe you can correct me) is that there are two ways of looking at the problem: * The "math/engineering" way, where once state is identical you are done! * The "product manager" way where you have reasonable-sounding requests like "I was typing in the middle of a paragraph, then someone deleted that paragraph, and my text was gone! It should be its own new paragraph in the same place."

Literally having identical state (or even identical state that adheres to a schema) is hard enough, but I'm not aware of techniques to ensure 1) identical state 2) adhering to a schema 3) that anyone on the team can easily modify in response to "PM-like" demands without being a sync expert.

ochiba

> And 'synchronisation' as a practice gets very little attention or discussion. People just start with naive approaches like 'download whats marked as changed' and then get stuck in the quagmire of known problems and known edge cases (handling deletions, handling transport errors, handling changes that didn't get marked with a timestamp, how to repair after a bad sync, dealing with conflicting updates etc).

I've spent 16 years working on a sync engine and have worked with hundreds of enterprises on sync use cases during this time. I've seen countless cases of developers underestimating the complexity of sync. In most cases it happens exactly as you said: start with a naive approach and then the fractal complexity spiral starts. Even if the team is able to do the initial implementation, maintaining it usually turns into a burden that they eventually find too big to bear.

danielvaughn

CRDTs work well for linear data structures, but there are known issues with hierarchical ones. For instance, if you have a tree, then two clients could send a transaction that would cause a node to be a parent of itself.

That said, there’s work that has been done towards fixing some of those issues.

Evan Wallace (I think he’s the CTO of Figma) has written about a few solutions he tried for Figma’s collaborative features. And then Martin Kleppmann has a paper proposing a solution:

https://martin.kleppmann.com/papers/move-op.pdf

rapnie

Martin Kleppmann in one of his recent talks about the future of local-first, mentions the need for a generic sync service for the 'local-first end-game' [0] as he calls it. Standardization is needed. Right now everyone and their mother is doing sync differently and building production platforms around their own protocols and mechanisms.

[0] https://www.youtube.com/watch?v=NMq0vncHJvU&t=1016s

tmpfs

The problem is that the requirements can be vastly different. A collaborative editor is very different to say syncing encrypted blobs. Perhaps there is a one size fits all but I doubt it.

I've been working on sync for the latter use case for a while and CRDTs would definitely be overkill.

layer8

Automatic conflict resolution will always be limited. For example, who seriously believes that we’ll ever be able to fully automate the handling of merge conflicts in version control? (Even if recorded every single edit operation on the syntax-tree level.) And in regular documents the situation is worse, because you don’t have formal parsers and type checkers and unit tests for them. Even for schematized structured data, there are similar issues on the semantic level, that a mere “it conforms to the schema” doesn’t solve.

lifty

Indeed. So conflict resolution that takes input from the user needs to be part of the protocol. Just like in Git.

jdvh

As long as all clients agree on the order of CRDT operations then cycles are no problem. It's just an invalid transaction that can be dropped. Invalid or contradictory updates can always happen (regardless of sync mechanism) and the resolution is a UX issue. In some cases you might want to inform the user, in other cases the user can choose how to resolve the conflict, in other cases quiet failure is fine.

jakelazaroff

Unfortunately, a hard constraint of (state-based) CRDTs is that merging causally concurrent changes must be commutative. ie it is possible that clients will not be able to agree on the order of CRDT operations, and they must be able to arrive at the same state after applying them in any order.

mrkeen

I've looked at CRDTs, and the concept really appeals to me in the general case, but in the specific cases, my design always ends up being "keep-all-the-facts" about a particular item. But then you defer the problem of 'which facts can I throw away?'. It's like inventing a domain-specific GC.

I'd love to hear about any success cases people have had with CRDTs.

FjordWarden

There was an article on this website not so long ago about using CRDTs for collaborative editing and there was this silly example to show how leaky this abstraction can be. What if your have the word "color" and one user replaces it with "colour" and another deletes the word, what does the CRDT do in this case? Well it merges this two edits into "u". This sort of makes me skeptical of using CRDTs for user facing applications.

jakelazaroff

There isn’t a monolithic “CRDT” in the way you’re describing. CRDTs are, broadly, a kind of data structure that allows clients to eventually agree on a final state without coordination. An integer `max` function is a simple example of a CRDT.

The behavior the article found is peculiar to the particular CRDT algorithms they looked at. But they’re probably right that it’s impossible for all conflicting edits to “just work” (in general, not just with CRDTs). That doesn’t mean CRDTs are pointless; you could imagine an algorithm that attempts to detect such semantic conflicts so the application can present some sort of resolution UI.

Here’s the article, if interested (it’s very good): https://www.moment.dev/blog/lies-i-was-told-pt-1

jdvh

It's still early, but we have a checkpointing system that works very well for us. And once you have checkpoints you can start dropping inconsequential transactions in between checkpoints, which you're right, can be considered GC. However, checkpointing is desirable anyway otherwise new users have to replay the transaction log from T=0 when they join, and that's impractical.

dtkav

I've also had success with this method. "domain-specific GC" is a fitting term.

yccs27

For me the main issue with CRDTs is that they have a fixed merge algorithm baked in - if you want to change how conflicts get resolved, you have to change the whole data structure.

WorldMaker

I feel like the state-of-the-art here is slowly starting to change. I think CRDTs for too many years got too caught up in "conflict-free" as a "manifest destiny" sort of thing more than "hope and prayer" and thought they'd keep finding the right fixed merged algorithm for every situation. I started watching CRDTs from the perspective of source control and having a strong inkling that "data is always messy" and "conflicts are human" (conflicts are kind of inevitable in any structure trying to encode data made by people).

I've been thinking for a bit that it is probably about time the industry renamed that first C to something other than "conflict-free". There is no freedom from conflicts. There's conflict resistance, sure and CRDTs can provide in their various data structures a lot of conflict resistance. But at the end of the day if the data structure is meant to encode an application for humans, it needs every merge tool and review tool and audit tool it can offer to deal with those.

I think we're finally starting to see some of the light in the tunnel in the major CRDT efforts and we're finally leaving the detour of "no it must be conflict-free, we named it that so it must be true". I don't think any one library is yet delivering it at a good high level, but I have that feeling that "one of the next libraries" is maybe going to start getting the ergonomics of conflict handling right.

dtkav

I've been running into this with automated regex edits. Our product (Relay [0]) makes Obsidian real-time collaborative using yjs, but I've been fighting with the automated processes that rewrites markdown links within notes.

The issue happens when a file is renamed by one client, and then all other clients pick up the rename and make the change to the local files on disk. Since every edit is broken down into delete/keep/insert runs, the automated process runs rapidly in all clients and can break the links.

I could limit the edits to just one client, but it feels clunky. Another thought I've had is to use ytext annotations, or just also store a ymap of the link metadata and only apply updates if they can meet some kind of check (kind of like schema validation for objects).

If anyone has a good mental model for modeling automated operations (especially find/replace) in ytext please let me know! (email in bio).

[0] https://system3.md/relay

jbmsf

Absolutely. My current product relies heavily on a handful of partner systems and, adds an opinionated layer on top of these systems, and propagates data to CRM, DW, and other analytical systems.

One early insight was that we needed a representation of partner data in our database (and the downstream systems need a representation of our opinionated view as well). This is clearly an (eventually consistent) synchronization problem.

We also realized that we often either fail to sync (due to bugs, timing, or whatever) and need a regular process to resync data.

We've ended up with a homegrown framework that does both things, such that the same business logic gets used in both cases. This also makes it easy to backfill data if a chosen representation changes)

We're now on the third or fourth iteration of this system and I'm pretty happy with it.

delusional

Once you add a periodic resync you have moved the true synchronization away from the online "(eventually consistent) synchronization" and into the batch resync. At that point the online synchronization is just a performance optimization on top of the batch resync.

I've been in that situation a lot, and I'd always carefully consider if you even need the online synchronization at that point. It's pretty rarely required.

jbmsf

In our case it absolutely is. There are user facing flows that require data from partner systems to complete. Waiting for the next sync cycle isn't a good UX.

pwdisswordfishz

> Cache Invalidation? Basically a sync problem.

Does naming things and off-by-one errors also count?

cyanydeez

foo(){ //does bar }

mattnewport

UI is also a sync problem if you squint a bit. React like systems are an attempt to be a sync engine between model and view in a sense.

Multiplayer games too.

mackopes

I'm not convinced that there is one generalised solution to sync engines. To make them truly performant at large scale, engineers need to have deep understanding of the underlying technology, their query performance, database, networking, and build a custom sync engine around their product and their data.

Abstracting all of this complexity away in one general tool/library and pretending that it will always work is snake oil. There are no shortcuts to building truly high quality product at a large scale.

wim

We've built a sync engine from scratch. Our app is a multiplayer "IDE" but for tasks/notes [1], so it's important to have a fast local first/office experience like other editors, and have changes sync in the background.

I definitely believe sync engines are the future as they make it so much easier to enable things like no-spinners browsing your data, optimistic rendering, offline use, real-time collaboration and so on.

I'm also not entirely convinced yet though that it's possible to get away with something that's not custom-built, or at least large parts of it. There were so many micro decisions and trade-offs going into the engine: what is the granularity of updates (characters, rows?) that we need and how does that affect the performance. Do we need a central server for things like permissions and real-time collaboration? If so do we want just deltas or also state snapshots for speedup. How much versioning do we need, what are implications of that? Is there end-to-end-encryption, how does that affect what the server can do. What kind of data structure is being synced, a simple list/map, or a graph with potential cycles? What kind of conflict resolution business logic do we need, where does that live?

It would be cool to have something general purpose so you don’t need to build any of this, but I wonder how much time it will save in practice. Maybe the answer really is to have all kinds of different sync engines to pick from and then you can decide whether it's worth the trade-off not having everything custom-built.

[1] https://thymer.com

mentalgear

Optimally, a sync engine would have the ability to be configed to have the best settings for the project (e.g. central server or completely decentralised). It'd be great if one engine would be so performant/configurable, but having a lot of sync engines to choose from for your project is the best alternative.

btw: excellent questions to ask / insights - about the same I also came across in my lo-fi ventures.

Would be great if someone could assemble all these questions in a "walkthrough" step-by-step interface and in the end, the user gets a list of the best matching engines.

Edit: Mh ... maybe something small enough to vibe code ... if someone is interested to help let me know!

jdvh

Completely decentralized is cool, but I think there are two key problems with it.

1) in a decentralized system who is responsible for backups? What happens when you restore from a backup?

2) in a decentralized system who sends push notifications and syncs with mobile devices?

I think that in an age of $5/mo cloud vms and free SSL having a single coordination server has all the advantages and none of the downsides.

tonsky

- You can have many sync engines

- Sync engines might only solve small and medium scale, that would be a huge win even without large scale

thr0w

> Abstracting all of this complexity away in one general tool/library and pretending that it will always work is snake oil.

Remember Meteor?

xg15

That might be true, but you might not have those engineers or they might be busy with higher-priority tasks:

> It’s also ill-advised to try to solve data sync while also working on a product. These problems require patience, thoroughness, and extensive testing. They can’t be rushed. And you already have a problem on your hands you don’t know how to solve: your product. Try solving both, fail at both.

Also, you might not have that "large scale" yet.

(I get that you could also make the opposite case, that the individual requirements for your product are so special that you cannot factor out any common behavior. I'd see that as a hypothesis to be tested.)

tbrownaw

> decoupled from the horrors of an unreliable network

The first rule of network transparency is: the network is not transparent.

> Or: I’ve yet to see a code base that has maintained a separate in-memory index for data they are querying

Is boost::multi_index_container no longer a thing?

Also there's SQLite with the :memory: database.

And this ancient 4gl we use at work has in-memory tables (as in database tables, with typed columns and any number of unique or not indexes) as a basic language feature.

anonyfox

In Elixir/Erlang thats quite common I think, at least I do this for when performance matters. Put the specific subset of commonly used data into a ETS table (= in memory cache, allowing concurrent reads) and have a GenServer (who owns that table) listen to certain database change events to update the data in the table as needed.

Helps a lot with high read situations and takes considerable load off the database with probably 1 hour of coding effort if you know what you're doing.

TeMPOraL

> Is boost::multi_index_container no longer a thing?

Depends on the shop. I haven't seen one in production so far, but I don't doubt some people use it.

> Also there's SQLite with the :memory: database.

Ah, now that's cheating. I know, because I did that too. I did that because of the realization that half the members I'm stuffing into classes to store my game state are effectively poor man's hand-rolled tables, indices and spatial indices, so why not just use a proper database for this?.

> And this ancient 4gl we use at work has in-memory tables (as in database tables, with typed columns and any number of unique or not indexes) as a basic language feature.

Which one is this? I've argued in the past that this is a basic feature missing from 4GL languages, and a lot of work in every project is wasted on hand-rolling in-memory databases left and right, without realizing it. It would seem I've missed a language that recognized this fact?

(But then, so did most of the industry.)

tbrownaw

> Which one is this? I've argued in the past that this is a basic feature missing from 4GL languages, and a lot of work in every project is wasted on hand-rolling in-memory databases left and right, without realizing it. It would seem I've missed a language that recognized this fact?

https://en.wikipedia.org/wiki/OpenEdge_Advanced_Business_Lan...

Dates back to 1981, called "Progress 4GL" until 2006.

https://docs.progress.com/bundle/abl-reference/page/DEFINE-T...

phyrex

ABAP, the SAP language has that, if i remember correctly

aiono

> The first rule of network transparency is: the network is not transparent.

That's precisely why current request model is painful.

ximm

> have a theory that every major technology shift happened when one part of the stack collapsed with another.

If that was true, we would ultimately end up with a single layer. Instead I would say that major shifts happen when we move the boundaries between layers.

The author here proposes to replace servers by synced client-side data stores.

That is certainly a good idea for some applications, but it also comes with drawbacks. For example, it would be easier to avoid stale data, but it would be harder to enforce permissions.

worthless-trash

I feel like this is the "serverless" discussion all over again.

There was still a server, its just not YOUR server. In this case, there will still be servers, just maybe not something that you need to manage state on.

This misnaming creates endless conflict when trying to communicate this with hyper excited management who want to get on the latest trend.

Cant wait to be on the meeting and hearing: "We dont need servers when we migrate to client side data stores".

TeMPOraL

I think the management isn't hyper excited about naming - in fact, they couldn't care less for what the name means (it's just a buzzword). What they're excited about is what the thing does - which is, turn more capex into opex. With "cloud", we can subscribe to servers instead of owning them. With "serverless", we can subscribe directly to what servers do, without managing servers themselves. Etc.

Diederich

Recently, something quite rare happened. I needed to Xerox some paper documents. Well, such actions are rare today, but years ago, it was quite common to Xerox things.

Over time, the meaning of the word 'Xerox' changed. More specifically, it gained a new meaning. For a long time, Xerox only referred to a company named in 1961. Some time in the late 60s, it started to be used as a verb, and as I was growing up in the 70s and 80s, the word 'Xerox' was overwhelmingly used in its verb form.

Our society decided as a whole that it was ok for the noun Xerox to be used a verb. That's a normal and natural part of language development.

As others have noted, management doesn't care whether the serverless thing you want to use is running on servers or not. They care that they don't have to maintain servers themselves. CapEx vs OpEx and all that.

I agree that there could be some small hazard with the idea that, if I run my important thing in a 'serverless' fashion, then I don't have to associate all of the problems/challenges/concerns I have with 'servers' to my important thing.

It's an abstraction, and all abstractions are leaky.

If we're lucky, this abstraction will, on average, leak very little.

philsnow

> Over time, the meaning of the word 'Xerox' changed. More specifically, it gained a new meaning. For a long time, Xerox only referred to a company named in 1961. Some time in the late 60s, it started to be used as a verb, and as I was growing up in the 70s and 80s, the word 'Xerox' was overwhelmingly used in its verb form.

https://www.youtube.com/watch?v=PZbqAMEwtOE#t=5m58s I don't think this dramatization (of a court proceedings from 2010) is related to Xerox's plight with losing their trademark, but said dramatization is brilliant nonetheless

szundi

[dead]

mentalgear

Honourable mentions of some more excellent fully open-source sync engines:

- Zero Sync: https://github.com/rocicorp/mono

- Triplit: https://github.com/aspen-cloud/triplit

ochiba

Useful directory of tools here: https://localfirstweb.dev/

guappa

> - Zero Sync: https://github.com/rocicorp/mono

Doesn't even have a readme :D Raise the bar a bit maybe.

thruflo

https://zero.rocicorp.dev/docs/introduction

Hard to raise the bar on Zero. It’s a brilliant system.

profstasiak

can you share how are you using this? Production / side projects?

Would you recommend it for side projects?

daveguy

It does have a readme. Click the "View all files" button.

But you don't have to. GitHub shows the readme just below the partial file list. That's what all the same-page docs on GitHub/GitLab repositories are.

Full docs are linked from the readme.

thunderbong

"Website and Docs" is the second line I see

mentalgear

if you know of other honourable mentions, reply with their source link!

aboodman

There are so many:

- https://github.com/electric-sql/electric

- https://github.com/powersync-ja

- https://github.com/get-convex

- https://github.com/tinyplex/tinybase

- https://github.com/garden-co/jazz

sergioisidoro

I've been very curious about electric -- the idea of giving your application a replicated subset of your databse, using your api as a proxy, is quite interesting for apps where the business layer between the db and the client is thin (our case).

edit: Also their decision to make it just one way sync makes a LOT of sense. Write access brings a lot of scary cases, so by making it only read sync eases some of my anxieties. I can still use Rest / RPC for updating the data

mentalgear

Convex I didnt know yet - looks really crisp (even has svelte support) ! Do you have experience with it? Does it support (decentralized) E2E?

hop_n_bop

more the basis for building your own backend sync-solution in Go than a complete product, this library does an rsync-like protocol to minimize data transferred to sync up two filesystems; its very general building block:

https://github.com/glycerine/jcp

bushido

https://rxdb.info/ is a good one.

zx8080

> decoupled from the horrors of an unreliable network

There's no such thing as reliable network in the world. The world is network connected, there's almost no local-only systems anymore (for a long long time now).

Some engineers dream that there's some cases when network is reliable, like when a system fully lives in the same region and single AZ. But even then it's actually not reliable and can have some glitches quite frequently (like once per month or so, depending on some luck).

01HNNWZ0MV43FF

True. Even the network between the CPU and an SD card or USB drive is not reliable

jimbokun

I believe the point is that given an unreliable network, it's nice to have access to all the data available locally up to the point when you had a network issue. And then when the network is working again, your data comes up to date with no extra work on the application developer's part.

tonsky

> There's no such thing as reliable network in the world

I’m not saying there is

Keyz56

[dead]

myflash13

Locally synced databases seem to be a new trend. Another example is Turso, which works by maintaining a sort of SQLite-DB-per-tenant architecture. Couple that with WASM and we’ve basically come full circle back to old school desktop apps (albeit with sync-on-load). Fat client thin client blah blah.

PaulHoule

Lotus Notes was a product far ahead of its time (nearly forgotten today) which was an object database with synchronization semantics. They made a lot of decisions that seem really strange today, like building an email system around it, but that empowered it for long-running business workflows. It's something everybody in the low-code/no-code space really needs to think about.

ddrdrck_

No one that has ever had to work with Lotus Notes could forget it. It was atrocious. Maybe the sync engine was great but I really do not know what it was used for ...

skybrian

This is also a tricky UI problem. Live updates, where web pages move around on you while you’re reading them, aren’t always desirable. When you’re collaborating with someone you know on the same document, you want to see edits immediately, but what about a web forum? Do you really need to see the newest responses, or is this a distraction? You might want a simple indicator that a reload will show a change, though.

A white paper showing how Instant solves synchronization problems might be nice.

slifin

I'm surprised to see Tonsky here

Mostly because I consider the state of the art on this to be Clojure Electric and he presumably is aware of it at least to some degree but does not mention it

tonsky

Clojure Electric is different. It’s not really a sync, it’s more of a thin client. It relies of having fast connection to server at all times, and re-fetches everything all the time. They innovation is that they found a really, really ergonomic way to do it

dustingetz

Electric’s network state distribution is fully incremental, i’m not sure what you mean by “re-fetches everything all the time” but that is not how i would describe it.

If you are referring to virtual scroll over large collections - yes, we use the persistent connection to stream the window of visible records from the server in realtime as the user scrolls, affording approximately realtime virtual scroll over arbitrarily large views (we target collections of size 500-50,000 records and test at 100ms artificial RT latency, my actual prod latency to the Fly edge network is 6ms RT ping), and the Electric client retains in memory precisely the state needed to materialize the current DOM state, no more no less. Which means the client process performance is decoupled from the size of the dataset - which is NOT the case for sync engines, which put high memory and compute pressure on the end user device for enterprise scale datasets. It also inherits the traditional backend-for-frontend security model, which all enterprise apps require, including consumer apps like Notion that make the bulk of their revenue from enterprise citizen devs and therefore are exposed to enterprise data security compliance. And this is in an AI-focused world where companies want to defend against AI scrapers so they can sell their data assets to foundation model providers for use in training!

Which IMO is the real problem with sync engines: they are not a good match for enterprise applications, nor are they a good match for hyper scale consumer saas that aspire to sell into enterprise. So what market are they for exactly?

quotemstr

Clojure Electric is proprietary software, which disqualifies it immediately no matter its other purported benefits

mananaysiempre

I’m also surprised, but more because I remember very vividly his previous post on sync[1] which described a much more user-friendly (andm much less startup-friendly) system.

[1] https://tonsky.me/blog/crdt-filesync/

profstasiak

thank you for mentioning! I have been reading a lot about sync engines and never saw Clojure Electric being mentioned here on HN!

ForTheKidz

> You’ll get your data synced for you

How does this happen without an interface for conflict resolution? That's the hard part.

phito

Right, first thing I did after opening the article is CTRL-F'ing for conflict, and got zero result. How are they not talking about the only real problem about the local-first approach? The rest is just boiler plate code.

Sammi

All this recent hype about sync engines and local first applications completely disregards conflict resolution. It's the reason why syncing isn't mainstream already and it isn't solved and arguably cannot be.

Imagine if git just on its own picked what to keep and what to throw away when there's a conflict. You fundamentally need the user to make the choice.

aboodman

Zero (zerosync.dev) uses transactional conflict resolution, which is what our prior product Replicache and Reflect both used. It is very similar to what multiplayer games have done for decades.

It is described here:

https://rocicorp.dev/blog/ready-player-two

It works really well and we and our customers have found it to be quite general.

It allows you to run an arbitrary transaction on the sever side to decide what to do in case of conflicts. It is the software equivalent of git asking the user what to do. Zero asks your code what to do.

But it asks it in the form of the question "please run the function named x with these inputs on the current backend db state". Which is a much more ergonomic way to ask it than "please do a 3-way merge between these three states".

Conflict resolution is not the reason why there has not been a general-purpose sync engine. None of our customers have ~ever complained about conflict resolution.

The reason there has not been a general-purpose sync engine is actually on the read side:

  - Previous sync engines really want you to sync all data. This is impractical for most apps.

  - Previous sync engines do not have practical approaches to permissions.

These problems are being solved in next generation of sync engines.

For more on this, I talk about it some here:

https://www.youtube.com/watch?v=rqOUgqsWvbw

probabletrain

I think with good presence (being able to see what other users are doing) and an app that isn't used offline, conflicts are essentially not a problem. As long as whatever is resolving the conflicts resolves them in a way that doesn't break the app, e.g. making sure there aren't cycles in some multiplayer app with a tree datastructure. Sounds like Zero has the right idea here, I'll build something on it imminently to try it out.

Sammi

"It is the software equivalent of git asking the user what to do. Zero asks your code what to do."

You are asking the dev what to do. You are _not_ asking the user what to do. This is akin of the git devs baking in a choice into git on what to keep in a merge conflict.

It's hard to trust you guys when you misrepresent like this. I thought long and hard on whether to respond confrontationally like this, but decided you really need to hear the push back on this.

probabletrain

> Previous sync engines really want you to sync all data

Linear had to do all sorts of shenanigans to be able to sync all data, for orgs with lots of it – there's a talk on that here:

https://www.youtube.com/watch?v=Wo2m3jaJixU&t=1473s

lifty

Is Zero based on prolly trees?

Jyaif

> All this recent hype about sync engines and local first applications completely disregards conflict resolution

The main concern of sync engines is precisely the conflict resolution! Everything else is simple in comparison.

The good news is that under some circumstances it is possible to solve conflicts without user intervention. The simplest example is a counter that can only be incremented. More advanced data structures automatically solving conflicts exists, for example solving conflicts for strings exists, and those are good enough for a text editor.

I agree that there will be conflicts that are resolved in a way that yields non-sensical text, for example if there are 2 edits of the sentence "One cat":

One cat => Two cats

One cat => One dog

The resulting merge may be something like "Two cats dog". Something else (the user, an LLM...) will then have to fix it.

But that's totally OK, because in practice this will happen extremely rarely, only when the user would have been offline for a long time. That user will be happy to have been able to work offline, largely compensating the fact that they have to proof read the text again.

SkiFire13

This doesn't "solve" conflict resolution, it just picks one of the possible answers and then doesn't care whether it was the correct one or not.

It can be acceptable for some usecases, but not for others where you're still concerned about stuff that happens "extremely rately" and is not under your direct control.

> Something else (the user, an LLM...) will then have to fix it.

This assumes that user/llm knows the conflict was automatically solved and might need to be fixed, so the conflict is still there! You just made the manual part delayed and non-mandatory, but if you want correctness it will still have to be there.

brulard

> in practice this will happen extremely rarely, only when the user would have been offline for a long time.

I don't think it would happen "extremely rarely". Drops in connectivity happen a lot, especially on cellular connection and this can absolutely happen a lot for some applications. Especially when talking about "offline first" apps.

sgt

> All this recent hype about sync engines and local first applications completely disregards conflict resolution.

Not really true though. I've used a couple of local sync engines, one internally built and another one which is both commercial and now open source called PowerSync[1]. Conflict resolution is definitely on the agenda, and a developer is definitely going to be mindful of conflicts when designing the application.

[1] https://www.powersync.com/

Sammi

My unfortunate point is that the dev cannot know what the user is doing, and so cannot in principle know what choice to make on behalf of the user in case of a conflict. This is not a code problem. It cannot be solved with code.

porridgeraisin

Precisely. The hype articles write all about the journey to The Wall, and then leave out the bit where you smash headfirst into it.

lifty

Very good point. The local-sync ecosystem is still in a young phase, and conflict resolution hasn't been tackled or solved yet. Most systems have a |last write wins" approach.

jamil7

> All this recent hype about sync engines and local first applications

Kind of but only really in the web world, it was the default on desktop for a long time and is pretty common on mobile.

null

[deleted]

avodonosov

They elaborate on the conflicts in the "80/20 for Multiplayer" section of this essay: https://www.instantdb.com/essays/next_firebase

(make sure to also read the footnote [28] there).

tonsky

Ah, no. Not really. People sometimes think about conflict resolution as a problem that needs to be solved. But it’s not solvable, not really. It’s part of the domain, it’s not going anywhere, it’s irreducible complexity.

You _will_ have conflicts (because your app is distributed and there are concurrent writes). They will happen on semantic level, so only you (app developer) _will_ be able to solve them. Database (or any other magical tool) can’t do it for you.

Another misconception is that conflict resolution needs to be “solved” perfectly before any progress can be made. That is not true as well. You might have unhandled conflicts in your system and still have a working, useful, successful product. Conflicts might be rare, insignificant, or people (your users) will just correct for/work around them.

I am not saying “drop data on the floor”, of course, if you can help it. But try not to overthink it, either.

DaiPlusPlus

> But it’s not solvable, not really. It’s part of the domain, it’s not going anywhere, it’s irreducible complexity. You _will_ have conflicts (because your app is distributed and there are concurrent writes). [...] Another misconception is that conflict resolution needs to be “solved” perfectly before any progress can be made. That is not true as well. You might have unhandled conflicts in your system and still have a working, useful, successful product. Conflicts might be rare, insignificant, or people (your users) will just correct for/work around them.

I can't speak for whatever application-level problems you were trying to solve, but many problem-cases can be massaged into being conflict-free by adding constraints (or rather: discovering constraints inherent in the business-domain you can use). For example (and the best example, too) is to use an append-only logical model: then the synchronization problem reduces down to merge-sort. Another kind of constraint might be to simply disallow "edit" access to local data when working-offline (without a prior lock or lease being taken) but still allowing "create".

> Database (or any other magical tool) can’t do it for you.

Yes-and-no.

While I'm no fan of CORBA and COM+ (...or SOAP, or WS-OhGodMakeItStop), but being "enterprise-y" it meant they brought distributed-transactions to any application, and that includes RDBMS-mediated distributed transactions (let's agree, an RDBMS is in a far greater position to be a better canonical transaction-server than an application-server running in-front of it). For distributed systems needing transient distributed locks to prevent conflicts in the first place (so only used by interactive users in the same LAN, really) this worked just-as-well as a local-only solution - and make it fault-tolerant too.

...so it is unfortunate that with the (absolutely justified) back-to-basics approach with REST[1] that we lose built-in support for distributed transactions (even some of the more useful and legitimate parts of WebDAV (and so, piggy-backing on our web-servers' built-in support for WebDAV verbs) seem to be going-away) - this all raises the barrier-to-entry for doing distributed-transactions _right_, which means the next set of college-hires won't have been exposed to it, which means it won't be a standard expected feature in the next major internal application they'll write for your org, which means you'll either have a race-condition impacting a multi-billion-dollar business thing that no-one knows how to fix or more likely, just a crappy UX where you have to tell your users not to reload the page too quickly "just in case". Yes, I see advisories like that in the Zendesk pages of the next line-of-business SaaS you'll be voluntold to integrate into your org.

(I think today, the "best" way to handle distributed-locking between interactive-users in a web-app would necessitate using a ServiceWorker using WebRTC, SSE, or a highly-reliable WebSocket - which itself is a load of work right there - and don't forget to do all your JS feature-checks because eventually someone will try to use your app on an old Safari edition because they want to keep on using their vintage Mac) - or anyone using Incognito mode, _gah_.

[1]: https://devblast.com/b/calling-your-web-api-restful-youre-do...

iansinnott

Have been using Instant for a few side projects recently and it has been a phenomenal experience. 10/10, would build with it again. I suspect this is also at least partially true of client-server sync engines in general.

kenrick95

I concur with this. Been using it on my side project that only have a front-end. The "back-end" is 100% InstantDB. Although for me, I found that the permissions part a bit hard to understand, especially when it involves linking to other namespace. Haven't checked them for a while, maybe they've improved on this...