Local-first software: You own your data, in spite of the cloud
80 comments
·July 5, 2025DataDaoDe
Yes a thousand percent! I'm working on this too. I'm sick of everyone trying to come up with a use case to get all my data in everyone's cloud so I have to pay a subscription fee to just make things work. I'm working on a fitness tracking app right now that will use the sublime model - just buy it, get updates for X years, sync with all your devices and use it forever. If you want updates after X years buy the newest version again. If its good enough as is - and that's the goal - just keep using it forever.
This is the model I want from 90% of the software out there, just give me a reasonable price to buy it, make the product good, and don't marry it to the cloud so much that its unusable w/out it.
There are also a lot of added benefits to this model in general beyond the data privacy (most are mentioned in the article), but not all the problems are solved here. This is a big space that still needs a lot of tooling to make things really easy going but the tech to do it is there.
Finally, the best part (IMHO) about local-first software is it brings back a much healthier incentive structure - you're not monetizing via ads or tracking users or maxing "engagement" - you're just building a product and getting paid for how good it is. To me it feels like its software that actually serves the user.
charcircuit
>you're not monetizing via ads
Yes, you are. You can find tons of purely local apps thar monetize themselves with apps.
DataDaoDe
Sure you could. I'm not, I don't think its in the spirit of local first. And I wouldn't pay money for that, but if you or someone else wants to build that kind of software - its a free world :)
criddell
It’s easy to say you wouldn’t do that, but if it gets to the point where you have an employee helping you out and in a downturn you have to choose between laying them off or pushing an ad to keep paying them one more quarter, you might reconsider.
thaumasiotes
> You can find tons of purely local apps tha[t] monetize themselves with a[d]s.
How do they do that without hitting the internet?
kid64
It's "local first", not "local only".
samwillis
There is now a great annual Local-first Software conference in Berlin (https://www.localfirstconf.com/) organised by Ink and Switch, and it's spawned a spin out Sync Conf this November in SF (https://syncconf.dev/)
There was a great panel discussion this year from a number of the co-authors of the the paper linked, discussing what is Local-first software in the context of dev tools and what they have learnt since the original paper. It's very much worth watching: https://youtu.be/86NmEerklTs?si=Kodd7kD39337CTbf
The community are very much settling on "Sync" being a component of local first, but applicable so much wider. Along with local first software being a characteristic of end user software, with dev tools - such as sync engines - being an enabling tool but not "local first" in as much themselves.
The full set of talks from the last couple of years are online here: https://youtube.com/@localfirstconf?si=uHHi5Tsy60ewhQTQ
It's an exciting time for the local-first / sync engine community, we've been working on tools that enable realtime collaborative and async collaborative experiences, and now with the onset of AI the market for this is exploring. Every AI app is inherently multi user collaborative with the agents as actors within the system. This requires the tech that the sync engine community has been working on.
2color
It's a very exciting moment for this movement. A lot of the research and tech for local-first is nearing the point that it's mature, efficient, and packaged into well designed APIs.
Moreover, local-first —at least in theory— enables less infrastructure, which could reignite new indie open source software with less vendor lock-in.
However, despite all my excitement about embracing these ideas in the pursuit of better software, there's one hurdle that preventing more wide spread adoption amongst developers, and that is the Web platform.
The Web platform lacks building blocks for distributing hashed and/or signed software that isn't tied to origins. In other words, it's hard to decouple web-apps from the same-origin model which requires you set up a domain and serve requests dynamically.
Service Workers and PWAs do help a bit in terms of building offline experiences, but if you want users to download once, and upgrade when they want (and internet is available), you can't use the Web. So you end up breaking out of the browser, and start using Web technologies outside of the browser with better OS functionality, like Electron, React Native, Tauri et al (the https://userandagents.com/ community is doing some cool experiments in this space).
Jtsummers
Worth a read, and it's had some very active discussions in the past:
https://news.ycombinator.com/item?id=19804478 - May 2019, 191 comments
https://news.ycombinator.com/item?id=21581444 - Nov 2019, 241 comments
https://news.ycombinator.com/item?id=23985816 - Jul 2020, 9 comments
https://news.ycombinator.com/item?id=24027663 - Aug 2020, 134 comments
https://news.ycombinator.com/item?id=26266881 - Feb 2021, 90 comments
https://news.ycombinator.com/item?id=31594613 - Jun 2022, 30 comments
https://news.ycombinator.com/item?id=37743517 - Oct 2023, 50 comments
the_snooze
Anything with online dependencies will necessarily require ongoing upkeep and ongoing costs. If a system is not local-first (or ideally local-only), it’s not designed for long-term dependability.
Connected appliances and cars have got to be the stupidest bit of engineering from a practical standpoint.
api
The entire thing is because of subscription revenue.
It’s self reinforcing because those companies that get subscription revenue have both more revenue and higher valuations enabling more fund raising, causing them to beat out companies that do not follow this model. This is why local first software died.
tikhonj
I remember seeing somebody summarize this as "SaaS is a pricing model" or "SaaS is financialization" and it totally rings true. Compared to normal software pricing, a subscription gives you predictable recurring revenue and a natural sort of price discrimination (people who use your system more, pay more). It's also a psychological thing: folks got anchored on really low up-front prices for software, so paying $2000 for something up-front sounds crazy even if you use it daily for years, but paying $25/month feels reasonable. (See also how much people complain about paying $60 for video games which they play for thousands of hours!)
It's sad because the dynamics and incentives around clear, up-front prices seem generally better than SaaS (more user control, less lock-in), but almost all commercial software morphs into SaaS thanks to a mix of psychology, culture and market dynamics.
There are other advantages to having your software and data managed by somebody else, but they are far less determinative than structural and pricing factors. In a slightly different world, it's not hard to imagine relatively expensive software up-front that comes with a smaller, optional (perhaps even third-party!) subscription service for data storage and syncing. It's a shame that we do not live in that world.
danjl
Correct. SaaS is a business model, not a technical concept. But the real problem is that there is no equivalent business model for selling local first software. Traditional desktop apps were single purchase items. Local first is not because you just navigate to a website in your browser and blammo you get the software. What we need is a way to make money off of local first software.
api
SaaS is a business model. Cloud is DRM. If you run the software in the cloud it can't be pirated and there is perfect lock-in. Double if the data can't be exported.
Related: I've been incubating an idea for a while that open source, as it presently stands, is largely an ecosystem that exists in support of cloud SaaS. This is quite paradoxical because cloud SaaS is by far the least free model for software -- far, far less free than closed source commercial local software.
bboygravity
The root cause of the problem is that it's easier to make personalized stuff with server/backend (?cloud?) than without maybe?
Example: I made a firefox extension that automatically fills forms using LLM. It's fully offline (except OPTIONALLY) the LLM part, optionally because it also supports Ollama locally.
Now the issue is that it's way too hard for most people to use: find the LLM to run, acquire it somehow (pay to run it online or download it to run in Ollama) gotta configure your API url, enter API key, save all of your details for form fulling locally in text files which you then have to backup and synchronize to other devices yourself.
The alternative would be: create account, give money, enter details and all is synced and backedup automatically accross devices, online LLM pre-selected and configured. Ready to go. No messing around with Ollama or openrouter, just go.
I don't know how to solve it in a local way that would be as user friendly as the subscription way would be.
Now things like cars and washing machines are a different story :p
tshaddox
> The root cause of the problem is that it's easier to make personalized stuff with server/backend (?cloud?) than without maybe?
That, and also there are real benefits to the end user of having everything persisted in the cloud by default.
okr
Can the LLM not help with setting up the local part? (Sorry, was just the first thought i had.)
montereynack
Cool to see principles behind this, although I think it’s definitely geared towards the consumer space. Shameless self plug, but related: we’re doing this for industrial assets/industrial data currently (www.sentineldevices.com), where the entire training, analysis and decision-making process happens on customer equipment. We don’t even have any servers they can send data to, our model is explicitly geared on everything happening on-device (so the network principle the article discussed I found really interesting). This is to support use cases in SCADA/industrial automation where you just can’t bring data to the outside world. There’s imo a huge customer base and set of use cases that are just casually ignored by data/AI companies because actually providing a service where the customer/user is is too hard, and they’d prefer to have the data come to them while keeping vendor lock-in. The funny part is, in discussions with customers we actually have to lean in and be very clear on “no this is local, there’s no external connectivity” piece, because they really don’t hear that anywhere and sometimes we have to walk them through it step by step to help them understand that everything is happening locally. It also tends to break the brains of software vendors. I hope local-first software starts taking hold more in the consumer space so we can see people start getting used to it in the industrial space.
spauldo
It doesn't help that all the SCADA vendors are jumping on the cloud wagon and trying to push us all in that direction. "Run your factory from your smartphone!" Great, now I'm one zero-day away from some script kiddie playing around with my pumps.
codybontecou
An exciting space and I'm glad you and your team are working in it.
I looked over your careers page and see all of your positions are non-remote. Is this because of limitations of working on local-first software require you to be in-person? Or is this primarily a management issue?
hemant6488
I've been building exactly this with SoundLeaf [0] - an iOS client for the excellent open-source Audiobookshelf server. No data collection, no third-party servers, just your audiobooks syncing directly with your own instance.
The user-friendliness challenge is real though. Setting up Audiobookshelf [1] is more work than "just sign up," but once you have it running, the local-first client becomes much cleaner to build. No user accounts, no subscription billing, no scaling concerns. Simple pricing too: buy once, own forever. No monthly fees to access your own audiobooks.
Existenceblinks
Tried to adopt this last month at work, it failed. E.g. the mentioned Automerge, it has poor docs https://automerge.org/docs/reference/library_initialization/... and that left out a lot of question, it seems backend agnostic but have to figure out how to store, how to broadcast ourselves.
GMoromisato
Personally, I disagree with this approach. This is trying to solve a business problem (I can't trust cloud-providers) with a technical trade-off (avoid centralized architecture).
The problems with closed-source software (lack of control, lack of reliability) were solved with a new business model: open source development, which came with new licenses and new ways of getting revenue (maintenance contracts instead of license fees).
In the same way, we need a business model solution to cloud-vendor ills.
Imagine we create standard contracts/licenses that define rights so that users can be confident of their relationship with cloud-vendors. Over time, maybe users would only deal with vendors that had these licenses. The rights would be something like:
* End-of-life contracts: cloud-vendors should contractually spell out what happens if they can't afford to keep the servers running.
* Data portability guarantees: Vendors must spell out how data gets migrated out, and all formats must be either open or (at minimum) fully documented.
* Data privacy transparency: Vendors must track/audit all data access and report to the user who/what read their data and when.
I'm sure you can think of a dozen other clauses.
The tricky part is, of course, adoption. What's in it for the cloud-vendors? Why would they adopt this? The major fear of cloud-vendors is, I think, churn. If you're paying lots of money to get people to try your service, you have to make sure they don't churn out, or you'll lose money. Maybe these contracts come only with annual subscription terms. Or maybe the appeal of these contracts is enough for vendors to charge more.
AnthonyMouse
> This is trying to solve a business problem (I can't trust cloud-providers) with a technical trade-off (avoid centralized architecture).
Whenever it's possible to solve a business problem or political problem with a technical solution, that's usually a strong approach, because those problems are caused by an adversarial entity and the technical solution is to eliminate the adversarial entity's ability to defect.
Encryption is a great example of this if you are going to use a cloud service. Trying to protect your data with privacy policies and bureaucratic rules is a fool's errand because there are too many perverse incentives. The data is valuable, neither the customer nor the government can easily tell if the company is selling it behind their backs, it's also hard to tell if he provider has cheaped out on security until it's too late, etc.
But if it's encrypted on the client device and you can prove with math that the server has no access to the plaintext, you don't have to worry about any of that.
The trouble is sometimes you want the server to process the data and not just store it, and then the technical solution becomes, use your own servers.
hodgesrm
> * Data portability guarantees: Vendors must spell out how data gets migrated out, and all formats must be either open or (at minimum) fully documented.
This is not practical for data of any size. Prod migrations to a new database take months or even years if you want things to go smoothly. In a crisis you can do it in weeks but it can be really ugly, That applies even when moving between the same version of open source database, because there's a lot of variation between the cloud services themselves.
The best solution is to have the data in your own environment to begin with and just unplug. It's possible with bring-your-own-cloud management combined with open source.
My company operates a BYOC data product which means I have an economic interest in this approach. On the other hand I've seen it work, so I know it's possible.
GMoromisato
I'd love to know more about BYOC. Does that apply to the raw data (e.g., the database lives inside the enterprise) or the entire application stack (e.g., the enterprise is effectively self-hosting the cloud).
It seems like you'd need the latter to truly be immune to cloud-vendor problems. [But I may not understand how it works.]
WarOnPrivacy
> End-of-life contracts: cloud-vendors should contractually spell out what happens if they can't afford to keep the servers running.
I'm trying to imagine how this would be enforced when a company shutters and it's principals walk away.
GMoromisato
It's a good question--I am not a lawyer.
But that's the point of contracts, right? When a company shuts down, the contracts become part of the liabilities. E.g., if the contract says "you must pay each customer $1000 if we shut down" then the customers become creditors in a bankruptcy proceeding. It doesn't guarantee that they get all (or any) money, but their interests are negotiated by the bankruptcy judge.
Similarly, I can imagine a contract that says, "if the company shuts down, all our software becomes open source." Again, this would be managed by a bankruptcy judge who would mandate a release instead of allowing the creditors to gain the IP.
Another possibility is for the company to create a legal trust that is funded to keep the servers running (at a minimal level) for some specified amount of time.
WarOnPrivacy
> When a company shuts down, the contracts become part of the liabilities.
The asset in the contract is their customer's data; it is becoming stale by the minute. It could be residing in debtor-owned hardware and/or in data centers that are no longer getting their bills paid.
It takes time to get a trustee assigned and I think we need an immediate response - like same day. (NAL but prep'd 7s & 13s)
WarOnPrivacy
(cont. thinking...) One possibility. A 3rd party manages a continually updating data escrow. It'd add some expense and complexity to the going concern.
al_borland
Does this really solve the problem? Let's say I'm using a cloud provider for some service I enjoy. They have documents that spell out that if they have to close their doors they will give X months of notice and allow for a data export. Ok, great. Now they decide to shut their doors and honor those agreements. What am I left with? A giant JSON file that is effectively useless unless I decide to write my own app, or some nice stranger does? The thought is there, it's better than nothing, but it's not as good as having a local app that will keep running, potentially for years or decades, after the company shuts their doors or drops support.
samwillis
> This is trying to solve a business problem (I can't trust cloud-providers) with a technical trade-off (avoid centralized architecture).
I don't think that's quite correct. I think the authors fully acknowledge that the business case for local-first is not complexly solved and is a closely related problem. These issues need both a business and technical solution, and the paper proposes a set of characteristics of what a solution could look like.
It's also incorrect to suggest that local-first is an argument for decentralisation - Martin Kleppmann has explicitly stated that he doesn't think decentralised tech solves these issues in a way that could become mass market. He is a proponent of centralised standardised sync engines that enable the ideals of local-first. See his talk from Local-first conf last year: https://youtu.be/NMq0vncHJvU?si=ilsQqIAncq0sBW95
GMoromisato
I'm sure I'm missing a lot, but the paper is proposing CRDTs (Conflict-free Replicated Data Types) as the way to get all seven checkmarks. That is fundamentally a distributed solution, not a centralized one (since you don't need CRDTs if you have a central server).
And while they spend a lot of time on CRDTs as a technical solution, I didn't see any suggestions for business model solutions.
In fact, if we had a business model solution--particularly one where your data is not tied to a specific cloud-vendor--then decentralization would not be needed.
I get that they are trying to solve multiple problems with CDRTs (such a latency and offline support) but in my experience (we did this with Groove in the early 2000s) the trade-offs are too big for average users.
Tech has improved since then, of course, so maybe it will work this time.
prmoustache
> Personally, I disagree with this approach. This is trying to solve a business problem (I can't trust cloud-providers)
It is not only a business problem. I stay away from cloud based services not only because of subscription model, but also because I want my data to be safe.
When you send data to a cloud service, and that data is not encrypted locally before being sent to the cloud (a rare feature), it is not a question of if but when that data will be pwned.
mumbisChungo
A good contract can help you to seek some restitution if wrongdoing is done and you become aware of it and you can prove it. It won't mechanically prevent the wrongdoing from happening.
maccard
> Vendors must spell out how data gets migrated out, and all formats must be either open or (at minimum) fully documented.
Anecdotally, I’ve never worked anywhere where the data formats are documented in any way other than a schema in code,
davepeck
In theory, I love the local-first mode of building. It aligns well with “small tech” philosophy where privacy and data ownership are fundamental.
In practice, it’s hard! You’re effectively responsible for building a sync engine, handling conflict resolution, managing schema migration, etc.
This said, tools for local-first software development seem to have improved in the past couple years. I keep my eye on jazz.tools, electric-sql, and Rocicorp’s Zero. Are there others?
rzzzt
CouchDB on the server and PouchDB on the client was an attempt at making such an environment:
Also some more pondering on local-first application development from a "few" (~10) years back can be found here: https://unhosted.org/
sroussey
And RxDB. https://rxdb.info/
samwillis
Along with the others mentioned, it's worth highlighting Yjs. It's an incredible CRDT toolkit that enables many of the realtime and async collaborative editing experience you want from local-first software.
thorum
I’ve built several apps on yjs and highly recommend it. My only complaint is that storing user data as a CRDT isn’t great for being able to inspect or query the user data server-side (or outside the application). You have to load all the user’s data into memory via the yjs library before you can work with any part of it. There are major benefits to CRDTs but I don’t think this trade-off is worth it for all projects.
zdragnar
I think I saw someone point out automerge not long ago:
Rust and JavaScript implementations, a handful of network strategies. It doesn't come with the free or paid offering that jazz.tools does, but it's pretty nice.
3036e4
I use local software and sync files using git or sometimes fossil (both work fine in Android with termux for instance, for stuff In want to access on my phone). I don't host servers or use any special software that requires syncing data in special ways.
ofrzeta
Do you know that website? https://www.localfirst.fm
EDIT: actually I wanted to point to the "landscape" link (in the top menu) but that URL is quite unergonomic.
davepeck
No, I didn't know about it -- thank you! (EDIT: and the landscape page has lots of libraries I hadn't run across before. Neat.)
sgt
There's also PowerSync: https://www.powersync.com/
It's also open source and has bindings for Dart, JS, Swift, C#, Kotlin, etc
ochiba
This site also has a directory of devtools: https://lofi.so/
ibizaman
That’s essentially what I’m trying to make widely available through my projects https://github.com/ibizaman/selfhostblocks and https://github.com/ibizaman/skarabox. Their shared goal is to make self-hosting more approachable to the masses.
It’s based on NixOS to provide as much as possible out of the box and declaratively: https, SSO, LDAP, backups, ZFS w/ snapshots, etc.
It’s a competitor to cloud hosting because it packages Vaultwarden and Nextcloud to store most of your data. It does provide more services than that though, home assistant for example.
It’s a competitor to YUNoHost but IMO better (or aims to be) because you can use the building blocks provided by SelfHostBlocks to self-host any packages you want. It’s more of a library than a framework.
It’s a competitor to NAS but better because everything is open source.
It still requires the user to be technical but I’m working on removing that caveat. One of my goals is to allow to install it on your hardware without needing nix or touching the command line.
pastaheld
Love it! I've been thinking about this a lot lately. It's crazy how many great FOSS alternatives are out there to everything – and while they might be relatively easy to install for tech-people ("docker compose up"), they are still out of reach for non-tech people.
Also, so many of these selfhostable apps are web applications with a db, server and frontend, but for a lot of use cases (at least for me personally) you just use it on one machine and don't even need a "hosted" version or any kind of sync to another device. A completely local desktop program would suffice. For example I do personal accounting once a month on my computer – no need to have a web app running 24/7 somewhere else. I want to turn on the program, do my work, and then turn it off. While I can achieve that easily as a developer, most of the people can't. There seems to be a huge misalignment (for lack of a better word) between the amount of high-quality selfhostable FOSS alternatives and the amount of people that can actually use them. I think we need more projects like yours, where the goal is to close that gap.
I will definitely try to use selfhostblocks for a few things and try to contribute, keep it up!
virgoerns
I love that you include hledger! It's amazing piece of software, even if a little obscure for people unfamiliar with plaintext accounting!
voat
Looks really neat! Thanks for building this
> "we have gone further than other projects down the path towards production-ready local-first applications based on CRDTs"
This seems like a bold claim, but IMHO Ink & Switch have earned their solid reputation and it wouldn't surprise me if it's true. I agree w/ their analysis and am philosophically aligned w/ their user-centric worldview. So who's going to build "Firebase for CRDTs"?