Skip to content(if available)orjump to list(if available)

Replace PostgreSQL with Git for your next project

_alternator_

This is a bad idea. Every use case they mention has a simpler, more performant option that’s easier to maintain. It reads like someone asked an LLM if they can use git as a database, then decided to post that to the web because being controversial gets clicks.

luhn

That's exactly what they're doing, it's just driving engagement for their sales:

> While Git makes an interesting database alternative for specific use cases, your production applications deserve better. Upsun provides managed PostgreSQL, MySQL, and other database services

_alternator_

This should do the opposite: I would not trust anyone who thinks that this is a solution worth considering for the use cases they identified as my database service provider.

9rx

Why is it a bad idea? They only mention one use case, but I have been in a similar situation where git was already being used and, like them, I simply needed to connect into the same functionality with my service. Sure, I could have come up with all kinds of complex solutions to synchronize git with other databases... Or I could just use git. It works fine. Why wouldn't it? That's the type of workload git is designed for.

null

[deleted]

daxfohl

Don't do this. I did this many years ago for a small internal "parts" database for our small EE team, since we needed an audit history of changes.

It was just awkward to use. Diffs were weird and hard to wrap a UI around, search wasn't great, it was hard to make ids sequential (the EE team hated uuids), etc., and conflict resolution usually still had to be done in a code editor. Even the killer app, the audit trail, didn't work quite the way the EE team wanted. Code to work around the disparities was probably half the codebase.

I ended up migrating to a normal database after a few months and everything was a lot better.

datadrivenangel

For a small project, this would be fine. Maybe a slight step up from just storing data in a single local file. For a more serious small project, SQLite is the way to go, as git's 'atomic' commits are still going to be relatively slow.

I assume you'd struggle to get a few hundred commits per second even on good hardware?

porridgeraisin

Some 4k per second on my laptop if you use git non-porcelain commands directly.

kragen

Possibly if this sounds interesting you should check out Dolt: https://github.com/dolthub/dolt “Git for Data! (...) a SQL database that you can fork, clone, branch, merge, push and pull just like a Git repository.”

With respect to querying Git repos, I was pleasantly surprised with how usable git cat-file --batch was as a programmatic way to walk around the Git graph in http://canonical.org/~kragen/sw/dev3/threepowcommit.py, which reads about 8000 commits per second on my laptop—not fast, but not as slow as you'd expect.

farhanhubble

I've used Dolt briefly and loved it. I didn't need real time perf but being able to see diffs and checkout branches was phenomenal.

fschuett

I did once write a system like that using libgit2 for the German Grundbuch (land registry):

https://github.com/projekt-dgb/dgb-server/blob/master/API.md...

In the current system, rights, owners and debts of any land parcel in Germany are simply recorded in PDF files, with an ID for each parcel. So, when adding a "record", the govt employees literally just open the PDF file in a PDF editor and draw lines in the PDF, then save it again. Some PDF files are simply scanned pages of typewriter text (often even pre-WW2), so the lines are just added on top. It's a state-of-the-art "digital" workflow for our wonderful, modern country.

Anyway, so I wrote an entire system to digitize it properly (using libtesseract + a custom verifying tool, one PDF file => one JSON file) and track changes to parcels using git. The system was also able to generate Änderungsmitteilungen (change notices) using "git diff" and automatically send them via E-Mail (or invoke a webhook to notify banks or other legal actors, that the parcel had changed - currently this is done using paper and letters).

It was a really cool system (desktop digitizing tool + web server for data management), but the German government didn't care, although some called it "quite impressive". So it's now just archived on GitHub. Their current problem with "digitization" was that every state uses different formats and extra fields for the and they are still (after 20+ years) debating about what the "standardized" database schema should be (I tried to solve that with an open, extensible JSON schema, but nah [insert guy flying out of window meme]). I'm a one-man show, not a multi-billion dollar company, so I didn't have the "power" to change much.

Instead, their "digital Grundbuch" (dabag) project is currently a "work in progress" for 20+ years: https://www.grundbuch.eu/nachrichten/ because 16 states cannot standardize on a unified DB scheme. So it's back to PDF files. Why change a working system, I guess. Germans, this is where your taxes are spent on - oh well, the project was still very cool.

fibers

What would be the downsides of running this as a self hosted instance because i would imagine github would not take kindly to this usecase?

9dev

GitHub wouldn't care for the most part, because they have solid rate limits in place that would foil your plan to use this for a production app immediately.

zimbatm

Gerrit is doing this with NoteDB. Backups are just one git clone away.

See https://gerrit-review.googlesource.com/Documentation/note-db...

bitmasher9

Imagine the possibilities of using GitHub actions as a way to trigger events based on new data being saved. This could be used to sync data across multiple git databases.

_alternator_

This is built in to git as “git hooks”; GitHub not required. But don’t do it. Postgres has a system to trigger commands based on SQL patterns (eg on commit, when updating a row in this table, etc). Much more powerful and maintainable since it is designed for this.

The fact is that basically every data structure can be abused as a database (python dicts, flat files, wave in mercury traveling down a long tube). Don’t reinvent the wheel, learn to use a power tool like Postgres.

stuartjohnson12

do not do this lol

mxuribe

"Your Scientists Were So Preoccupied With Whether Or Not They Could, They Didn’t Stop To Think If They Should" (quote by Jeff Goldblum's character Ian Malcolm in the original Jurassic Park film). /s

More seriously, i agree that to make a use-case fit seems too much of a stretch...Yes, its cool that git can be used in this fashion, as a neat experiment! But, for even non-serious needs, not sure that i would ever do this. But, still, very clever thinking for even thinking of and doing this; kudos!