Immutability Changes Everything (2016) [pdf]

45 comments

·January 25, 2025

LeftHandPath

Immutability is a fantastic tool, especially when working with enterprise data. It's relatively easy to implement your own temporal tables on most existing databases, no special libraries or tools required. It seems really trivial/obvious, but I'll admit I first stumbled into the concept using the AS400 at work. If you make a mistake on payroll in IBM's old MAPICS program, you don't overwrite or delete it. You introduce a new "backout record" to nullify it, then (maybe) insert another record with the correct data. It seems obvious once you've seen the pattern.

I've made a few non-technical eyes go wide by explaining A) that this is done and B) how it is done. The non-tech crypto/blockchain enthusiasts I've met get really excited when they learn you can make a set of data immutable without blockchain / merkle trees. Actually, explaining that is a good way to introduce the concept of a merkle tree / distributed ledger, and why "blockchain" is specifically for systems without a central authority.

(Bi)Temporal and immutable tables are especially useful for things like HR, PTO, employee clock activity, etc. Helps keep things auditable and correct.

layer8

Without specific support from the RDBMS, bitemporal schemas are difficult with regard to cross-table references, such as foreign keys. Rows that need to be consistent between tables aren’t necessarily 1:1 anymore, but instead each row in one table needs to be consistent with all corresponding rows in the other table having an intersecting time interval. You then run into problems with transaction isolation and visibility.

pyrale

> bitemporal schemas are difficult with regard to cross-table references

Who needs more than one table ? >:)

More complex models can be built and stored separately. The great benefit of this method being that, once you're unhappy with your table model, you can trash it and rebuild it from scratch without regard for data migration.

layer8

Your last sentence sounds more like event sourcing than bitemporal databases, which are quite different concepts. I don’t see how bitemporal schemas simplify schema migration.

hobs

Pretty much, you want triggers to store things in a schemaless fashion in an audit format so that you are free to migrate tables.

This does require either knowing the schema at the point in time or recording enough information to do a schema on read.

The other options are of course you basically run a table like an API, always adding, never removing.

refset

> It's relatively easy to implement your own temporal tables on most existing databases

It gets tricky when you need to change the schema without breaking historical data or queries. SQL databases could do a lot more to make immutability easier and widespread.

jiggawatts

One fundamental issue I’ve noticed is that typical SQL databases have a single schema per table defining both the logical and physical aspects, typically with a strong correlation between the two.

Databases could treat the columns as the fundamental unit with tables being not much more than a view of a bunch of columns that can change over both space (partitioning) and time (history).

bobnamob

That’s effectively how datomic works. Datoms are the fundamental unit, with attributes being analogous to a column name and views being the 4 indexes that datomic keeps

teleforce

>Actually, explaining that is a good way to introduce the concept of a merkle tree / distributed ledger, and why "blockchain" is specifically for systems without a central authority

This is a very important points, for whatever systems or solutions that you do, do not overengineer and always remember premature optimization is the root of all evil.

It used to be blockchain and it seems apparently ML/AI is the new fad. Most probably majority of the solutions being design now with ML/AI does not need it and in doing so just make it expensive/slow/complex/non-deterministic/etc.

People need to wake up and smell the coffee, since ultimately ML/AL it just a tool inside the many tools toolbox.

unit149

[dead]

gatane

My main gripe with immutability is that making updated data requires building a full copy of the data again with the changes. Sure, you could have zippers to aid in the updating process by acting as a kind of cursor/pointer, but raw access to data beats them anytime (even if you optimize for cache).

So if you had to optimize for raw speed, why not choose mutable data?

https://ksvi.mff.cuni.cz/~sefl/papers/zippers.pdf

dsQTbR7Y5mRHnZv

> My main gripe with immutability is that making updated data requires building a full copy of the data again with the changes.

Conceptually yes, but the implementation doesn't always necessarily need to work that way under the hood: https://www.roc-lang.org/functional#opportunistic-mutation

null

[deleted]

KingMob

> My main gripe with immutability is that making updated data requires building a full copy of the data again with the changes.

That's not generally true. Many immutable languages are using "persistent" data structures, where "persist" here means that much of the original structure persists in the new one.

For more, see:

- Purely Functional Data Structures by Okasaki: https://www.cs.cmu.edu/~rwh/students/okasaki.pdf - Phil Bagwell's research - e.g., https://infoscience.epfl.ch/record/64398/files/idealhashtree...

munchler

> My main gripe with immutability is that making updated data requires building a full copy of the data again with the changes.

That is not true in general. There are plenty of data structures that can be updated without forcing a full copy. Lists, trees, sets, maps, etc. All of these are common in functional programming. This is discussed in the article (e.g. "Append-Only Computing").

sarchertech

If you really care about performance, iterating over all of those is going to much much slower than iterating over an array.

munchler

If you really care about multi-threading, mutating array elements is going to be much buggier than using an immutable data structure.

mrkeen

Someone should try it with postgres. Make a raw speed branch that gets rid of the overhead of mvcc:

  while querying a database each transaction sees a snapshot of data (a database version) as it was some time ago, regardless of the current state of the underlying data

  https://www.postgresql.org/docs/7.1/mvcc.html

ahoka

That’s not exactly how PostgreSQL works. This is true only at certain isolation levels.

cratermoon

https://dl.acm.org/doi/10.1145/356635.356640

dang

Immutability Changes Everything (2016) - https://news.ycombinator.com/item?id=27640308 - June 2021 (94 comments)

Immutability Changes Everything - https://news.ycombinator.com/item?id=10953645 - Jan 2016 (4 comments)

Immutability Changes Everything [pdf] - https://news.ycombinator.com/item?id=8955130 - Jan 2015 (25 comments)

(Reposts are fine after a year or so; links to past threads are just to satisfy extra-curious readers)

gleenn

I love the quote "accountants don't use erasers". So many things should be modeled over time and keep track of change right out the gate. Little things like Ruby on Rails always adding timestamps to model tables was super helpful but also a little code smell. If this is obvious enough to be useful everywhere, what is the next level? One more reason Datamoic is so cool: nothing is overwritten, it is overlayed with a newer record and you can always look back and you can always also always take a slice of the db at a specific time and have a complete and consistent viewbof the universe at that time. Immutability!

yencabulator

Accountants also have trivially simple schemas. (Though lots of complexity elsewhere.)

cowsandmilk

The “right to be forgotten” has caused a lot of conflicts with certain immutable data stores. If I can reconstruct a snapshot with a user’s data, have I actually “forgotten” them? Having a deadline where the merges fully occur and old data is rendered inaccessible is sometimes necessary legally.

hcarvalhoalves

You can always "redact" previous data. You can treat the sensible entries themselves as mutable, without it breaking the system design around immutable data.

I have also seen a scheme where you store the hash, and have a separate lookup table for sensible data, that you can redact more easily without messing with the log.

mrkeen

Likewise with database backups.

prydt

One of my favorite papers! This reminds me of Martin Kleppmann's work on Apache Samza and the idea of "turning the database inside out" by hosting the write-ahead log on something like Kafka and then having many different materialized views consume that log.

Seems like a very powerful architecture that is both simple and decouples many concerns.

082349872349872

In their 1992 Transaction Processing book*, Gray and Reuter extrapolate h/w and s/w trends forward and predict that the DBMS of their far future would look like a tape robot for backing store with materialised views in main memory.

Substitute streams for tape i/o, and this description of Samza sounds like it could be very similar to that vision.

* as far as I know, their exposition of the WAL and tradeoffs in its implementation has aged well. Any counter opinions?

gsf_emergency_2

Thanks!

lbj

I have to say, I really love the title :)

cacozen

I guess “Immutability changes nothing” wouldn’t have the same impact

skybrian

Editors and form validation are where this gets tricky. The user isn't just reporting new, independent observations to append to a log. They're looking at existing state and deciding how to react to it. Sometimes avoiding constraint violations with other state that they're not looking at is also important.

It often works out, but if you're not looking at the right version then you're risking a merge conflict.

niuzeta

Semi-related, but is there any repository(ies?) that comprise of these technical white papers? I'm fascinated by these papers whenever they show up in my feed and I gorge on them, and I'd love more. I can't be the only one thinking this way.

ahoka

I can recommend Adrian Colyer‘s excellent The Morning Paper blog: https://blog.acolyer.org/

sstanfie

Needs more exclaimation points!

lincpa

[dead]

HN

Immutability Changes Everything (2016) [pdf]

Immutability Changes Everything (2016) [pdf]