SQLite (with WAL) doesn't do `fsync` on each commit under default settings
56 comments
·August 24, 2025charleslmunger
nh2
From this (linked https://sqlite.org/pragma.html#pragma_synchronous), does anybody understand EXTRA?
> EXTRA provides additional durability if the commit is followed closely by a power loss.
means?
How can one have "additional" durability, if FULL already "ensures that an operating system crash or power failure will not corrupt the database"?
Is it that FULL only protects against "corruption" as stated, but will still lose committed transactions?
It seems so from the points on https://stackoverflow.com/questions/58113560/during-power-lo...
Which is also quite nasty. I want my databases to be fully durable by default, and not lose anything once they have acknowledged a transaction. The typical example for ACID DBs are bank transactions; imagine a bank accidentally undoing a transaction upon server crash, after already having acknowledged it to a third-party over the network.
agwa
This is what the documentation (https://sqlite.org/pragma.html#pragma_synchronous) says about EXTRA:
> EXTRA synchronous is like FULL with the addition that the directory containing a rollback journal is synced after that journal is unlinked to commit a transaction in DELETE mode
So it only has an effect in DELETE mode; WAL mode doesn't use a rollback journal.
That said, the documentation about this is pretty confusing.
nh2
Yes, I'm talking about the fact that sqlite in its default (journal_mode = DELETE) is not durable.
Which in my opinion is worse than whatever may apply to WAL mode, because WAL is something a user needs to explicitly enable.
If it is true as stated, then I also don't find it very confusing, but would definitely appreciate if it were more explicit, replacing "will not corrupt the database" by "will not corrupt the database (but may still lose committed transactions on power loss)", and I certainly find that a very bad default.
mrkeen
> The typical example for ACID DBs are bank transactions; imagine a bank accidentally undoing a transaction upon server crash
That's why they don't try to do it that way! But it's still an informative way to think about it.
Also, while we're discussing defaults, your ACID db is probably running at READ COMMITTED by default, meaning that your bank transactions are vanishing/creating money:
* You read accounts A and B($30) in order to move $5 between them. The new balance for B should be $35. Just before you write the $35, someone else's transaction sets B to $100. Your transaction will proceed and blindly set it to $35 anyway.
But to your overall point, I'm also frustrated that these systems aren't as safe as they look on the box.
nh2
> your ACID db is probably running at READ COMMITTED by default
You're probably refering to PostgreSQL. Yes, I am also frustrated that that doesn't default to SERIALIZABLE.
I do wish the top two open-source "ACID" DBs (Postgres and SQLite) used guaranteed-safe, zero-surprise defaults.
charleslmunger
>EXTRA synchronous is like FULL with the addition that the directory containing a rollback journal is synced after that journal is unlinked to commit a transaction in DELETE mode. EXTRA provides additional durability if the commit is followed closely by a power loss.
It depends on your filesystem whether this is necessary. In any case I'm pretty sure it's not relevant for WAL mode.
avinassh
hey, I just tested and `NORMAL` is default:
$ sqlite3 test.db
SQLite version 3.43.2 2023-10-10 13:08:14
Enter ".help" for usage hints.
sqlite> PRAGMA journal_mode=wal;
wal
sqlite> PRAGMA synchronous;
1
sqlite>
edit: fresh installation from homebrew shows default as FULL: /opt/homebrew/opt/sqlite/bin/sqlite3 test.db
SQLite version 3.50.4 2025-07-30 19:33:53
Enter ".help" for usage hints.
sqlite> PRAGMA journal_mode=wal;
wal
sqlite> PRAGMA synchronous;
2
sqlite>
I will update the post, thanks!mediumsmart
Same with macports here - 2 (opt/local/bin/sqlite3)
and /usr/bin/sqlite3 is 1
larschdk
Just checked debian/ubuntu/alpine/fedora/arch docker images. All are FULL by default.
eatonphil
Is this sqlite built from source or a distro sqlite? It's possible the defaults differ with build settings.
supriyo-biswas
The one which avinassh shows is MacOS's SQLite under /usr/bin/sqlite3. In general it also has some other weird settings, like not having concat() method, last I checked.
jmull
The takeaway should be:
Don't assume sqlite is doing an fsync on each commit.
While that's the "default default", you may be using a sqlite3 compiled by someone else or using non-default build options, or using a wrapper that sets its own defaults.
More generally, be careful about assuming any of the various levers sqlite3 has are set in a specific way. That is, take control of the ones that are important to your use case (well, and before that, review the levers so you know which ones those are).
dangoodmanUT
SQLite defaults have many, many weird choices that make unsuspecting users choose dangerous configurations.
And it's not reasonable to expect them to read through all the docs to figure out that the defaults are not safe.
Defaults should be safe, tune for performance. Not the other way around.
jabwd
Sqlite defaults in many ways perfectly fine, you get the foot guns when you need the performance. Read the article rather than commenting on HN because WAL is not default.
btown
There's some nuance here. The compiler flags SQLITE_DEFAULT_SYNCHRONOUS and SQLITE_DEFAULT_WAL_SYNCHRONOUS are set to FULL by default, which does fsync on each commit.
https://sqlite.org/compile.html
But there is a thing called NORMAL mode which, in WAL and non-WAL mode, does not fsync on each commit. In WAL mode, at least this doesn't cause corruption, but it can still lose data.
https://www.sqlite.org/pragma.html#pragma_synchronous is very explicit that the thing called NORMAL does have these risks. But it's still called NORMAL, and I'd think that's something of a foot-slingshot for newcomers, if not a full-fledged footgun.
nh2
> Sqlite defaults in many ways perfectly fine
In the sibling comment we are discussing how the default of sqlite is not durable, so it's only ACI, not ACID.
https://news.ycombinator.com/item?id=45005866
So you do get the foot guns automatically.
> rather than commenting on HN
I appreciate the parent commenting on HN, because they seem to be ... right.
tobias3
The default with the build in sqlite in macos. Means Apple engineers made this choice.
conradev
This is perhaps my favorite one. By default, SQLite doesn’t actually delete data:
The default setting for secure_delete is determined by the SQLITE_SECURE_DELETE compile-time option and is normally off. The off setting for secure_delete improves performance by reducing the number of CPU cycles and the amount of disk I/O.
https://www.sqlite.org/pragma.html#pragma_secure_deletercxdude
That isn't much unlike most filesystems (in fact, on CoW filesystems, even with this setting the data is likely to still be hanging around on disk)
sgarland
Yes, yes it does [0]. I fully understand the need for backwards-compatibility given the sheer number of SQLite installations, I find their attitude towards flexible types [1] appalling. At least they’ve added STRICT.
Similarly, the fact that FKs aren’t actually enforced by default is a horrifying discovery if you’re new to it, but tbf a. MySQL until quite recently allowed CHECK constraints to be declared, which it then ignored b. The FK enforcement decision is for backwards-compatibility.
lupusreal
I swear I think people choose WAL mode because they read something about it online, where that something obviously isn't the documentation. This behavior shouldn't be catching any engineer by surprise.
conradev
One Apple platforms, you also want to use F_FULLFSYNC instead of fsync if you need durability:
https://www.sqlite.org/pragma.html#pragma_checkpoint_fullfsy...
baruz
Is there a performance hit for f_fullfsync?
mananaysiempre
Yes, that’s why those systems can cheat on benchmarks by defaulting to half-arsed fsync() in the first place.
londons_explore
Half arsed fsync is all I want.
I am happy to lose 5 or 10 seconds of data in a power failure. However I'm not okay with a file becoming so corrupted that it is unmountable when the power recovers.
Half arsed fsync provides exactly that - and considering you get way more performance this seems like a good tradeoff.
chasil
There is enough going on with WAL mode that it really shouldn't be enabled unless all the limitations are understood.
In addition to the need for all clients to see shared memory, it disables acid transactions on attached databases (2nd to last paragraph below):
bawolff
> In addition to the need for all clients to see shared memory, it disables acid transactions on attached databases
It only disables global cross-db transactions. It does not disable transactions in the attached db.
The fact that global transactions are a thing at all is the surprising bit to me. I think the WAL mode is the more expected behaviour.
boris
The main reason you would attach a database and then jump through hoops like qualifying tables is to have transactions cover all the attached databases. If you don't need that, then you can just open separate connections to each database without needing to jump through any hoops. So the fact that WAL does not provide that is a big drawback.
SoftTalker
This is true for any database, for any concurrency or durability settings. You must understand the implications of the defaults and your choices if you change them.
liuliu
Does it matter? For all we know, it keeps the serializability. At this point (of computer hardware history), you would care more about serializability than making sure data written to disk after power loss, the latter would now depend on so many layers of drivers doing correct things (for things that is hard to test correctly).
nolist_policy
Without the "making sure data written to disk after power loss" part you won't get serialization either in modern storage stacks.
wellpast
Wouldn’t it depend on use case?
If the app confirms to me my crypto transaction has been reliably queued, I probably don’t want to hear that it was unqueued because a node using SQLite in the cluster had died at an inconvenient specific time.
bawolff
If you had a power failure between when the transaction was queued and the sqlite transaction was comitted, no amount of fsync will save you.
If that is the threat you want to defend against this is not the right setting. Maybe it would reduce the window for it a little bit, but power failures are basically a non existent threat anyways, does a solution that mildly reduces but not eliminate the risk really matter when the risk is negligible?
kevincox
A better example is probably
1. I general a keypair and commit it.
2. I send the public key to someone.
I *really* want to be sure that 1 is persisted. Because if they for example send me $1M worth of crypto it will really suck if I don't have the key anymore. There are definitely cases where it is critical to know that data has been persisted.
This is also assuming that what you are syncing to is more than one local disc, ideally you are running the fsync on multiple geographically distant discs. But there are also cryptography related applications where you must never reuse state otherwise very bad things happen. This can apply even for one local disc (like a laptop). In this case if you did something like 1. Encrypt some data. 2. Commit that this nonce, key, OTP, whatever has been used. 3. Send that datasome where. Then You want to be sure that either that data was comitted or the disc was permanently destroyed (or at least somehow wouldn't be used accidentally to be encrypt more data).
wellpast
Of course it will because same programmers don’t ack their customers until their (distributed, replicated) db says ack.
throwawaymaths
if you are doing crypto you really ought to have a different way of checking that your tx has gone though that is the actual source of truth, like, for exple, the blockchain.
wellpast
I knew I shouldn’t have said crypto, but it is why I said queued. I knew a pedant was going to nitpick. Probably subconsciously was inviting it. I think my point still stands.
null
sethev
One of the things I like about SQLite is how easy it is to understand its behavior if you read the docs. This is one of the potentially surprising defaults - but it seems reasonable for an embedded database, where a power loss is likely to cause the application to "forget" that something was committed at the same time that SQLite does.
However, it depends on the application - hence the need for clear docs and an understandable model.
null
wiradikusuma
Does anyone, preferably from Cloudflare, know the settings used in D1, since it's a SQLite dialect?
koakuma-chan
Is D1 a dialect? I thought they actually run SQLite.
wiradikusuma
I hit some issues where they worked in localhost but not in production. Fortunately, they're not dealbreakers. Unfortunately, my mind is very good at forgetting bad experiences (especially if they have a workaround), so I don't remember the issues.
I do know the transaction is handled "differently".
weinzierl
I don't know why SQLite has this default, I suspect primarily performance, but I think even with modern SSD's there is a fsync frequency where it starts to hurt lifetime.
Also, you probably do not gain much with synchronization=FULL if the SSD does not really honor flushes correctly. Some SSDs historically lied - acknowledging flushes before actually persisting them. I don't know if this is still true.
arpinum
This is popping up because SurrealDB was found to turn fsync off by default. But there are important differences:
- SurrealDB provides poor documentation about this default
- SQLite is typically run client side, while SurrealDB is typically run as a remote server
- SQLite is actually full sync by default, but distros may package it with other defaults - SurrealDB explicitly did this for benchmarking reasons (for comparison fairness) while SQLite distros turn off fsync for typically practical reasons as it's run pure client side.
hoppp
Calling fsync on each commit makes it slower, doesn't it?
Its an extra syscall. It depends on the use-case if its needed or not
praptak
That's why big DBs do group commits, trading latency for throughput.
sgammon
yeah it says that in the docs
The default is FULL
https://sqlite.org/compile.html#default_synchronous
>SQLITE_DEFAULT_SYNCHRONOUS=<0-3> This macro determines the default value of the PRAGMA synchronous setting. If not overridden at compile-time, the default setting is 2 (FULL).
>SQLITE_DEFAULT_WAL_SYNCHRONOUS=<0-3> This macro determines the default value of the PRAGMA synchronous setting for database files that open in WAL mode. If not overridden at compile-time, this value is the same as SQLITE_DEFAULT_SYNCHRONOUS.
Many wrappers for sqlite take this advice and change the default, but the default is FULL.