Skip to content(if available)orjump to list(if available)

35% Faster Than the Filesystem (2017)

35% Faster Than the Filesystem (2017)

25 comments

·October 26, 2024

dale_glass

Out of curiosity, would cutting out the filesystem out of the equation entirely improve things further still? Just put the database on /dev/sda2 or a LV? Also, how much overhead does LVM have?

sgbeal

> would cutting out the filesystem out of the equation entirely improve things further still?

This recent forum post on that subject:

<https://sqlite.org/forum/forumpost/66b4d47514b442af>

doesn't answer your question but may be interesting to you.

dang

Related. Others?

SQLite: 35% Faster Than the Filesystem - https://news.ycombinator.com/item?id=41085376 - July 2024 (193 comments)

SQLite is 35% Faster Than The Filesystem (2017) - https://news.ycombinator.com/item?id=27897427 - July 2021 (127 comments)

35% Faster Than The Filesystem (2017) - https://news.ycombinator.com/item?id=20729930 - Aug 2019 (164 comments)

SQLite small blob storage: 35% Faster Than the Filesystem - https://news.ycombinator.com/item?id=14550060 - June 2017 (202 comments)

Since the July 2024 had a lot of attention, the current repost counts as a dupe. Reposts are ok after a year or so—this is in the FAQ: https://news.ycombinator.com/newsfaq.html.

dehrmann

Sounds like we should all just be doing sqlite3 /dev/sda2

Filesystems and databases solve similar problems, so putting one on top of the other is a bit redundant.

sgbeal

> Sounds like we should all just be doing sqlite3 /dev/sda2

A recent forum post on that topic: <https://sqlite.org/forum/forumpost/66b4d47514b442af>

Vecr

I don't think so. I'm pretty sure sqlite relies on fsync barriers, something block devices don't have.

wmf

Block devices have barriers and sync; fsync is built on those primitives.

Vecr

I'm pretty sure it wouldn't work. Sqlite commits by deleting a file in normal operation, and with WAL you need an ordering that I'm not sure Linux on standard hardware can provide without running both the database and the WAL on 1 filesystem.

sgc

Is anybody depending on this for mission critical data? Can I throw 1TB of images, pdfs, and miscellanea in there, delete the originals, and keep humming along in practical perpetuity? I would presume use lightstream to backup on top of other more general backup solutions, and use a few Attached Databases instead one monolithic for a bit more flexibility / safety?

turbocon

Well "SQLite is likely used more than all other database engines combined. Billions and billions of copies of SQLite exist in the wild."[1]

And

"SQLite has exhibited very high reliability in the field and a very low defect rate, especially considering how rapidly it is evolving. The quality of SQLite is achieved in part by careful code design and implementation. But extensive testing also plays a vital role in maintaining and improving the quality of SQLite."[2]

[1] https://www.sqlite.org/mostdeployed.html

[2] https://www.sqlite.org/testing.html#summary

ozim

I don't know about 1TB data but I think it was used in some missile system so I would say yes for mission critical data, not sure if we think about mission critical in the same terms.

Edit: by pure chance I hit that just a moment ago: https://fractaledmind.github.io/2024/10/16/sqlite-supercharg...

null

[deleted]

samatman

> Is anybody depending on this for mission critical data?

There may be the occasional entity with mission critical data which does not, sooner or later, entrust some of it to SQLite. It would be difficult to populate such a list, however.

> Can I throw 1TB

Subject to this[0] list of constraints, yes. The theoretical upper bound is 281TiB, but this is untested due to a lack of the combination of hardware and filesystem which can create an SQLite file of that magnitude. What happens when the limit is reached is extensively tested, just not at the theoretical maximum, yet.

As for its general reliability, this too is documented[1]. SQLite sets the bar for high-assurance practices in open source software, at a level which more programs should aspire to.

[0]: https://www.sqlite.org/limits.html [1]: https://www.sqlite.org/testing.html

sgc

From that limits page, it looks like as-shipped it can have a blob of max 1gb, and a bit over 2gb if compiled from source. That's rather limiting for my data, since I have some static images, as well as some pdfs and of course videos that are larger than those limits. I guess every library has an oversized books shelf, but a higher limit would be more useful to use it in "everything but the kitchen sink" mode.

James_K

I feel like some kind of compression is the way to go if you are interested in fast reads/writes. Depending on the algorithm used, you could probably decompress the data faster than it can be read off the disk. In an archive file, you would also have the same benefit of fewer read/write calls.

wmf

Some filesystems will do this for you.

bubblesnort

ITT: people who never installed an RDBMS that used a storage device instead of a filesystem, such as Db2.

null

[deleted]

unwind

Meta: (2017), it seems?

wongogue

It includes retests from 2022.

kardos

If the speedup is due to fewer open/close syscalls and space savings due to avoiding block padding, tar would achieve the same yes?

LtWorf

I believe the people downvoting you didn't open the article.

kardos

Possible. I'm not posing that tar replaces sqlite here, just that tar should achieve the same space savings and syscall savings

LtWorf

It's talking about comparing to reading the data from individual files. If you keep them in a single file and seek, the 35% no longer holds.

Read the post :)