In S3 simplicity is table stakes

141 comments

·March 14, 2025

CobrastanJorji

> When we moved S3 to a strong consistency model, the customer reception was stronger than any of us expected.

This feels like one of those Apple-like stories about inventing and discovering an amazing, brand new feature that delighted customers but not mentioning the motivating factor of the competing products that already had it. A more honest sentence might have been "After years of customers complaining that the other major cloud storage providers had strong consistency models, customers were relieved when we finally joined the party."

progbits

I mainly use GCP but keep hearing how great AWS is in comparison.

Imagine my surprise when porting some GCS code to S3 last year and realizing there is no way to get consistency guarantees without external lock service.

thiht

> I mainly use GCP but keep hearing how great AWS is in comparison.

Where do you keep hearing this?

Having used both, AWS is trash in comparison. It's way to complicated to do anything simple. At work I wish we could migrate to GCP (or just something that's not AWS, really).

progbits

All my coworkers who mainly used AWS before say that. I agree with you, but my AWS experience is quite limited, so what do I know.

But I feel like being used to something + grass being greener on other side both play a big role in these opinions.

klysm

Distributed locks are a lie

snoman

That would surprise me too considering read-after-write consistency came to S3 like 5 years ago?

jchw

IIRC (it has been a while) the difference is that on Amazon it can only be consistent within a region whereas on GCS I believe even multi-region buckets offer strong consistency.

antonvs

You get this same pattern with a lot of stories about software. Features are often implemented in a way that’s simple for the developers, but not really a great fit for what’s actually needed. Then typically some story is given to justify why the resulting limitations or usability issues are actually a good thing.

_fat_santa

S3 is up there as one of my favorite tech products ever. Over the years I've used it for all sorts of things but most recently I've been using it to roll my own DB backup system.

One of the things that shocks me about the system is the level of object durability. A few years ago I was taking an AWS certification course and learned that their durability number means that one can expect to loose data about once every 10,000 years. Since then anytime I talk about S3's durability I bring up that example and it always seems to convey the point for a layperson.

And it's "simplicity" is truly elegant. When I first started using S3 I thought of it as a dumb storage location but as I learned more I realized that it had some wild features that they all managed to hide properly so when you start it looks like a simple system but you can gradually get deeper and deeper until your S3 bucket is doing some pretty sophisticated stuff.

Last thing I'll say is, you know your API is good when "S3 compatable API" is a selling point of your competitors.

victorp13

> Last thing I'll say is, you know your API is good when "S3 compatable API" is a selling point of your competitors.

Counter-point: You know that you're the dominant player. See: .psd, .pdf, .xslx. Not particularly good file types, yet widely supported by competitor products.

pavlov

Photoshop, PDF and Excel are all products that were truly much better than their competitors at the time of their introduction.

Every file format accumulates cruft over thirty years, especially when you have hundreds of millions of users and you have to expand the product for use cases the original developers never imagined. But that doesn’t mean the success wasn’t justified.

jalk

PDF is not a product. I get what you are day but I can’t say that I’ve ever liked Adobe Acrobat

eternityforest

Most people use libraries to read and write the files, and judge them pretty much entirely by popularity.

A very popular file format pretty much defines the semantics and feature set for that category in everyone's mind, and if you build around those features, then you can probably expect good compatibility.

Nobody thinks about the actual on disk data layout, they think about standardization and semantics.

I rather like PDF, although it doesn't seem to be well suited for 500MB scans of old books and the like, they really seem to bog down on older mobile devices.

laluser

It’s designed for that level of durability, but it’s only as good as a single change or correlated set of hardware failures that can quickly change the theoretical durability model. Or even corrupting data is possible too.

huntaub

You're totally correct, but these products also need to be specifically designed against these failure cases (i.e. it's more than just MTTR + MTTF == durability). You (of course) can't just run deployments without validating that the durability property is satisfied throughout the change.

laluser

Yep! There’s a lot of checksum verification, carefully orchestrated deployments, hardware diversity, erasure code selection, the list goes on and on. I help run a multi-exabyte storage system - I’ve seen a few things.

TheNewsIsHere

This is true. While I prefer non-SaaS solutions generally, S3 is something that’s hard to cost effectively replace. I can setup an AWS account, create an S3 bucket, and have a system that can then persist at least one copy of my data to at least two data centers each within a goal of 1 second. And then layer cross-region replication if I need.

It’s by no means impossible to do that yourself, but it costs a lot more in time and upfront expense.

Cthulhu_

I've used it for server backups too, just a simple webserver. Built a script that takes the webserver files, config files and makes a database dump, packages it all into a .tar.gz file on monday mornings, and uploads it to S3 using a "write only into this bucket" access key. In S3 I had it set up so it sends me an email whenever a new file was added, and that anything older than 3 weeks is put into cold storage.

Of course, I lost that script when the server crashed, the one thing I didn't back up properly.

TheNewsIsHere

If you haven’t already, make sure that versioning is enabled on that bucket!

cruffle_duffle

> but as I learned more I realized that it had some wild features that they all managed to hide properly so when you start it looks like a simple system but you can gradually get deeper and deeper until your S3 bucket is doing some pretty sophisticated stuff.

Over the years working on countless projects I’ve come to realize that the more “easy” something looks to an end user, the more work it took to make it that way. It takes a lot of work to create and polish something to the state where you’d call it graceful, elegant and beautiful.

There are exceptions for sure, but often times hidden under every delightful interface is an iceberg of complexity. When something “just works” you know there was a hell of a lot of effort that went into making it so.

angulardragon03

I did a GCP training a while back, and the anecdote from one of the trainers was that the Cloud Storage team (GCP’s S3-compatible product) hadn’t lost a single byte of data since GCS had existed as a product. Crazy at that scale.

mannyv

Well, google cloud has destroyed entire accounts, but I suppose that's not a storage failure per se.

marsovo

Here's the link to that https://news.ycombinator.com/item?id=40304666

Moral of the story: the "technical part" of things is not the end of the story

Alternative moral: The 3-2-1 backup rule of thumb is still alive and well (is your cloud account a single point of failure?)

hobo_in_library

Eh, they have lost a bit

Gys

> their durability number means that one can expect to loose data about once every 10,000 years

What does that mean? If I have 1 million objects, I loose 100 per year?

lukevp

What it means is in any given year, you have a 1 in 10,000 chance that a data loss event occurs. It doesn’t stack like that.

If you had light bulbs that lasted 1,000 hrs on average, and you had 10k light bulbs, and turned them all on at once, then they would all last 1,000 hours on average. Some would die earlier and some later, but the top line number does not tell you anything about the distribution, only the average (mean). That’s what MTTF is; the mean time for a given part to where it has a greater likelihood to have failed by then vs not. It doesn’t tell you if the distribution of light bulbs burning out is 10 hrs or 500 hrs wide. it’s the latter, you’ll start seeing bulbs out within 750 hrs, but if the former it’d be 995 hrs before anything burned out.

8organicbits

Isn't it just a marketing number? I didn't think durability was part of the S3 SLA, for example.

TheNewsIsHere

Object integrity isn’t part of the S3 SLA. I assume that is mostly because object integrity is something AWS can’t know about per se.

You could unknowingly upload a corrupted file, for example. By the time you discover that, there may not be a clear record of operations on that object. (Yes, you can record S3 data plane events but that’s not the point.)

Only the customer would know if their data is intact, and only the customer can ensure that.

The best S3 (or any storage system) can do is say “this is exactly what was uploaded”.

And you can overwrite files in S3 with the appropriate privileges. S3 will do what you ask if you have the proper credentials.

Otherwise, S3 is designed to be self-healing with erasure encoding and storing copies in at least two data centers per region.

ceejayoz

Amazon claims 99.999999999% durability.

If you have ten million objects, you should lose one every 10k years or so.

graemep

How does that compare to competitors and things like distributed file systems?

toolslive

it's really not that impressive, but you have to use erasure coding (chop the data D in X parts, use these to generate Y extra pieces, and store all X+Y of them) iso replication (store D n times)

wodenokoto

I’ve never worked with AWS, but have a certification from GCP and currently use Azure.

What do you see as special for S3? Isn’t it just another bucket?

merb

The durability is not so good when you have a lot of objects

achierius

Why not? I don't work with web-apps or otherwise use object stores very often, but naively I would expect that "my objects not disappearing" would be a good thing.

oblio

I think their point is that you'd need even higher durability. With millions of objects, even 5+ nines means that you lose objects relatively constantly.

waiwai933

> I’ve seen other examples where customers guess at new APIs they hope that S3 will launch, and have scripts that run in the background probing them for years! When we launch new features that introduce new REST verbs, we typically have a dashboard to report the call frequency of requests to it, and it’s often the case that the team is surprised that the dashboard starts posting traffic as soon as it’s up, even before the feature launches, and they discover that it’s exactly these customer probes, guessing at a new feature.

This surprises me; has anyone done something similar and benefitted from it? It's the sort of thing where I feel like you'd maybe get a result 1% of the time if that, and then only years later when everyone has moved on from the problem they were facing at the time...

easton

Maybe this is a faster way of getting AWS feature requests heard.

I'm going to write a script that keeps trying to call ecs:MakeFargateCacheImages.

ajb

It could also be hackers, as when a new service launches is exactly when it will be most buggy. And the contents of S3 are a big payoff.

bentobean

It's funny—S3 started as a "simple" storage service, and now it's handling entire table abstractions. Reminds me how SQL was declared dead every few years, yet here we are, building ever more complex data solutions on top of supposedly simple foundations.

imiric

I instinctively distrust any software or protocol that implies it is "simple" in its name: SNMP, SMTP, TFTP, SQS, etc. They're usually the cause of an equal or more amount of headaches than alternatives.

Maybe such solutions are a reaction to previous more "complex" solutions, and they do indeed start simple, but inevitably get swallowed by the complexity monster with age.

great_wubwub

TFTP is probably the exception to that rule. All the other protocols started out easily enough and added more and more cruft. TFTP stayed the way it's always been - minimalist, terrifyingly awful at most things, handy for a few corner cases. If you know when to use it and when to use something like SCP, you're golden.

If TFTP had gone the way of SNMP, we'd have 'tftp <src> <dest> --proto tcp --tls --retries 8 --log-type json' or some horrendous mess like that.

yencabulator

TFTP's usefulness in the modern day is strictly for things that don't have a TCP stack. Anything with a TCP stack is better off with HTTP. That doesn't leave much on the table except legacy & inertia.

mannyv

The true hero in AWS is its authentication and accounting infrastructure.

Most people don't even think about it. But authenticating trillions of operations per second is why AWS works. And the accounting and billing. Anyone that does authentication knows how hard it is. At AWS' scale it's well, the pinnacle of distributed systems.

neom

little history: When we were getting ready to do an API at DigitalOcean I got asked "uhm... how should it feel?" I thought about that for about 11 seconds and said "if all our APIs feel as good as S3, it should be fine" - it's a good API.

flessner

The API, absolutely.

It's only sad that the SDKs are often on the heavy side, I remember that the npm package used to be multiple megabytes as it bundled large parts of the core AWS SDK. Nowadays, I believe it's still around half a megabyte.

rglover

The footprint of the JS SDK is much better as they split all of it into service-specific packages, but the SDK APIs are a bit confusing (everything is a class that has to be instantiated—even config).

malfist

I don't know if the npm is the same way, but the java sdk now has options by service. So you can include just s3 instead of all of aws

alberth

S3 is the simplest CRUD app you could create.

It's essentially just the 4 functions of C.R.U.D done to a file.

Most problems in tech are not that simple.

Note: not knocking the service. just pointing out not all things are so inherently basic (and valuable at the same time).

icedchai

Now add versioning, replication, logging, encryption, ACLs, CDN integration, event triggering (Lambda). I could go on. These are just some other features I can name off the top of my head. And it all has to basically run with zero downtime, 24x7...

bdcravens

That's the public interface. The underlying architecture is where the power is.

riv991

Isn't that the most impressive part? That the abstraction makes it seem so simple

shepherdjerred

Anyone can create a CRUD API. It takes a _lot_ of work to make a CRUD API that scales with high availability and a reasonable consistency model. The vast majority of engineers would take months or years to write demo.

If you don't believe me, you might want to reconsider how skilled the average developer _really_ is.

o10449366

These comments are so uniquely "HN" cringe-worthy.

rglover

I used to have the same opinion until I built my own CDN. Scaling something like that is no joke, let alone ensuring you handle consistency and caching properly.

A basic implementation is simple, but at S3 scale, that's a whole different ball game.

cruffle_duffle

That’s when you really know you hid all the complexity well. When people call your globally replicated data store with granular permissions, sophisticated data retention policies, versioning, and manage to have, what, seven (ten?) nines or something, “simple”.

No problem. I’m sure ChatGPT could cook up a replacement in a weekend. Like Dropbox it’s just rsync with some scripts that glue it together. How hard could it possibly be?

I mean people serve entire websites right out of s3 buckets. Using it as a crude CDN of sorts.

It’s a modern marvel.

sebastiansm

I could build a netflix in a weekend.

jimbokun

The point of the article is that making a service at S3's scale with such a simple API exposed to the end user, without exposing the provisioning and durability and security and performance to make it all work...

...is very difficult.

myflash13

I have a feeling that economies of scale have a point of diminishing returns. At what point does it become more costly and complicated to store your data on S3 versus just maintaining a server with RAID disks somewhere?

S3 is an engineering marvel, but it's an insanely complicated backend architecture just to store some files.

sjsdaiuasgdia

That's going to depend a lot on what your needs are, particularly in terms of redundancy and durability. S3 takes care of a lot of that for you.

One server with a RAID array can survive, usually, 1 or maybe 2 drive failures. The remaining drives in the array will have to do more work when a failed drive is replaced and data is copied to the new array member. This sometimes leads to additional failures before replacement completes, because all the drives in the array are probably all the same model bought at the same time and thus have similar manufacturing quality and materials. This is part of why it's generally said that RAID != backup.

You can make a local backup to something like another server with its own storage, external drives, or tape storage. Capacity, recovery time, and cost varies a lot across the available options here. Now you're protected against the original server failing, but you're not protected against location-based impacts - power/network outages, weather damage/flooding, fire, etc.

You can make a remote backup. That can be in a location you own / control, or you can pay someone else to use their storage.

Each layer of redundancy adds cost and complexity.

AWS says they can guarantee 99.999999999% durability and 99.99% availability. You can absolutely design your own system that meets those thresholds, but that is far beyond what one server with a RAID array can do.

vb-8448

How many businesses or applications really need 99.999999999% durability and 99.99% availability? Is your whole stack organized to deliver the forementioned durability and availability?

huntaub

I think that this is, to Andy's point, basically about simplicity. It's not that your business necessarily needs 11 9s of durability for continuity purposes, but it sure is nice that you never have to think about the durability of the storage layer (vs. even something like EBS where 5 9s of durability isn't quite enough to go from "improbable" to "impossible").

hadlock

There are a lot of companies who their livelihood depends on their proprietary data, and loss of that data would be a company-ending-event. I'm not sure how the calculus works out exactly, but having additional backups and types of backups to reduce risk is probably one of the smaller business expenses one can pick up. Sending a couple TB of data to three+ cloud providers on top of your physical backups is in the tens of dollars per month.

sjsdaiuasgdia

Different people and organizations will have different needs, as indicated in the first sentence of my post. For some use cases one server is totally fine, but it's good to think through your use cases and understand how loss of availability or loss of data would impact you, and how much you're willing to pay to avoid that.

I'll note that data durability is a bit of a different concern than service availability. A service being down for some amount of time sucks, but it'll probably come back up at some point and life moves on. If data is lost completely, it's just gone. It's going to have to be re-created from other sources, generated fresh, or accepted as irreplaceable and lost forever.

Some use cases can tolerate losing some or all of the data. Many can't, so data durability tends to be a concern for non-trivial use cases.

Asraelite

> One server with a RAID array can survive, usually, 1 or maybe 2 drive failures.

Standard RAID configurations can only handle 2 failures, but there are libraries and filesystems allowing arbitrarily high redundancy.

sjsdaiuasgdia

As long as it's all in one server, there's still a lot of situations that can immediately cut through all that redundancy.

As long as it's all in one physical location, there's still fire and weather as ways to immediately cut through all that redundancy.

Tanjreeve

Probably never. The complexity is borne by Amazon. Even before any of the development begins if you want a RAID setup with some sort of decent availability you've already multiplied your server costs by the number of replicas you'd need. It's a Sisyphean task that also has little value for most people.

Much like twitter it's conceptually simple but it's a hard problem to solve at any scale beyond a toy.

zerd

One interesting thing about S3 is the vast scale of it. E.g. if you need to store 3 PB of data you might need 150 HDDs + redundancy, but if you store it on S3 it's chopped up and put on tens of thousands of HDDs, which helps with IOPS and throughput. Of course that's shared with others, which is why smart placement is key, so that hot objects are spread out.

Some details in https://www.allthingsdistributed.com/2023/07/building-and-op... / https://www.youtube.com/watch?v=sc3J4McebHE

mangamadaiyan

What's the difference between "IOPS" and "throughput"?

Cthulhu_

There are a few stories from companies that moved away from S3, like Dropbox, and who shared their investments and expenses.

The long and short of it is that getting anywhere near the redundancy, reliability, performance etc of S3, you're spending A Lot of money.

hylaride

There is a diminishing return of what percentage you save, sure. But amazon will always be at that edge. They already have amortized the equipment, labour, administration, electricity, storage, cooling, etc.

They also already have support for storage tiering, replication, encryption, ACLs, integration with other services (from web access to sending notifications of storage events to lambda, sqs, etc). Uou get all of this whether you're saving 1 eight bit file or trillions of gigabyte sized ones.

There are reasons why you may need to roll your own storage setup (regulatory, geographic, some other unique reason), but you'll never be more economical than S3, especially if the storage is mostly sitting idle.

timewizard

> At what point does it become more costly and complicated to store your data on S3 versus just maintaining a server with RAID disks somewhere?

It's more costly immediately. S3 storage prices are above what you would pay even for triply redundant media and you have to pay for data transfer at a very high rate to both send and receive data to the public internet.

It's far less complicated though. You just create a bucket and you're off to the races. Since the S3 API endpoints are all public there's not even a delay for spinning up the infrastructure.

Where S3 shines for me is two things. Automatic lifecycle management. Objects can be moved in between storage classes based on the age of the object and even automatically deleted after expiration. The second is S3 events which are also _durable_ and make S3 into an actual appliance instead of just a convenient key/value store.

snoman

Care to elaborate? What you’re saying doesn’t match my experience.

I’ve paid pennies a year to store data in s3 for the better part of 5 years. Can’t even buy a hdd with what I’ve spent on s3.

timewizard

The per GB price on S3 is higher than on bulk HDDs. This is easily observed. What you are saying is your data storage needs don't even justify a single HDD. This is a scaling issue and not a pricing issue.

arnath

I found out last year that you can actually run a full SPA using S3 and a CDN. It’s kind of a nuts platform

jorams

Since everything you need to run "a full SPA" is to serve some static files over an internet connection I'm not sure how that tells you anything interesting about the platform. It's basically the simplest thing a web server can do.

ellisv

I use S3+Cloudfront for static sites and Cloudflare workers if it needed.

It's always crazy to me that people will run a could be static site on Netlify/Vercel/etc.

Cthulhu_

We've used Netlify at previous projects, we used it because it was easy. No AWS accounts or knowledge needed, just push to master, let the CI build (it was a Gatsby site) and it was live.

ellisv

I think Netlify is great but to me it's overkill if you just have a static site.

I understand that Netlify is much simpler to get started with and setting up an AWS account is somewhat more complex. If you have several sites, it's worth spending the time to learn.

rook1

"I think one thing that we’ve learned from the work on Tables is that it’s these _properties of storage_ that really define S3 much more than the object API itself."

Between the above philosophy, S3 Tables, and Express One Zone (SSD perf), it makes me really curious about what other storage modalities S3 moves towards supporting going forward.

diroussel

It's great that they added iceberg support I guess, but it's a shame that they also removed S3 Select. S3 Select wasn't perfect. For instance the performance was no where near as good as using DuckDB to scan a parquet file, since duck is smart, and S3 Select does a full table scan.

But S3 Select is nearly way cheaper that the new iceberg support. So if your needs are only for reading one parquet snapshot, we no need to do updates, then this change is not welcome.

Great article though, and I was pleased to see this at the end:

> We’ve invested in a collaboration with DuckDB to accelerate Iceberg support in Duck,

StratusBen

For those interested in S3 Tables which is referenced in this blog post, we literally just published this overview on what they are and cost considerations of them that people might find interesting: https://www.vantage.sh/blog/amazon-s3-tables

1a527dd5

https://www.vantage.sh/blog/amazon-s3-tables#s3-tables-cost

I can't make head or tails of the beginning of this sentence:-

> Pricing for S3 Tables is all and all not bad.

Otherwise lovely article!

shawabawa3

"all and all" is a typo for "all in all" which means "overall", or "taking everything into consideration"

So they are saying the pricing is not bad considering everything it does

dangoodmanUT

> There are 1 million PUT requests and 10 million GET requests that month

> + 1,000,000 GET requests x ($0.004 / 1,000 requests) = $9