1,145 pull requests per day
78 comments
·May 22, 2025darth_avocado
The comments so far are surprising. Yea counting PRs and lines of code isn’t impressive, and yes you may also do them at your own company. Any engineer will tell you, if you push code often and continuously move it to production, regression is inevitable. In finance, at a scale that stripe operates, not making mistakes is very critical. Being able to do what the articles describes is very impressive in any engineering organization. Being able to do that as Stripe is even more impressive.
arghwhat
In finance, not making mistakes is not at all critical, and mistakes happpens all the time. Regulatory compliance is critical, as it provides a thorough legal defense even in case of failure.
The lack of efficiency in finance (or pharma for that matter) is not driven by a wish for quality, but purely from a fear of stepping outside regulatory compliance with no individual wanting to be responsible for any process optimization that could later be seen as non-compliant.
Younger companies on the other hand might realize that compliance controls are, in fact, something the company defines themselves and can structure and optimize however they'd like, allowing for both high throughput and compliance. It's just hard to implement in older companies where half the staff would fight back as their role exists purely due to poor processes and their overhead.
eviks
If it's not impressive why is the article so impressed (it's even in the headline)?
And no, not making any mistakes isn't critical, ... some are. You can have a million of UI mistakes and regressions (where is the stat for how many regressions there are?) and what not in the UI of you Stripe Dashboard app without any of them being critical. The "finance" aura doesn't permeate literally everything a finance company does, raising it to the critical level
netsharc
I thought the article was going somewhere, it ended up being a very vapid "Impressive, right? Reflect what you can do to accomplish this in your organization".
Well the actual ending is, "Subscribe to my newsletter!".
dakiol
Fuck the mission. Fuck the culture. You are doing nothing but shoveling money into founders, investors and shareholders pockets.
danpalmer
My previous company averaged 2 PRs (and deploys) per engineer per day across a small team. At my current company I'm averaging about 2.5 CLs per day (they're a bit smaller changes). Stripe is good at this, but this is very achievable.
Often the problem is that we put too much into a single change. Smaller changes means easier reviews, better reviews, less risky deploys, forces better tooling for deployments and change management, often lower latency, and even often leads to higher throughput because of less WIP taking up head space and requiring context switching. The benefits are so compounding that I think it's very undervalued in some orgs.
polishdude20
I think better tooling for deployments allows small changes. Not the other way around.
danpalmer
That's sort of what I mean by small changes being a forcing function. The tooling we have available rarely makes this level of small changes untenable, it's just clunky. When you send 1k PRs a day though you'll notice things that are too clunky and fix them, and then that makes it easier to get to and maintain that level of productivity.
smadge
> CL
Googler identified.
Scramblejams
Maybe! Perforce (standard in AAA gamedev) speak is littered with CLs, too.
riffraff
The article seems surprised that you can do so many changes at scale, but IMO that's the wrong perspective. The larger the scale you have, the easier it must be to ship a change.
Yes, regressions will be more painful if you manage trillions of dollars, but it also means shipping a fix for such regressions needs to be easy and fast, which you can only do if you have a good and frequent CI/CD pipeline.
See also "The Infallible five-tier system for measuring the maturity of your Continuous Delivery pipeline". You should live on "Friday".
AstralStorm
Incorrect, shipping a fix quickly is relatively irrelevant, being able to roll back a busted change to a working configuration is critical.
The exception is security issues. But these usually require actual thinking to be fixed, so no, you're not getting volume in the first place.
Preferably not breaking things while doing your mostly cosmetic or preparatiry changes rather than patching afterwards limits the scope of this kind of fix churn. And how to know you didn't? Proper functional and integration tests is how.
xeromal
PRs merged is the LOC count. Completely useless measure of anything really.
cadamsdotcom
Something to question though: does the A/B test system generate a changeset when an engineer flips a feature flag or changes the cohorts or percentage it’s rolled out to? Those are very safe changes generated by a template and with easy (maybe even automated) rollback, so they can happen all the time without really risking downtime.
There’ll still be plenty of changes made by humans. But some of those 1145 per day are so low-risk that they’re almost better off making more of them.
hoppp
Maybe they make a lot of very small PR-s? Each one should be reviewed by more than 1 person so keeping it very small is the only way imho
AdamJacobMuller
Diminishing returns at some point, but, I think the counterpoint to this is companies who are doing a single huge manual deployment every month (or less) which is scheduled into a 4-hour outage window where many if not all services will be down for this period.
I do agree there isn't a lot of delta between a company doing 2 or 10 deploys a day and a company doing 1,200, but, there's a huge engineering gap between companies who do approximately .03 deploys per day and those doing multiple.
SchemaLoad
Open source projects can be the worst at this stuff. I realise it's all volunteer run so I'm not complaining too much. But so often they end up pushing a versioned release once a year. So you end up finding a bug, going to report it and see it was fixed 8 months ago but still broken in the published package. And then they get afraid to push a new version because it's been so long since the last one that everything has changed.
null
metalrain
Having minute of downtime per year is quite a big tradeoff.
It makes sense for Stripe, they would lose lot of money when not operating. But in smaller companies you can choose to take more downtime to reduce engineering time.
Instead of skirting around doing gradual data transformations on live data, you could take service offline and do all of it once.
nitwit005
How many of those people are actually working on the core payments flow that they're measuring the uptime of?
I'm sure most people are working on some settings UI, fraud or risk tools, etc.
sverhagen
It is undoubtedly very impressive. But once you're set up for it, it's probably easier than saving up the changes and doing a big release at the end of the month, because the amount of change and the amount of risk, per deployment, then is also a lot higher.
Like other commenters here have said, it doesn't mean that I can say "(scoff --) we're doing the same" if I'm doing the same relative number of releases with my tiny team. But it is validating for a small team like mine to see that this approach works at large scale, as it does for us.
coolcase
Yeah I get itchy if the pipeline to prod ain't working for more than 24h for one microservice. I love continuous deployment.
cuttothechase
Is counting the number of pull requests a useful measure of engineering performance ergo product performance and company perf?
Isn't it more like a BS counter that keep incrementing and that is indicative of churn but nothing else reliably.
One of the most low effort, easily to game metric that can be skewed to show anything that the user wants to show.
qznc
There is research (DORA/Accelerate/DevOps report) that makes a good case that throughput (like number of pull requests) contributes positively to company performance. More precisely the DORA metric is deployment frequency.
JimDabell
> With some napkin math assuming a similar distribution today, that would mean on average each engineer ships at least 1 change to production every 3 days.
This is the important metric. It means there is very little divergence between what’s being worked on and what’s in production. The smaller the difference, the quicker you deliver value to users and the less risky it is to deploy.
chhs
In my org they count both the number of pull requests, and the number of comments you add to reviews. Easily gamed, but that's the performance metric they use to compare every engineer now.
darkmarmot
good god, i would quit in a heartbeat.
netdevphoenix
It will be gamed soon if it isn't already
panstromek
By itself not, but combined with the rest of DORA metrics it's a pretty good indicator.
> few nuggets scattered on the internet regarding how Stripe does things (ex. #1, #2, #3) and in general the conclusion is that they have a very demanding but very advanced engineering culture
#3 is "What I Miss About Working at Stripe" (https://every.to/p/what-i-miss-about-working-at-stripe) reminiscing about 15-hour days, missing vacations, and crying at work.
discussed here; https://news.ycombinator.com/item?id=32159752 (131 comments)