It's OK to hardcode feature flags
110 comments
·February 1, 2025simonw
tdumitrescu
Seriously. This is one of those cases where rolling your own really does make sense. Flags in a DB table, flags in a json file, all super simple to build and maintain, and 100x faster and more reliable than making the critical paths of your application's request cycle depend on an external provider.
joshmanders
You know what I would find worse than telling my customers that they can't access the application they paid for and works because I farmed my auth out to a 3rd party that is having an outage?
Telling them that my auth provider isn't out, but the thing I use to show them a blue button vs a red button is.
Oof.
gboss
Has this actually been a problem? We’ve been using launch darkly for years and if they do have an outage (which is really really rare) the flag will be set to the default value. It’s also very very cheap, maybe $500 a month.
null
twisteriffic
We did this. Two tables. One for feature flags, with name, desc, id, enum (none, defaultToEnabled, overrideToDisabled). One for user flag overrides, with flagId, userId, enum (enabled, disabled).
The combination of these two has been all we've ever needed. User segmentation, A/B testing, pilot soft launch etc are all easy.
uutangohotel
Would you mind expanding on the usage of enums for the feature flags table? Why not use a boolean?
PaulHoule
In years of trying to sell things I've found that one of the best selling points to management is "susceptible to vendor lock-in", "you don't own your customer database", etc.
I have no idea why that is.
dasil003
I'm confused. Are you saying this ironically or have you literally pitched management with the risks of using your product?
Kwpolska
> using an (often expensive) feature flags as a service platform
I have no idea why anyone would actually do that in real life. Feature flags are something so trivial that you can implement them from scratch in a few hours, tops — and that includes some management UI.
jitl
Often these 3rd party offerings are feature flags PLUS experimentation with user segmenting. Depending on the style of software you build, this can be extremely valuable; it’s very popular in the SaaS market for a reason.
Early on at Notion we used simple percent rollout in Redis, then we built our own flag & experimentation system, but as our needs got more complex we ended up switching to a 3rd party rather than dedicating a team to keep building out the internal system.
We will probably hit a scale in a few years where it makes sense to bring this back in house but there’s certainly a sweet spot for the 3rd party version between the 50-500 engineer mark for SaaS companies.
boulos
That's a reasonable path! You probably learned to appreciate and value the complexity, but you wouldn't have from the start. Which service do you use?
simonw
That path sounds very sensible to me.
cogman10
Happens when you do the flags wrong :)
We have a FF as a service platform and a big "value add" is that we can turn on and off features at the client level with it.
But, unfortunately, it's both not the only mechanism for this and it is also being used for actual feature flags and not just client specific configuration.
I'm personally a MUCH bigger fan of putting feature flags in a configuration file that you deploy either with the application or though some mechanism like kubernetes configs. It's faster, easier to manage, and really easy to quickly answer the question "What's turned on, why, and for how long". Because, a core part of managing feature flags is deleting them and the old code path once you are confident things are "right".
The biggest headache of our FF-ws is that's really not clear and we OFTEN end up with years old feature flags that are on with the old code path still existing even though it's unexercised.
hirsin
You'll still be building management UI over their system (it doesn't understand or validate actor types, tenants, etc, so you have to do that.).
But at high throughput, you might want something with dedicated professional love. Ten thousand feature flags, being checked at around 2 (or 200) million RPS from multiple deployments... I don't want to be the team with that as their side project. And once you're talking a team of three to six engineers to build all this out, maybe it makes sense to just buy something for half a million a year. Assuming it can actually fit your model.
Spivak
But it's not a side-project in most implementations it's part of the app itself.
fizx
The scale is easy in practice, cause you outsource to a CDN. But everything takes time and has opportunity cost.
fizx
If I was a bootstrapped startup, I'd do a json file and then when I've outgrown, I'd hand write something that long-polls a CDN for updates, with a tiny rails or react app behind the CDN.
But these approaches are insane for companies above a certain size, where individuals are being hired and fired regularly, security matters, and feature flags are in the critical path of revenue.
Last time I looked at LaunchDarkly Enterprise licensing, it started at $50k/year, and included SAML.
Now that sounds like a lot, but if you're well past the startup stage, you need a tiny team to manage your homegrown platform. Maybe you have other things for them to do as well, but you probably need 3 people devoting at least 25% of their time to this, in order to maintain. So that's at least $175k/year in the USA, and if your company is growing, then probably the opportunity cost is higher.
ozim
Add to that ideally feature flags should be removed after feature is released. Ideally also you shouldn’t have more than handful of feature flags.
Permanent per customer configuration is not a feature flag. Also best would be not to have too many per customer configurations.
Supermancho
Feature flags are ofter initially for feature and later left in as dependency flags. Even within a large organization, individual components and services by other teams will have outages.
echelon
> build-vs-buy
Roll your own. Seriously.
Feature flags are such an easy thing that there should be a robust and completely open source offering not tied to B2B SaaS. Until then, do it in house.
My team built a five nines feature flag system that handled 200k QPS from thousands of services, active-active, local client caching, a robust predicate DSL for matching various conditions, percent rollout, control plane, ACLs, history, everything. It was super robust and took half an engineer to maintain.
We ultimately got roped into the "build vs buy" / "anti-weirdware" crosshairs from above. Being tasked with migrating to LaunchDarkly caused more outages, more headache, and more engineering hours spent. We were submitting fixes to LaunchDarkly's code, fixing the various language client integrations, and writing our own Ruby batching and multiprocessing. And they charged us way more for the pleasure.
Huge failure of management.
I've been out of this space for some years now, but someone should "Envoy" this whole problem and be done with it. One service, optional sidecars, all the language integrations. Durable failure and recovery behavior. Solid UX. This shouldn't be something you pay for. It should be a core competency and part of your main tooling.
rav
I don't understand what a dedicated "completely open source offering" provides or what your "five nines feature flag system" provides. If you're running on a simple system architecture, then you can sync some text files around, and if you have a more scalable distributed architecture, then you're probably already handling some kind of slowly-changing, centrally-managed system state at runtime (e.g. authentication/authorization, or in-app news updates, ...) where you can easily add another slowly-changing, centrally-managed bit of data to be synchronised. How do you measure the nines on a feature flag system, if you're not just listing the nines on your system as a whole?
foobazgt
> If you're running on a simple system architecture,
His point was that even a feature flag system in a complex environment with substantial functional and system requirements is worth building vs buying. If your needs are even simpler, then this statement is even more true!
I'm having a hard time making sense out of the rest of your comment, but in larger businesses the kinds of things you're dealing with are:
- low latency / staleness: You flip a flag, and you'll want to see the results "immediately", across all of the services in all of your datacenters. Think on the order of one second vs, say 60s.
- scalability: Every service in your entire business will want to check many feature flags on every single request. For a naive architecture this would trivially turn into ungodly QPS. Even if you took a simple caching approach (say cache and flush on the staleness window), you could be talking hundreds of thousands of QPS across all of your services. You'll probably want some combination of pull and push. You'll also need the service to be able to opt into the specific sets of flags that it cares about. Some services will need to be more promiscuous and won't know exactly which flags they need to know in advance.
- high availability: You want to use these flags everywhere, including your highest availability services. The best architecture for this is that there's not a hard dependency on a live service.
- supports complex rules: Many flags will have fairly complicated rules requiring local context from the currently executing service call. Something like: "If this customer's preferred language code is ja-JP, and they're using one of the following devices (Samsung Android blah, iPhone blargh), and they're running versions 1.1-1.4 of our app, then disable this feature". You don't want to duplicate this logic in every individual service, and you don't want to make an outgoing service call (remember, H/A), so you'll be shipping these rules down to the microservices, and you'll need a rules engine that they can execute locally.
- supports per-customer overrides: You'll often want to manually flip flags for specific customers regardless of the rules you have in place. These exclusion lists can get "large" when your customer base is very large, e.g. thousands of manual overrides for every single flag.
- access controls: You'll want to dictate who can modify these flags. For example, some eng teams will want to allow their PMs to flip certain flags, while others will want certain flags hands off.
- auditing: When something goes wrong, you'll want to know who changed which flags and why.
- tracking/reporting: You'll want to see which feature flags are being actively used so you can help teams track down "dead" feature flags.
This list isn't exhaustive (just what I could remember off the top of my head), but you can start to see why they're an endeavor in and of themselves and why products like LaunchDarkly exist.
echelon
> if you're not just listing the nines on your system as a whole
At scale the nines of your feature flagging system become the nines of your company.
We have a massive distributed systems architecture handling billions in daily payment volume, and flags are critical infra.
Teams use flags for different things. Feature rollout, beta test groups, migration/backfill states, or even critical control plane gates. The more central a team's services are as common platform infrastructure, the more important it is that they handle their flags appropriately, as the blast radius of outages can spiral outwards.
Teams have to be able to competently handle their own flags. You can't be sure what downstream teams are doing: if they're being safe, practicing good flag hygiene, failing closed/open, keeping sane defaults up to date, etc.
Mistakes with flags can cause undefined downstream behavior. Sometimes state corruption (eg. with complicated multi-stage migrations) or even thundering herds that take down systems all at once. You hope that teams take measures to prevent this, but you also have to help protect them from themselves.
> slowly-changing, centrally-managed system state at runtime
With flags being so essential, we have to be able to service them with near-perfect uptime. We must be able to handle application / cluster restart and make sure that downstream services come back up with the correct flag states for every app that uses flags. In the case of rolling restarts with a feature flag outage, the entire infrastructure could go hard down if you can't do this robustly. You're never given the luxury of knowing when the need might arise, so you have to engineer for resiliency.
An app can't start serving traffic with the wrong flags, or things could go wrong. So it's a hard critical dependency to make sure you're always available.
Feature flags sit so closely to your overall infrastructure shape that it's really not a great idea to outsource it. When you have traffic routing and service discovery listening to flags, do you really want LaunchDarkly managing that?
the_mitsuhiko
> I think there's a reasonable middle ground-point between having feature flags in a JSON file that you have to redeploy to change and using an (often expensive) feature flags as a service platform: roll your own simple system.
The middle ground is a JSON file that is copied up and periodically refreshed. We (Sentry) moved from a managed software to just a YAML file with feature flags that is pushed to all containers.
The benefit of just changing a file is that you have a lot of freedom of how you deal with it (eg: leave comments) and you have the history of who flipped it and for which reason.
maccard
How do you push the files to all of your containers? I’ve done this in the past with app specific endpoints but never found a solution I liked with containers.
the_mitsuhiko
We currently persist the feature flag config in a database where the containers pull it from. Not the optimal solution but that was a natural evolution from a system we already had in place.
superb_dev
We keep a JSON blob in Google Secret Manager for our flags. The service running in the container will reload the secret anytime it changes
mattmanser
I've been doing this a long time and seen a few different apps use config in database. There's different levels of config you're talking about here, but general app config should generally not go in a db.
No-one ever changes the bloody things and it's just an extra thing to go wrong. If it only loads on startup, it achieves nothing over a bog standard config file. If it loads every request you've just incurred a 5% overhead on every call.
And it ALWAYS ends up filled with crap that doesn't work anymore. Because unlike config files, no-one clear it up.
Worse still is when people haven't made it injectable and then it means unit tests rely on a real database, or it blocks getting a proper CI/CD pipeline working.
I end up having to pick the damn thing out of the app.
Use a config file like everyone else that's probably built into the framework you're using.
To be honest, most of the time I've seen it has been when people who clearly did not know their language/framework who wrote the app.
I'm not saying it's you, but that's been my honest experience of config in the db, it's generally been a serious code smell that the whole app will be bad.
mjr00
There's differences to what kind of configuration you'd want to have in a config file (or environment variables, or some other "system level" management tooling) versus a feature flagging system.
In my experience, feature flagging is more application-level than system-level. What I mean by that is, feature flagging is for stuff like: roll this feature out to 10% of users, or to users in North America, or to users who have opted into beta features; enable this feature and report conversion metrics (aka A/B testing); enable this experimental speedup for 15 minutes so we can measure the performance increase. It's stuff that you want to change at runtime, through centralized tooling with e.g. auditing and alerting, without restarting all of your application servers. It's a bit different than config for like "what's the database host and user", stuff that you don't want to change after initialization (generally).
Regarding the article though, early on your deployment pipeline should be fast enough that updating a hardcoded JSON file and redeploying is just as easy as updating a feature flag, so I agree it's not something to invest in if you're still trying to get your first 1000 users.
marcosdumay
For some kind of software, another call to the DB is the best way to add bog-standard functionality without adding complexity and failure modes.
Granted, not for all software. And there's something to be said about a config file that you can just replace at deployment. But that's something that varies a lot from one environment to another.
secondcoming
> feature flags in a JSON file that you have to redeploy to change
Our config files are stored in their own repo. Pushes to the master branch trigger a Jenkins job that copies the config files to a GCP bucket.
On startup, each machine pulls this config from GCS and everything just works.
It's not a 'redeployment' in the sense that we don't push new images on each config change.
j-krieger
We do the same thing but slightly differently. If a new docker image is built, we deploy that image. If the config changes, an ansible job moves that config to the target host and the service is restartet with that new config file. Configs are mounted inside containers. It all runs on GitLab CI/CD.
j45
Great summary.
Just starting with them and learning to improve your application of them is the best way to learn, too.
There is one book on feature flags that had been written earlier, some of the independently published books by experienced tech folks out there are a goldmine.
Feature Flags by Ben Nadel is one such book for me. There is an online version that is free as well. Happy to learn about others.
adamtaylor_13
Heck if your user system is just a Users table, you don’t even really need to consider build vs buy for them either.
If you start doing it for sub-groups, hard agree but this is a space where it almost always pays dividends to roll your own first. The size of a company that needs to consider adding feature flags (versus one that already has them) is typically that in which building your own is quicker, cheaper, and most importantly: simpler.
vijayer
The call out on premature optimization is valid. However this article misses the mark on a couple fronts.
One, as others have called out, is the ability to control rollout (and rollback) without needing a deployment. Think mobile apps and the rollout friction. If something goes wrong, you need a way to turn off the offending functionality quickly without having to go through another deployment or a mobile app review.
Second, is to be able to understand the impact of the rollout. Feature flags can easily measure how the rollout of one feature can affect the rest of the system - whether it is usage, crash rates, engagement, or further down the funnel, revenue. It’s a cheat code for quickly turning every rollout into an experiment. And you don’t need a large sample size for catching some of these.
By having this power, you will find yourself doing more of it, which I believe is good.
dave4420
If you have enough traffic then you’ll want to roll out new features gradually, and revert them quickly if despite your testing it causes trouble in production.
If you don’t have much traffic, and can live with having to redeploy to flip the switch, then fine, stick it in a config file.
But I clicked through expecting a defence of hard coding feature flags in the source code (`if true` or `if customerEmail.endsWith(“@importantcustomer.com”)`). I very don’t approve of this.
3eb7988a1663
That specific example feels like it might be ok? Presumably you have a very slow process by which customers are identified as VIP white-glove whales. Hard-coding the account representing X% of revenues is not going to experience a lot of churn. Just make it a collection variable, so you do not repeat yourself in multiple places.
keybored
One sense of feature flag that I am familiar with (not from experience) is in trunk based development where they are used to integrate new code (with a feature flag) which is relatively untested. Or just not fully developed. That’s an alternative to longer-lived feature branches which are only merged until it is either fully finished or (going further) fully tested. Hard-coding that kind of feature flag makes sense. Because a later revision will delete the feature flags outright (removing the branches).
There also seems to be feature flags in the sense of toggling on and off features. Then hard-coding makes less sense.
CRConrad
> One sense of feature flag that I am familiar with (not from experience) is in trunk based development where they are used to integrate new code (with a feature flag) which is relatively untested. Or just not fully developed. That’s an alternative to longer-lived feature branches which are only merged until it is either fully finished or (going further) fully tested.
That's actually the only sense of "feature flag" I was aware of before this discussion.
> Hard-coding that kind of feature flag makes sense. Because a later revision will delete the feature flags outright (removing the branches).
Yup. And, AFAIK, is what "feature flag" means.
> There also seems to be feature flags in the sense of toggling on and off features. Then hard-coding makes less sense.
So "feature flag" has now taken on -- taken over? -- the meaning of just plain "flag" (or "switch" or "toggle" or whatever), as in ordinary everyday run-time configuration? What is this development supposed to be good for? We used to have two distinct distinguishable terms for two distinct distinguishable things; now we apparently don't any more. So we've lost a bit of precision from the language we use to discuss this stuff. Have we, in exchange, gained anything?
forinti
I've had an issue with gitlab feature flags when gitlab became unavailable. I couldn't fire a new deploy and the system wouldn't work until gitlab came back to life.
That was a stupid dependency.
fiddlerwoaroof
This sounds like an integration issue: systems like LaunchDarkly typically allow you to specify a default value for when the feature flag server can’t be reached.
jitl
And/or build a near cache so you treat the 3rd party as a control layer, but actually serve requests from your near cache as data layer. Then when 3rd party goes down, your app doesn’t notice at all, and you can still manually update/override values in the cache in emergencies.
ourmandave
Also be sure to use descriptive names so the guys who disassemble your code can write articles about upcoming features.
andix
It's even okay to hardcode them into code (not a config/json file). Depending on the build pipeline this is similar to preprocessor flags, and the code will be removed during build.
It might be enough to test new features with a limited audience (beta build, test deployments for stakeholders/qa).
If done correctly this solution can be easily extended to use a feature flag management tool, or a config file.
PS: removing new features during build/tree-shaking/etc adds some additional security. In some cases even disabled features could pose a security risk. Disabled features are often not perfectly tested yet.
CRConrad
> It's even okay to hardcode them into code (not a config/json file).
Yes, as I understood it that was what the article was all about.
andix
The article suggests to put them into a config file, and considers it hardcoding. That’s how I understood it at least.
CRConrad
Ah, yes indeed, seems I'd misread it; sorry.
(Sheesh, WTF is that guy talking about??? Now not only "feature flag" doesn't mean anything any more, but "hardcoded" doesn't either!)
Narciss
My team just had an issue where a new feature caused our live app to grind to a halt. One of the key reasons it took so long to fix is that the dev in charge of the feature had removed the remote feature flag earlier that day.
Redeploying takes time. Sometimes you want to disable something quick. Having a way to disable that feature without deploys is amazing in those cases.
That being said, there’s really no need to rely for a dedicated service for this. We use our in house crm, but we also have amplitude for more complex cases (like progressive rollout)
jdwyah
There is something too this, though jumping all the way to DIY is unnecessary.
Context: I run a FF company (https://prefab.cloud/)
There are multiple distinct benefits to be had from feature flagging. Because it's the "normal" path, most FF products bundle them all together, but it's useful to split them out.
- The code / libraries for evaluating rules. - The UI for creating rules, targeting & roll outs. - The infrastructure for hosting the flags and providing real-time updates. - Evaluation tracking / debugging to help you verify what's happening.
If you don't need #1 and #2 there, you might decide to DIY and build it yourself, but I think you shouldn't have to. Most feature flag tools today are usable in an offline mode. For Prefab it is: https://docs.prefab.cloud/docs/how-tos/offline-mode You can just do a CLI command to download the flags. Then boot the client off a downloaded file. With our pricing model that's totally free because we're really hardly doing anything for you. Most people use this functionality for CI environments, but I think it's a reasonable way to go for some orgs. It has 100% reliability and that's tough to beat.
You can do that if you DIY too, but there's so many nice to haves in actually having a tool / UI that has put some effort into it that I would encourage people not to go down that route.
whoknowsidont
There's a typo in the article:
>Hardoced feature flags
Think the author obviously meant "hardcoded" here.
Anyways, recently, this has been really hard to sell teams on in my experience. At some point "feature flag" became equivalent to having an entire SaaS platform involved (even for systems where interacting with another SaaS platform makes little sense). I can't help but wonder if this problem is "caused" by the up-coming generation of developers' lived experience with everything always being "online" or having an external service for everything.
In my opinion, your feature flag "system" (at least in aggregate) needs to be layered. Almost to act as "release valves."
Some rules or practices I do:
* Environment variables (however you want to define or source them) can and should act as feature flags.
* Feature flag your feature flag systems. Use an environment variable (or other sourced metadata, even an HTTP header) to control where your program is reading from.
* The environment variables should take both take priority if they're defined AND act as a fallback in case of detected or known service disruption with more configurable feature flag systems (such as an internal DB or another SaaS platform).
* Log the hell out of feature flags, telemetry will keep things clean (how often flags are read, and how often they're changed).
* Categorize your feature flags. Is this a "behavioral" feature flag or functional (i.e., to help keep the system stable). Use whatever qualifiers make sense for your team and system.
* Remove "safety" flags for new features/releases after you have enough data to prove the release is stable.
* Remove unused "behavior" flags once a year.
My $0.02
mcdoh
There's a typo in the post:
> Anyways
I think you obviously meant "anyway".
whoknowsidont
"Anyways" is not a typo. It's a well used term in informal contexts.
cluckindan
Just put your flags in environment variables.
Depending on your infra, that can already make them toggleable without a redeployment: a restart of the apps/containers on the new envvars is enough.
Having them in a separate file would be useful if you need to be able to reload the flags upon receiving SIGUSR1 or something.
eqvinox
> Simply start with a simple JSON file, read it in at application startup,
That's not what I'd call hardcoding, it's a startup-time configuration option. Hardcoding is, well, "hard coding", as in changing something in the source code of the affected component, in particular with compiled languages (with interpreted languages it the distinction is a bit mushy.)
And then for compilation there is the question whether it is a build system option (with some kind of build user interface) or "actual" hardcoding buried somewhere.
Also, there is a connection to be drawn here to loadable/optional software components. Loading or not loading something can be implemented both as startup-time or runtime decision.
The single biggest value add of feature flags is that they de-risk deployment. They make it less frightening and difficult to turn features on and off, which means you'll do it more often. This means you can build more confidently and learn faster from what you build. That's worth a lot.
I think there's a reasonable middle ground-point between having feature flags in a JSON file that you have to redeploy to change and using an (often expensive) feature flags as a service platform: roll your own simple system.
A relational database lookup against primary keys in a table with a dozen records is effectively free. Heck, load the entire collection at the start of each request - through a short lived cache if your profiling says that would help.
Once you start getting more complicated (flags enabled for specific users etc) you should consider build-vs-buy more seriously, but for the most basic version you really can have no-deploy-changes at minimal cost with minimal effort.
There are probably good open source libraries you can use here too, though I haven't gone looking for any in the last five years.