Global variables are not the problem
164 comments
·January 31, 2025Jtsummers
Izkata
> it is the wrong place to put the count information.
I'd argue this is the case regardless of lifetime. It's trying to squash two unrelated things into one object and should have been two different arguments.
Way more obvious if "obj" is replaced with some example object instead of an empty one:
let person = { name: "Foo Bar", age: 30, counter: counter };
roenxi
I like the diagnosis.
My JS is terrible, but it seems like once you make the counter a global variable it is just better to change it to have an atomic dedicated count function. So instead of incrementing the counter in simple, a globalCount() function gets called that isolates the state. Something like
{
let i = 0;
var counter = function (){
console.log(++i);
}
}
Then call counter() to count & log and document that something horrible and stateful is happening. I wouldn't call that a global variable although the article author disagrees.bandrami
How is a globally-scoped closure not a global variable?
roenxi
Because there is no intent, need or reason to vary it.
Nearly everything in RAM is technically a global variable to someone who is keen enough. Programming depends on a sort-of honour system not to treat things like variables if it is hinted that they shouldn't and it doesn't make sense to. Otherwise we can all start making calls to scanmem & friends.
robertlagrant
A function isn't a variable.
levodelellis
Could you tell me where this was posted? I thought no one would see this after I got no comments the first day
No one I showed this to complained about the first example but online many people did. I wrote a completely different article which I think is much better that uses examples I would have used in the follow up. I'll post that article next week
Jtsummers
Second chance pool. This post is, per your submission history, from 2 days ago. HN has a second chance pool that lets articles continue collecting upvotes (more easily than digging through the full history). Some of those articles will get their timestamp updated to artificially inflate their ranking. This brings them to the front page again and gives them their "second chance". After a few hours or whatever time, the timestamp is reverted and they'll start falling back into their natural place in the rankings.
https://news.ycombinator.com/submitted?id=levodelellis
null
SpaceNoodled
The counter should be declared as static inside the function, thus limiting is scope and avoiding pollution of the global namespace.
Jtsummers
In this case, yes. Its scope should be the lowest necessary scope. Does JS provide static variables in functions? If not, then that forces you to lift it to some other scope like file or module scope or the surrounding function or class if that's viable.
porridgeraisin
> Does JS provide static variables
const f = (() => {
let cnt = 0;
return () => console.log("cnt", cnt);
})();
:-)khana
[dead]
billforsternz
From the article> "Static Function Variable: In C inside a function, you can declare a static variable. You could consider this as a local variable but I don't since it's on the heap, and can be returned and modified outside of the functions. I absolutely avoid this except as a counter whose address I never return."
These variables are not on the heap. They are statically allocated. Their visibility is the only thing that differentiates them from global variables and static variables defined outside of functions.
I think such variables can be useful, if you need a simple way of keeping some persistent state within a function. Of course it's more state you are carrying around, so it's hard to defend in a code review as best practice.
Amusingly, you can modify such variables from outside the function, simply by getting your function to provide the modifying code with a pointer to the variable, eg by returning the variable's address. If you do that though you're probably creating a monster. In contrast I think returning the value of the static variable (which the author casts shade on in the quote above) seems absolutely fine.
Edit: I should have stated that the number one problem with this technique is that it's absolutely not thread safe for obvious reasons.
Someone
> These variables are not on the heap. They are statically allocated. Their visibility is the only thing that differentiates them from global variables and static variables defined outside of functions.
In C++, there is a difference: function-static variables get initialized when control first passes into the function. That difference matters in cases where the initial value (partly) comes from a function call or from another global. For example, in C++ one can do:
int bar = -1;
void foo() {
static char * baz = malloc(bar);
…
}
void quux() {
bar = 13;
}
That’s fine as long as the code calls quux before ever calling foo.They added that feature not to support such monstrosities, but because they wanted to support having static variables of class type.
If they were to design that today, I think they would require the initialization expression to be constexpr.
j16sdiz
> If they were to design that today, I think they would require the initialization expression to be constexpr.
Why would they?
This (non-constexpr) semantic would be useful as lazy initialization. .. and C++ love tiny little features like these.
tcoff91
Just in case anyone still doesn't understand what that means to be statically allocated. It means that they are allocated in the Data Segment, which is a separate area of virtual memory from the stack and the heap.
levodelellis
Only if it's const. Otherwise the data is copied into memory you can write to
tcoff91
Isn’t there a part of the data segment that is writable? Initialized vs uninitialized data segment right?
billforsternz
Sorry, I owe the author an apology for saying that they "cast shade" on the idea of returning the value of the static variable. Actually they, quite correctly, cast shade on the idea of returning the address of the static variable. I'd edit the original message, but I am (far) too late. I just noticed my mistake. All I can do is add this apology, for the record.
levodelellis
I didn't mind. I knew I'd hear a lot of disagreements and incorrect thoughts. My next article has a lot of examples which should make things easier to digest.
One thing I dislike about compsci is everyone has different definitions for everything. If people say static variables are not on the heap fine, but you can easily see the address of a static variable and global variable being in the same 4K page
ijustlovemath
static variable addresses are an extremely important tool in static analysis, for proving certain program invariants hold. In C, a const static often will be able to tell you more about the structure of your code at runtime than a #define macro ever could!
though unless you're programming extremely defensively (eg trying to thwart a nation state), I see no reason why you would use them at runtime
Panzer04
Could you expand on this? I don't understand the point you're making, or how it's useful.
ijustlovemath
static consts in C carry their identity through their (fixed, unchanging) pointer address. Lets say you have a business rules engine, that's only meant to ingest threshold values from a certain module. You want to know if the 3.0 you're using is coming from the correct place in the code, or if there's a programming error. With a define, there's not enough additional information to be able to. With a static const, you can just have a const static array of pointers to the valid threshold constants, and use that in your tests for the rules engine.
I work in a highly regulated field, so often this level of paranoia and proof helps us make our case that our product works exactly the way we say it works.
whitten
Perhaps a distinction between static analysis (at compile) and dynamic analysis (at run time) is useful ?
hansvm
Global variables (in languages where they otherwise make sense and don't have footguns at initialization and whatnot) have two main problems:
1. They work against local reasoning as you analyze the code
2. The semantic lifetime for a bundle of data is rarely actually the lifetime of the program
The second of those is easy to guard against. Just give the bundle of data a name associated with its desired lifetime. If you really only need one of those lifetimes then globally allocate one of them (in most languages this is as cheap as independently handling a bunch of globals, baked into the binary in a low-level language). If necessary, give it a `reset()` method.
The first is a more interesting problem. Even if you bundle data into some sort of `HTTPRequest` lifetime or whatever, the fact that it's bundled still works against local reasoning as you try to use your various counters and loggers and what have you. It's the same battle between implicit and explicit parameters we've argued about for decades. I don't have any concrete advice, but anecdotally I see more bugs from biggish collections of heterogeneous data types than I do from passing everything around manually (just the subsets people actually need).
Spivak
I don't think #1 is necessarily true. Take a common case for a global variable, a metric your app is exposing via prometheus. You have some global variable representing its value. Libraries like to hide the global variable sometimes with cuteness like MetricsRegistey.get_metric("foo") but they're globally readable and writable state. And in your function you do a little metric.observe_event() to increment your counter. I think having this value global helps reasoning because the alternative is going to be a really clunky plumbing of the variable down the stack.
hansvm
It helps with reasoning in some sense, and the net balance might be positive in how much reasoning it enables, but it definitely hurts local reasoning. You need broader context to be able to analyze whether the function is correct (or even what it's supposed to be doing if you don't actively work to prevent entropic decay as the codebase changes). You can't test that function without bringing in that outer context. It (often) doesn't naturally compose well with other functions.
Joker_vD
Of course #1 is not necessarily true, it depends on one's coding style, and using globals for application-scoped services like logging/metrics is tentatively fine... although I also think that if we're going to dedicate globals almost exclusively to this use, they probably should have dynamic scoping.
On the other hand, I have seen quite a lot of parsing/compilers' code from the seventies and eighties and let me tell you: for some reason, having interfaces between lexer and parser, or lexer and buffered reader, or whatever else to be "call void NextToken(void), it updates global variables TokenKind, TokenNumber, TokenText to store the info about the freshly produced token" was immensely popular. This has gotten slightly less popular but even today e.g. Golang's scanner has method next() that updates scanner's "current token"-related fields and returns nothing. I don't know why: I've written several LL(1) recursive-descent parsers that explicitly pass the current token around the parsing functions and it works perfectly fine.
darioush
Global variables are a programming construct, which like other programming constructs is neither bad nor good. Except, due to the takeover of workplaces by the best practices cult, instead of reasoning about tradeoffs on a case by case basis (which is the core of coding and software engineering), we ascribe a sort of morality to programming concepts.
For example, global variables have drawbacks, but so does re-writing every function in a call-stack (that perhaps you don't control and get callbacks from).
Or if you are building a prototype, the magic of being able to access "anything from anywhere" (either via globals or context, which is effectively a global that's scoped to a callstack), increases your speed by 10x (since you don't have to change all the code to keep passing its own abstractions to itself as arguments!)
Functions with long signatures are tedious to call, create poor horizontal (which then spills over to vertical) code density. This impacts your ability to look at code and follow the logic at a glance, and perhaps spot major bugs in review. There's also fewer stuff for say copilot to fill in incorrectly, increasing your ability to use AI.
At the end, every program has global state, and use of almost every programming construct from function calls (which may stack overflow) or the modulus operator (which can cause division by zero), or sharing memory between threads (which can cause data races) requires respecting some invariants. Instead, programmers will go to lengths to avoid globals (like singletons or other made up abstractions -- all while claiming the abstractions originate in the problem domain) to represent global state, because someone on the internet said it's bad.
cogman10
Depends a bit on the language.
A global variable in a language with parallel operation is often a terrible idea. The problem with globals and parallel operations is they are an inherent risk for race conditions that can have wild consequences.
In some languages, for example, a parallel write to a field is not guaranteed to be consistent. Let's assume in the above example `counter` was actually represented with 2 bytes. If two threads write an increment to it without a guard, there is no guarantee which thread will win the upper byte and which will win the lower byte. Most of the time it will be fine, but 1 in 10k there can be a bizarre counter leap (forwards or backwards) that'd be almost impossible to account for.
Now imagine this global is tucked away in a complex library somewhere and you've got an even bigger problem. Parallel calls to the library will just sometimes fail in ways that aren't easy to explain and, unfortunately, can only be fixed by the callee with a library wrapping synchronization construct. Nobody wants to do that.
All of these problems are masked by a language like Javascript. Javascript is aggressively single threaded (Everything is ran in a mutex!). Sure you can do concurrent things with callbacks/async blocks, but you can't mutate any application state from 2 threads. That makes a global variable work in most cases. It only gets tricky if you are dealing with a large amount of async while operating on the global variable.
darioush
Yes, mixing some concepts in programming is a terrible idea.
Perhaps this is also widely unpopular, but it's the parallelism that needs to be treated with care and the additional caution, as often the parallelism itself is the terrible idea.
Concurrent code often has unpredictable performance due to cache behavior and NUMA, unpredictable lock contention, and the fact that often there is no measure of whether the bottleneck is CPU or I/O.
What most people want from concurrency (like computing the response to independent HTTP requests) can be done by separate processes, and the OS can abstract the issues away. As another reference, the entire go language is designed around avoiding shared memory (and using message passing -- even though it doesn't use processes for separation it encourages coding like you did).
But also sharing memory between processes can be handled with care via mappings and using the OS.
Tainnor
> What most people want from concurrency (like computing the response to independent HTTP requests) can be done by separate processes
OS processes are way too heavyweight for many use cases.
Tainnor
> Except, due to the takeover of workplaces by the best practices cult, instead of reasoning about tradeoffs on a case by case basis (which is the core of coding and software engineering), we ascribe a sort of morality to programming concepts.
You're just strawmanning here. Maybe some of the people who say that global variables should be avoided ("should be avoided" never means "absolutely can't be used ever", btw) are people who have experience working on large projects where the use of implicit state routinely makes code hard to reason about, causes concurrency issues and introduces many opportunities for bugs.
> There's also fewer stuff for say copilot to fill in incorrectly, increasing your ability to use AI.
That argument makes no sense to me. If some piece of code is relying on implicit global state to have been set, why would copilot be any better at figuring that out than if it had to pass the state as an argument, something which is clearly stated in the function signature?
robertlagrant
I think this article broadens the definition of global variable and then says "Look, the things I added to the definition aren't bad, so global variables aren't always bad."
If you just look at what people normally mean by global variable, then I don't think the article changes minds on that.
berkes
To make it worse: "Look, the things I added to the definition aren't bad in this specific language and use-case, so global variables are not the problem".
To me, the author either has a very narrow field of focus and honestly forgets about all the other use-cases, practicalities and perspectives, or they choose to ignore it just to fire up a debate.
In any case, these constructs are only true for JavaScript (in node.js whose setup avoids threads common issues), and fall flat in a multithreaded setup in about every other languages.
If I were to port this to Rust, first, the borrow checker would catch the described bugs and not allow me to write them in the first place. But secondly, if I really insist on something global that I need to mutate or share between threads, I can do so, but would be explicitly required to choose a type (Mutex, RwLock, with Arc or something) so that a) I have thought about the problem and b) chose something that I know to work for my case.
robertlagrant
> If I were to port this to Rust, first, the borrow checker would catch the described bugs and not allow me to write them in the first place. But secondly, if I really insist on something global that I need to mutate or share between threads, I can do so, but would be explicitly required to choose a type (Mutex, RwLock, with Arc or something) so that a) I have thought about the problem and b) chose something that I know to work for my case.
Agreed. Not the specifics, as I don't know Rust, but it makes sense.
serbuvlad
I find the concept of a context structure passed as the first parameter to all your functions with all your "globals" to be very compelling for this sort of stuff.
sublinear
https://en.wikipedia.org/wiki/Dependency_injection
This is very similar to dependency injection. Separating state and construction from function or method implementation makes things a lot easier to test. In my opinion it's also easier to comprehend what the code actually does.
bee_rider
That just seems like globals with extra steps. Suddenly if your context structure has a weird value in it, you’ll have to check every function to see who messed it up.
bigcat12345678
That's 2 parts: 1. Global variable (mutable) 2. Local function with context argument (mutations)
You have clear tracking of when and how functions change the global variable
Joker_vD
First, that's true for globals as well. Second, with "context structure" pattern, the modifications to it are usually done by copying this structure, modifying some fields in the copy and passing the copy downwards, which severely limits the impact radius and simplifies tracking down which function messed it up: it's either something above you in the call stack or one of the very few (hopefully) functions that changes this context by-reference, with intent to apply such changes globally.
dugmartin
This plus immutable data is what makes doing web apps in Elixir using Phoenix so nice. There is a (demi-)god "%Conn" structure passed as the first parameter that middleware and controller actions can update (by returning a new struct). The %Conn structure is then used in the final step of the request cycle to return data for the request.
For non-web work genservers in Elixir have immutable state that is passed to every handler. This is "local" global state and since genservers guarantee ordering of requests via the mailbox handlers can update state also by returning a new state value and you never have race conditions.
levodelellis
That's exactly why I used this specific example. I seen many code bases that use clone to avoid mutation problems so I wrote this specifically to show it can become a problem too.
I wrote a better article on globals. I plan on posting it next week
dkersten
This seems more an issue with not understanding structuralClone, than one of understanding globals or lack thereof. There’s nothing wrong with the example, it does exactly what the code says it should — if you want counter to be “global” then structuralClone isn’t the function you want to call. The bug isn’t in how counter was in obj, the bug is in calling structuralClone when its behaviour wasn’t wanted.
With that said, it seems obvious that if you want to globally count the calls, then that count shouldn’t live in an argument where you (the function) don’t control its lifetime or how global it actually is. Simple has no say over what object obj.counter points to, it could trivially be a value type passed into that particular call, so if you know you want a global count then of course storing it in the argument is the wrong choice.
Global has two conflated meanings: global lifetime (ie lifetime of the whole program) and global access (which the article states). Simple needs global lifetime but not global access.
You rarely ”need” global access, although for things like a logger it can be convenient. Often you do need global lifetime.
leetrout
The "god object"
caspper69
Hard disagree.
If I have 500 functions, I don't want to extrapolate out the overhead of passing a state object around to all of them. That's a waste of effort, and frankly makes me think you want to code using an FP paradigm even in imperative languages.
Module-level and thread-level "globals" are fine. You gain nothing (other than some smug ivory tower sense of superiority) by making your functions pure and passing around a global state object to every single method invocation.
rafaelmn
You get functions that are easily testable in isolation with all state provided in parameters.
You also get explicit dependencies and scoping controlled by caller.
I don't mind globals but saying you get nothing for avoiding them is :/
Gibbon1
I tend to use getter and setter functions to access globals and manage state.
Advantage only the function that depends on the global needs to bring in the dependency.
wruza
If that’s so useful, make your language support the concept of lexical environments instead. Otherwise it’s just manual sunsetting every day of week. Our craft is full of this “let’s pretend we’re in a lisp with good syntax” where half of it is missing, but fine, we’ll simulate it by hand. Dirt and sticks engineering.
(To be clear, I’m just tangentially ranting about the state of things in general, might as well post this under somewhere else.)
Tainnor
I got into this argument with my former coworkers. Huge legacy codebase. Important information (such as the current tenant of our multi-tenant app) was hidden away in thread-local vars. This made code really hard to understand for newcomers because you just had to know that you'd have to set certain variables before calling certain other functions. Writing tests was also much more difficult and verbose. None of these preconditions were of course documented. We started getting into more trouble once we started using Kotlin coroutines which share threads between each other. You can solve this (by setting the correct coroutine context), but it made the code even harder to understand and more error-prone.
I said we should either abolish the thread-local variables or not use coroutines, but they said "we don't want to pass so many parameters around" and "coroutines are the modern paradigm in Kotlin", so no dice.
caspper69
You know what helps manage all this complexity and keep the state internally and externally consistent?
Encapsulation. Provide methods for state manipulation that keep the application state in a known good configuration. App level, module level or thread level.
Use your test harness to control this state.
If you take a step back I think you’ll realize it’s six of one, half dozen of the other. Except this way doesn’t require manually passing an object into every function in your codebase.
levodelellis
Ouch, I partially address this in the next article
billmcneale
Yes, you gain testability.
If your global state contains something that runs in prod but should not run in a testing environment (e.g. a database connection), your global variable based code is now untestable.
Dependency Injection is popular for a very good reason.
caspper69
This sounds like a design deficiency.
If you have something that should only run in testing, perhaps your test harness should set the global variable appropriately, no?
Panzer04
Just need to make sure your module doesn't get too big or unwieldy. I work in a codebase with some "module" C files with a litany of global statics and it's very difficult to understand the possible states it can be in or test it.
I agree that so long as overall complexity is limited these things can be OK. As soon as you're reading and writing a global in multiple locations though I would be extremely, extremely wary.
serbuvlad
It's one register. You gain that much performance from the single optimization of not updating rbp in release builds.
rurban
That's only needed if you use multiple threads. In a single thread global vars are just fine, and much simplier than passing around ctx
jerf
Even in a single-threaded environment, passing a struct around containing values allows you to implement dynamic scoping, which in this case means, you can easily call some functions with some overridden values in the global struct, and that code can also pass around modified versions of the struct, and so on, and it is all cleanly handled and scoped properly. This has many and sundry uses. If you just have a plain global variable this is much more difficult.
Although Perl 5 has a nice feature where all global variables can automatically be treated this way by using a "local" keyword. It makes the global variables almost useful, and in my experience, makes people accidentally splattering globals around do a lot less damage than they otherwise would because they don't have to make explicit provision for scoped values. I don't miss many features from Perl but this is on the short list.
Tainnor
And do you always know beyond any reasonable doubt that your code will be single-threaded for all time? Because the moment this changes, you're in for a world of pain.
rurban
Wrapping the globals into to a struct context #ifdef MULTI-THREADED and adding this ctx for each call as first call is a matter of minutes. I've done this multiple times.
Much worse is protecting concurrent writes, eg to an object or hash table
christophilus
If JavaScript ever gets real shared memory multithreading, it will be opt in so that the entire web doesn’t break. So, yes.
immibis
Or every occurrence of the singleton pattern (except for when it's the flyweight pattern)
SpicyLemonZest
> The problem is data access. Nothing more, nothing less.
I agree with this, but the problem with global variables is precisely that they make bad data access patterns look easy and natural. Speaking from experience, it’s a lot easier to enforce a “no global variables” rule than explain to a new graduate why you won’t allow them to assign a variable in module X even though it’s OK in module Y.
levodelellis
You might like the article I wrote for next week. Could you tell me where this post is linked from? I didn't think anyone would see this when no one commented the first day
PeterStuer
The one where the author almost rediscovers the singleton pattern.
cauefcr
It can also be an almost rediscovery of closures.
bb88
Please no.
Singletons if you must. At least you can wrap a mutex around access if you're trying to make it thread safe.
grandempire
Why are singletons better? That's just a global object.
> wrap a mutex
What if my program has one thread? Or the threads have clearly defined responsibilities?
zabzonk
how do you know that? just about every serious bug i have ever written was when i thought i understood multi-threaded code, but didn't.
grandempire
Because as a programmer I have responsibility for the technical soundness of the program, and I don't create threads haphazardly.
> when i thought i understood multi-threaded code, but didn't.
All the more reason to carefully plan and limit shared state among threads. It's hard enough to get right when you know where the problems are and impossible if you spray and pray with mutexes.
dragontamer
Global objects need to be initialized. And if two are ever initialized you run into problems like the above.
Singleton is a pattern to ensure that a global objects is only ever initialized once.
unscaled
Most programming languages written after 1990 let you initialize global variables lazily. The main problem is that the initialization order might be unexpected or you may run into conflicts. Singletons make the order slightly more predictable (based on first variable access), although it is till implicit (lazy).
But singletons are still a terrible idea. The issue with global variables is not just initialization. I would argue it's one of the more minor issues.
The major issues with global variables are:
1. Action at distance (if the variable is mutable)
2. Tight coupling (global dependencies are hard-coded, you cannot use dependency injection)
3. Hidden dependency. The dependency is not only tightly-coupled, it is also hidden from the interface. Client code that calls your function doesn't know that you rely on a global variable and that you can run into conflict with other code using that variable or that you may suddenly start accessing some database it didn't even know about.
Singleton does not solve any of the above. The only thing it ensures is lazy initialization on access.
rileymat2
You know it is /currently/ accessed on one thread. These are little landmines that add up over time.
grandempire
The burden is on the programmer adding a new thread to know what they can safely access.
The conclusion of your argument looks like 2000s Java - throw a mutex on every property because you never know when it will need to be accessed on a thread.
Designs that spread complexity rather than encapsulate it are rarely a good idea.
bb88
In complex systems trying to hunt down intractable bugs on global variables is a terrible developer experience. I've done it before.
bknight1983
For every example of a bug caused by not using a global variable I’m sure could find 10 caused by a global variable
levodelellis
You may hate my article next week, it's meant to replace this article. If you want you can email me for early access and tell me how I can improve the article. Lets say you can guess my email if you're emailing the right domain
mrkeen
You say "thread safe", I say "dead lock".
Tainnor
> If you run the code you'll see 1 2 3 4 3 printed instead of 5.
I'm really confused, as this behaviour appears to be completely obvious to me.
gbalduzzi
Yeah in the example it was obvious.
Of course in a real life program, it may be lost in other code logic and, most importantly, the function performing the clone may not be so explicit about it (e.g. an "update" function that returns a different copy of the object).
qalmakka
mutable global variables are intrinsically incompatible with a multithreaded environment. Having mutable shared state is never the right solution, unless you basically can live with data races. And that's before taking maintainability in consideration too
Puts
I think the author forgot the most useful use case for globals, and that is variables that has to do with the context the program is running under such as command line arguments and environment variables (properly validated and if needed escaped).
peanut-walrus
Do those change during program runtime though? I don't think many people have problems with global constants.
The bug in the program reveals a poor understanding of object lifecycles by whoever wrote it. The `obj` argument to `simple` is not globally unique and so it makes a poor location to store global state information (a count of how often `simple` is called, in this example).
Never tie global state information to ephemeral objects whose lifetime may be smaller than what you want to track. In this case, they want to know how many times `simple` is called across the program's lifetime. Unless you can guarantee the `obj` argument or its `counter` member exists from before the first call to `simple` and through the last call to `simple` and is the only `obj` to ever be passed to `simple`, it is the wrong place to put the count information. And with those guarantees, you may as well remove `obj` as a parameter to both `simple` and `complex` and just treat it as a global.
State information needs to exist in objects or locations that last as long as that state information is relevant, no more, no less. If the information is about the overall program lifecycle, then a global can make sense. If you only need to know how many times `simple` was invoked with a particular `obj` instance, then tie it to the object passed in as the `obj` argument.