Better Error Handling

103 comments

·April 20, 2025

jackjeff

Checking for errors after every line (like in Go) is the worst. Used to do that in c/c++ calling win32 APIs. Know what happened when sloppy developers come along? They don’t bother checking and you have really bizarre impossible to debug problems because things fail in mysterious ways. At least with an exception if you “forget” to catch it blows up in your face and it’ll be obvious

Sure monads are cool and I’d be tempted to use them. They make it impossible for forget to check for errors and if you don’t care you can panic.

But JS is not Rust. And the default is obviously to use exceptions.

You’ll have to rewrap every API under the moon. So for Monads in JS to make sense you need a lot of weird code that’s awkward to write with exceptions to justify the costs.

I’m not sure the example of doing a retry in the API is “enough” to justify the cost. Also in the example, I’m not sure you should retry. Retries can be dangerous especially if you pile them on top of other retries: https://devblogs.microsoft.com/oldnewthing/20051107-20/?p=33...

stickfigure

Monadic style or not, the `if err != nil return err` pattern destroys critical information for debugging. `try/catch` gives you a complete stacktrace. That stacktrace is often more valuable than the error message itself.

9rx

> the `if err != nil return err` pattern destroys critical information for debugging.

Funny enough, your code looks like it is inspired by Go, and Go experimented with adding stack traces automatically. Upon inspecting real-world usage, they discovered that nobody ever used the stack traces, and came to learn that good errors already contains everything you'd want to know, making stack traces redundant.

Whether or not you think the made the right choice, it is refreshing that the Go project applies the scientific method before making a choice. The cool part is that replication is the most important piece of the scientific method, so anyone who does think they got it wrong can demonstrate it!

RadiozRadioz

Please can you provide the article/mailing list where this was discussed along with their methodology?

zaphirplane

I think you omitted the part and ergonomics of wrapping err in other err, as it bombs out of nested if statements. like a poor persons stack trace.

And using those utilities to test if an err is of a certain kind now that’s been wrapped a few times

peterashford

I code Go professionally. I like the language. However, I vehemently disagree with the stance that error messages > stack traces. Error messages are in no way as helpful as stack traces. Ideally, you'd have both.

stickfigure

This is the arrogance of language designers with limited experience developing real-world applications. Maybe it works as a replacement for C building low level apps, but it won't fly in enterprise codebases.

sethammons

That is why everyone says to wrap your errors in Go. %w ftw

Naked err returns can be a source of pain.

WorldMaker

An advantage to the Monad approach is that it sugars to the try/catch approach and vice-versa (try/catch desugars to monads). JS Promises are also already "Either<reject, resolve>". In an async function you are writing try/catch, but it desugars to monadic code. You could write an alternative to a library like "neverthrow" that just wraps everything in a Promise and get free desugaring from the async and await keywords (including boundary conditions like auto-wrapping synchronous methods that throw into Promise rejections). You could similarly write everything by hand monadically/pseudo-monadically directly with `return Promise.reject(new Error())` and `return Promise.resolve(returnValue)` and everything just works with a lot of existing code and devs are quite familiar with Promise returns.

It might be nice for JS to have a more generic sounding/seeming "do-notation"/"computation expression" language than async/await, but it is pretty powerful as-is, and kind of interesting seeing people talk about writing Monadic JS error handling and ignoring the "built-in" one that now exists.

This is also where I see it as a false dichotomy between Monads and try/catch. One is already a projection of the other in existing languages today (JS Promise, C# Task, Python Future/Task sometimes), and that's probably only going to get deeper and in more languages. (It's also why I think Go being intentionally "anti-Monadic" feels like such a throwback to bad older languages.)

sethammons

Moving from try:catch to errors as values was so refreshing. Same company, same developers, but suddenly people were actually _thinking_ of their errors. Proper debugging details and structured logging became default.

I assert that try:catch encourages lazy error handling leading to a worse debugging experience and longer mean time to problem discovery.

peterashford

Checked exceptions also force people to think of their errors

jayy-lmao

Nice thing about Monads in JS with tools like neverthrow is that you can create the Monad boundary where you like.

It becomes very similar to try-catch exception handling at the place you draw the boundary, then within the boundary it’s monad land.

If you haven’t wrapped it in a monad, chances are you wouldn’t have wrapped it in a try-catch either!

rad_gruchalski

Don’t accept sloppy development practices regardless of what programming language you’re going to use.

myvoiceismypass

Just FYI, the go linter has `errcheck`, which would catch your sloppy developer checkins: https://golangci-lint.run/usage/linters/#errcheck

geocar

> An interesting debate emerged about the necessity of checking every possible error:

> In JS world this could be true, but for Rust (and statically typed compiled languages in general) this is actually not the case… GO pointers are the only exceptions to this. There are no nil check protection at compile level. But Rust, kotlin etc are solid.

Yes it actually is the case. You cannot check/validate for every error, not even in rust. I recommend getting over it.

For a stupid-simple example: You can't even check if disk is going to be full!

The disk being full is a real error you have to deal with, and it could happen at any line in your code through no fault of your own, and no it doesn't always happen at write() but can also when you allocate pages for writing (e.g. SIGSEGV). You cannot really do anything about this with code- aborting or unwinding will only ever annoy users, but you can do something.

We live in a multitasking world, so our users can deal with out-of-disk and out-of-memory errors by deleting files, adding more storage, closing other (lower priority) processes, paging/swapping, and so on. So you can wait: maybe alert the user/operator that there is trouble but then wait for the trouble to clear.

Also: Dynamic-wind is a useful general-purpose programming technique awkward to emulate, and I personally dislike subclassing BackTrack from Error because of what can only be a lack of imagination.

astrobe_

> We live in a multitasking world, so our users can deal with out-of-disk and out-of-memory errors by deleting files, adding more storage, closing other (lower priority) processes, paging/swapping, and so on. So you can wait: maybe alert the user/operator that there is trouble but then wait for the trouble to clear.

That's a weird take. I've been working for multiple decades now with systems that have no UI to speak of; their end-users are barely aware that there's a whole system behind what they can see, and that's a good thing because they become aware of it when it causes them trouble.

I take from my mentor in programming this stance for many things, including error handling: the best solution to a problem is to avoid it. That's something everybody knows actually, but we can forget that when designing/programming because one has so many things to deal with and worry about. Making the thing barely work can be a challenge in itself.

For errors, this usually means: don't let them happen. E.g. avoid OOM by avoiding dynamic allocation as much as possible; statically pre-allocate everything, even if it means megabytes of unused reserved space. Don't design your serialization format with quotes around your keys just to allow "weird" key names, a feature that nobody will ever use and that creates opportunities for errors.

Of course it is not always possible, but don't miss the opportunity when it is.

geocar

> That's a weird take

I appreciate that, but...

> I've been working for multiple decades now with systems that have no UI to speak of; their end-users are barely aware that there's a whole system behind what they can see, and that's a good thing because they become aware of it when it causes them trouble.

Notice I said "user" not "end-user" or "customer".

This was not an accident.

In your system (as in mine) the "user" is the operator.

> the best solution to a problem is to avoid it.

That's your opinion man. I don't know if you can avoid everything (I certainly can't).

Something to consider is why Erlang people have been trying to get people to "let it crash" and just deal with that, because enumerating the solutions is sometimes easier than enumerating the problems.

eska

That’s not his opinion, that’s the standard technique in systems programming. It’s why there’s software out there that does in fact never crash and shows consistent performance.

astrobe_

> Something to consider is why Erlang people have been trying to get people to "let it crash" and just deal with that

Yes, if you can afford it, I would say it is a way to avoid the problem of handling errors in a bug-free way. But it is more than yet another error handling tactic, it is a design strategy.

koolba

> For a stupid-simple example: You can't even check if disk is going to be full!

Isn’t this addressed by preallocating data files in advance of writing application data? It’s pretty common practice for databases for both ensuring space and sometimes performance (by ensuring a contiguous extent allocation).

Someone

I don’t think it’s possible to get that to work 100% of the time on typical modern hardware.

As an example, a disk block may be bad, requiring the OS to find another one to store that pre-allocated disk space. If you try to prevent that by writing to the preallocated space after you allocated it, you still can hit a case where the block goes bad after you did that.

geocar

> Isn’t this addressed by preallocating data files in advance of writing application data?

Allocation isn't the only thing that can fail: Actually writing to the blocks can fail, and just because you can write zeros doesn't mean you can write anything else.

You really can't know until you try. This is life.

beng-nl

You’re not wrong, but you are moving the goalposts a little; GP is responding to your “disks is going to be full” scenario, and that is well handled I’d say by pre allocation.. then of course other things can go wrong too.

anacrolix

This. There are errors and states you cannot predict. As a grandchild comment says: It's easier to provide solutions than to list all the errors. Find your happy path and write code that steers you back on to it. The code will be shorter, less surprising, and actually describable. It's also testable because you treat whole classes of errors consistently so your error combinations count is smaller.

im3w1l

There is in fact a common strategy for dealing with those errors. Shut the process down. That relies on another strategy. Reliable persisted state. Best practice here is to use mechanisms that ensures that at every moment the persisted state is valid. Some databases can guarantee this. You can also write out the new state to a temp file and atomically replace the old state with the new one.

zeroq

JS aside, I recently tried my very best to introduce proper logging and error handling to otherwise "look ma, no handlebars" codebase.

Call it a thought experiment. We start with a clean implementation that satisfies requirements. It makes a bold assumption that every star in the universe will align to help us achieve to goal.

Now we add logging and error handling.

Despite my best intentions and years of experience, starting with clean code, the outcome was a complete mess.

It brings back memories when in 2006 I was implementing deep linking for Wikia. I started with a "true to the documention" implemention which was roughly 10 lines of code. After handling all edge cases and browser incompatibilites I ended up with a whooping 400 lines.

Doing exactly the same as the original lines did, but cross compatible.

dullcrisp

I guess I’ll ask, did you try using exceptions?

9rx

> We start with a clean implementation that satisfies requirements ... Now we add logging and error handling.

If error handling and logging isn't necessary to satisfy requirements, why bother with them at all?

01HNNWZ0MV43FF

Handlebars like on a bike, or like the templating language?

RadiozRadioz

They mean the bike analogy

ivanjermakov

Errors as values approach suffers similar problem as async/await - it's leaky. Once the function is altered to possibly return an error, its signature changes and every caller needs to be updated (potentially all the way to the main(), if error is not handled before that).

This approach is great when:

* program requirements are clear

* correctness is more important than prototyping speed, because every error has to be handled

* no need for concise stack trace, which would require additional layer above simple tuples

* language itself has a great support for binding and mapping values, e.g. first class monads or a bind operator

Good job by the author on acknowledging that this error handling approach is not a solver bullet and has tradeoffs.

frumplestlatz

It’s only leaky if you do not consider failure cases to be as equally intrinsic to an interface’s definition as its happy-path return value :-)

null

[deleted]

whatsakandr

Like most things in C++, I wish the default was `nothrow`, and you added throw for a function that throws. There's so many functions that don't throw, but aren't marked `nothrow`.

In my experience I've used exceptions for things that really should never fail, and optional for things that are more likely to.

deschutes

If you squint hard enough, any potentially allocating function is fallible. This observation has motivated decades of pointless standards work defending against copy or initialization failure and is valuable to the people who participate in standardization for that reason alone.

For practitioners it serves mainly as a pointless gotcha. In safety critical domains the batteries that come with c++ are useless and so while they are right to observe this would be a major problem there they offer no real relief.

ndnxnnxn

[dead]

kikimora

Common Lisp has retries in addition to exceptions. Retry works almost the same way as exception except it allows exception handler to restart execution from the place it happened. I wish we have this in modern widespread languages.

pmontra

It's strange that they didn't write about the Erlang /Elixir approach of

1. returning a tuple with an ok or fail value (so errors as values) plus

2. pattern matching on return values (which makes error values bearable) possibly using the with do end macro plus

3. failing on unmatched errors and trying again to execute the failed operation (fail fast) thanks to supervision trees.

Maybe that's because the latter feature is not available nearly for free in most runtimes and because Erlang style pattern matching is also uncommon.

The approach requires a language that's built on those concepts and not one in which they are added unnaturally as an afterthought (the approach becomes burdensome.)

Pattern matching: https://hexdocs.pm/elixir/pattern-matching.html

With: https://hexdocs.pm/elixir/1.18.1/Kernel.SpecialForms.html#wi...

Supervisors: https://hexdocs.pm/elixir/1.18.1/supervisor-and-application....

eximius

The three things I wish were more standardized in the languages I use are

1. Stacktraces with fields/context besides a string 2. Wrapping errors 3. Combining multiple errors

pyfon

Observability tools give you this (as long as it can be handled and isn't a straight up panic).

Animats

Most of these proposals miss the point. Errors need a useful taxonomy, based on what to do about them. The question is what do you do with an error after you caught it. A breakdown like this is needed:

- Program is broken. Probably need to abort program. Example: subscript out of range.

- Data from an external source is corrupted. Probably need to unwind transaction but program can continue. Example: bad UTF-8 string from input.

- Connection to external device or network reports a problem.

-- Retryable. Wait and try again a few times. Example: HTTP 5xx errors.

-- Non-retryable. Give up now. Example: HTTP 4xx errors.

Python 2 came close to that, but the hierarchy for Python 3 was worse. They tried; all errors are subclasses of a standard error hierarchy, but it doesn't break down well into what's retryable and what isn't.

Rust never got this right, even with Anyhow.

mirekrusin

Severity in majority of library functions is undecidable, it’s decidable at the call site instead. That’s why language should be providing sugar to pick behaviour - exceptions (propagate as is, optionally decorate/wrap), refute (error value, result type), mute/predicate-like (use zero value, ie undefined in js/ts).

9rx

> optionally decorate/wrap

If you are using exceptional handlers for transmitting errors instead of exceptions (i.e. what should have been a compiler error but wasn't detected until runtime), wrapping should be mandatory, else you'll invariably leak implementation details, which is a horrid place to end up. Especially if you don't have something like checked exceptions to warn you that the implementation has changed.

dwattttt

There's no universal taxonomy of "this error is retryable, this one non-recoverable"; it's context dependent.

As a boring example, I might write something that detects when a resource gets hosted, e.g. goes from 404 -> 200.

The best I imagine you can do is be able to easily group each error and handle them appropriately.

01HNNWZ0MV43FF

Well you don't usually want double retry loops, and sometimes that subscript error is because the subscript came from input.

What to do with an error depends on who catches it. That's probably why Python got it wrong and then Rust said worse is better

nlitened

An interesting development in this direction is Clojure’s anomalies taxonomy: nine outcomes (two retriable, two maybe-retriable, five non-retriable; nine respective ways to fix)

See table here: https://github.com/cognitect-labs/anomalies

Lord_Zero

This is called the "result pattern". I would not call this a novel concept. In C# we use this: https://github.com/ardalis/Result

karmakaze

Yes, I stopped reading at:

> The most common approach is the traditional try/catch method.

wavemode

Weird to stop reading at a statement that is factually true.

hamstergene

Returning error codes was actually the first approach to error handling. Exceptions (try/catch) became widespread much later. The article got it backwards calling try/catch "traditional" and Go's approach "modern".

karmakaze

Not in a codebase I develop or maintain. Nothing to see here, moving along.

akoboldfrying

Pretty sure this line: https://meowbark.dev/Better-error-handling#:~:text=return%20...

will immediately throw if b == 0, because

    a / b

is evaluated immediately, so execution never makes it into fromThrowable(). Does it need to be

    () => a / b

instead?

Similarly, withRetry()'s argument needs to have type "() => ResultAsync<T, ApiError>" -- at present, it is passed a result, and if that result is a RateLimit error, it will just return the same error again 1s later.

ChrisMarshallNY

I’m of the opinion that the best error handling, is to not encounter the error, in the first place.

That means good UX, intuitive interfaces, good affordances, user guidance (often, without requiring them to read text), and simplicity.

When an error is encountered, then it needs to be reported to the user in as empathetic and useful manner as possible. It also needs to be as “bare bones” simple as can reasonably be managed.

Designing for low error rates, starts from requirements. Good error reporting requires a lot of [early] input from non-technical stakeholders.

pyfon

Errors often come from the fact that we build on unreliable medium.

Lost packets, high latency, crashed disks, out of memory etc.

You can talk to your users sure but you need to handle this stuff at some level either way. Shit happens!

ChrisMarshallNY

Absolutely.

But we need to plan for it from Day One, and that can also include things like choosing good technology stacks.

Like I said, when inevitable errors happen, how we communicate (or, if possible, mitigate silently) the condition, is crucial.

[EDITED TO ADD] Note how any discussion of improving Quality of software is treated, hereabouts. Bit discouraging.

pyfon

Correct. What errors can happen PLUS how we communicate (and what we do: roll-back transaction? Etc.) PLUS how do we ensure correctness of both (sane programing language, good idioms, testing, proofs etc.)