Show HN: Unsure Calculator – back-of-a-napkin probabilistic calculator

169 comments

·April 15, 2025

roughly

I like this!

In the grand HN tradition of being triggered by a word in the post and going off on a not-quite-but-basically-totally-tangential rant:

There’s (at least) three areas here that are footguns with these kinds of calculations:

1) 95% is usually a lot wider than people think - people take 95% as “I’m pretty sure it’s this,” whereas it’s really closer to “it’d be really surprising if it were not this” - by and large people keep their mental error bars too close.

2) probability is rarely truly uncorrelated - call this the “Mortgage Derivatives” maxim. In the family example, rent is very likely to be correlated with food costs - so, if rent is high, food costs are also likely to be high. This skews the distribution - modeling with an unweighted uniform distribution will lead to you being surprised at how improbable the actual outcome was.

3) In general normal distributions are rarer than people think - they tend to require some kind of constraining factor on the values to enforce. We see them a bunch in nature because there tends to be negative feedback loops all over the place, but once you leave the relatively tidy garden of Mother Nature for the chaos of human affairs, normal distributions get pretty abnormal.

I like this as a tool, and I like the implementation, I’ve just seen a lot of people pick up statistics for the first time and lose a finger.

btilly

I strongly agree with this, and particularly point 1. If you ask people to provide estimated ranges for answers that they are 90% confident in, people on average produce roughly 30% confidence intervals instead. Over 90% of people don't even get to 70% confidence intervals.

You can test yourself at https://blog.codinghorror.com/how-good-an-estimator-are-you/.

Nevermark

From link:

> Heaviest blue whale ever recorded

I don't think estimation errors regarding things outside of someone's area of familiarity say much.

You could ask a much "easier"" question from the same topic area and still get terrible answers: "What percentage of blue whales are blue?" Or just "Are blue whales blue?"

Estimating something often encountered but uncounted seems like a better test. Like how many cars pass in front of my house every day. I could apply arithmetic, soft logic and intuition to that. But that would be a difficult question to grade, given it has no universal answer.

kqr

I have no familiarity with blue whales but I would guess they're 1--5 times the mass of lorries, which I guess weigh like 10--20 cars which I in turn estimate at 1.2--2 tonnes, so primitively 12--200 tonnes for a normal blue whale. This also aligns with it being at least twice as large as an elephant, something I estimate at 5 tonnes.

The question asks for the heaviest, which I think cannot be more than three times the normal weight, and probably no less than 1.3. That lands me at 15--600 tonnes using primitive arithmetic. The calculator in OP suggests 40--320.

The real value is apparently 170, but that doesn't really matter. The process of arriving at an interval that is as wide as necessary but no wider is the point.

Estimation is a skill that can be trained. It is a generic skill that does not rely on domain knowledge beyond some common sense.

yen223

I guess people didn't realise they are allowed to, and in fact are expected to, put very wide ranges for things they are not certain about.

peeters

So the context of the quiz is software estimation, where I assume it's an intentional parable of estimating something you haven't seen before. It's trying to demonstrate that your "5-7 days" estimate probably represents far more certainty than you intended.

For some of these, your answer could span orders of magnitude. E.g. my answer for the heaviest blue whale would probably be 5-500 tons because I don't have a good concept of things that weigh 500 tons. The important point is that I'm right around 9 times in 10, not that I had a precise estimate.

MichaelDickens

It shouldn't matter how familiar you are with the question. If you're pretty familiar, give a narrow 90% credence interval. If you're unfamiliar, give a wide interval.

pertdist

I did a project with non-technical stakeholders modeling likely completion dates for a big GANTT chart. Business stakeholders wanted probabilistic task completion times because some of the tasks were new and impractical to quantify with fixed times.

Stakeholders really liked specifying work times as t_i ~ PERT(min, mode, max) because it mimics their thinking and handles typical real-world asymmetrical distributions.

[Background: PERT is just a re-parameterized beta distribution that's more user-friendly and intuitive https://rpubs.com/Kraj86186/985700]

kqr

This looks like a much more sophisticated version of PERT than I have seen used. When people around me have claimed to use PERT, they have just added together all the small numbers, all the middle numbers, and all the big numbers. That results in a distribution that is too extreme in both lower and upper bound.

baq

that... is not PERT. it's 'I read a tweet about three point estimates' and I'm using a generous interpretation of read

baq

arguably this is how it should always be done, fixed durations for any tasks are little more than wishful thinking.

jrowen

This jives with my general reaction to the post, which was that the added complexity and difficulty of reasoning about the ranges actually made me feel less confident in the result of their example calculation. I liked the $50 result, you can tack on a plus or minus range but generally feel like you're about breakeven. On the other hand, "95% sure the real balance will fall into the -$60 to +$220 range" feels like it's creating a false sense of having more concrete information when you've really just added compounding uncertainties at every step (if we don't know that each one is definitely 95%, or the true min/max, we're just adding more guesses to be potentially wrong about). That's why I don't like the Drake equation, every step is just compounding wild-ass guesses, is it really producing a useful number?

kqr

It is producing a useful number. As more truly independent terms are added, error grows with the square root while the point estimation grows linearly. In the aggregate, the error makes up less of the point estimation.

This is the reason Fermi estimation works. You can test people on it, and almost universally they get more accurate with this method.

If you got less certain of the result in the example, that's probably a good thing. People are default overconfident with their estimated error bars.

jrowen

Read a bit on Fermi estimation, I'm not quite sure exactly what the "method" is in contrast to a less accurate method, it's basically just getting people to think in terms of dimensional analysis? This passage from the Wikipedia is interesting:

By contrast, precise calculations can be extremely complex but with the expectation that the answer they produce is correct. The far larger number of factors and operations involved can obscure a very significant error, either in mathematical process or in the assumptions the equation is based on, but the result may still be assumed to be right because it has been derived from a precise formula that is expected to yield good results.

So the strength of it is in keeping it simple and not trying to get too fancy, with the understanding that it's just a ballpark/sanity check. I still feel like the Drake equation in particular has too many terms for which we don't have enough sample data to produce a reasonable guess. But I think this is generally understood and it's seen as more of a thought experiment.

pests

> People are default overconfident with their estimated error bars.

You say this but yet roughly in a top level comment mentions people keep their error bars too close.

roughly

I think the point is to create uncertainty, though, or to at least capture it. You mention tacking a plus/minus range to $50, but my suspicion is that people's expected plus/minus would be narrower than the actual - I think the primary value of the example is that it makes it clear there's a very real possibility of the outcome being negative, which I don't think most people would acknowledge when they got the initial positive result. The increased uncertainty and the decreased confidence in the result is a feature, not a bug.

dawnofdusk

>rent is very likely to be correlated with food costs - so, if rent is high, food costs are also likely to be high

Not sure I agree with this. It's reasonable to have a model where the mean rent may be correlated with the mean food cost, but given those two parameters we can model the fluctuations about the mean as uncorrelated. In any case at the point when you want to consider something like this you need to do proper Bayesian statistics anyways.

>In general normal distributions are rarer than people think - they tend to require some kind of constraining factor on the values to enforce.

I don't know where you're getting this from. One needs uncorrelated errors, but this isn't a "constraint" or "negative feedback".

roughly

The family example is a pat example, but take something like project planning - two tasks, each one takes between 2 and 4 weeks - except that they’re both reliant on Jim, and if Jim takes the “over” on task 1, what’s the odds he takes the “under” on task 2?

This is why I joked about it as the mortgage derivatives maxim - what happened in 2008 (mathematically, at least - the parts of the crisis that aren’t covered by the famous Upton Sinclair quote) was that the mortgage backed derivatives were modeled as an aggregate of a thousand uncorrelated outcomes (a mortgage going bust), without taking into account that at least a subset of the conditions leading to one mortgage going bust would also lead to a separate unrelated mortgage going bust - the results were not uncorrelated, and treating them as such meant the “1 in a million” outcome was substantially more likely in reality than the model allowed.

Re: negative feedback - that’s a separate point from the uncorrelated errors problem above, and a critique of using the normal distribution at all for modeling many different scenarios. Normal distributions rely on some kind of, well, normal scattering of the outcomes, which means there’s some reason why they’d tend to clump around a central value. We see it in natural systems because there’s some constraints on things like height and weight of an organism, etc, but without some form of constraint, you can’t rely on a normal distribution - the classic examples being wealth, income, sales, etc, where the outliers tend to be so much larger than average that they’re effectively precluded by a normal distribution, and yet there they are.

To be clear, I’m not saying there are not statistical methods for handling all of the above, I’m noting that the naive approach of modeling several different uncorrelated normally distributed outcomes, which is what the posted tool is doing, has severe flaws which are likely to lead to it underestimating the probability of outlier outcomes.

youainti

> I’ve just seen a lot of people pick up statistics for the first time and lose a finger.

I love this. I've never though of statistics like a power tool or firearm, but the analogy fits really well.

ninalanyon

Unfortunately it's usually someone else who loses a finger, not the person wielding the statistics.

rssoconnor

Normal distributions are the maximum entropy distributions for a given mean and variance. Therefore, in accordance with the principle of maximum entropy, unless you have some reason to not pick a normal distribution (e.g. you know your values must be non-negative), you should be using a normal distribution.

tgv

At least also accept a log-normal distribution. Sometimes you need a factor like .2 ~ 5, but that isn't the same as N(2.6, 1.2).

kqr

> you should be using a normal distribution.

...if the only things you know about an uncertain value are its expectation and variance, yes.

Often you know other things. Often you don't know expectation and variance with any certainty.

jbjbjbjb

I think to do all that you’d need a full on DSL rather than something pocket calculator like. I think adding a triangular distribution would be good though.

gamerDude

Great points. I think the idea of this calculator could just be simply extended to specific use cases to make the statistical calculation simple and take into account additional variables. Moving being one example.

NunoSempere

I have written similar tools

- for command line, fermi: https://git.nunosempere.com/NunoSempere/fermi

- for android, a distribution calculator: https://f-droid.org/en/packages/com.nunosempere.distribution...

People might also be interested in https://www.squiggle-language.com/, which is a more complex version (or possibly <https://git.nunosempere.com/personal/squiggle.c>, which is a faster but much more verbose version in C)

NunoSempere

Fermi in particular has the following syntax

```

5M 12M # number of people living in Chicago

beta 1 200 # fraction of people that have a piano

30 180 # minutes it takes to tune a piano, including travel time

/ 48 52 # weeks a year that piano tuners work for

/ 5 6 # days a week in which piano tuners work

/ 6 8 # hours a day in which piano tuners work

/ 60 # minutes to an hour

```

multiplication is implied as the default operation, fits are lognormal.

NunoSempere

Here is a thread with some fun fermi estimates made with that tool: e.g., number of calories NK gets from Russia: https://x.com/NunoSempere/status/1857135650404966456

900K 1.5M # tonnes of rice per year NK gets from Russia

* 1K # kg in a tone

* 1.2K 1.4K # calories per kg of rice

/ 1.9K 2.5K # daily caloric intake

/ 25M 28M # population of NK

/ 365 # years of food this buys

/ 1% # as a percentage

kqr

Oh, this is very similar to what I have with Precel, less syntax. Thanks for sharing!

NunoSempere

Another tool in this spirit is <https://carlo.app/>, which allows you to do this kind of calculation on google sheets.

joshlemer

Their pricing is absolutely out of this world though. Their BASIC plan is $2990 USD per year, the pro plan is $9990/year. https://carlo.app/pricing

NunoSempere

They have a free tier as well, just with fewer samples, and aren't in the zero marginal cost regime

dariosalvi78

another tool, free: https://www.getguesstimate.com/

notpushkin

Would be a nice touch if Squiggle supported the `a~b` syntax :^)

antman

I tried the unsure calc and the android app and they seem to produce different results?

NunoSempere

The android app fits lognormals, and 90% rather than 95% confidence intervals. I think they are a more parsimonious distribution for doing these kinds of estimates. One hint might be that, per the central limit theorem, sums of independent variables will tend to normals, which means that products will tend to be lognormals, and for the decompositions quick estimates are most useful, multiplications are more common

OisinMoran

This is neat! If you enjoy the write up, you might be interested in the paper “Dissolving the Fermi Paradox” which goes even more on-depth into actually multiplying the probability density functions instead of the common point estimates. It has the somewhat surprising result that we may just be alone.

https://arxiv.org/abs/1806.02404

drewvlaz

This was quite a fun read, thanks!

baq

a bit depressing TBH... but ~everyone on this site should read this for the methodology

kqr

I have made a similar tool but for the command line[1] with similar but slightly more ambitious motivation[2].

I really like that more people are thinking in these terms. Reasoning about sources of variation is a capability not all people are trained in or develop, but it is increasingly important.[3]

[1]: https://git.sr.ht/~kqr/precel

[2]: https://entropicthoughts.com/precel-like-excel-for-uncertain...

[3]: https://entropicthoughts.com/statistical-literacy

gregschlom

The ASCII art (well technically ANSI art) histogram is neat. Cool hack to get something done quickly. I'd have spent 5x the time trying various chart libraries and giving up.

Retr0id

On a similar note, I like the crude hand-drawn illustrations a lot. Fits the "napkin" theme.

smartmic

Here [1] is a nice implementation written in Awk. A bit rough around the edges, but could be easily extended.

[1] https://github.com/stefanhengl/histogram

trieloff

https://www.getguesstimate.com/ is this, as a spreadsheet

peeters

Is there a way to do non-scalar multiplication? E.g if I want to say "what is the sum of three dice rolls" (ignoring the fact that that's not a normal distro) I want to do 1~6 * 3 = 1~6 + 1~6 + 1~6 = 6~15. But instead it does 1~6 * 3 = 3~18. It makes it really difficult to do something like "how long will it take to complete 1000 tasks that each take 10-100 days?"

ttoinou

Would be nice to retransform the output into an interval / gaussian distribution

   Note: If you're curious why there is a negative number (-5) in the histogram, that's just an inevitable downside of the simplicity of the Unsure Calculator. Without further knowledge, the calculator cannot know that a negative number is impossible

Drake Equation or equation multiplying probabilities can also be seen in log space, where the uncertainty is on the scale of each probability, and the final probability is the product of exponential of the log probabilities. And we wouldnt have this negative issue

hatthew

The default example `100 / 4~6` gives the output `17~25`

ttoinou

Amazing, thank you !

krick

It sounds like a gimmick at first, but looks surprisingly useful. I'd surely install it if it was available as an app to use alongside my usual calculator, and while I cannot quite recall a situation when I needed it, it seems very plausible that I'll start finding use cases once I have it bound to some hotkey on my keyboard.

PennRobotics

I just threw numbers into there (population x ownership percent x replacement frequency x unit cost) to estimate the annual revenue of the smartphone market and got a few percentage points away from what the internet reports is the true value.

Because there are trig functions, this would also be nice for reverse engineering of complex parts from simple measurements (feature volumes, corner angles, cross-sectional areas)

Multimodal would be nice but not necessary.

Inverse trig functions would be interesting but complicated and not necessary.

In any case, this tool is more convenient than opening Python every time I want to estimate a range of answers.

NunoSempere

> if it was available as an app

Consider https://f-droid.org/en/packages/com.nunosempere.distribution...

null

[deleted]

hyperbolablabla

They use dart as their primary language so it should be easy to make a flutter app from it...

thih9

Feature request: allow specifying the probability distribution. E.g.: ‘~’: normal, ‘_’: uniform, etc.

tgv

I think they should be functions: G(50, 1) for a Gaussian with µ=50, σ=1; N(3) for a negative exponential with λ=3, U(0, 1) for a uniform distribution between 0 and 1, UI(1, 6) for an uniform integer distribution from 1 to 6, etc. Seems much more flexible, and easier to remember.

pyfon

Not having this feature is a feature—they mention this.

thih9

Not really, or at least not permanently; uniform distribution is mentioned in a github changelog, perhaps it’s an upcoming feature:

> 0.4.0

> BRAKING: x~y (read: range from x to y) now means "flat distribution from x to y". Every value between x and y is as likely to be emitted.

> For normal distribution, you can now use x+-d, which puts the mean at x, and the 95% (2 sigma) bounds at distance d from x.

https://github.com/filiph/unsure/blob/master/CHANGELOG.md#04...

djoldman

I perused the codebase but I'm unfamiliar with dart:

https://github.com/filiph/unsure/blob/master/lib/src/calcula...

I assume this is a montecarlo approach? (Not to start a flamewar, at least for us data scientists :) ).

kccqzy

Yes it is.

porridgeraisin

Can you explain how? I'm an (aspiring)

kccqzy

I didn't peruse the source code. I just read the linked article in its entirety and it says

> The computation is quite slow. In order to stay as flexible as possible, I'm using the Monte Carlo method. Which means the calculator is running about 250K AST-based computations for every calculation you put forth.

So therefore I conclude Monte Carlo is being used.

hawthorns

It's dead simple. Here is the simplified version that returns the quantiles for '100 / 2 ~ 4'.

  import numpy as np
  
  def monte_carlo(formula, iterations=100000):
    res = [formula() for _ in range(iterations)]
    return np.percentile(res, [0, 2.5, \*range(10, 100, 10), 
    97.5, 100])

  def uncertain_division():
    return 100 / np.random.uniform(2, 4)

  monte_carlo(uncertain_division, iterations=100000)

constantcrying

Line 19 to 21 should be the Monte-Carlo sampling algorithm. The implementation is maybe a bit unintuitive but apparently he creates a function from the expression in the calculator, calling that function gives a random value from that function.

nritchie

Here (https://uncertainty.nist.gov/) is another similar Monte Carlo-style calculator designed by the statisticians at NIST. It is intended for propagating uncertainties in measurements and can handle various different assumed input distributions.

filiph

I think I was looking at this and several other similar calculators when creating the linked tool. This is what I mean when I say "you'll want to use something more sophisticated".

The problem with similar tools is that of the very high barrier to entry. This is what my project was trying to address, though imperfectly (the user still needs to understand, at the very least, the concept of probability distributions).