The Missing 11th of the Month
12 comments
·June 18, 2025djoldman
This is why one of my principles is to be skeptical of outliers. Often they are not real and therefore misrepresent the true data.
It's one reason median is preferred over mean, at the outset, as well as throwing out outliers just to see what things look like.
dustincoates
Similar to Twyman's Law: “Any figure that looks interesting or different is usually wrong.”
yen223
The lesson I took from this is that it is useful and important to dig into how any piece of data was sourced.
throwaway173738
You can tell how much they cared about data quality because they never took the time to look at context-dependent glyph equivalencies. And some context-sensitive algorithms might not make the same mistakes as a naive “guess what characters are here” algorithm that just uses glyph shapes. You run into this a LOT with ALPR systems because some of the presses excluded some characters. O and 0 are the most common character equivalency. But only in certain places.
OCR is actually complicated if you’re trying to rely on the data for something.
mensetmanusman
Naming an event after its date will have a limited run.
esafak
tl,dr: It's an OCR error
dahart
Or, sometimes, not; one of the more interesting takeaways was typewritten lowercase ells instead of ones: “When the algorithm read October llth, it was far more correct than we have been giving it credit.”
strogonoff
The latent font designer in me balks at the thought of taking a typeface and intentionally making one character look more like another character.
Was it some technical constraint of the typewriter that caused “1” to become more like “l” come XX century?
thedufer
Typewriter keys cost money, and dropping the 1 allowed them to drop a key without significantly affecting the use of it. As far as I can tell, that's effectively the entire rationale.
This wasn't meaningfully the case prior; the printing press would've just needed more copies of 'l' if they'd dropped the 1s, and letters weren't as significant a portion of the cost of the machine, anyway. And afterwards came computers, which need to distinguish between the characters even if they're displayed the same way.
hidingfearful
was it that in prior years a reader could usually distinguish 1 from l by context. Even today, very few things cause me to need to te11 a 1 from a l.
(typo 0n purpose)
it matters when reading code and random string (what we now call passwords, though back then passwords were things you could pronounce, unlike say ywtr466Nh%vX).
It doesn't matter for much else.
Though it did make an interesting plot twist in the Mioscene Arrow
bediger4000
My parents had a typewriter without a 1 or a 0. I always thought it was to provide room for two other valuable characters like the old "cents" c with a bar through it.
Interesting! Be sure to follow the link to the second post about what happened to the 2nd, 3rd, 22nd, and 23rd. It's simpler but still worth the read:
https://drhagen.com/blog/the-missing-23rd-of-the-month/