Elementary Functions and Not Following the IEEE754 Floating-Point Standard(2020)
10 comments
·February 11, 2025jcranmer
zokier
That Zimmermann paper is far more useful than the article. Notably LLVM libm is correctly rounded for most single precision ops.
Notable omission are crlibm/rlibm/core-math etc libs which claim to be more correct, but I suppose we can already be pretty confident about them.
lifthrasiir
CORE-MATH is working directly with LLVM devs to get their routines to the LLVM libm, so no additional column is really required.
lifthrasiir
> All the standard Maths libraries which claim to be IEEE complaint I have tested are not compliant with this constraint.
It is possible to read the standard in the way that they still remain compliant. The standard, as of IEEE 754-2019, does not require recommended operations to be implemented in accordance to the standard in my understanding; implementations are merely recommended to ("should") define recommended operations with the required ("shall") semantics. So if an implementation doesn't claim that given recommended operation is indeed compliant, the implementation remains complaint in my understanding.
One reason for which I think this might be the case is that not all recommended operations have a known correctly rounded algorithm, in particular bivariate functions like pow (in fact, pow is the only remaining one at the moment IIRC). Otherwise no implementations would ever be complaint as long as those operations are defined!
adgjlsfhk1
It's really hard for me to think that this is a good solution. Normal users should be using Float64 (where there is no similar solution), and Float32 should only be used when Float64 computation is too expensive (e.g. GPUs). In such cases, it's hard to believe that doing the math in Float64 and converting will make anyone happy.
winocm
Off topic but tangentially related, here’s a fun fact, DEC Alpha actually ends up transparently converting IEEE single precision floats (S-float) to double precision floats (T-float, or register format) when performing registers loads and operations.
stncls
Floating-point is hard, and standards seem like they cater to lawyers rather than devs. But a few things are slightly misleading in the post.
1. It correctly quotes the IEEE754-2008 standard:
> A conforming function shall return results correctly rounded for the applicable rounding direction for all operands in its domain
and even points out that the citation is from "Section 9. *Recommended* operations" (emphasis mine). But then it goes on to describes this as a "*requirement*" of the standard (it is not). This is not just a mistype, the post actually implies that implementations not following this recommendation are wrong:
> [...] none of the major mathematical libraries that are used throughout computing are actually rounding correctly as demanded in any version of IEEE 754 after the original 1985 release.
or:
> [...] ranging from benign disregard for the standard to placing the burden of correctness on the user who should know that the functions are wrong: “It is following the specification people believe it’s following.”
As far as I know, IEEE754 mandates correct rounding for elementary operations and sqrt(), and only for those.
2. All the mentions of 1 ULP in the beginning are a red herring. As the article itself mentions later, the standard never cares about 1 ULP. Some people do care about 1 ULP, just because it is something that can be achieved at a reasonable cost for transcendentals, so why not do it. But not the standard.
3. The author seems to believe that 0.5 ULP would be better than 1 ULP for numerical accuracy reasons:
> I was resounding told that the absolute error in the numbers are too small to be a problem. Frankly, I did not believe this.
I would personally also tell that to the author. But there is a much more important reason why correct rounding would be a tremendous advantage: reproducibility. There is always only one correct rounding. As a consequence, with correct rounding, different implementations return bit-for-bit identical results. The author even mentions falling victim to FP non-reproducibility in another part of the article.
4. This last point is excusable because the article is from 2020, but "solving" the fp32 incorrect-rounding problem by using fp64 is naive (not guaranteed to always work, although it will with high probability) and inefficient. It also does not say what to do for fp64. We can do correct rounding much faster now [1, 2]. So much faster that it is getting really close to non-correctly-rounded, so some libm may one day decide to switch to that.
SuchAnonMuchWow
3. >> I was resounding told that the absolute error in the numbers are too small to be a problem. Frankly, I did not believe this.
> I would personally also tell that to the author. But there is a much more important reason why correct rounding would be a tremendous advantage: reproducibility.
This is also what the author want from his own experiences, but failed to realize/state explicitly: "People on different machines were seeing different patterns being generated which meant that it broke an aspect of our multiplayer game."
So yes, the reasons mentioned as a rationale for more accurate functions are in fact rationale for reproducibility across hardware and platforms. For example going from 1 ulp errors to 0.6 ulp errors would not help the author at all, but having reproducible behavior would (even with an increased worst case error).
Correctly rounded functions means the rounding error is the smallest possible, and as a consequence every implementation will always return exactly the same results: this is the main reason why people (and the author) advocates for correctly rounded implementations.
It's worth noting that the C standard explicitly disclaims correct rounding for these IEEE 754 functions (C23§F.3¶20).
Also, there's a group of people who have been running tests on common libms, reporting their current accuracy states here: https://members.loria.fr/PZimmermann/papers/accuracy.pdf (that paper is updated ~monthly).