Skip to content(if available)orjump to list(if available)

Why Does Integer Addition Approximate Float Multiplication?

HPsquared

The most significant bits of a floating point representation are basically a logarithm of the number. Logarithms have this relation between multiplication and addition.

null

[deleted]

sethaurus

This is the core trick in the Fast Inverse Square Root[0], as made famous by the Quake III source code.

It uses a shift (equivalent to dividing) and a subtraction in integer-land to estimate x^(-0.5) in float-land.

[0]: https://en.m.wikipedia.org/wiki/Fast_inverse_square_root

Animats

Neat. Of course it works for the exponent, but that it's not that far off for the mantissa is unexpected. It helps that the mantissa is normalized.

guyomes

For the mantissa, the insight is probably that :

(1+e1)*(1+e2) = 1+e1+e2+(e1*e2)

If e1 and e2 are small, then e1*e2 is negligible.

thequux

Huh, and in the case that e1 and e2 are large, the mantissa overflows into the exponent, multiplying the result by 2.

amelius

I guess we're lucky that the IEEE 754 committee put the exponent in front of the mantissa. Or maybe they were aware of this trick all along ...

SushiHippie

Previous discussion of the paper:

Addition is all you need for energy-efficient language models - https://news.ycombinator.com/item?id=41784591 - Oct 2024 - 126 comments

nabla9

It's math and not just for integers. That's also how slide rule works.

a×b = 10^(log(a×b))

log(a×b) = log(a)+log(b)

thus a×b = 10^(log(a)+log(b))

Sharlin

Yes, but what is nontrivial here is how good the approximation is, despite the mantissa part being linear (much better than the upper bound of being within 2x of the real value).

shiandow

Nontrivial yes, but it's not that complicated to figure out that floats are a piecewise linear approximation of the base 2 logarithm.

Edit: or base 2^k, depends a bit how you view it.

fatuna

Would subtraction also approximate division?

null

[deleted]

rowanG077

This can be very worth it in circuit design for custom accelerators. Floating point operation is often multicycle. If you can get close enough with addition it will save a ton of resources and probably also simplify the design.

null

[deleted]