Gradients Are the New Intervals
53 comments
·May 31, 2025kragen
mkeeter
Interesting post, thanks for the link!
You may also enjoy "Spelunking the Deep: Guaranteed Queries on General Neural Implicit Surfaces via Range Analysis", which uses interval arithmetic (ish) to raymarch neural implicit surfaces:
kragen
That's fantastic, thanks! I didn't know about neural implicit surfaces at all.
guyomes
A generalisation of this idea is known as Taylor model in 1998 [1]. It might even have been known in 1984 as neighborhood arithmetic [2]. The generalisation works by taking a Taylor expansion of the function up to order n, and then by using a bound for the remainder using bounds on the partial derivatives of order n+1 [3].
[1]: https://www.bmtdynamics.org/cgi-bin/display.pl?name=rdaic
[2]: https://books.google.fr/books?id=2zDUCQAAQBAJ
[3]: https://en.wikipedia.org/wiki/Taylor%27s_theorem#Taylor's_th...
sfpotter
Worth pointing out that these ideas were already well known by Moore, the founder of interval arithmetic. Chapter 3 of his monograph "Methods and Applications of Interval Analysis" has the basic ideas worked out.
The folks working on the Taylor model were coming from physics where they had some nasty (very ill-conditioned) nonlinear systems that they needed to solve and developed a package called COSY which implements it.
My understanding is that the Taylor model is effective in practice, but there might have been some confusion around what it was actually capable of. I believe the Taylor model people claimed that their interval inclusions had better than quadratic excess, but this turned out not to be the case. Other people were able to push past the quadratic limit of the well known interval inclusions using other techniques. There are some recent papers by Hormann and Yap along these lines, although I think the first interval inclusion that is better than quadratic dates back further...
yorwba
I assume "recent papers by Hormann and Yap" is a reference to e.g. Range functions of any convergence order and their amortized complexity analysis https://www.inf.usi.ch/hormann//papers/Hormann.2023.RFO.pdf ? Cool stuff.
yorwba
The suggested normalization procedure, even with the ad-hoc fix for gradient discontinuities, doesn't actually ensure that the resulting function is 1-Lipschitz unless the gradient of the gradient magnitude vanishes. The signed-distance functions considered in the article seem to have piecewise constant gradient magnitudes (so are L-Lipschitz, just with L > 1) except for inside the "r", but for less well-behaved functions, higher order derivatives might start to matter.
meindnoch
Sure, but signed distance fields by definition have a constant gradient magnitude, aren't they? They measure the distance from the surface, which grows linearly in the normal direction.
But it is true that general implicit surfaces don't necessarily have constant magnitude gradients. I.e. F(x,y) = x^2 + y^2 - 1 vs. F(x,y) = sqrt(x^2 + y^2) - 1
kragen
This is a good point!
diabllicseagull
I've worked on a patent some years ago about SDF CSG Tree pruning and constant radius filleted blends. Sadly patents don't get the same visibility journals enjoy.
https://patentimages.storage.googleapis.com/7a/73/2d/8d2eeca...
jacobolus
In theory, the purpose of a patent is to encourage sharing inventions so people can build on each-others' work. In practice, the modern purpose of a patent is to claim ownership over ideas and block further innovation so the author can extract rents via the court system; they are typically written to be as vague and inscrutable as possible to help cover a wider range of possible alternative inventions someone else might come up with, with no incentive for clarity. There's generally little reason for someone who isn't a lawyer to read a patent that hasn't expired yet – still a decade away in this case.
A paper is usually better: the goal is very explicitly sharing knowledge, and there are peer reviewers and editors whose job is to make sure the writing is clear.
mitthrowaway2
One reason for reading patents that haven't expired yet is if you're trying to evaluate an offer from a startup which has patents on their core technology. It can be worth understanding how strong or weak their technology position is.
null
null
marcosdumay
If you are after visibility, you can always do both. Bot any patent will push people away from your work.
3abiton
It started interestingly, but then
> This blog post assumes a vague understanding of implicit surface rasterization, and how interval arithmetic is used to both skip regions of space and simplify complex expressions.
Can anyone give me a quick rundow of the article?
meindnoch
Which part do you not understand?
An implicit surface is defined by F(x,y,z) = 0. Implicit surfaces can be rendered by checking whether the camera ray going through pixel x,y intersects the surface or not. An axis-aligned bounding box (AABB) in 3D is the product of intervals [x_min, x_max]x[y_min, y_max]x[z_min, z_max]. The interval extension of F(x,y,z) gives you an interval [F_min, F_max] for such an AABB. If [F_min, F_max] doesn't contain 0, then you can discard the whole AABB for rendering purposes, because the surface is guaranteed to be not there. This can speed up rendering substantially.
3abiton
Kudos for replying, that's already quite helpful.
sfpotter
Cool post.
The Lipschitz trick here relies on the assumption that the gradient has magnitude 1 everywhere. Evaluating the SDF in a box and using the Lipschitz property lets you get some quick estimates on the range of the SDF over the box.
This is a close cousin of the monotonicity interval inclusion: e.g., if f'([a, b]) > 0, then f([a, b]) is a subset of [f(a), f(b)]. Rounded interval arithmetic makes this rigorous on a computer; you can also do it in multiple dimensions, with higher derivatives, successively tightening inclusions for lower derivatives, etc.
ajkjk
Don't know much about this kind of thing, but it seems like it would be useful, if possible, to store the actual displacement (dx,dy) vector to the surface instead of just its magnitude, the SDF. I can't evaluate the tradeoff of storing an extra value, but it seems like it would make the recursion in the first section a lot easier: when you want to test if a particular box contains the object, if [v+d, v-d] contains 0, you would know a lot more where to look: you need only continue the iteration on subdivisions that are in the correct direction.
fph
One of the benefits of intervals is that you can ensure your results are correct irrespective of floating-point errors, if you carefully round all computations in the correct direction.
I don't think you can ensure that with gradients though: if f and f' are computed in machine arithmetic, cancellation errors might pile up.
kragen
It's true; this is a different application of interval arithmetic, not the usual application, which is, as you say, to avoid numerical stability issues. By and large, graphics researchers don't give a shit about numerical stability issues. They're the Ed Woods of numerical computation.
sfpotter
The point of interval arithmetic isn't really to deal with numerical stability, per se... it's to make it possible to do rigorous and validated computations on a computer. Numerical stability is a different concept that is somewhat orthogonal. I could carry out a numerically unstable computation using interval arithmetic, but using interval arithmetic wouldn't magically make the computation stable, it would just give you error bounds (which may indeed be quite loose and unhelpful if the algorithm is unstable).
constantcrying
>I don't think you can ensure that with gradients though: if f and f' are computed in machine arithmetic, cancellation errors might pile up.
Yes, you can. You need to do automatic differentiation with interval arithmetic. This gives you a mathematically correct result for the derivative.
Always keep in mind with interval arithmetic that (-inf, inf) is also a mathematically correct result. Keeping intervals small is the challenge with interval arithmetic and means that every algorithm not specifically developed for it is likely going to lead to useless results.
helltone
I think perhaps this could be done in other ways that don't require interval arithmetic for autodiff, only that the gradient is conservatively computed, in other words carrying the numerical error from f into f'
null
fph
> You need to do automatic differentiation with interval arithmetic.
But that kinda defeats the point of replacing interval arithmetic with gradients, though.
kragen
This doesn't help if your computation for f is numerically unstable, unless you compute f with interval arithmetic too.
constantcrying
>In this case, "the Lipschitz property" means that the gradient of the distance value is bounded
This is total nonsense. The point of Lipschitz continuity is that it is more than continuity and less then differentiability. If you assert that it is differentiable the concept looses all meaning. It is specifically interesting because you do not have to assert differentiability.
mkeeter
(OP here)
This is taken directly from the paper's introduction, which admittedly uses the more specific terminology of "1-Lipschitz signed distance bounds".
The paper cites the original Hart '96 paper on sphere tracing; quoth Hart, "a function is Lipschitz if and only if the magnitude of its derivative remains bounded".
https://graphics.stanford.edu/courses/cs348b-20-spring-conte...
I wonder if there's a terminology schism here between computer graphics and numerical analysis folks.
sfpotter
The concept of a Lipschitz function comes from mathematical analysis; neither computer graphics nor numerical analysis. It's straightforward to find the definition of a Lipschitz function online, and it is not in terms of its derivative. If a function is differentiable, then your quote applies; but again, it isn't the definition of a Lipschitz function.
I'd say this is a little pedantic, save for the fact that your function of interest (an SDF) isn't a differentiable function! It has big, crucially important subset of points (the caustic sets) where it fails to be differentiable.
constantcrying
>I wonder if there's a terminology schism here between computer graphics and numerical analysis folks.
The first group just pretends every function has a derivative (even when it clearly does not), the other doesn't.
The linked Wikipedia article gets it exactly right, I do not know why you would link to something which straight up says your definition is incorrect.
There is no point in talking about Lipschitz continuity when assuming that there is a derivative, you assume that it is Lipschitz because it is a weaker assumption. The key reason Lipschitz continuity is interesting because it allows you to talk about functions without a derivative, almost like they have one. It is the actual thing which makes any of this work.
kragen
There are functions that are differentiable, but not Lipschitz, e.g. sin(1/x) on (0, ∞)). In this context the Lipschitz property allows you to do things that mere differentiability doesn't, and, in the context of assuming that your function is differentiable, the definition given for the Lipschitz property is correct. It's true that there are also functions that are Lipschitz-continuous but not differentiable, but in this context they aren't interesting.
constantcrying
> It's true that there are also functions that are Lipschitz-continuous but not differentiable, but in this context they aren't interesting.
They absolutely are, even in the article. E.g. the max function or the 1-norm are non-differentiable, but continuous.
There is literally no point in talking about Lipschitz continuity for differentiable functions, because it is equivalent to the derivative being bounded.
kragen
Hmm, I guess you're right. But those functions (highly relevant here!) are differentiable almost everywhere; is that enough for Lipschitz continuity to be equivalent to the derivative being bounded?
I still think there's a point in talking about Lipschitz continuity for differentiable functions because the algorithms they're talking about depend on the Lipschitz property, not the differentiability.
ajkjk
The article is talking about functions whose changes over an displacement dx are not just bounded by K |dx| for some K, but specifically by the value K=1. So it's not the same thing as the Lipschitz property in analysis, although I guess it's inspired by it--but as a result it's concrete and useful for algorithms.
qbit42
If a function is Lipschitz, then it is differentiable almost everywhere by Radamacher's theorem. Moreover, if you want to prove things about Lipschitz functions, you can often (though not always) prove things about continuously differentiable functions with bounded gradients, and then lift to all Lipschitz functions via an appropriate compactness argument.
sfpotter
This is a great point... and Rademacher's theorem is one of my favorites. But this is in the context of a numerical algorithm which proposes to replace interval arithmetic with a direct evaluation of a gradient. For an SDF, the gradient fails to exist in many places; and since we're working with it on a computer and not doing math, this algorithm will eventually evaluate the gradient at a point where it is undefined (finite number of points, p > 0). On the other hand, it's straightforward to come up with a useful interval inclusion for the gradient which is defined even in these places (it should contain the subgradient of the function at each point). So, I am personally not convinced of the value of the proposed approach.
qbit42
Yeah in context I somewhat agree, though the utility for graphics applications probably comes down to some more empirical aspects that I won't conjecture about. I imagine there is some stuff you could do in this setting by incorporating autograd derivatives from many slightly perturbed points in a neighborhood of the input point (which together act as a coarse approximation of the subdifferential set).
dataflow
> This is total nonsense. The point of Lipschitz continuity is that it is more than continuity and less then differentiability. If you assert that it is differentiable the concept looses all meaning.
How is Lipschitz less than differentiability? e^x is differentiable everywhere but not Lipschitz continuous everywhere, right?
constantcrying
>How is Lipschitz less than differentiability? e^x is differentiable everywhere but not Lipschitz continuous everywhere, right?
You should compare the local properties.
dataflow
Nobody said local anywhere though, right? You just said the statement was nonsense where it clearly makes sense at least globally.
But now that we've moved the goalposts to local: what about f(x) = x^(3/2) sin(1/x), f(0) = 0? It's differentiable yet not locally Lipschitz-continuous on (0, 1]. (Taken straight from Wikipedia)
srean
> This is total nonsense... If you assert that it is differentiable the concept looses all meaning.
Citation please.
Yes Lipschitz functions need not be differentiable but when they are that bound holds and is very useful. Using that bound is bread and butter in convex analysis of differentiable convex functions.
Curb the snark.
constantcrying
>Citation please.
The article links to the Wikipedia page, which gets it right.
"An everywhere differentiable function g : R → R is Lipschitz continuous (with K = sup |g′(x)|) if and only if it has a bounded first derivative"
Lipschitz continuity for differentiable functions is just having a bounded derivative. The Lipschitz property suddenly becomes uninteresting as it just falls out of the assumptions, the interesting fact which allows you to use non-differentiable functions is that not assuming differentiability, but assuming Lipschitz continuity, is enough.
>Using that bound is bread and butter in convex analysis of differentiable convex functions.
It also is the bread and butter in Analysis of PDEs, but it is the bread and butter because Lipschitz continuity is a weaker property than differentiability. In the context of the article you want to talk about non-differnetiable functions, e.g. max, which you couldn't if you assumed differentiability.
The reason this is important because choosing Lipschitz over differentiable is what makes all this work.
srean
Yes we know that. I object to 'loses all meaning' when the function is differentiable. It certainly doesn't.
It gives the special property that the derivative is bounded, a property that's not true for arbitrary differentiable functions.
porridgeraisin
Yes.
I think the author meant that the rate of change is bounded [without the function needing to be differentiable] but used the term "gradient" anyways. I guess you shouldn't use the term "rate of change" either? I guess
The Lipschitz property means the function's output doesn't change faster than linearly with input
Is precise in that contextconstantcrying
He even gives the definition for the L=1 Lipschitz condition afterwards.
Lipschitz continuity means the differential quotient is globally bounded (where it exists). Differentiability, by definition, means the differential quotient converges at every point.
These are somewhat related, but different things.
This is very exciting! It seems like a lot of the interval stuff is bringing to fruition my idle speculations from 6 years ago in https://dercuano.github.io/notes/interval-raymarching.html. I'll have to read this carefully to see if it's actually the same approach or a different one that uses the same algebras.