Skip to content(if available)orjump to list(if available)

Kalman Filter Tutorial

Kalman Filter Tutorial

30 comments

·January 18, 2025

rsp1984

Always telling this whenever the topic of Kalman Filters come up:

If you're learning the Kalman Filter in isolation, you're kind of learning it backwards and missing out on huge "aha" moments that the surrounding theory can unlock.

To truly understand the Kalman Filter, you need to study Least Squares (aka linear regression), then recursive Least Squares, then the Information Filter (which is a different formulation of the KF). Then you'll realize the KF is just recursive Least Squares reformulated in a way to prioritize efficiency in the update step.

This PDF gives a concise overview:

[1] http://ais.informatik.uni-freiburg.de/teaching/ws13/mapping/...

bradly

I appreciate you taking the time to help people understand higher level concepts.

From a different perspective... I have no traditional background in mathematics or physics. I do not understand the first line of the pdf you posted nor do I understand the process for obtaining the context to understand it.

But I have intellectual curiosity. So the best path forward for me understanding is a path that can maintain that curiosity while making progress on understanding. I can reread the The Six (Not So ) Easy Pieces and not understand any of it and still find value in it. I can play with Arnold's cat and, slowly, through no scientific rigor other than the curiosity of the naked ape, I can experience these concepts that have traditionally been behind gates of context I do not possess keys to.

http://gerdbreitenbach.de/arnold_cat/cat.html

mtizim

With no mathematical rigor there is no mathematical understanding. You are robbing yourself, as the concepts are meaningless without the context.

Truly appreciate the power of linear approximations by going through algebra, appreciate the tricks of calculus, marvel at the inherent tradeoffs of knowledge with estimator theory, and see the joy of the central limit theorem being true. All of this knowledge is free, and much more interesting than a formal restatement of "it was not supposed to rain, but I see clouds outside, I guess I'll expect light rain instead of a big thunderstorm".

bradly

> With no mathematical rigor there is no mathematical understanding. You are robbing yourself, as the concepts are meaningless without the context.

I will think more about this, but I'm not sure I agree. I have enjoyed reading Feynman talk about twins and one going on a supersonic vacation without understanding the math. Verisimilitude allows a modeling of understanding with a scalar representation of scientific knowledge, so why not?

Of course I would like to understand the math in its purest forms–just the same as I wanted to read 1Q84 in Japanese to be able to fully experience it in its purest form, but my life isn't structured in a way were that is realistic even if the knowledge of the Japanese language is free.

> Truly appreciate the power of linear approximations by going through algebra, appreciate the tricks of calculus, marvel at the inherent tradeoffs of knowledge with estimator theory, and see the joy of the central limit theorem being true.

I can't even foil so the journey toward understanding can feel unattainable in the time resources I have. This absolutely may be a limiting belief, but the concept of knowledge being free ignores the time cost for some exploring these outside of academia or professional setting.

keithalewis

[flagged]

bradly

> Just stop whining about it in public.

I'm curious if this is how my reply came across?

jbullock35

I found this article invaluable for understanding the Kalman filter from a Bayesian perspective:

Meinhold, Richard J., and Nozer D. Singpurwalla. 1983. "Understanding the Kalman Filter." American Statistician 37 (May): 123–27.

jampekka

I think the easiest way depends on your background knowledge. If you understand linearity of the Gaussian distribution and the Bayesian posterior of Gaussians, the Kalman filter is almost trivial.

For (1D) we get the prior from the linear prediction X'1 = X0*a + b, for which mean(X'1) = mean(X0)*a + b and var(X'1) = var(X0)*a^2, where a and b give the assumed dynamics.

The posterior for Gaussians is the precision weighted mean of the prior and the observation: X1 = (1 - K)*X'1 + Y*K, where the weighting K = (1/var(X'1))/(1/var(X'1) + 1/var(Y)), with Y being the Gaussian observation.

Iterating this gives the Kalman filter. Generalizing this to multiple dimensions is straightforward given the linearity of multidimensional Gaussians.

This is how (after I understood it) it makes it really simple to me, but things like linearity of (multidimensional) Gaussians and the posterior of Gaussians as such probably are not.

jtrueb

You can keep telling this, but this “esoteric” math is often too much for the people actually implementing the filters.

jampekka

FWIW, I think I understand Kalman filters quite well, but the linked PDF is hard for me to follow, and I'd really struggle to understand it if I didn't already know what it's saying.

I think the lesson there is that the Kalman filter is simpler in the "information form" where the Gaussian distribution is parameterized using the inverse of the covariance matrix.

If you don't already know what that means, you likely don't get much out of that. I think the more intuitive way is to first understand the 1D case where the filter result is weighted average of the prediction and the observation where the weights are the multiplicative inverses of the respective variances (the less uncertainty/"inprecision", the more you give weight).

In the multidimensional case the inverse is the matrix inverse but the logic is the same.

More generally the idea is to statistically predict the next step from the previous and then balance out the prediction and the noisy observation based on the confidence you have in each. This intuition covers all Bayesian filters. The Kalman filter is a special case of the Bayesian filter where the prediction is linear and all uncertainties are Gaussian, although it was understood this way only well after Kalman invented the eponymous filter.

Not sure how intuitive that's either, but don't be too worried if these things aren't obvious, because they aren't until you know all the previous steps. To implement or use a Kalman filter you don't really need this statistical understanding.

If you prefer to understand things more "procedually", check out the particle filter. It's conceptually the Bayesian filter but doesn't require the mathematical analysis. That's the way I really understood the underlying logic.

defrost

It's bread and butter math for physics, Engineering (trad. Engineering), Geophysics, Signal processing etc.

Why would anyone have people implementing Kalman filters who found the math behind them "esoteric"?

Back in the day, in my wet behind the ears phase, my first time implementing a Kalman Filter from scratch, the application was to perform magnetic heading normalisation for on mag data from an airborne geophysical survey - 3 axis nanotesla sensor inputs on each wing and tail boom requiring a per survey calibration pattern to normalise the readings over a fixed location regardless of heading.

This was buried as part of a suite requiring calculation of the geomagnetic reference field (a big paramaterised spherical harmonic equation), upward, downward and reduce to pole continuations of magnetic field equations, raw GPS post processing corrections, etc.

where "etc" goes on for a shelf full of books with a dense chunk of applied mathematics

IgorPartola

I understood it as reestimation with a dynamic weight factor based on the perceived error factor. I know it’s more complex than that but this simplified version I needed at one point and it worked.

dr_kiszonka

You are probably right, but many folks following your advice will give up halfway through and never get to KF.

raincom

That’s the one should learn any subject—-be it physics, chemistry, math, etc. However, textbooks don’t follow that technique.

ryan-duve

I strongly recommend Elements of Physics by Millikan and Gale for anyone who wants to learn pre-quantum physics this way.

jvanderbot

Are you me? I feel like I say this every time too! Perfectly captured.

jtrueb

Every time that one comes up, this one comes up https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Pyt... (and vice versa)

david_draco

As far as I am aware, there is no symbolic computing tool yet for probability distributions? For example, multiplying two multivariate Gaussian PDFs together and getting the covariance matrix out. Or defining all the ingredients for a Kalman filter (prediction model and observing process) and getting the necessary formulas out (as in sympy's lambdify).

pmarreck

Something occurred to me a while back: Can we treat events that only have eyewitness testimony with a Kalman filter somehow in order to strengthen the evidential value of the observations after encoding it into vectors of some sort?

This would treat both lying and inaccuracy as "error"

I'm thinking of things like: reports of Phoenix lights or UFOs in general, ghosts, NDEs, and more prosaically, claims of rape

plasticchris

Only if you can make a linear model of those things…

bradly

Why does the model need to be linear?

pinkmuffinere

“Kalman filter” usually refers to “linear quadratic estimator”, which assumes a linear model in its derivation. This will impact the “predict“ step at the very least, and I think also the way the uncertainty propagates. There are nonlinear estimators as well, though they usually have less-nice guarantees (eg particle filter, extended kalman filter)

Edit: in fact, I see part three of the book in tfa is devoted to nonlinear Kalman filters. I suspect some of the crowd (myself included) just assumed we were talking about linear Kalman filters

dpflan

Anyone else watch Michael van Biezem (with the bow tie) lectures on Kalman Filters while learning this topic?

- https://www.youtube.com/watch?v=CaCcOwJPytQ&list=PLX2gX-ftPV...

dang

Related. Others?

Kalman filter from the ground up - https://news.ycombinator.com/item?id=37879715 - Oct 2023 (150 comments)

(also what's the best year to put in the title above?)

blharr

The first example of tracking, is this the same thing as dead reckoning? I've always been confused on the term "tracking" since it is used a lot in common speech, but seems to mean some specific type of 'tracking'

hansvm

Kind of.

"Tracking", here, means providing some kind of `f(time) -> space` API.

Dead reckoning is a mechanism for incorporating velocity and whatnot into a previously estimated position to estimate a new position (and is also one possible way to implement tracking, usually with compounding errors).

The Kalman filter example is better than just dead reckoning. For a simple example, imagine you're standing still but don't know exactly where. You have an API (like GPS) that can estimate your current position within some tolerance. If you're able to query that API repeatedly and the errors aren't correlated, you can pinpoint your location much more precisely.

Back to tracking with non-zero velocity, every new position estimate (e.g., from GPS) can be incorporated with all the information you've seen so far, adjusting your estimates of velocity, acceleration, and position and giving you a much more accurate current estimate but also better data for dead-reckoning estimates while you wait for your next external signal.

The technique (Kalman Filter) is pretty general. It's just merging all your noisy sources of information according to some ruleset (real-world physics being a common ruleset). You can tack on all sorts of other interesting information, like nearby wifi signals or whatever, and even very noisy signals can aggregate to give precise results.

Another application I threw it at once was estimating my true weight, glycogen reserves, ..., from a variety of noisy measurements. The sky's the limit. You just need multiple measurements and a rule for how they interact.

defrost

Dead reckoning is a form of prediction, based on past evidence that indicates location then, you are reckoning (best guessing) a current position and detrmining a direction to move forward to reach some target.

"Past evidence that indicates" is deliberate phrasing, in the majority of these examples we are looking at acquired data with noise; errors, instrument noise, missing returns, etc.

"Tracking" is multi-stage, there's a desired target to be found (or to be declared absent) in noisy data .. that's pattern search and locking, the trajectory (the track) of that target must be best guessed, and the best guess forward prediction can be used to assist the search for the target in a new position.

This is not all that can be done with a Kalman filter but it's typical of a class of common applications.

einpoklum

The one sentence you really need to know:

"The filter is named after Rudolf E. Kálmán (May 19, 1930 – July 2, 2016). In 1960, Kálmán published his famous paper describing a recursive solution to the discrete-data linear filtering problem."

magic_hamster

Not sure why, but I get this vague notion that the author might have written a book.