Computing the Kalman Filter connecting pieces
The Kalman Filter is an algorithm to update probability distributions with new observations made under noisy conditions. It is used in everything from smartphone GPS navigation systems to large scale climate simulations. It is implemented in hardware ranging from embedded devices to super-computers. It is important and widely used.
In this post I will
- Show how to compute the Kalman Filter with BLAS/LAPACK
- Relate this to old work on
sympy.stats
- Analyze the computation graph for some flaws and features
Results
The Kalman filter can be defined as a pair of matrix expressions
We convert these matrix expressions into a computation
This is a useful result.
History with stats
Two years ago I wrote two blogposts about the Kalman filter, one on the univariate case and one on the multivariate case. At the time I was working on sympy.stats
, a module that enabled uncertainty modeling through the introduction of a random variables into the SymPy language.
SymPy stats was designed with atomicity in mind. It tried to be as small and thin a layer as possible.
- It turned queries on continuous random expressions into integral expressions.
- It turned queries on discrete random expressions into iterators and sums.
- It also turned queries on multivariate normal random expressions into matrix expressions.
The goal was to turn a specialized problem (uncertainty quantification) into a general one (integrals, sums, matrix expressions) under the assumption that tools are much richer to solve general problems than specific ones.
The first blogpost on the univariate kalman filter produced integral expressions that were then solved by SymPy’s integration routines. The second blogpost on the multivariate Kalman filter generated the following matrix expressions
\(\mu' = \mu + \Sigma H^T \left( R + H \Sigma H^T \right )^{-1} \left(H\mu - \textrm{data} \right)\) \(\Sigma' = \left( \mathbb{I} - \Sigma H^T \left(R + H \Sigma H^T \right)^{-1} H \right) \Sigma\)
That blogpost finished with this result, claiming that the job of sympy.stats
was finished and that any of the popular numerical linear algebra packages could pick up from that point.
These two equations are the Kalman filter and were our starting point for today. Matrix Expressions are an intermediate representation layer between sympy.stats
and sympy.computations.matrices
.
By composing these projects we compile high level statistical expressions of sympy.stats
to intermediate level matrix expressions to the DAGs of sympy.computations
and even down to low level Fortran code. If we add a traditional Fortran compiler we can build working binaries directly from sympy.stats
.
Features and Flaws in the Kalman graph
Lets investigate the Kalman computation. It contains a few notable features.
First, unlike previous examples it has two outputs, new_mu
and new_Sigma
. These two have large common subexpressions like H*Sigma*H' + R
. You can see that these were computed once and shared.
Second, lets appreciate that H*Sigma*H + R
is identified as symmetric positive definite allowing the more efficient POSV
routine. I’ve brought this up before in artificial examples. It’s nice to see that this occurs in practice.
Third, there is an unfortunately common motif. This instance was taken from the upper right of the image but the GE/SYMM -> AXPY
motif occurs four times in this graph.
GEMM/SYMM
are matrix multiply operations used to break down expressions like alpha*A*B
. AXPY
is a matrix addition used to break down expressions like alpha*A + B
. They are both used properly here.
This motif is unforunate because GEMM
is also capable of breaking down a larger expression like alpha*A*B + beta*C
, doing a fused matrix multiply and add all in one pass. The AXPY
would be unnecessary if the larger GE/SYMM
patterns had matched correctly.
Canonicalization
The alpha*A*B + beta*C
patterns did not match here for a silly reason, there wasn’t anything to match for the scalars alpha
and beta
. Instead the patterns were like A*B - C
. One solution to this problem is to make more patterns with all possibilities. Alternatively we could change how we canonicalize MatMul
objects so that they always have a scalar coefficient, even if it defaults to 1.
We don’t want to change MatMul
to behave like this throughout all of SymPy though; the extra 1 is usually unwanted. This is a case where there are multiple correct definitions of canonical. Fortunately MatMul
is written with this eventuality in mind.
Links
- A Lesson in Data Assimilation using SymPy
- Multivariate Normal Random Variables
- Source for this post: kalman.py, kalman_comp.py
blog comments powered by Disqus