Kalman Filter

$$ \newcommand{\RR}{\mathbb{R}} \newcommand{\GG}{\mathbb{G}} \newcommand{\PP}{\mathbb{P}} \newcommand{\PS}{\mathcal{P}} \newcommand{\SS}{\mathbb{S}} \newcommand{\NN}{\mathbb{N}} \newcommand{\ZZ}{\mathbb{Z}} \newcommand{\CC}{\mathbb{C}} \newcommand{\HH}{\mathbb{H}} \newcommand{\ones}{\mathbb{1\hspace{-0.4em}1}} \newcommand{\alg}[1]{\mathfrak{#1}} \newcommand{\mat}[1]{ \begin{pmatrix} #1 \end{pmatrix} } \renewcommand{\bar}{\overline} \renewcommand{\hat}{\widehat} \renewcommand{\tilde}{\widetilde} \newcommand{\inv}[1]{ {#1}^{-1} } \newcommand{\eqdef}{\overset{\text{def}}=} \newcommand{\block}[1]{\left(#1\right)} \newcommand{\set}[1]{\left\{#1\right\}} \newcommand{\abs}[1]{\left|#1\right|} \newcommand{\trace}[1]{\mathrm{tr}\block{#1}} \newcommand{\norm}[1]{ \left\| #1 \right\| } \newcommand{\argmin}[1]{ \underset{#1}{\mathrm{argmin}} } \newcommand{\argmax}[1]{ \underset{#1}{\mathrm{argmax}} } \newcommand{\st}{\ \mathrm{s.t.}\ } \newcommand{\sign}[1]{\mathrm{sign}\block{#1}} \newcommand{\half}{\frac{1}{2}} \newcommand{\inner}[1]{\langle #1 \rangle} \newcommand{\dd}{\mathrm{d}} \newcommand{\ddd}[2]{\frac{\partial #1}{\partial #2} } \newcommand{\db}{\dd^b} \newcommand{\ds}{\dd^s} \newcommand{\dL}{\dd_L} \newcommand{\dR}{\dd_R} \newcommand{\Ad}{\mathrm{Ad}} \newcommand{\ad}{\mathrm{ad}} \newcommand{\LL}{\mathcal{L}} \newcommand{\Krylov}{\mathcal{K}} \newcommand{\Span}[1]{\mathrm{Span}\block{#1}} \newcommand{\diag}{\mathrm{diag}} \newcommand{\tr}{\mathrm{tr}} \newcommand{\sinc}{\mathrm{sinc}} \newcommand{\cat}[1]{\mathcal{#1}} \newcommand{\Ob}[1]{\mathrm{Ob}\block{\cat{#1}}} \newcommand{\Hom}[1]{\mathrm{Hom}\block{\cat{#1}}} \newcommand{\op}[1]{\cat{#1}^{op}} \newcommand{\hom}[2]{\cat{#1}\block{#2}} \newcommand{\id}{\mathrm{id}} \newcommand{\Set}{\mathbb{Set}} \newcommand{\Cat}{\mathbb{Cat}} \newcommand{\Hask}{\mathbb{Hask}} \newcommand{\lim}{\mathrm{lim}\ } \newcommand{\funcat}[1]{\left[\cat{#1}\right]} \newcommand{\natsq}[6]{ \begin{matrix} & #2\block{#4} & \overset{#2\block{#6}}\longrightarrow & #2\block{#5} & \\ {#1}_{#4} \hspace{-1.5em} &\downarrow & & \downarrow & \hspace{-1.5em} {#1}_{#5}\\ & #3\block{#4} & \underset{#3\block{#6}}\longrightarrow & #3\block{#5} & \\ \end{matrix} } \newcommand{\comtri}[6]{ \begin{matrix} #1 & \overset{#4}\longrightarrow & #2 & \\ #6 \hspace{-1em} & \searrow & \downarrow & \hspace{-1em} #5 \\ & & #3 & \end{matrix} } \newcommand{\natism}[6]{ \begin{matrix} & #2\block{#4} & \overset{#2\block{#6}}\longrightarrow & #2\block{#5} & \\ {#1}_{#4} \hspace{-1.5em} &\downarrow \uparrow & & \downarrow \uparrow & \hspace{-1.5em} {#1}_{#5}\\ & #3\block{#4} & \underset{#3\block{#6}}\longrightarrow & #3\block{#5} & \\ \end{matrix} } \newcommand{\cone}[1]{\mathcal{#1}} $$

Kalman Filter

A geometric take on Kalman filtering. In the absence of process noise, Kalman filtering simply boils down to the Recursive Least Squares algorithm.

Recursive Least Squares

where $M$ is positive definite, and where the size of the system will grow over time. Assuming that $A$ has full row rank, the normal equations for the above are:

and the solution is $x = \block{A^T M A}^{-1} A^T M b$. Assuming we computed the solution $x_k$ at step $k$, we now add extra rows to our (overconstrained) system:

At this point, it is convenient to rewrite the current system in terms of the solution update $\delta_{k+1}$ such that $x_{k+1} = x_k + \delta_{k+1}$ in order to express the current normal equations in terms of the previous ones:

where the first part in the right-hand side is zero since $x_k$ solves the problem at step $k$. The Woodbury formula provides a practical way to update the inverse $C_{k+1}$ of $K_{k+1}$ from the previously computed $C_k$ as follows:

Let us denote $S_{k+1} = R_{k+1}^{-1} + H_{k+1} C_k H_{k+1}^T$, now the whole update process becomes:

Alternatively, the appendix shows that $C_{k+1} H_{k+1}^T R_{k+1} = C_k H_{k+1} S_{k+1}^{-1}$, so that the solution update can also be obtained as:

Forgetting Factor

One can easily incorporate a geometrically decreasing weight for previous measurements by scaling $K_k$ by $0 \leq \lambda < 1$, which corresponds to scaling $C_k$ by $\frac{1}{\lambda}$ before computing the next iterate.

Non-stationary Process

Let us now assume that the state $x$ changes between steps according to a linear map, for instance as a result of some dynamic process:

for some invertible linear mapping $F_k$. One can also think of $F_k$ as a change of coordinates occurring after each step, and we need to express the previous system in terms of the new coordinates $x^{(k+1)}$. The incremental problem becomes:

i.e. we fit previous observations by reverting to the previous coordinate system. The normal equations become:

since once again, $x_k$ solves the problem at step $k$. Using the Woodbury formula as before, we obtain:

where $D_k = F_k C_k F_k^T$ and $S_{k+1} = R_{k+1}^{-1} + H_{k+1} D_k H_{k+1}^T$. The solution update is traditionally decomposed into two prediction/update phases:

Prediction

Update

Affine update

Terms once again cancel each other in the normal equations, this time for the solution update $\delta = x - \block{F_k x_k + u_k}$:

TODO Process Noise

Not quite sure how to obtain this one, looks like some kind of dual regularization:

Extended Kalman Filter

Starting from an initial estimate $x_0$, we linearize each new measurement $f_{k+1}$ at current estimate $x_k$ to obtain:

A straightforward adaptation of the linear Kalman filter to the linearized problem gives:

and the rest is the same as before. A similar linearization for non-stationary processes can be obtained, again by linearizing the non-linear transition function at the most up-to-date estimate for the state. See wikipedia for details.

Prediction

Update

Variant

Alternatively, one can consider the linearized least-squares problem expressed $purely$ in terms of the solution update $\delta x = x - x_k$, which will help us derive the equations for the Lie group case. Assuming $x_k$ is our lastest estimate, we get $x - x_i = \delta _x + x_k - x_i$, which we plug into the Extended Kalman Filter equation above to obtain:

In this version, the state at each iteration is the displacement $\delta x$, which will be predicted/corrected. The underlying position $x_k$ is maintained separately only to update matrices/vectors accordingly. Each change of linearization point can be seen as a coordinate change on the solution update $\delta x$:

where the same $x$ is expressed in two coordinate systems $\delta x^{(k)}$ and $\delta x^{(k-1)}$, related by:

Prediction

Update

The first prediction line seems a bit silly as it appears to always evaluate to zero, but it comes useful when incorporating state constraints into the filter. In this case, $x_{k+1} = x_k + \dd x_k + \ldots$ therefore the predicted $\dd x$ is not always zero.

Lie Groups

We now consider an incremental non-linear least-squares problem on a Lie group $G$:

where $f: G \to E$ maps to an Euclidean space $E$. Instead of solving for $g$ directly (which may involve non-linear constraints), we express it as a body-fixed update from current estimate $g_k$ using the group exponential:

Another convenient alternative is to linearize the coordinate change at $\delta g_{k-1}$, i.e. the previous computed solution update:

Appendix

One can verify that the solution $x$ for the above system is also that of the following two augmented KKT systems:

Now the solution for the first one is $x = \block{C - CH^T\inv{S}HC}H^TRz$, where $S = HCH^T + \inv{R}$ is the Schur complement, and the solution for the second is $x = CH^T\inv{S}z$.