UP | HOME

Kalman Filters

Setup

  1. Linear state space system:
\begin{align} x_k &= Ax_{k-1} + Bu_{k-1} + w_{k-1} \tag{1} \\ z_k &= Hx_k + v_k \tag{2} \end{align}
  1. The state at time \(k\) is \(x_k\) and the measurement at that time is \(z_k\)
  2. The inputs are known.
  3. Initial state \(x_0 \sim \mathcal{N}(q_0, \Sigma_0)\)
  4. Gaussian process noise \(w_k \sim \mathcal{N}(0, Q)\)
    • Noise at each step is independent (independent, identicially distributed or i.i.d.).
  5. Gaussian sensor noise \(v_k \sim \mathcal{N}(0, R)\)
    • (i.i.d. noise)
  6. Noise is independent of state.
    • Can also be just a known correlation, but we simplify here

The Results

Initialize

An initial Gaussian distribution representing your knowledge of the initial state. The Gaussian is \(\mathcal{N}(\hat{x}_0, \Sigma_0)\).

Predict

Propagate the state and uncertainty forward:

\begin{align} \hat{x}_k^- &= A\hat{x}_{k-1} + Bu_{k-1} \\ \Sigma_k^- &= A\Sigma_{k-1}A^T + Q \end{align}
  • The prediction random variable is Gaussian: \(X_k^- \sim \mathcal{N}(A\hat{x}_{k-1} + Bu_{k-1}, A\Sigma_{k-1}A^T + Q)\)
Derivation
  1. For \(X \sim \mathcal{N}(\mu, \Sigma)\), and \(Y = AX + b\), \(Y \sim \mathcal{N}(A\mu +b, A \Sigma A^T)\) (See Lecture on Gaussians)
  2. In this case, the state update equation can be written as:

    \begin{align} x_k = \begin{bmatrix} A & I \end{bmatrix} \begin{bmatrix} x_{k-1} \\ w_{k-1} \end{bmatrix} + B u_{k-1} \end{align}
  3. Applying the formula for a linear transformation of Gaussian random variables reveals the mean to be:

    \begin{align} \hat{x}_k^- &= \begin{bmatrix} A & I \end{bmatrix} \begin{bmatrix}\hat{x}_{k-1} \\ 0 \end{bmatrix} + Bu_{k-1} \\ &= A \hat{x}_{k-1} + Bu_{k-1} \end{align}
  4. Applying the formula for a linear transformation of Gaussian random variables reveals the covariance to be:

    \begin{align} \Sigma_k^- &= \begin{bmatrix} A & I \end{bmatrix} \begin{bmatrix} \Sigma_{k-1} & 0 \\ 0 & Q \end{bmatrix} \begin{bmatrix} A^T \\ I \end{bmatrix} \\ &= \begin{bmatrix} A & I \end{bmatrix} \begin{bmatrix} \Sigma_{k-1} A^T \\ Q \end{bmatrix} \\ &= A \Sigma_{k-1} A^T + Q \end{align}
    • Here we used the fact that \(w\) and \(x\) are independent random variables, which is why \(Cov(x_{k-1}, w_{k-1})\) is block diagonal.

Correct

Update the estimate based on the sensor measurement. The Kalman gain \(K\) controls the relative importance of the estimate from the model and from the measurement

\begin{align} K_k &= \Sigma_k^{-}H^T(H \Sigma_k^{-} H^T + R)^{-1}\\ \hat{x}_k &= \hat{x}_k^- + K_k(z(k) - H\hat{x}_k^-)\label{eq:mean_update}\\ \Sigma_k &= (I - K_kH)\Sigma_k^-\label{eq:var_update} \end{align}
  • The mean and the covariance estimate of the state are provided.
  • Since the inputs are Gaussian and the system is linear, the estimate is also a Gaussian distribution.
Derivation
  1. Let \(Y \sim X_1 | X_2\) = a, where \(\begin{bmatrix}X_1 \\ X_2\end{bmatrix} \sim \mathcal{N}(\begin{bmatrix}\mu_1 \\ \mu_2\end{bmatrix}, \begin{bmatrix}\Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22}\end{bmatrix})\) are Jointly Gaussian. Then \begin{equation} Y ∼ \mathcal{N}(μ_1 + Σ_{12}Σ_{22}^{-1}(a - μ_2),\;Σ_{11}- Σ_{12}Σ_{22}^{-1}Σ_{21}). \end{equation} (See Lecture on Gaussians)
  2. The measurement and state variable is written in terms of the predicted state and the measurement noise

    \begin{equation} \begin{bmatrix} x_k \\ z_k \\ \end{bmatrix} = \begin{bmatrix} I & 0 \\ H & I \end{bmatrix} \begin{bmatrix} x_k \\ v_k \end{bmatrix} \end{equation}
  3. The Jointly Gaussian random vector \(\begin{bmatrix} x_k \\ v_k \end{bmatrix} \sim \mathcal{N}(\begin{bmatrix}\hat{x}_k^- \\ 0 \end{bmatrix}, \begin{bmatrix} \Sigma_k^- & 0 \\ 0 & R \end{bmatrix})\)
  4. Next, apply the formula for a linear transformation of a Jointly Gaussian random vector to reveal the joint distribution

    \begin{align} \begin{bmatrix} x_k \\ z_k \\ \end{bmatrix} &\sim \mathcal{N}( \begin{bmatrix} I & 0 \\ H & I \end{bmatrix} \begin{bmatrix} \hat{x}_k^- \\ 0 \end{bmatrix}, \begin{bmatrix} I & 0 \\ H & I \end{bmatrix} \begin{bmatrix} \Sigma_k^- & 0 \\ 0 & R \end{bmatrix} \begin{bmatrix} I & H^T \\ 0 & I \end{bmatrix} )\\ &\sim \mathcal{N}( \begin{bmatrix} \hat{x}_k^- \\ H\hat{x}_k^- \end{bmatrix}, \begin{bmatrix} \Sigma_k^- & \Sigma_k^- H^T \\ H \Sigma_k^- & H \Sigma_k^- H^T + R \end{bmatrix}) \end{align}
  5. Applying the formula for the conditional Jointly Gaussian random vector to \(\begin{bmatrix}x_k\\z_k\end{bmatrix}\), when \(z_k = z(k)\) reveals that it is distributed with a mean given by equation \eqref{eq:mean_update} and variance given by equation \eqref{eq:var_update}.

Repeat

  • The results from the correction step (\(\hat{x}_{k-1}, \Sigma_{k-1}\)) are used for the prediction at time \(k\)
  • Thus, Kalman filtering consists of a repeated cycle of prediction and correction steps

Extended Kalman Filter

  1. Start with non-linear system, whose state \(x_k\) needs to be estimated from measurements \(z_k\)

    \begin{align} x_k &= g(x_{k-1}, u_{k-1}, w_{k-1}) \\ z_k &= h(x_k, v_k) \end{align}

    To approximate the state, set the noise to zero and propagate the previous state forward to the next timestep:

    \begin{equation} \tilde{x}_k = g(\hat{x}_{k-1}, u_{k-1}, 0) \end{equation}

    To approximate the measurement that could be expected from a given state use the previous state estimate and set noise to zero:

    \begin{equation} \tilde{z}_k = h(\tilde{x}_k, 0) \end{equation}
  2. Linearize the system about the previous estimate \(\hat{x}_k\) and 0 noise (using Taylor expansion).

    \begin{align} x(k) &\approx g(\hat{x}_{k-1}, u_{k-1}, 0) + A_k (x_{k-1} - \hat{x}_{k-1}) + W_k w_{k-1} \label{eq:taylorx}\\ z(k) &\approx h(\hat{x}_k, 0) + H_k (x_{k} - \tilde{x}_{k}) + V_k v_k \label{eq:taylorz} \end{align}

    with

    \begin{align} A_k &= \frac{d g}{dx}\Bigr\rvert_{\hat{x}_{k-1}, u_{k-1}, 0}\\ W_k &= \frac{d g}{dw}\Bigr\rvert_{\hat{x}_{k-1}, u_{k-1}, 0} \\ H_k &= \frac{d h}{dx}\Bigr\rvert_{\tilde{x}_{k}, 0} \\ V_k &= \frac{d h}{dv}\Bigr\rvert_{\tilde{x}_{k}, 0} \end{align}

    Let the process noise random variable \(\tilde{w} = W_k w_k\) be distributed as a Gaussian with \(\mathcal{N}(0, W_kQW_k^T)\) and the sensor noise random variable \(\tilde{v} = V_k v_k \sim \mathcal{N}(0, V_kRV_k^T)\)

  3. The dynamics of the estimation error can be modeled as a linear system. Thus, a Kalman filter can be used to estimate the estimation error!

    The error between the state and the prediction is

    \begin{equation} \tilde{e}_{x_k} = x_k - \tilde{x}_k \end{equation}

    And the error between the actual measurement and estimated measurement (the residual) is

    \begin{equation} \tilde{e}_{z_k} = z_k - \tilde{z}_k \end{equation}

    Substituting the linear approximations in equations \eqref{eq:taylorx} and \eqref{eq:taylorz} into the error equations yields the first-order approximate error dynamics

    \begin{align} \tilde{e}_{x_k} &\approx A_k(x_{k-1} - \hat{x}_{k-1}) + \tilde{w}_{k-1} \\ \tilde{e}_{z_k} & \approx H_k(x_{k} - \tilde{x}_{k}) + \tilde{v}_k \end{align}

    (note that \(\tilde{x}_k = g(\hat{x}_{k-1}, u_k, 0)\) and \(\tilde{z}_k = h(\hat{x}_k, 0)\)).

    If we use \(\tilde{x}_{k-1}\) as the estimate \(\hat{x}_{k-1}\), the equations above become state equations in \(\tilde{e}_{x_k}\)

    \begin{align} \tilde{e}_{x_k} &\approx A_k\tilde{e}_{x_{k-1}} + \tilde{w}_{k-1} \\ \tilde{e}_{z_k} & \approx H_k\tilde{e}_{x_k} + \tilde{v}_k, \end{align}

    Then \(\tilde{e}_{x_k}\) can be estimated by a Kalman filter, yielding an estimate \(\hat{e}_k\). The linear system has system matrices due to the linearization and the noise covariance has also been transformed.

  4. Estimating the error is estimating the state: The Kalman filter estimate of the error is

    \begin{equation} \hat{e}_k = \hat{e}_k^- + \tilde{K}\tilde{e}_{z_k}. \end{equation}

    Let \(\hat{e}_k^-\) equal zero (because we predict that the mean of the error is zero)

    \begin{equation} \hat{e}_k = K\tilde{e}_{z_k} \end{equation}

    From the error equation, we substitute \(\hat{x}_k\) for \(x\) and \(\hat{e}_k\) for \(\tilde{e}_{x_k}\)

    \begin{align} \hat{x}_k &= \hat{e}_k + \tilde{x}_k \\ &= \tilde{x}_k + K\tilde{e}_{z_k} \\ &= \tilde{x}_k + K(z_k - \tilde{z}_k) \end{align}

    Therefore, finding the mean of the error for the linear error system is the same as finding the mean of the state for that same system shifted by the prediction \(\tilde{x}_k\). What about the covariance? (exercise)

    Thus, we use the linearized system for computing the Kalman gain and the measurement update. However, the prediction \(\tilde{x}_k\) is approximated with \(g(\hat{x}_{k-1},u_{k-1},0)\) and the predicted measurement \(\tilde{z}_k\) is approximated with \(h(\tilde{x}_k, 0)\). Uncertainty is propagated using the linearized model.

EKF Filtering Equations

  • I am using \(A_k\), \(H_k\), \(W_k\) and \(V_k\) as the linearized matrices.

Prediction:

\begin{align} \tilde{x}_k &= g(\hat{x}_{k-1}, u_{k-1}, 0)\\ \Sigma_k^- &= A_k\Sigma_{k-1}A_k^T+W_k Q_{k-1} W_k^T \end{align}

Correction

\begin{align} K &=\Sigma_k^-H_k^T(H_k\Sigma_k^-H_k^T + V_k R_k V_k^T)^{-1}\\ \hat{x}_k &= \tilde{x}_k + K_k(z(k) - h(\tilde{x}_k,0))\\ \Sigma_k &= (I - K H_k)\Sigma_k^- \end{align}

Further Reading

Author: Matthew Elwin