UP | HOME

Jointly Gaussian Distributions

Overview

  • Also called Multivariate Normal Distrubtions (or Multi-Normal)
  • \(X(\xi) \sim \mathcal{N}(\mu, \Sigma)\),
    • \(X(\xi)\) is a jointly Gaussian random variable, (\(\xi\) is an element of the sample space)
    • \(\mu \in \mathbb{R}^k\) is the mean. \(\mu = E[X]\) (\(E[]\) is expected value. I will also drop dependence on the sample space).
    • \(\Sigma \in \mathbb{R}^{k \times k}\) is the covariance. \(\Sigma = E[(X - \mu)(X-\mu)^T]\), and is positive definite.
  • The probability density function (pdf) for a jointly Gaussian random variable is

    \begin{equation}\label{eq:mvgauss} f_X(x) = (2\pi)^{-\frac{k}{2}}(\det{\Sigma})^{-\frac{1}{2}}e^{-\frac{1}{2}(x-\mu)^T\Sigma^{-1}(x-\mu)} \end{equation}

Linear Transform

  • Let \(Y = A X + b\), \(X \sim \mathcal{N}(\mu, \Sigma)\), where \(A \in \mathbb{R}^{n \times n}\) is invertable
  • Then \(Y \sim \mathcal{N}(A \mu + b, A \Sigma A^T)\)
Sketch of a Proof
  1. Use the multi-variable change of variables formula to find the new density:

    \begin{equation} f_Y(y) = \frac{f_x(x)}{\det A} \end{equation}
  2. Show that the new density has the same form as multi-variate Gaussian equation~\eqref{eq:mvgauss}.

Conditional Distribution

  • Let \(X \sim \mathcal{N}(\mu, \Sigma)\).
  • Partition:

    \begin{align} X &= \begin{bmatrix}X_1 \\ X_2\end{bmatrix} \\ \mu &= \begin{bmatrix}\mu_1 \\ \mu_2\end{bmatrix}\\ \Sigma &= \begin{bmatrix}\Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22}\end{bmatrix} \end{align}
  • What is the conditional probability distribution \(f_{X_1}(x_1 | X_2 = a)\)?
  • Let \(Y \sim X_1 | X_2\) (\(Y\) is a random variable with pdf corresponding to the conditional probability of \(X_1\) given \(X_2 = a\)):

    \begin{equation} Y \sim \mathcal{N}(\mu_1 + \Sigma_{12}\Sigma_{22}^{-1}(a - \mu_2), \Sigma_{11}- \Sigma_{12}\Sigma_{22}^{-1}\Sigma_{21}) \end{equation}

Random Number Generation in C++

  • If you intend to use the C++ standard library for generating random numbers, you can use the following singleton pattern to ensure that you are only seeding the random number generator once in your process.

    #include<random>
     std::mt19937 & get_random()
     {
         // static variables inside a function are created once and persist for the remainder of the program
         static std::random_device rd{}; 
         static std::mt19937 mt{rd()};
         // we return a reference to the pseudo-random number genrator object. This is always the
         // same object every time get_random is called
         return mt;
     }
    
     // To generate a gaussian variable:
     std::normal_distribution<> d(mean, variance);
     d(get_random());
    
    
  • Armadillo also has random number generation functions

Multivariate Gaussian

  • An algorithm for generating multivariate Gaussian noise, given mean \(m \in \mathbb{R}^n\) and variance \(Q \in \mathbb{R}^{n \times n}\) is as follows:
    1. Let \(L\) be the (lower) Cholesky decomposition of \(Q\) (that is \(L\) is lower triangular such that \(L L^T = Q\).
      • Use the chol() function from armadillo or the llt() function from eigen
    2. Let \(u \in \mathbb{R}^n\) be generated such that each entry is independently drawn from a Gaussian distribution with mean 0 and variance 1.
    3. Then, \(v = m + Lu\) will have the desired distribution

Plotting Contours

  • The level-sets of a Gaussian distribution (\(f_X(x) = c\)) form ellipsoids.
  • Plotting the ellipsoid of a Gaussian is an excellent method of visualizing it (at least in 2 or 3 dimensions).
  • Diagonalize \(\Sigma\), so that \(\Sigma = P \Lambda P^T\), where the columns of \(P\) are eigenvectors of \(\Sigma\) and \(\Lambda\) is a diagonal matrix of eigenvalues, where \(P\) are orthogonal matrices \(PP^T = P^TP = I\) (we can do this since covariance is positive definite).
  • The lengths of the major axes of the ellipse correspond to the inverse eigenvalues of Σ.
  • The major axes are rotated by \(P^T\)

Derivation

\begin{align} c &= (2\pi)^{-\frac{k}{2}}\det{\Sigma}^{-\frac{1}{2}}e^{-\frac{1}{2}(x-\mu)^T\Sigma^{-1}(x-\mu)} \\ c' &= e^{-\frac{1}{2}(x-\mu)^T\Sigma^{-1}(x-\mu)} \\ c'' &= (x - \mu)^T\Sigma^{-1}(x-\mu) \\ c'' &= (x - \mu)^T(P\Lambda P^T)^{-1}(x-\mu) \\ c'' &= (x-\mu)^TP\Lambda^{-1}P^T(x-\mu) \\ c'' &= y^T \Lambda^{-1} y, \end{align}

where

\begin{align} c' &=(2\pi)^{\frac{k}{2}}\det{\Sigma}^{1/2} \\ c'' &= 2\log{\frac{1}{c'}}\\ y &= P^T(x-\mu) \end{align}

Author: Matthew Elwin