Jointly Gaussian Distributions

Overview

Also called Multivariate Normal Distrubtions (or Multi-Normal)
\(X(\xi) \sim \mathcal{N}(\mu, \Sigma)\),
- \(X(\xi)\) is a jointly Gaussian random variable, (\(\xi\) is an element of the sample space)
- \(\mu \in \mathbb{R}^k\) is the mean. \(\mu = E[X]\) (\(E[]\) is expected value. I will also drop dependence on the sample space).
- \(\Sigma \in \mathbb{R}^{k \times k}\) is the covariance. \(\Sigma = E[(X - \mu)(X-\mu)^T]\), and is positive definite.
The probability density function (pdf) for a jointly Gaussian random variable is
\begin{equation}\label{eq:mvgauss} f_X(x) = (2\pi)^{-\frac{k}{2}}(\det{\Sigma})^{-\frac{1}{2}}e^{-\frac{1}{2}(x-\mu)^T\Sigma^{-1}(x-\mu)} \end{equation}

Linear Transform

Let \(Y = A X + b\), \(X \sim \mathcal{N}(\mu, \Sigma)\), where \(A \in \mathbb{R}^{n \times n}\) is invertable
Then \(Y \sim \mathcal{N}(A \mu + b, A \Sigma A^T)\)

Sketch of a Proof

Use the multi-variable change of variables formula to find the new density:
\begin{equation} f_Y(y) = \frac{f_x(x)}{\det A} \end{equation}
Show that the new density has the same form as multi-variate Gaussian equation~\eqref{eq:mvgauss}.

Conditional Distribution

Let \(X \sim \mathcal{N}(\mu, \Sigma)\).
Partition:
\begin{align} X &= \begin{bmatrix}X_1 \\ X_2\end{bmatrix} \\ \mu &= \begin{bmatrix}\mu_1 \\ \mu_2\end{bmatrix}\\ \Sigma &= \begin{bmatrix}\Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22}\end{bmatrix} \end{align}
What is the conditional probability distribution \(f_{X_1}(x_1 | X_2 = a)\)?
Let \(Y \sim X_1 | X_2\) (\(Y\) is a random variable with pdf corresponding to the conditional probability of \(X_1\) given \(X_2 = a\)):
\begin{equation} Y \sim \mathcal{N}(\mu_1 + \Sigma_{12}\Sigma_{22}^{-1}(a - \mu_2), \Sigma_{11}- \Sigma_{12}\Sigma_{22}^{-1}\Sigma_{21}) \end{equation}

Random Number Generation in C++

If you intend to use the C++ standard library for generating random numbers, you can use the following singleton pattern to ensure that you are only seeding the random number generator once in your process.

#include<random>
 std::mt19937 & get_random()
 {
     // static variables inside a function are created once and persist for the remainder of the program
     static std::random_device rd{}; 
     static std::mt19937 mt{rd()};
     // we return a reference to the pseudo-random number genrator object. This is always the
     // same object every time get_random is called
     return mt;
 }

 // To generate a gaussian variable:
 std::normal_distribution<> d(mean, variance);
 d(get_random());

Armadillo also has random number generation functions
- See, for example, http://arma.sourceforge.net/docs.html#randu_randn_standalone
- These random number generators are different then the built-in C++ method

Multivariate Gaussian

An algorithm for generating multivariate Gaussian noise, given mean \(m \in \mathbb{R}^n\) and variance \(Q \in \mathbb{R}^{n \times n}\) is as follows:
1. Let \(L\) be the (lower) Cholesky decomposition of \(Q\) (that is \(L\) is lower triangular such that \(L L^T = Q\).
  - Use the chol() function from armadillo or the llt() function from eigen
2. Let \(u \in \mathbb{R}^n\) be generated such that each entry is independently drawn from a Gaussian distribution with mean 0 and variance 1.
3. Then, \(v = m + Lu\) will have the desired distribution

Plotting Contours

The level-sets of a Gaussian distribution (\(f_X(x) = c\)) form ellipsoids.
Plotting the ellipsoid of a Gaussian is an excellent method of visualizing it (at least in 2 or 3 dimensions).
Diagonalize \(\Sigma\), so that \(\Sigma = P \Lambda P^T\), where the columns of \(P\) are eigenvectors of \(\Sigma\) and \(\Lambda\) is a diagonal matrix of eigenvalues, where \(P\) are orthogonal matrices \(PP^T = P^TP = I\) (we can do this since covariance is positive definite).
The lengths of the major axes of the ellipse correspond to the inverse eigenvalues of Σ.
The major axes are rotated by \(P^T\)

Derivation

\begin{align} c &= (2\pi)^{-\frac{k}{2}}\det{\Sigma}^{-\frac{1}{2}}e^{-\frac{1}{2}(x-\mu)^T\Sigma^{-1}(x-\mu)} \\ c' &= e^{-\frac{1}{2}(x-\mu)^T\Sigma^{-1}(x-\mu)} \\ c'' &= (x - \mu)^T\Sigma^{-1}(x-\mu) \\ c'' &= (x - \mu)^T(P\Lambda P^T)^{-1}(x-\mu) \\ c'' &= (x-\mu)^TP\Lambda^{-1}P^T(x-\mu) \\ c'' &= y^T \Lambda^{-1} y, \end{align}

where

\begin{align} c' &=(2\pi)^{\frac{k}{2}}\det{\Sigma}^{1/2} \\ c'' &= 2\log{\frac{1}{c'}}\\ y &= P^T(x-\mu) \end{align}