Jointly Gaussian Distributions
Overview
- Also called Multivariate Normal Distrubtions (or Multi-Normal)
- \(X(\xi) \sim \mathcal{N}(\mu, \Sigma)\),
- \(X(\xi)\) is a jointly Gaussian random variable, (\(\xi\) is an element of the sample space)
- \(\mu \in \mathbb{R}^k\) is the mean. \(\mu = E[X]\) (\(E[]\) is expected value. I will also drop dependence on the sample space).
- \(\Sigma \in \mathbb{R}^{k \times k}\) is the covariance. \(\Sigma = E[(X - \mu)(X-\mu)^T]\), and is positive definite.
The probability density function (pdf) for a jointly Gaussian random variable is
\begin{equation}\label{eq:mvgauss} f_X(x) = (2\pi)^{-\frac{k}{2}}(\det{\Sigma})^{-\frac{1}{2}}e^{-\frac{1}{2}(x-\mu)^T\Sigma^{-1}(x-\mu)} \end{equation}
Linear Transform
- Let \(Y = A X + b\), \(X \sim \mathcal{N}(\mu, \Sigma)\), where \(A \in \mathbb{R}^{n \times n}\) is invertable
- Then \(Y \sim \mathcal{N}(A \mu + b, A \Sigma A^T)\)
Sketch of a Proof
Use the multi-variable change of variables formula to find the new density:
\begin{equation} f_Y(y) = \frac{f_x(x)}{\det A} \end{equation}- Show that the new density has the same form as multi-variate Gaussian equation~\eqref{eq:mvgauss}.
Conditional Distribution
- Let \(X \sim \mathcal{N}(\mu, \Sigma)\).
Partition:
\begin{align} X &= \begin{bmatrix}X_1 \\ X_2\end{bmatrix} \\ \mu &= \begin{bmatrix}\mu_1 \\ \mu_2\end{bmatrix}\\ \Sigma &= \begin{bmatrix}\Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22}\end{bmatrix} \end{align}- What is the conditional probability distribution \(f_{X_1}(x_1 | X_2 = a)\)?
Let \(Y \sim X_1 | X_2\) (\(Y\) is a random variable with pdf corresponding to the conditional probability of \(X_1\) given \(X_2 = a\)):
\begin{equation} Y \sim \mathcal{N}(\mu_1 + \Sigma_{12}\Sigma_{22}^{-1}(a - \mu_2), \Sigma_{11}- \Sigma_{12}\Sigma_{22}^{-1}\Sigma_{21}) \end{equation}
Random Number Generation in C++
If you intend to use the
C++
standard library for generating random numbers, you can use the followingsingleton
pattern to ensure that you are only seeding the random number generator once in your process.#include<random> std::mt19937 & get_random() { // static variables inside a function are created once and persist for the remainder of the program static std::random_device rd{}; static std::mt19937 mt{rd()}; // we return a reference to the pseudo-random number genrator object. This is always the // same object every time get_random is called return mt; } // To generate a gaussian variable: std::normal_distribution<> d(mean, variance); d(get_random());
- Armadillo also has random number generation functions
- See, for example, http://arma.sourceforge.net/docs.html#randu_randn_standalone
- These random number generators are different then the built-in C++ method
Multivariate Gaussian
- An algorithm for generating multivariate Gaussian noise, given mean \(m \in \mathbb{R}^n\) and variance \(Q \in \mathbb{R}^{n \times n}\) is as follows:
- Let \(L\) be the (lower) Cholesky decomposition of \(Q\)
(that is \(L\) is lower triangular such that \(L L^T = Q\).
- Use the
chol()
function from armadillo or thellt()
function from eigen
- Use the
- Let \(u \in \mathbb{R}^n\) be generated such that each entry is independently drawn from a Gaussian distribution with mean 0 and variance 1.
- Then, \(v = m + Lu\) will have the desired distribution
- Let \(L\) be the (lower) Cholesky decomposition of \(Q\)
(that is \(L\) is lower triangular such that \(L L^T = Q\).
Plotting Contours
- The level-sets of a Gaussian distribution (\(f_X(x) = c\)) form ellipsoids.
- Plotting the ellipsoid of a Gaussian is an excellent method of visualizing it (at least in 2 or 3 dimensions).
- Diagonalize \(\Sigma\), so that \(\Sigma = P \Lambda P^T\), where the columns of \(P\) are eigenvectors of \(\Sigma\) and \(\Lambda\) is a diagonal matrix of eigenvalues, where \(P\) are orthogonal matrices \(PP^T = P^TP = I\) (we can do this since covariance is positive definite).
- The lengths of the major axes of the ellipse correspond to the inverse eigenvalues of Σ.
- The major axes are rotated by \(P^T\)
Derivation
\begin{align}
c &= (2\pi)^{-\frac{k}{2}}\det{\Sigma}^{-\frac{1}{2}}e^{-\frac{1}{2}(x-\mu)^T\Sigma^{-1}(x-\mu)} \\
c' &= e^{-\frac{1}{2}(x-\mu)^T\Sigma^{-1}(x-\mu)} \\
c'' &= (x - \mu)^T\Sigma^{-1}(x-\mu) \\
c'' &= (x - \mu)^T(P\Lambda P^T)^{-1}(x-\mu) \\
c'' &= (x-\mu)^TP\Lambda^{-1}P^T(x-\mu) \\
c'' &= y^T \Lambda^{-1} y,
\end{align}
where
\begin{align} c' &=(2\pi)^{\frac{k}{2}}\det{\Sigma}^{1/2} \\ c'' &= 2\log{\frac{1}{c'}}\\ y &= P^T(x-\mu) \end{align}