pymc.MatrixNormal#

class pymc.MatrixNormal(name, *args, rng=None, dims=None, initval=None, observed=None, total_size=None, transform=UNSET, default_transform=UNSET, **kwargs)[source]#

Matrix-valued normal distribution.

\[f(x \mid \mu, U, V) = \frac{1}{(2\pi^{m n} |U|^n |V|^m)^{1/2}} \exp\left\{ -\frac{1}{2} \mathrm{Tr}[ V^{-1} (x-\mu)^{\prime} U^{-1} (x-\mu)] \right\}\]

Support	\(x \in \mathbb{R}^{m \times n}\)
Mean	\(\mu\)
Row Variance	\(U\)
Column Variance	\(V\)

Parameters:

mutensor_like of float: Array of means. Must be broadcastable with the random variable X such that the shape of mu + X is (M, N).
rowcov(M, M) tensor_like of float, optional: Among-row covariance matrix. Defines variance within columns. Exactly one of rowcov or rowchol is needed.
rowchol(M, M) tensor_like of float, optional: Cholesky decomposition of among-row covariance matrix. Exactly one of rowcov or rowchol is needed.
colcov(N, N) tensor_like of float, optional: Among-column covariance matrix. If rowcov is the identity matrix, this functions as cov in MvNormal. Exactly one of colcov or colchol is needed.
colchol(N, N) tensor_like of float, optional: Cholesky decomposition of among-column covariance matrix. Exactly one of colcov or colchol is needed.

Examples

Define a matrixvariate normal variable for given row and column covariance matrices.

import pymc as pm
import numpy as np
import pytensor.tensor as pt

with pm.Model() as model:
    colcov = np.array([[1.0, 0.5], [0.5, 2]])
    rowcov = np.array([[1, 0, 0], [0, 4, 0], [0, 0, 16]])
    m = rowcov.shape[0]
    n = colcov.shape[0]
    mu = np.zeros((m, n))
    vals = pm.MatrixNormal("vals", mu=mu, colcov=colcov, rowcov=rowcov)

Above, the ith row in vals has a variance that is scaled by 4^i. Alternatively, row or column cholesky matrices could be substituted for either covariance matrix. The MatrixNormal is quicker way compute MvNormal(mu, np.kron(rowcov, colcov)) that takes advantage of kronecker product properties for inversion. For example, if draws from MvNormal had the same covariance structure, but were scaled by different powers of an unknown constant, both the covariance and scaling could be learned as follows (see the docstring of LKJCholeskyCov for more information about this)

# Setup data
true_colcov = np.array(
    [
        [1.0, 0.5, 0.1],
        [0.5, 1.0, 0.2],
        [0.1, 0.2, 1.0],
    ]
)
m = 3
n = true_colcov.shape[0]
true_scale = 3
true_rowcov = np.diag([true_scale ** (2 * i) for i in range(m)])
mu = np.zeros((m, n))
true_kron = np.kron(true_rowcov, true_colcov)
data = np.random.multivariate_normal(mu.flatten(), true_kron)
data = data.reshape(m, n)

with pm.Model() as model:
    # Setup right cholesky matrix
    sd_dist = pm.HalfCauchy.dist(beta=2.5, shape=3)
    colchol, _, _ = pm.LKJCholeskyCov("colchol", n=3, eta=2, sd_dist=sd_dist)
    # Setup left covariance matrix
    scale = pm.LogNormal("scale", mu=np.log(true_scale), sigma=0.5)
    rowcov = pt.diag([scale ** (2 * i) for i in range(m)])

    vals = pm.MatrixNormal("vals", mu=mu, colchol=colchol, rowcov=rowcov, observed=data)

Methods

MatrixNormal.dist(mu[, rowcov, rowchol, ...])

Create a tensor variable corresponding to the cls distribution.