PytensorRepresentation#

class pymc_experimental.statespace.core.PytensorRepresentation(k_endog: int, k_states: int, k_posdef: int, design: ndarray | None = None, obs_intercept: ndarray | None = None, obs_cov=None, transition=None, state_intercept=None, selection=None, state_cov=None, initial_state=None, initial_state_cov=None)[source]#

Core class to hold all objects required by linear gaussian statespace models

Notation for the linear statespace model is taken from [1], while the specific implementation is adapted from the statsmodels implementation: statsmodels/statsmodels described in [2].

Parameters:

k_endog (int) – Number of observed states (called “endogeous states” in statsmodels)
k_states (int) – Number of hidden states
k_posdef (int) – Number of states that have exogenous shocks; also the rank of the selection matrix R.
design (ArrayLike, optional) – Design matrix, denoted ‘Z’ in [1].
obs_intercept (ArrayLike, optional) – Constant vector in the observation equation, denoted ‘d’ in [1]. Currently not used.
obs_cov (ArrayLike, optional) – Covariance matrix for multivariate-normal errors in the observation equation. Denoted ‘H’ in [1].
transition (ArrayLike, optional) – Transition equation that updates the hidden state between time-steps. Denoted ‘T’ in [1].
state_intercept (ArrayLike, optional) – Constant vector for the observation equation, denoted ‘c’ in [1]. Currently not used.
selection (ArrayLike, optional) – Selection matrix that matches shocks to hidden states, denoted ‘R’ in [1]. This is the identity matrix when k_posdef = k_states.
state_cov (ArrayLike, optional) – Covariance matrix for state equations, denoted ‘Q’ in [1]. Null matrix when there is no observation noise.
initial_state (ArrayLike, optional) – Experimental setting to allow for Bayesian estimation of the initial state, denoted alpha_0 in [1]. Default It should potentially be removed in favor of the closed-form diffuse initialization.
initial_state_cov (ArrayLike, optional) – Experimental setting to allow for Bayesian estimation of the initial state, denoted P_0 in [1]. Default It should potentially be removed in favor of the closed-form diffuse initialization.

Notes

A linear statespace system is defined by two equations:

\[\begin{split}\begin{align} x_t &= A_t x_{t-1} + c_t + R_t \varepsilon_t \tag{1} \\ y_t &= Z_t x_t + d_t + \eta_t \tag{2} \\ \end{align}\end{split}\]

Where \(\{x_t\}_{t=0}^T\) is a trajectory of hidden states, and \(\{y_t\}_{t=0}^T\) is a trajectory of observable states. Equation 1 is known as the “state transition equation”, while describes how the system evolves over time. Equation 2 is the “observation equation”, and maps the latent state processes to observed data. The system is Gaussian when the innovations, \(\varepsilon_t\), and the measurement errors, \(\eta_t\), are normally distributed. The definition is completed by specification of these distributions, as well as an initial state distribution:

\[\begin{split}\begin{align} \varepsilon_t &\sim N(0, Q_t) \tag{3} \\ \eta_t &\sim N(0, H_t) \tag{4} \\ x_0 &\sim N(\bar{x}_0, P_0) \tag{5} \end{align}\end{split}\]

The 9 matrices that form equations 1 to 5 are summarized in the table below. We call \(N\) the number of observations, \(m\) the number of hidden states, \(p\) the number of observed states, and \(r\) the number of innovations.

Name	Symbol	Shape
Initial hidden state mean	\(x_0\)	\(m \times 1\)
Initial hidden state covariance	\(P_0\)	\(m \times m\)
Hidden state vector intercept	\(c_t\)	\(m \times 1\)
Observed state vector intercept	\(d_t\)	\(p \times 1\)
Transition matrix	\(T_t\)	\(m \times m\)
Design matrix	\(Z_t\)	\(p \times m\)
Selection matrix	\(R_t\)	\(m \times r\)
Observation noise covariance	\(H_t\)	\(p \times p\)
Hidden state innovation covariance	\(Q_t\)	\(r \times r\)

The shapes listed above are the core shapes, but in the general case all of these matrices (except for \(x_0\) and \(P_0\)) can be time varying. In this case, a time dimension of shape \(n\), equal to the number of observations, can be added.

Warning

The time dimension is used as a batch dimension during kalman filtering, and must thus always be the leftmost dimension.

The purpose of this class is to store these matrices, as well as to allow users to easily index into them. Matrices are stored as pytensor TensorVariables of known shape. Shapes are always accessible via the .type.shape method, which should never return None. Matrices can be accessed via normal numpy array slicing after first indexing by the name of the desired array. The time dimension is stored on the far left, and is automatically sliced away unless specifically requested by the user. See the examples for details.

Examples

from pymc_experimental.statespace.core.representation import PytensorRepresentation
ssm = PytensorRepresentation(k_endog=1, k_states=3, k_posdef=1)

# Access matrices by their names
print(ssm['transition'].type.shape)
>>> (3, 3)

# Slice a matrices
print(ssm['observation_cov', 0, 0].eval())
>>> 0.0

# Set elements in a slice of a matrix
ssm['design', 0, 0] = 1
print(ssm['design'].eval())
>>> np.array([[1, 0, 0]])

# Setting an entire matrix is also permitted. If you set a time dimension, it must be the first dimension, and
# the "core" dimensions must agree with those set when the ssm object was instantiated.
ssm['obs_intercept'] = np.arange(10).reshape(10, 1) # 10 timesteps
print(ssm['obs_intercept'].eval())
>>> np.array([[1.], [2.], [3.], [4.], [5.], [6.], [7.], [8.], [9.]])

References

__init__(k_endog: int, k_states: int, k_posdef: int, design: ndarray | None = None, obs_intercept: ndarray | None = None, obs_cov=None, transition=None, state_intercept=None, selection=None, state_cov=None, initial_state=None, initial_state_cov=None) → None[source]#

Methods

__init__(k_endog, k_states, k_posdef[, ...])