BayesianVARMAX#

class pymc_extras.statespace.models.BayesianVARMAX(order: tuple[int, int], endog_names: list[str] | None = None, exog_state_names: list[str] | dict[str, list[str]] | None = None, stationary_initialization: bool = False, filter_type: str = 'standard', measurement_error: bool = False, verbose: bool = True, mode: str | Mode | None = None)[source]#

Vector AutoRegressive Moving Average with eXogenous Regressors

The VARMA model is a multivariate extension of the SARIMAX model. Given a set of timeseries \(\{x_t\}_{t=0}^T\), with \(x_t = \begin{bmatrix} x_{1,t} & x_{2,t} & \cdots & x_{k,t} \end{bmatrix}^T\), a VARMA models each series as a function of the histories of all series. Specifically, denoting the AR-MA order as (p, q), a VARMA can be written:

\[x_t = A_1 x_{t-1} + A_2 x_{t-2} + \cdots + A_p x_{t-p} + B_1 \varepsilon_{t-1} + \cdots + B_q \varepsilon_{t-q} + \varepsilon_t\]

Where \(\varepsilon_t = \begin{bmatrix} \varepsilon_{1,t} & \varepsilon_{2,t} & \cdots & \varepsilon_{k,t}\end{bmatrix}^T \sim N(0, \Sigma)\) is a vector of i.i.d stochastic innovations or shocks that drive intertemporal variation in the data. Matrices \(A_i, B_i\) are \(k \times k\) coefficient matrices:

\[\begin{split}A_i = \begin{bmatrix} \rho_{1,i,1} & \rho_{1,i,2} & \cdots & \rho_{1,i,k} \\ \rho_{2,i,1} & \rho_{2,i,2} & \cdots & \rho_{2,i,k} \\ \vdots & \vdots & \cdots & \vdots \\ \rho{k,i,1} & \rho_{k,i,2} & \cdots & rho_{k,i,k} \end{bmatrix}\end{split}\]

Internally, this representation is not used. Instead, the vectors \(x_t, x_{t-1}, \cdots, x_{t-p}, \varepsilon_{t-1}, \cdots, \varepsilon_{t-q}\) are concatenated into a single column vector of length k * (p+q), while the coefficients matrices are likewise concatenated into a single coefficient matrix, \(T\).

As the dimensionality of the VARMA system increases – either because there are a large number of timeseries included in the analysis, or because the order is large – the probability of sampling a stationary matrix \(T\) goes to zero. This has two implications for applied work. First, a non-stationary system will exhibit explosive behavior, potentially rending impulse response functions and long-term forecasts useless. Secondly, it is not possible to do stationary initialization. Stationary initialization significantly speeds up sampling, and should be preferred when possible.

Examples

The following code snippet estimates a VARMA(1, 1):

import pymc_extras.statespace as pmss
import pymc as pm

# Create VAR Statespace Model
bvar_mod = pmss.BayesianVARMAX(endog_names=data.columns, order=(2, 0),
                               stationary_initialization=False, measurement_error=False,
                               filter_type="standard", verbose=True)

# Unpack dims and coords
x0_dims, P0_dims, state_cov_dims, ar_dims = bvar_mod.param_dims.values()
coords = bvar_mod.coords

# Estimate PyMC model
with pm.Model(coords=coords) as var_mod:
    x0 = pm.Normal("x0", dims=x0_dims)
    P0_diag = pm.Gamma("P0_diag", alpha=2, beta=1, size=data.shape[1] * 2, dims=P0_dims[0])
    P0 = pm.Deterministic("P0", pt.diag(P0_diag), dims=P0_dims)

    state_chol, _, _ = pm.LKJCholeskyCov(
        "state_chol", eta=1, n=bvar_mod.k_posdef, sd_dist=pm.Exponential.dist(lam=1)
    )

    ar_params = pm.Normal("ar_params", mu=0, sigma=1, dims=ar_dims)
    state_cov = pm.Deterministic("state_cov", state_chol @ state_chol.T, dims=state_cov_dims)

    bvar_mod.build_statespace_graph(data)
    idata = pm.sample(nuts_sampler="numpyro")
__init__(order: tuple[int, int], endog_names: list[str] | None = None, exog_state_names: list[str] | dict[str, list[str]] | None = None, stationary_initialization: bool = False, filter_type: str = 'standard', measurement_error: bool = False, verbose: bool = True, mode: str | Mode | None = None)[source]#

Create a Bayesian VARMAX model.

Parameters:
  • order (tuple of (int, int)) – Number of autoregressive (AR) and moving average (MA) terms to include in the model. All terms up to the specified order are included. For restricted models, set zeros directly on the priors.

  • endog_names (list of str, optional) – Names of the endogenous variables being modeled. Used to generate names for the state and shock coords.

  • exog_state_names (list[str] or dict[str, list[str]], optional) – Names of the exogenous state variables. If a list, all endogenous variables will share the same exogenous variables. If a dict, keys should be the names of the endogenous variables, and values should be lists of the exogenous variable names for that endogenous variable. Endogenous variables not included in the dict will be assumed to have no exogenous variables. If None, no exogenous variables will be included.

  • stationary_initialization (bool, default False) –

    If true, the initial state and initial state covariance will not be assigned priors. Instead, their steady state values will be used. If False, the user is responsible for setting priors on the initial state and initial covariance.

    ..warning :: This option is very sensitive to the priors placed on the AR and MA parameters. If the model dynamics

    for a given sample are not stationary, sampling will fail with a “covariance is not positive semi-definite” error.

  • filter_type (str, default "standard") – The type of Kalman Filter to use. Options are “standard”, “single”, “univariate”, “steady_state”, and “cholesky”. See the docs for kalman filters for more details.

  • measurement_error (bool, default True) – If true, a measurement error term is added to the model.

  • verbose (bool, default True) – If true, a message will be logged to the terminal explaining the variable names, dimensions, and supports.

  • mode (str or Mode, optional) –

    Pytensor compile mode, used in auxiliary sampling methods such as sample_conditional_posterior and forecast. The mode does not effect calls to pm.sample.

    Regardless of whether a mode is specified, it can always be overwritten via the compile_kwargs argument to all sampling methods.

Methods

__init__(order[, endog_names, ...])

Create a Bayesian VARMAX model.

add_default_priors()

Add default priors to the active PyMC model context

build_statespace_graph(data[, ...])

Given a parameter vector theta, constructs the full computational graph describing the state space model and the associated log probability of the data.

default_coords()

forecast(idata[, start, periods, end, ...])

Generate forecasts of state space model trajectories into the future.

impulse_response_function(idata[, n_steps, ...])

Generate impulse response functions (IRF) from state space model dynamics.

make_and_register_data(name, shape[, dtype])

Helper function to create a pytensor symbolic variable and register it in the _name_to_data dictionary

make_and_register_variable(name[, shape, dtype])

Helper function to create a pytensor symbolic variable and register it in the _name_to_variable dictionary

make_symbolic_graph()

The purpose of the make_symbolic_graph function is to hide tedious parameter allocations from the user.

sample_conditional_posterior(idata[, ...])

Sample from the conditional posterior; that is, given parameter draws from the posterior distribution, compute Kalman filtered trajectories.

sample_conditional_prior(idata[, ...])

Sample from the conditional prior; that is, given parameter draws from the prior distribution, compute Kalman filtered trajectories.

sample_filter_outputs(idata[, ...])

sample_statespace_matrices(idata, matrix_names)

Draw samples of requested statespace matrices from provided idata

sample_unconditional_posterior(idata[, ...])

Draw unconditional sample trajectories according to state space dynamics, using random samples from the posterior distribution over model parameters.

sample_unconditional_prior(idata[, steps, ...])

Draw unconditional sample trajectories according to state space dynamics, using random samples from the prior distribution over model parameters.

set_coords()

Provide coordinates to the model.

set_data_info()

Provide data_info metadata to the model.

set_parameters()

Provides parameter metadata to the model.

set_shocks()

Provide shock metadata to the model.

set_states()

Provide state metadata to the model.

unpack_statespace()

Helper function to quickly obtain all statespace matrices in the standard order.

Attributes

coords

PyMC model coordinates

data_info

Information about Data variables that need to be declared in the PyMC model block.

data_names

Names of data variables expected by the model.

default_priors

Dictionary of parameter names and callable functions to construct default priors for the model

n_timesteps

Symbolic placeholder for the number of time steps in the data.

observed_states

A k_endog length list of strings, associated with the model's observed states

param_dims

Dictionary of named dimensions for each model parameter

param_info

Information about parameters needed to declare priors

param_names

Names of model parameters

shock_names

A k_posdef length list of strings, associated with the model's shock processes

state_names

A k_states length list of strings, associated with the model's hidden states