R2D2M2CP#

pymc_experimental.distributions.R2D2M2CP(name: str, output_sigma: Variable | Sequence[Variable] | ArrayLike, input_sigma: Variable | Sequence[Variable] | ArrayLike, *, dims: Sequence[str], r2: Variable | Sequence[Variable] | ArrayLike, variables_importance: Variable | Sequence[Variable] | ArrayLike | None = None, variance_explained: Variable | Sequence[Variable] | ArrayLike | None = None, importance_concentration: Variable | Sequence[Variable] | ArrayLike | None = None, r2_std: Variable | Sequence[Variable] | ArrayLike | None = None, positive_probs: Variable | Sequence[Variable] | ArrayLike | None = 0.5, positive_probs_std: Variable | Sequence[Variable] | ArrayLike | None = None, centered: bool = False) R2D2M2CPOut[source]#

R2D2M2CP Prior.

Parameters:
  • name (str) – Name for the distribution

  • output_sigma (Tensor) – Output standard deviation

  • input_sigma (Tensor) – Input standard deviation

  • dims (Union[str, Sequence[str]]) – Dims for the distribution

  • r2 (Tensor) – \(R^2\) estimate

  • variables_importance (Tensor, optional) – Optional estimate for variables importance, positive, by default None

  • variance_explained (Tensor, optional) – Alternative estimate for variables importance which is point estimate of variance explained, should sum up to one, by default None

  • importance_concentration (Tensor, optional) – Confidence around variance explained or variable importance estimate

  • r2_std (Tensor, optional) – Optional uncertainty over \(R^2\), by default None

  • positive_probs (Tensor, optional) – Optional probability of variables contribution to be positive, by default 0.5

  • positive_probs_std (Tensor, optional) – Optional uncertainty over effect direction probability, by default None

  • centered (bool, optional) – Centered or Non-Centered parametrization of the distribution, by default Non-Centered. Advised to check both

Returns:

Output variance (sigma squared) is split in residual variance and explained variance.

Return type:

residual_sigma, coefficients

Raises:

TypeError – If parametrization is wrong.

Notes

The R2D2M2CP prior is a modification of R2D2M2 prior.

Examples

Here are arguments explained in a synthetic example

Warning

To use the prior in a linear regression

  • make sure \(X\) is centered around zero

  • intercept represents prior predictive mean when \(X\) is centered

  • setting named dims is required

import pymc_experimental as pmx
import pymc as pm
import numpy as np
X = np.random.randn(10, 3)
b = np.random.randn(3)
y = X @ b + np.random.randn(10) * 0.04 + 5
with pm.Model(coords=dict(variables=["a", "b", "c"])) as model:
    eps, beta = pmx.distributions.R2D2M2CP(
        "beta",
        y.std(),
        X.std(0),
        dims="variables",
        # NOTE: global shrinkage
        r2=0.8,
        # NOTE: if you are unsure about r2
        r2_std=0.2,
        # NOTE: if you know where a variable should go
        # if you do not know, leave as 0.5
        positive_probs=[0.8, 0.5, 0.1],
        # NOTE: if you have different opinions about
        # where a variable should go.
        # NOTE: if you put 0.5 previously,
        # just put 0.1 there, but other
        # sigmas should work fine too
        positive_probs_std=[0.3, 0.1, 0.2],
        # NOTE: variable importances are relative to each other,
        # but larget numbers put "more" weight in the relation
        # use
        # * 1-10 for small confidence
        # * 10-30 for moderate confidence
        # * 30+ for high confidence
        # EXAMPLE:
        # "a" - is likely to be useful
        # "b" - no idea if it is useful
        # "c" - a must have in the relation
        variables_importance=[10, 1, 34],
        # NOTE: try both
        centered=True
    )
    # intercept prior centering should be around prior predictive mean
    intercept = y.mean()
    # regressors should be centered around zero
    Xc = X - X.mean(0)
    obs = pm.Normal("obs", intercept + Xc @ beta, eps, observed=y)

There can be special cases by choosing specific set of arguments

Here the prior distribution of beta is Normal(0, y.std() * r2 ** .5)

with pm.Model(coords=dict(variables=["a", "b", "c"])) as model:
    eps, beta = pmx.distributions.R2D2M2CP(
        "beta",
        y.std(),
        X.std(0),
        dims="variables",
        # NOTE: global shrinkage
        r2=0.8,
        # NOTE: if you are unsure about r2
        r2_std=0.2,
        # NOTE: if you know where a variable should go
        # if you do not know, leave as 0.5
        centered=False
    )
    # intercept prior centering should be around prior predictive mean
    intercept = y.mean()
    # regressors should be centered around zero
    Xc = X - X.mean(0)
    obs = pm.Normal("obs", intercept + Xc @ beta, eps, observed=y)

It is fine to leave some of the _std arguments unspecified. You can also specify only positive_probs, and all the variables are assumed to explain same amount of variance (same importance)

with pm.Model(coords=dict(variables=["a", "b", "c"])) as model:
    eps, beta = pmx.distributions.R2D2M2CP(
        "beta",
        y.std(),
        X.std(0),
        dims="variables",
        # NOTE: global shrinkage
        r2=0.8,
        # NOTE: if you are unsure about r2
        r2_std=0.2,
        # NOTE: if you know where a variable should go
        # if you do not know, leave as 0.5
        positive_probs=[0.8, 0.5, 0.1],
        # NOTE: try both
        centered=True
    )
    intercept = y.mean()
    obs = pm.Normal("obs", intercept + X @ beta, eps, observed=y)

Notes

To reference R2D2M2CP implementation, you can use the following bibtex entry:

@misc{pymc-experimental-r2d2m2cp,
    title = {pymc-devs/pymc-experimental: {P}ull {R}equest 137, {R2D2M2CP}},
    url = {https://github.com/pymc-devs/pymc-experimental/pull/137},
    author = {Max Kochurov},
    howpublished = {GitHub},
    year = {2023}
}