Simulator#

class pymc.Simulator(name, *args, **kwargs)[source]#

Simulator distribution, used for Approximate Bayesian Inference (ABC) with Sequential Monte Carlo (SMC) sampling via sample_smc().

Simulator distributions have a stochastic pseudo-loglikelihood defined by a distance metric between the observed and simulated data, and tweaked by a hyper-parameter epsilon.

Parameters:

fncallable()

Python random simulator function. Should expect the following signature (rng, arg1, arg2, ... argn, size), where rng is a numpy.random.Generator and size defines the size of the desired sample.

*unnamed_paramslist of TensorVariable

Parameters used by the Simulator random function. Each parameter can be passed by order after fn, for example param1, param2, ..., paramN. params can also be passed with keyword argument “params”.

paramslist of TensorVariable

Keyword form of ‘’unnamed_params’’. One of unnamed_params or params must be provided. If passed both unnamed_params and params, an error is raised.

distancePyTensor Op, callable() or str, default “gaussian”

Distance function. Available options are "gaussian", "laplace", "kullback_leibler" or a user defined function (or PyTensor_Op) that takes epsilon, the summary statistics of observed_data and the summary statistics of simulated_data as input.

gaussian: \(-0.5 \left(\left(\frac{xo - xs}{\epsilon}\right)^2\right)\)

laplace: \(-{\left(\frac{|xo - xs|}{\epsilon}\right)}\)

kullback_leibler: \(\frac{d}{n} \frac{1}{\epsilon} \sum_i^n -\log \left( \frac{{\nu_d}_i}{{\rho_d}_i} \right) + \log_r\) [1]

distance="gaussian" + sum_stat="sort" is equivalent to the 1D 2-wasserstein distance

distance="laplace" + sum_stat="sort" is equivalent to the the 1D 1-wasserstein distance

sum_statPyTensor Op, callable() or str, default “identity”

Summary statistic function. Available options are "identity", "sort", "mean", "median". If a callable (or PyTensor_Op) is defined, it should return a 1d numpy array (or PyTensor vector).

epsilontensor_like of float, default 1.0

Scaling parameter for the distance functions. It should be a float or an array of the same size of the output of sum_stat.

ndim_suppint, default 0

Number of dimensions of the SimulatorRV (0 for scalar, 1 for vector, etc.)

ndims_paramslist of int, optional

Number of minimum dimensions of each parameter of the RV. For example, if the Simulator accepts two scalar inputs, it should be [0, 0]. Default to list of 0 with length equal to the number of parameters.

class_namestr, optional

Suffix name for the RandomVariable class which will wrap the Simulator methods.

References

[1]

Pérez-Cruz, F. (2008, July). Kullback-Leibler divergence estimation of continuous distributions. In 2008 IEEE international symposium on information theory (pp. 1666-1670). IEEE. link

Examples

def simulator_fn(rng, loc, scale, size):
    return rng.normal(loc, scale, size=size)

with pm.Model() as m:
    loc = pm.Normal("loc", 0, 1)
    scale = pm.HalfNormal("scale", 1)
    simulator = pm.Simulator("simulator", simulator_fn, loc, scale, observed=data)
    idata = pm.sample_smc()

Methods

Simulator.dist(fn, *unnamed_params[, ...])

Creates a tensor variable corresponding to the cls distribution.