pymc.smc.sample_smc#

pymc.smc.sample_smc(draws=2000, kernel=<class 'pymc.smc.smc.IMH'>, *, start=None, model=None, random_seed=None, chains=None, cores=None, compute_convergence_checks=True, return_inferencedata=True, idata_kwargs=None, progressbar=True, **kernel_kwargs)[source]#

Sequential Monte Carlo based sampling.

Parameters
draws: int

The number of samples to draw from the posterior (i.e. last stage). And also the number of independent chains. Defaults to 2000.

kernel: SMC Kernel used. Defaults to pm.smc.IMH (Independent Metropolis Hastings)
start: dict, or array of dict

Starting point in parameter space. It should be a list of dict with length chains. When None (default) the starting point is sampled from the prior distribution.

model: Model (optional if in ``with`` context)).
random_seed: int

random seed

chainsint

The number of chains to sample. Running independent chains is important for some convergence statistics. If None (default), then set to either cores or 2, whichever is larger.

coresint

The number of chains to run in parallel. If None, set to the number of CPUs in the system.

compute_convergence_checksbool

Whether to compute sampler statistics like Gelman-Rubin and effective_n. Defaults to True.

return_inferencedatabool, default=True

Whether to return the trace as an arviz.InferenceData (True) object or a MultiTrace (False) Defaults to True.

idata_kwargsdict, optional

Keyword arguments for pymc.to_inference_data()

progressbarbool, optional default=True

Whether or not to display a progress bar in the command line.

**kernel_kwargs: keyword arguments passed to the SMC kernel.
The default IMH kernel takes the following keywords:
threshold: float

Determines the change of beta from stage to stage, i.e. indirectly the number of stages, the higher the value of threshold the higher the number of stages. Defaults to 0.5. It should be between 0 and 1.

n_steps: int

The number of steps of each Markov Chain. If tune_steps == True n_steps will be used for the first stage and for the others it will be determined automatically based on the acceptance rate and p_acc_rate, the max number of steps is n_steps.

tune_steps: bool

Whether to compute the number of steps automatically or not. Defaults to True

p_acc_rate: float

Used to compute n_steps when tune_steps == True. The higher the value of p_acc_rate the higher the number of steps computed automatically. Defaults to 0.85. It should be between 0 and 1.

Keyword arguments for other kernels should be checked in the respective docstrings

Notes

SMC works by moving through successive stages. At each stage the inverse temperature \(\beta\) is increased a little bit (starting from 0 up to 1). When \(\beta\) = 0 we have the prior distribution and when \(\beta\) =1 we have the posterior distribution. So in more general terms we are always computing samples from a tempered posterior that we can write as:

\[p(\theta \mid y)_{\beta} = p(y \mid \theta)^{\beta} p(\theta)\]

A summary of the algorithm is:

  1. Initialize \(\beta\) at zero and stage at zero.

  2. Generate N samples \(S_{\beta}\) from the prior (because when :math beta = 0 the tempered posterior is the prior).

  3. Increase \(\beta\) in order to make the effective sample size equals some predefined value (we use \(Nt\), where \(t\) is 0.5 by default).

  4. Compute a set of N importance weights W. The weights are computed as the ratio of the likelihoods of a sample at stage i+1 and stage i.

  5. Obtain \(S_{w}\) by re-sampling according to W.

  6. Use W to compute the mean and covariance for the proposal distribution, a MVNormal.

  7. For stages other than 0 use the acceptance rate from the previous stage to estimate n_steps.

  8. Run N independent Metropolis-Hastings (IMH) chains (each one of length n_steps), starting each one from a different sample in \(S_{w}\). Samples are IMH as the proposal mean is the of the previous posterior stage and not the current point in parameter space.

  9. Repeat from step 3 until \(\beta \ge 1\).

  10. The final result is a collection of N samples from the posterior.

References

Minson2013

Minson, S. E. and Simons, M. and Beck, J. L., (2013), Bayesian inversion for finite fault earthquake source models I- Theory and algorithm. Geophysical Journal International, 2013, 194(3), pp.1701-1726, link

Ching2007

Ching, J. and Chen, Y. (2007). Transitional Markov Chain Monte Carlo Method for Bayesian Model Updating, Model Class Selection, and Model Averaging. J. Eng. Mech., 10.1061/(ASCE)0733-9399(2007)133:7(816), 816-832. link