pymc.smc.sample_smc#
- pymc.smc.sample_smc(draws=2000, kernel=<class 'pymc.smc.smc.IMH'>, *, start=None, model=None, random_seed=None, chains=None, cores=None, compute_convergence_checks=True, return_inferencedata=True, idata_kwargs=None, progressbar=True, **kernel_kwargs)[source]#
Sequential Monte Carlo based sampling.
- Parameters
- draws: int
The number of samples to draw from the posterior (i.e. last stage). And also the number of independent chains. Defaults to 2000.
- kernel: SMC Kernel used. Defaults to pm.smc.IMH (Independent Metropolis Hastings)
- start: dict, or array of dict
Starting point in parameter space. It should be a list of dict with length chains. When None (default) the starting point is sampled from the prior distribution.
- model: Model (optional if in ``with`` context)).
- random_seed: int
random seed
- chains
int
The number of chains to sample. Running independent chains is important for some convergence statistics. If
None
(default), then set to eithercores
or 2, whichever is larger.- cores
int
The number of chains to run in parallel. If
None
, set to the number of CPUs in the system.- compute_convergence_checksbool
Whether to compute sampler statistics like
R hat
andeffective_n
. Defaults toTrue
.- return_inferencedatabool, default=True
Whether to return the trace as an
arviz.InferenceData
(True) object or a MultiTrace (False) Defaults toTrue
.- idata_kwargs
dict
, optional Keyword arguments for
pymc.to_inference_data()
- progressbarbool, optional default=True
Whether or not to display a progress bar in the command line.
- **kernel_kwargs: keyword arguments passed to the SMC kernel.
- The default IMH kernel takes the following keywords:
- threshold: float
Determines the change of beta from stage to stage, i.e. indirectly the number of stages, the higher the value of threshold the higher the number of stages. Defaults to 0.5. It should be between 0 and 1.
- correlation_threshold: float
The lower the value the higher the number of MCMC steps computed automatically. Defaults to 0.01. It should be between 0 and 1.
Keyword arguments for other kernels should be checked in the respective docstrings
Notes
SMC works by moving through successive stages. At each stage the inverse temperature \(\beta\) is increased a little bit (starting from 0 up to 1). When \(\beta\) = 0 we have the prior distribution and when \(\beta\) =1 we have the posterior distribution. So in more general terms we are always computing samples from a tempered posterior that we can write as:
\[p(\theta \mid y)_{\beta} = p(y \mid \theta)^{\beta} p(\theta)\]A summary of the algorithm is:
Initialize \(\beta\) at zero and stage at zero.
Generate N samples \(S_{\beta}\) from the prior (because when :math beta = 0 the tempered posterior is the prior).
Increase \(\beta\) in order to make the effective sample size equals some predefined value (we use \(Nt\), where \(t\) is 0.5 by default).
Compute a set of N importance weights W. The weights are computed as the ratio of the likelihoods of a sample at stage i+1 and stage i.
Obtain \(S_{w}\) by re-sampling according to W.
Use W to compute the mean and covariance for the proposal distribution, a MVNormal.
Run N independent MCMC chains, starting each one from a different sample in \(S_{w}\). For the IMH kernel, the mean of the proposal distribution is the mean of the previous posterior stage and not the current point in parameter space.
The N chains are run until the autocorrelation with the samples from the previous stage stops decreasing given a certain threshold.
Repeat from step 3 until \(\beta \ge 1\).
The final result is a collection of N samples from the posterior.
References
- Minson2013
Minson, S. E. and Simons, M. and Beck, J. L., (2013), Bayesian inversion for finite fault earthquake source models I- Theory and algorithm. Geophysical Journal International, 2013, 194(3), pp.1701-1726, link
- Ching2007
Ching, J. and Chen, Y. (2007). Transitional Markov Chain Monte Carlo Method for Bayesian Model Updating, Model Class Selection, and Model Averaging. J. Eng. Mech., 10.1061/(ASCE)0733-9399(2007)133:7(816), 816-832. link