Pathfinder Variational Inference#

Pathfinder [Zhang et al., 2021] is a variational inference algorithm that produces samples from the posterior of a Bayesian model. It compares favorably to the widely used ADVI algorithm. On large problems, it should scale better than most MCMC algorithms, including dynamic HMC (i.e. NUTS), at the cost of a more biased estimate of the posterior. For details on the algorithm, see the arxiv preprint.

PyMC’s implementation of Pathfinder is now natively integrated using PyTensor. The Pathfinder implementation can be accessed through pymc-extras, which can be installed via:

pip install git+https://github.com/pymc-devs/pymc-extras

import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pymc as pm
import pymc_extras as pmx

print(f"Running on PyMC v{pm.__version__}")
Running on PyMC v5.20.1

First, define your PyMC model. Here, we use the 8-schools model.

# Data of the Eight Schools Model
J = 8
y = np.array([28.0, 8.0, -3.0, 7.0, -1.0, 1.0, 18.0, 12.0])
sigma = np.array([15.0, 10.0, 16.0, 11.0, 9.0, 11.0, 10.0, 18.0])

with pm.Model() as model:
    mu = pm.Normal("mu", mu=0.0, sigma=10.0)
    tau = pm.HalfCauchy("tau", 5.0)

    z = pm.Normal("z", mu=0, sigma=1, shape=J)
    theta = pm.Deterministic("theta", mu + tau * z)
    obs = pm.Normal("obs", mu=theta, sigma=sigma, shape=J, observed=y)

Next, we call pmx.fit() and pass in the algorithm we want it to use.

rng = np.random.default_rng(123)
with model:
    idata_ref = pm.sample(target_accept=0.9, random_seed=rng)
    idata_path = pmx.fit(
        method="pathfinder",
        jitter=12,
        num_draws=1000,
        random_seed=123,
    )
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [mu, tau, z]

Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 1 seconds.

Pathfinder Results                          
                                            
  No. model parameters     10               
                                            
  Configuration:                            
  num_draws_per_path       1000             
  history size (maxcor)    7                
  max iterations           1000             
  ftol                     1.00e-05         
  gtol                     1.00e-08         
  max line search          1000             
  jitter                   12               
  epsilon                  1.00e-08         
  ELBO draws               10               
                                            
  LBFGS Status:                             
  CONVERGED                4                
  L-BFGS iterations        mean 22 ± std 6  
                                            
  Path Status:                              
  SUCCESS                  4                
  ELBO argmax              mean 8 ± std 9   
                                            
  Importance Sampling:                      
  Method                   psis             
  Pareto k                 0.75             
                                            
  Timing (seconds):                         
  Compile                  4.53             
  Compute                  0.09             
  Total                    4.62             

Just like pymc.sample(), this returns an idata with samples from the posterior. Note that because these samples do not come from an MCMC chain, convergence can not be assessed in the regular way.

az.plot_forest(
    [idata_ref, idata_path],
    var_names=["~z"],
    model_names=["ref", "path"],
    combined=True,
);
../_images/60ecfb8b14bd4a05cb6efde36c7ca4d7d9e2ab499a24ea97f995262ce81573ea.png

References#

[1]

Lu Zhang, Bob Carpenter, Andrew Gelman, and Aki Vehtari. Pathfinder: parallel quasi-newton variational inference. arXiv preprint arXiv:2108.03782, 2021.

Authors#

  • Authored by Thomas Wiecki on Oct 11 2022 (pymc-examples#429)

  • Re-execute notebook by Reshama Shaikh on Feb 5, 2023

  • Bug fix by Chris Fonnesbeck on Jul 17, 2024

  • Updated to PyMC implementation by Michael Cao on Feb 13, 2025

  • Updated text by Chris Fonnesbeck on Feb 19, 2025

Watermark#

%load_ext watermark
%watermark -n -u -v -iv -w -p xarray
Last updated: Wed Feb 19 2025

Python implementation: CPython
Python version       : 3.12.9
IPython version      : 8.32.0

xarray: 2025.1.2

arviz      : 0.19.0
numpy      : 1.26.4
matplotlib : 3.10.0
pymc_extras: 0.2.3
pymc       : 5.20.1

Watermark: 2.5.0

License notice#

All the notebooks in this example gallery are provided under the MIT License which allows modification, and redistribution for any use provided the copyright and license notices are preserved.

Citing PyMC examples#

To cite this notebook, use the DOI provided by Zenodo for the pymc-examples repository.

Important

Many notebooks are adapted from other sources: blogs, books… In such cases you should cite the original source as well.

Also remember to cite the relevant libraries used by your code.

Here is an citation template in bibtex:

@incollection{citekey,
  author    = "<notebook authors, see above>",
  title     = "<notebook title>",
  editor    = "PyMC Team",
  booktitle = "PyMC examples",
  doi       = "10.5281/zenodo.5654871"
}

which once rendered could look like:

Thomas Wiecki . "Pathfinder Variational Inference". In: PyMC Examples. Ed. by PyMC Team. DOI: 10.5281/zenodo.5654871