TODO: incorporate the useful bits of this page into the learning section

Usage Overview#

For a detailed overview of building models in PyMC, please read the appropriate sections in the rest of the documentation. For a flavor of what PyMC models look like, here is a quick example.

First, let’s import PyMC and ArviZ (which handles plotting and diagnostics):

import arviz as az
import numpy as np
import pymc as pm

Models are defined using a context manager (with statement). The model is specified declaratively inside the context manager, instantiating model variables and transforming them as necessary. Here is an example of a model for a bioassay experiment:

# Set style
az.style.use("arviz-darkgrid")

# Data
n = np.ones(4)*5
y = np.array([0, 1, 3, 5])
dose = np.array([-.86,-.3,-.05,.73])

with pm.Model() as bioassay_model:

    # Prior distributions for latent variables
    alpha = pm.Normal('alpha', 0, sigma=10)
    beta = pm.Normal('beta', 0, sigma=1)

    # Linear combination of parameters
    theta = pm.invlogit(alpha + beta * dose)

    # Model likelihood
    deaths = pm.Binomial('deaths', n=n, p=theta, observed=y)

Save this file, then from a python shell (or another file in the same directory), call:

with bioassay_model:

    # Draw samples
    idata = pm.sample(1000, tune=2000, cores=2)
    # Plot two parameters
    az.plot_forest(idata, var_names=['alpha', 'beta'], r_hat=True)

This example will generate 1000 posterior samples on each of two cores using the NUTS algorithm, preceded by 2000 tuning samples (these are good default numbers for most models).

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [beta, alpha]
|██████████████████████████████████████| 100.00% [6000/6000 00:04<00:00 Sampling 2 chains, 0 divergences]

The sample is returned as arrays inside a MultiTrace object, which is then passed to the plotting function. The resulting graph shows a forest plot of the random variables in the model, along with a convergence diagnostic (R-hat) that indicates our model has converged.

learn/images/forestplot.png

See also#