# Posts in intermediate

## Automatic marginalization of discrete variables

PyMC is very amendable to sampling models with discrete latent variables. But if you insist on using the NUTS sampler exclusively, you will need to get rid of your discrete variables somehow. The best way to do this is by marginalizing them out, as then you benefit from Rao-Blackwell’s theorem and get a lower variance estimate of your parameters.

## Bayesian copula estimation: Describing correlated joint distributions

When we deal with multiple variables (e.g. $$a$$ and $$b$$) we often want to describe the joint distribution $$P(a, b)$$ parametrically. If we are lucky, then this joint distribution might be ‘simple’ in some way. For example, it could be that $$a$$ and $$b$$ are statistically independent, in which case we can break down the joint distribution into $$P(a, b) = P(a) P(b)$$ and so we just need to find appropriate parametric descriptions for $$P(a)$$ and $$P(b)$$. Even if this is not appropriate, it may be that $$P(a, b)$$ could be described well by a simple multivariate distribution, such as a multivariate normal distribution for example.

## Frailty and Survival Regression Models

This notebook uses libraries that are not PyMC dependencies and therefore need to be installed specifically to run this notebook. Open the dropdown below for extra guidance.

## The Besag-York-Mollie Model for Spatial Data

This notebook uses libraries that are not PyMC dependencies and therefore need to be installed specifically to run this notebook. Open the dropdown below for extra guidance.

## Faster Sampling with JAX and Numba

PyMC can compile its models to various execution backends through PyTensor, including:

## Gaussian Processes: Latent Variable Implementation

The gp.Latent class is a direct implementation of a Gaussian process without approximation. Given a mean and covariance function, we can place a prior on the function $$f(x)$$,

## Marginal Likelihood Implementation

The gp.Marginal class implements the more common case of GP regression: the observed data are the sum of a GP and Gaussian noise. gp.Marginal has a marginal_likelihood method, a conditional method, and a predict method. Given a mean and covariance function, the function $$f(x)$$ is modeled as,

## Rolling Regression

Pairs trading is a famous technique in algorithmic trading that plays two stocks against each other.

## Hierarchical Partial Pooling

Suppose you are tasked with estimating baseball batting skills for several players. One such performance metric is batting average. Since players play a different number of games and bat in different positions in the order, each player has a different number of at-bats. However, you want to estimate the skill of all players, including those with a relatively small number of batting opportunities.

## Reliability Statistics and Predictive Calibration

Duplicate implicit target name: “reliability statistics and predictive calibration”.

## Quantile Regression with BART

Usually when doing regression we model the conditional mean of some distribution. Common cases are a Normal distribution for continuous unbounded responses, a Poisson distribution for count data, etc.

## DEMetropolis(Z) Sampler Tuning

For continuous variables, the default PyMC sampler (NUTS) requires that gradients are computed, which PyMC does through autodifferentiation. However, in some cases, a PyMC model may not be supplied with gradients (for example, by evaluating a numerical model outside of PyMC) and an alternative sampler is necessary. The DEMetropolisZ sampler is an efficient choice for gradient-free inference. The implementation of DEMetropolisZ in PyMC is based on ter Braak and Vrugt [2008] but with a modified tuning scheme. This notebook compares various tuning parameter settings for the sampler, including the drop_tune_fraction parameter which was introduced in PyMC.

## DEMetropolis and DEMetropolis(Z) Algorithm Comparisons

For continuous variables, the default PyMC sampler (NUTS) requires that gradients are computed, which PyMC does through autodifferentiation. However, in some cases, a PyMC model may not be supplied with gradients (for example, by evaluating a numerical model outside of PyMC) and an alternative sampler is necessary. Differential evolution (DE) Metropolis samplers are an efficient choice for gradient-free inference. This notebook compares the DEMetropolis and the DEMetropolisZ samplers in PyMC to help determine which is a better option for a given problem.

## Reparameterizing the Weibull Accelerated Failure Time Model

The previous example notebook on Bayesian parametric survival analysis introduced two different accelerated failure time (AFT) models: Weibull and log-linear. In this notebook, we present three different parameterizations of the Weibull AFT model.

## Bayesian Survival Analysis

Survival analysis studies the distribution of the time to an event. Its applications span many fields across medicine, biology, engineering, and social science. This tutorial shows how to fit and analyze a Bayesian survival model in Python using PyMC.

## ODE Lotka-Volterra With Bayesian Inference in Multiple Ways

The purpose of this notebook is to demonstrate how to perform Bayesian inference on a system of ordinary differential equations (ODEs), both with and without gradients. The accuracy and efficiency of different samplers are compared.

## Introduction to Variational Inference with PyMC

The most common strategy for computing posterior quantities of Bayesian models is via sampling, particularly Markov chain Monte Carlo (MCMC) algorithms. While sampling algorithms and associated computing have continually improved in performance and efficiency, MCMC methods still scale poorly with data size, and become prohibitive for more than a few thousand observations. A more scalable alternative to sampling is variational inference (VI), which re-frames the problem of computing the posterior distribution as an optimization problem.

## Hierarchical Binomial Model: Rat Tumor Example

This short tutorial demonstrates how to use PyMC to do inference for the rat tumour example found in chapter 5 of Bayesian Data Analysis 3rd Edition . Readers should already be familiar with the PyMC API.

## Analysis of An AR(1) Model in PyMC

Consider the following AR(2) process, initialized in the infinite past: $$$y_t = \rho_0 + \rho_1 y_{t-1} + \rho_2 y_{t-2} + \epsilon_t,$$$$where$$\epsilon_t \overset{iid}{\sim} {\cal N}(0,1)$$. Suppose you'd like to learn about$$\rho$$from a a sample of observations$$Y^T = { y_0, y_1,\ldots, y_T }$.

## GLM: Poisson Regression

This is a minimal reproducible example of Poisson regression to predict counts using dummy data.

## Bayesian Vector Autoregressive Models

Duplicate implicit target name: “bayesian vector autoregressive models”.

## Multi-output Gaussian Processes: Coregionalization models using Hamadard product

This notebook shows how to implement the Intrinsic Coregionalization Model (ICM) and the Linear Coregionalization Model (LCM) using a Hamadard product between the Coregion kernel and input kernels. Multi-output Gaussian Process is discussed in this paper by Bonilla et al. [2007]. For further information about ICM and LCM, please check out the talk on Multi-output Gaussian Processes by Mauricio Alvarez, and his slides with more references at the last page.

## Kronecker Structured Covariances

PyMC contains implementations for models that have Kronecker structured covariances. This patterned structure enables Gaussian process models to work on much larger datasets. Kronecker structure can be exploited when

## Interrupted time series analysis

This notebook focuses on how to conduct a simple Bayesian interrupted time series analysis. This is useful in quasi-experimental settings where an intervention was applied to all treatment units.

## A Primer on Bayesian Methods for Multilevel Modeling

Hierarchical or multilevel modeling is a generalization of regression modeling.

## Forecasting with Structural AR Timeseries

Bayesian structural timeseries models are an interesting way to learn about the structure inherent in any observed timeseries data. It also gives us the ability to project forward the implied predictive distribution granting us another view on forecasting problems. We can treat the learned characteristics of the timeseries data observed to-date as informative about the structure of the unrealised future state of the same measure.

## Difference in differences

This notebook provides a brief overview of the difference in differences approach to causal inference, and shows a working example of how to conduct this type of analysis under the Bayesian framework, using PyMC. While the notebooks provides a high level overview of the approach, I recommend consulting two excellent textbooks on causal inference. Both The Effect and Causal Inference: The Mixtape have chapters devoted to difference in differences.

## Model Averaging

When confronted with more than one model we have several options. One of them is to perform model selection, using for example a given Information Criterion as exemplified the PyMC examples Model comparison and the GLM: Model Selection. Model selection is appealing for its simplicity, but we are discarding information about the uncertainty in our models. This is somehow similar to computing the full posterior and then just keep a point-estimate like the posterior mean; we may become overconfident of what we really know. You can also browse the blog/tag/model-comparison tag to find related posts.

## Counterfactual inference: calculating excess deaths due to COVID-19

Causal reasoning and counterfactual thinking are really interesting but complex topics! Nevertheless, we can make headway into understanding the ideas through relatively simple examples. This notebook focuses on the concepts and the practical implementation of Bayesian causal reasoning using PyMC.

## Probabilistic Matrix Factorization for Making Personalized Recommendations

So you are browsing for something to watch on Netflix and just not liking the suggestions. You just know you can do better. All you need to do is collect some ratings data from yourself and friends and build a recommendation algorithm. This notebook will guide you in doing just that!

## Modeling spatial point patterns with a marked log-Gaussian Cox process

The log-Gaussian Cox process (LGCP) is a probabilistic model of point patterns typically observed in space or time. It has two main components. First, an underlying intensity field $$\lambda(s)$$ of positive real values is modeled over the entire domain $$X$$ using an exponentially-transformed Gaussian process which constrains $$\lambda$$ to be positive. Then, this intensity field is used to parameterize a Poisson point process which represents a stochastic mechanism for placing points in space. Some phenomena amenable to this representation include the incidence of cancer cases across a county, or the spatiotemporal locations of crime events in a city. Both spatial and temporal dimensions can be handled equivalently within this framework, though this tutorial only addresses data in two spatial dimensions.

## Variational Inference: Bayesian Neural Networks

Probabilistic Programming, Deep Learning and “Big Data” are among the biggest topics in machine learning. Inside of PP, a lot of innovation is focused on making things scale using Variational Inference. In this example, I will show how to use Variational Inference in PyMC to fit a simple Bayesian Neural Network. I will also discuss how bridging Probabilistic Programming and Deep Learning can open up very interesting avenues to explore in future research.

## Censored Data Models

This example notebook on Bayesian survival analysis touches on the point of censored data. Censoring is a form of missing-data problem, in which observations greater than a certain threshold are clipped down to that threshold, or observations less than a certain threshold are clipped up to that threshold, or both. These are called right, left and interval censoring, respectively. In this example notebook we consider interval censoring.

## Gaussian Process for CO2 at Mauna Loa

This Gaussian Process (GP) example shows how to:

## Air passengers - Prophet-like model

We’re going to look at the “air passengers” dataset, which tracks the monthly totals of a US airline passengers from 1949 to 1960. We could fit this using the Prophet model (indeed, this dataset is one of the examples they provide in their documentation), but instead we’ll make our own Prophet-like model in PyMC3. This will make it a lot easier to inspect the model’s components and to do prior predictive checks (an integral component of the Bayesian workflow ).

## NBA Foul Analysis with Item Response Theory

This tutorial shows an application of Bayesian Item Response Theory to NBA basketball foul calls data using PyMC. Based on Austin Rochford’s blogpost NBA Foul Calls and Bayesian Item Response Theory.

## Model building and expansion for golf putting

This uses and closely follows the case study from Andrew Gelman, written in Stan. There are some new visualizations and we steered away from using improper priors, but much credit to him and to the Stan group for the wonderful case study and software.

## Mean and Covariance Functions

A large set of mean and covariance functions are available in PyMC. It is relatively easy to define custom mean and covariance functions. Since PyMC uses PyTensor, their gradients do not need to be defined by the user.

## A Hierarchical model for Rugby prediction

In this example, we’re going to reproduce the first model described in Baio and Blangiardo [2010] using PyMC. Then show how to sample from the posterior predictive to simulate championship outcomes from the scored goals which are the modeled quantities.

## GLM: Model Selection

A fairly minimal reproducible example of Model Selection using WAIC, and LOO as currently implemented in PyMC3.

## Bayesian Additive Regression Trees: Introduction

Bayesian additive regression trees (BART) is a non-parametric regression approach. If we have some covariates $$X$$ and we want to use them to model $$Y$$, a BART model (omitting the priors) can be represented as:

## GLM: Robust Regression using Custom Likelihood for Outlier Classification

Using PyMC for Robust Regression with Outlier Detection using the Hogg 2010 Signal vs Noise method.

## Estimating parameters of a distribution from awkwardly binned data

Let us say that we are interested in inferring the properties of a population. This could be anything from the distribution of age, or income, or body mass index, or a whole range of different possible measures. In completing this task, we might often come across the situation where we have multiple datasets, each of which can inform our beliefs about the overall population.

## GLM: Mini-batch ADVI on hierarchical regression model

Unlike Gaussian mixture models, (hierarchical) regression models have independent variables. These variables affect the likelihood function, but are not random variables. When using mini-batch, we should take care of that.

## Marginalized Gaussian Mixture Model

• 18 September 2021

Gaussian mixtures are a flexible class of models for data that exhibits subpopulation heterogeneity. A toy example of such a data set is shown below.