# Posts in beginner

## Generalized Extreme Value Distribution

- 27 September 2022
- Category: beginner

The Generalized Extreme Value (GEV) distribution is a meta-distribution containing the Weibull, Gumbel, and Frechet families of extreme value distributions. It is used for modelling the distribution of extremes (maxima or minima) of stationary processes, such as the annual maximum wind speed, annual maximum truck weight on a bridge, and so on, without needing *a priori* decision on the tail behaviour.

## Bayesian regression with truncated or censored data

- 20 September 2022
- Category: beginner

The notebook provides an example of how to conduct linear regression when your outcome variable is either censored or truncated.

## How to debug a model

- 02 August 2022
- Category: beginner

There are various levels on which to debug a model. One of the simplest is to just print out the values that different variables are taking on.

## GLM: Negative Binomial Regression

- 20 June 2022
- Category: beginner

This notebook closely follows the GLM Poisson regression example by Jonathan Sedar (which is in turn inspired by a project by Ian Osvald) except the data here is negative binomially distributed instead of Poisson distributed.

## Stochastic Volatility model

- 17 June 2022
- Category: beginner

Asset prices have time-varying volatility (variance of day over day `returns`

). In some periods, returns are highly variable, while in others very stable. Stochastic volatility models model this with a latent volatility variable, modeled as a stochastic process. The following model is similar to the one described in the No-U-Turn Sampler paper, [Hoffman and Gelman, 2014].

## Splines

- 04 June 2022
- Category: beginner

Often, the model we want to fit is not a perfect line between some \(x\) and \(y\).
Instead, the parameters of the model are expected to vary over \(x\).
There are multiple ways to handle this situation, one of which is to fit a *spline*.
Spline fit is effectively a sum of multiple individual curves (piecewise polynomials), each fit to a different section of \(x\), that are tied together at their boundaries, often called *knots*.

## Bayes Factors and Marginal Likelihood

- 01 June 2022
- Category: beginner, explanation

The “Bayesian way” to compare models is to compute the *marginal likelihood* of each model \(p(y \mid M_k)\), *i.e.* the probability of the observed data \(y\) given the \(M_k\) model. This quantity, the marginal likelihood, is just the normalizing constant of Bayes’ theorem. We can see this if we write Bayes’ theorem and make explicit the fact that all inferences are model-dependant.

## Sampler Statistics

- 31 May 2022
- Category: beginner

When checking for convergence or when debugging a badly behaving sampler, it is often helpful to take a closer look at what the sampler is doing. For this purpose some samplers export statistics for each generated sample.

## General API quickstart

- 31 May 2022
- Category: beginner

Models in PyMC are centered around the `Model`

class. It has references to all random variables (RVs) and computes the model logp and its gradients. Usually, you would instantiate it as part of a `with`

context:

## Approximate Bayesian Computation

- 31 May 2022
- Category: beginner, explanation

Approximate Bayesian Computation methods (also called likelihood free inference methods), are a group of techniques developed for inferring posterior distributions in cases where the likelihood function is intractable or costly to evaluate. This does not mean that the likelihood function is not part of the analysis, it just the we are approximating the likelihood, and hence the name of the ABC methods.

## Regression discontinuity design analysis

- 20 April 2022
- Category: beginner, explanation

Quasi experiments involve experimental interventions and quantitative measures. However, quasi-experiments do *not* involve random assignment of units (e.g. cells, people, companies, schools, states) to test or control groups. This inability to conduct random assignment poses problems when making causal claims as it makes it harder to argue that any difference between a control and test group are because of an intervention and not because of a confounding factor.

## Gaussian Mixture Model

- 20 April 2022
- Category: beginner

A mixture model allows us to make inferences about the component contributors to a distribution of data. More specifically, a Gaussian Mixture Model allows us to make inferences about the means and standard deviations of a specified number of underlying component Gaussian distributions.

## Bayesian moderation analysis

- 20 March 2022
- Category: beginner

This notebook covers Bayesian moderation analysis. This is appropriate when we believe that one predictor variable (the moderator) may influence the linear relationship between another predictor variable and an outcome. Here we look at an example where we look at the relationship between hours of training and muscle mass, where it may be that age (the moderating variable) affects this relationship.

## Binomial regression

- 20 February 2022
- Category: beginner

This notebook covers the logic behind Binomial regression, a specific instance of Generalized Linear Modelling. The example is kept very simple, with a single predictor variable.

## Bayesian mediation analysis

- 20 February 2022
- Category: beginner

This notebook covers Bayesian mediation analysis. This is useful when we want to explore possible mediating pathways between a predictor and an outcome variable.

## Lasso regression with block updating

- 10 February 2022
- Category: beginner

Sometimes, it is very useful to update a set of parameters together. For example, variables that are highly correlated are often good to update together. In PyMC block updating is simple. This will be demonstrated using the parameter `step`

of `pymc.sample`

.

## Bayesian Estimation Supersedes the T-Test

- 07 January 2022
- Category: beginner

Non-consecutive header level increase; H1 to H3 [myst.header]

## Using shared variables (Data container adaptation)

- 16 December 2021
- Category: beginner

The `pymc.Data`

container class wraps the theano shared variable class and lets the model be aware of its inputs and outputs. This allows one to change the value of an observed variable to predict or refit on new data. All variables of this class must be declared inside a model context and specify a name for them.

## Using a “black box” likelihood function (numpy)

- 16 December 2021
- Category: beginner

This notebook in part of a set of two twin notebooks that perform the exact same task, this one uses numpy whereas this other one uses Cython

## Sequential Monte Carlo

- 19 October 2021
- Category: beginner

Sampling from distributions with multiple peaks with standard MCMC methods can be difficult, if not impossible, as the Markov chain often gets stuck in either of the minima. A Sequential Monte Carlo sampler (SMC) is a way to ameliorate this problem.

## Multivariate Gaussian Random Walk

- 25 September 2021
- Category: beginner

This notebook shows how to fit a correlated time series using multivariate Gaussian random walks (GRWs). In particular, we perform a Bayesian regression of the time series data against a model dependent on GRWs.

## Introduction to Bayesian A/B Testing

This notebook demonstrates how to implement a Bayesian analysis of an A/B test. We implement the models discussed in VWO’s Bayesian A/B Testing Whitepaper [Stucchio, 2015], and discuss the effect of different prior choices for these models. This notebook does *not* discuss other related topics like how to choose a prior, early stopping, and power analysis.