Posts in beginner
GLM: Negative Binomial Regression
- 22 June 2022
- Category: beginner
This notebook closely follows the GLM Poisson regression example by Jonathan Sedar (which is in turn inspired by a project by Ian Osvald) except the data here is negative binomially distributed instead of Poisson distributed.
Splines
- 04 June 2022
- Category: beginner
Often, the model we want to fit is not a perfect line between some \(x\) and \(y\). Instead, the parameters of the model are expected to vary over \(x\). There are multiple ways to handle this situation, one of which is to fit a spline. The spline is effectively multiple individual lines, each fit to a different section of \(x\), that are tied together at their boundaries, often called knots.
Bayes Factors and Marginal Likelihood
- 01 June 2022
- Category: beginner, explanation
The “Bayesian way” to compare models is to compute the marginal likelihood of each model \(p(y \mid M_k)\), i.e. the probability of the observed data \(y\) given the \(M_k\) model. This quantity, the marginal likelihood, is just the normalizing constant of Bayes’ theorem. We can see this if we write Bayes’ theorem and make explicit the fact that all inferences are model-dependant.
Sampler Statistics
- 31 May 2022
- Category: beginner
When checking for convergence or when debugging a badly behaving sampler, it is often helpful to take a closer look at what the sampler is doing. For this purpose some samplers export statistics for each generated sample.
General API quickstart
- 31 May 2022
- Category: beginner
Models in PyMC are centered around the Model
class. It has references to all random variables (RVs) and computes the model logp and its gradients. Usually, you would instantiate it as part of a with
context:
Approximate Bayesian Computation
- 31 May 2022
- Category: beginner, explanation
Approximate Bayesian Computation methods (also called likelihood free inference methods), are a group of techniques developed for inferring posterior distributions in cases where the likelihood function is intractable or costly to evaluate. This does not mean that the likelihood function is not part of the analysis, it just the we are approximating the likelihood, and hence the name of the ABC methods.
Regression discontinuity design analysis
- 22 April 2022
- Category: beginner, explanation
Quasi experiments involve experimental interventions and quantitative measures. However, quasi-experiments do not involve random assignment of units (e.g. cells, people, companies, schools, states) to test or control groups. This inability to conduct random assignment poses problems when making causal claims as it makes it harder to argue that any difference between a control and test group are because of an intervention and not because of a confounding factor.
Gaussian Mixture Model
- 22 April 2022
- Category: beginner
A mixture model allows us to make inferences about the component contributors to a distribution of data. More specifically, a Gaussian Mixture Model allows us to make inferences about the means and standard deviations of a specified number of underlying component Gaussian distributions.
Bayesian moderation analysis
- 22 March 2022
- Category: beginner
This notebook covers Bayesian moderation analysis. This is appropriate when we believe that one predictor variable (the moderator) may influence the linear relationship between another predictor variable and an outcome. Here we look at an example where we look at the relationship between hours of training and muscle mass, where it may be that age (the moderating variable) affects this relationship.
Binomial regression
- 22 February 2022
- Category: beginner
This notebook covers the logic behind Binomial regression, a specific instance of Generalized Linear Modelling. The example is kept very simple, with a single predictor variable.
Bayesian mediation analysis
- 22 February 2022
- Category: beginner
This notebook covers Bayesian mediation analysis. This is useful when we want to explore possible mediating pathways between a predictor and an outcome variable.
Lasso regression with block updating
- 10 February 2022
- Category: beginner
Sometimes, it is very useful to update a set of parameters together. For example, variables that are highly correlated are often good to update together. In PyMC block updating is simple. This will be demonstrated using the parameter step
of pymc.sample
.
Bayesian regression with truncated or censored data
- 22 January 2022
- Category: beginner
The notebook provides an example of how to conduct linear regression when your outcome variable is either censored or truncated.
Bayesian Estimation Supersedes the T-Test
- 07 January 2022
- Category: beginner
Non-consecutive header level increase; H1 to H3 [myst.header]
Using shared variables (Data container adaptation)
- 16 December 2021
- Category: beginner
The pymc.Data
container class wraps the theano shared variable class and lets the model be aware of its inputs and outputs. This allows one to change the value of an observed variable to predict or refit on new data. All variables of this class must be declared inside a model context and specify a name for them.
Using a “black box” likelihood function (numpy)
- 16 December 2021
- Category: beginner
This notebook in part of a set of two twin notebooks that perform the exact same task, this one uses numpy whereas this other one uses Cython
Sequential Monte Carlo
- 19 October 2021
- Category: beginner
Sampling from distributions with multiple peaks with standard MCMC methods can be difficult, if not impossible, as the Markov chain often gets stuck in either of the minima. A Sequential Monte Carlo sampler (SMC) is a way to ameliorate this problem.
Multivariate Gaussian Random Walk
- 25 September 2021
- Category: beginner
This notebook shows how to fit a correlated time series using multivariate Gaussian random walks (GRWs). In particular, we perform a Bayesian regression of the time series data against a model dependent on GRWs.