Posts tagged regression
Multivariate Gaussian Random Walk
- 02 February 2023
- beginner
This notebook shows how to fit a correlated time series using multivariate Gaussian random walks (GRWs). In particular, we perform a Bayesian regression of the time series data against a model dependent on GRWs.
Rolling Regression
- 28 January 2023
- intermediate
Pairs trading is a famous technique in algorithmic trading that plays two stocks against each other.
Modeling Heteroscedasticity with BART
In this notebook we show how to use BART to model heteroscedasticity as described in Section 4.1 of pymc-bart
’s paper [Quiroga et al., 2022]. We use the marketing
data set provided by the R package datarium
[Kassambara, 2019]. The idea is to model a marketing channel contribution to sales as a function of budget.
Quantile Regression with BART
- 25 January 2023
- intermediate, explanation
Usually when doing regression we model the conditional mean of some distribution. Common cases are a Normal distribution for continuous unbounded responses, a Poisson distribution for count data, etc.
GLM: Robust Linear Regression
- 10 January 2023
- beginner
Duplicate implicit target name: “glm: robust linear regression”.
GLM: Poisson Regression
- 30 November 2022
- intermediate
This is a minimal reproducible example of Poisson regression to predict counts using dummy data.
Difference in differences
- 28 September 2022
- intermediate
This notebook provides a brief overview of the difference in differences approach to causal inference, and shows a working example of how to conduct this type of analysis under the Bayesian framework, using PyMC. While the notebooks provides a high level overview of the approach, I recommend consulting two excellent textbooks on causal inference. Both The Effect [Huntington-Klein, 2021] and Causal Inference: The Mixtape [Cunningham, 2021] have chapters devoted to difference in differences.
Bayesian regression with truncated or censored data
- 28 September 2022
- beginner
The notebook provides an example of how to conduct linear regression when your outcome variable is either censored or truncated.
Counterfactual inference: calculating excess deaths due to COVID-19
- 28 July 2022
- intermediate
Causal reasoning and counterfactual thinking are really interesting but complex topics! Nevertheless, we can make headway into understanding the ideas through relatively simple examples. This notebook focuses on the concepts and the practical implementation of Bayesian causal reasoning using PyMC.
Splines
- 04 June 2022
- beginner
Often, the model we want to fit is not a perfect line between some \(x\) and \(y\). Instead, the parameters of the model are expected to vary over \(x\). There are multiple ways to handle this situation, one of which is to fit a spline. Spline fit is effectively a sum of multiple individual curves (piecewise polynomials), each fit to a different section of \(x\), that are tied together at their boundaries, often called knots.
Regression discontinuity design analysis
- 28 April 2022
- explanation, beginner
Quasi experiments involve experimental interventions and quantitative measures. However, quasi-experiments do not involve random assignment of units (e.g. cells, people, companies, schools, states) to test or control groups. This inability to conduct random assignment poses problems when making causal claims as it makes it harder to argue that any difference between a control and test group are because of an intervention and not because of a confounding factor.
Bayesian mediation analysis
- 28 February 2022
- beginner
This notebook covers Bayesian mediation analysis. This is useful when we want to explore possible mediating pathways between a predictor and an outcome variable.
Lasso regression with block updating
- 10 February 2022
- beginner
Sometimes, it is very useful to update a set of parameters together. For example, variables that are highly correlated are often good to update together. In PyMC block updating is simple. This will be demonstrated using the parameter step
of pymc.sample
.
Bayesian Additive Regression Trees: Introduction
- 21 December 2021
- intermediate, explanation
Bayesian additive regression trees (BART) is a non-parametric regression approach. If we have some covariates \(X\) and we want to use them to model \(Y\), a BART model (omitting the priors) can be represented as:
GLM: Robust Regression using Custom Likelihood for Outlier Classification
- 17 November 2021
- intermediate
Using PyMC for Robust Regression with Outlier Detection using the Hogg 2010 Signal vs Noise method.