Posted in 2024

Bayesian Hypothesis Testing - an introduction

09 December 2024

Bayesian hypothesis testing provides a flexible and intuitive way to assess whether parameters differ from specified values. Unlike classical methods focusing on p-values, Bayesian methods let us directly compute probabilities of hypotheses and quantify the strength of evidence in various ways.

Read more ...

GLM-missing-values-in-covariates

09 November 2024

Minimal Reproducible Example: Workflow to handle missing data in multiple covariates (numeric predictor features)

Read more ...

GLM-ordinal-features

27 October 2024

Here we use an ordinal exogenous predictor feature within a model:

Read more ...

Simpson’s paradox

09 September 2024

Simpson’s Paradox describes a situation where there might be a negative relationship between two variables within a group, but when data from multiple groups are combined, that relationship may disappear or even reverse sign. The gif below (from the Simpson’s Paradox Wikipedia page) demonstrates this very nicely.

Read more ...

Confirmatory Factor Analysis and Structural Equation Models in Psychometrics

09 September 2024

“Evidently, the notions of relevance and dependence are far more basic to human reasoning than the numerical values attached to probability judgments…the language used for representing probabilistic information should allow assertions about dependency relationships to be expressed qualitatively, directly, and explicitly” - Pearl in Probabilistic Reasoning in Intelligent Systems Pearl [1985]

Read more ...

The prevalence of malaria in the Gambia

24 August 2024

Duplicate implicit target name: “the prevalence of malaria in the gambia”.

Read more ...

Model Averaging

09 August 2024

When confronted with more than one model we have several options. One of them is to perform model selection as exemplified by the PyMC examples Model comparison and the GLM: Model Selection, usually is a good idea to also include posterior predictive checks in order to decide which model to keep. Discarding all models except one is equivalent to affirm that, among the evaluated models, one is correct (under some criteria) with probability 1 and the rest are incorrect. In most cases this will be an overstatement that ignores the uncertainty we have in our models. This is somewhat similar to computing the full posterior and then just keeping a point-estimate like the posterior mean; we may become overconfident of what we really know. You can also browse the blog/tag/model-comparison tag to find related posts.

Read more ...

Gaussian Processes: HSGP Advanced Usage

28 June 2024

The Hilbert Space Gaussian processes approximation is a low-rank GP approximation that is particularly well-suited to usage in probabilistic programming languages like PyMC. It approximates the GP using a pre-computed and fixed set of basis functions that don’t depend on the form of the covariance kernel or its hyperparameters. It’s a parametric approximation, so prediction in PyMC can be done as one would with a linear model via pm.Data or pm.set_data. You don’t need to define the .conditional distribution that non-parameteric GPs rely on. This makes it much easier to integrate an HSGP, instead of a GP, into your existing PyMC model. Additionally, unlike many other GP approximations, HSGPs can be used anywhere within a model and with any likelihood function.

Read more ...

Gaussian Processes: HSGP Reference & First Steps

10 June 2024

The Hilbert Space Gaussian processes approximation is a low-rank GP approximation that is particularly well-suited to usage in probabilistic programming languages like PyMC. It approximates the GP using a pre-computed and fixed set of basis functions that don’t depend on the form of the covariance kernel or its hyperparameters. It’s a parametric approximation, so prediction in PyMC can be done as one would with a linear model via pm.Data or pm.set_data. You don’t need to define the .conditional distribution that non-parameteric GPs rely on. This makes it much easier to integrate an HSGP, instead of a GP, into your existing PyMC model. Additionally, unlike many other GP approximations, HSGPs can be used anywhere within a model and with any likelihood function.

Read more ...

Categorical regression

09 May 2024

In this example, we will model outcomes with more than two categories.

Read more ...

Automatic marginalization of discrete variables

20 January 2024

PyMC is very amendable to sampling models with discrete latent variables. But if you insist on using the NUTS sampler exclusively, you will need to get rid of your discrete variables somehow. The best way to do this is by marginalizing them out, as then you benefit from Rao-Blackwell’s theorem and get a lower variance estimate of your parameters.

Read more ...

Categories and Curves

14 January 2024

This notebook is part of the PyMC port of the Statistical Rethinking 2023 lecture series by Richard McElreath.

Read more ...

Bayesian Non-parametric Causal Inference

09 January 2024

There are few claims stronger than the assertion of a causal relationship and few claims more contestable. A naive world model - rich with tenuous connections and non-sequiter implications is characteristic of conspiracy theory and idiocy. On the other hand, a refined and detailed knowledge of cause and effect characterised by clear expectations, plausible connections and compelling counterfactuals, will steer you well through the buzzing, blooming confusion of the world.

Read more ...

Baby Births Modelling with HSGPs

09 January 2024

This notebook provides an example of using the Hilbert Space Gaussian Process (HSGP) technique, introduced in [Solin and Särkkä, 2020], in the context of time series modeling. This technique has proven successful in speeding up models with Gaussian process components.

Read more ...

The Garden of Forking Data

07 January 2024

This notebook is part of the PyMC port of the Statistical Rethinking 2023 lecture series by Richard McElreath.

Read more ...

Social Networks

07 January 2024

This notebook is part of the PyMC port of the Statistical Rethinking 2023 lecture series by Richard McElreath.

Read more ...

Ordered Categories

07 January 2024

This notebook is part of the PyMC port of the Statistical Rethinking 2023 lecture series by Richard McElreath.

Read more ...

Multilevel Models

07 January 2024

This notebook is part of the PyMC port of the Statistical Rethinking 2023 lecture series by Richard McElreath.

Read more ...

Multilevel Adventures

07 January 2024

This notebook is part of the PyMC port of the Statistical Rethinking 2023 lecture series by Richard McElreath.

Read more ...

Modeling Events

07 January 2024

This notebook is part of the PyMC port of the Statistical Rethinking 2023 lecture series by Richard McElreath.

Read more ...

Missing Data

07 January 2024

This notebook is part of the PyMC port of the Statistical Rethinking 2023 lecture series by Richard McElreath.

Read more ...

Measurement and Misclassification

07 January 2024

This notebook is part of the PyMC port of the Statistical Rethinking 2023 lecture series by Richard McElreath.

Read more ...

Markov Chain Monte Carlo

07 January 2024

This notebook is part of the PyMC port of the Statistical Rethinking 2023 lecture series by Richard McElreath.

Read more ...

Horoscopes

07 January 2024

This notebook is part of the PyMC port of the Statistical Rethinking 2023 lecture series by Richard McElreath.

Read more ...

Good & Bad Controls

07 January 2024

This notebook is part of the PyMC port of the Statistical Rethinking 2023 lecture series by Richard McElreath.

Read more ...

Geocentric Models

07 January 2024

This notebook is part of the PyMC port of the Statistical Rethinking 2023 lecture series by Richard McElreath.

Read more ...

Generalized Linear Madness

07 January 2024

This notebook is part of the PyMC port of the Statistical Rethinking 2023 lecture series by Richard McElreath.

Read more ...

Gaussian Processes

07 January 2024

This notebook is part of the PyMC port of the Statistical Rethinking 2023 lecture series by Richard McElreath.

Read more ...

Fitting Over & Under

07 January 2024

This notebook is part of the PyMC port of the Statistical Rethinking 2023 lecture series by Richard McElreath.

Read more ...

Elemental Confounds

07 January 2024

This notebook is part of the PyMC port of the Statistical Rethinking 2023 lecture series by Richard McElreath.

Read more ...

Counts and Hidden Confounds

07 January 2024

This notebook is part of the PyMC port of the Statistical Rethinking 2023 lecture series by Richard McElreath.

Read more ...

Correlated Features

07 January 2024

This notebook is part of the PyMC port of the Statistical Rethinking 2023 lecture series by Richard McElreath.

Read more ...