{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Running on PyMC v4.4.0+213.g85ca9123f.dirty\n" ] } ], "source": [ "import arviz as az\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import pymc as pm\n", "\n", "print(f\"Running on PyMC v{pm.__version__}\")" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "az.style.use(\"arviz-darkgrid\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(model_comparison)=\n", "# Model comparison\n", "\n", "To demonstrate the use of model comparison criteria in PyMC, we implement the **8 schools** example from Section 5.5 of Gelman et al (2003), which attempts to infer the effects of coaching on SAT scores of students from 8 schools. Below, we fit a **pooled model**, which assumes a single fixed effect across all schools, and a **hierarchical model** that allows for a random effect that partially pools the data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The data include the observed treatment effects (`y`) and associated standard deviations (`sigma`) in the 8 schools." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "y = np.array([28, 8, -3, 7, -1, 1, 18, 12])\n", "sigma = np.array([15, 10, 16, 11, 9, 11, 10, 18])\n", "J = len(y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pooled model" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Auto-assigning NUTS sampler...\n", "Initializing NUTS using jitter+adapt_diag...\n", "Multiprocess sampling (4 chains in 4 jobs)\n", "NUTS: [mu]\n" ] }, { "data": { "text/html": [ "\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", " \n", " 100.00% [12000/12000 00:02<00:00 Sampling 4 chains, 0 divergences]\n", "