{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "(howto_debugging)=\n", "# How to debug a model\n", "\n", ":::{post} August 2, 2022\n", ":tags: debugging, PyTensor\n", ":category: beginner\n", ":author: Thomas Wiecki, Igor Kuvychko\n", ":::" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction\n", "There are various levels on which to debug a model. One of the simplest is to just print out the values that different variables are taking on.\n", "\n", "Because `PyMC` uses `PyTensor` expressions to build the model, and not functions, there is no way to place a `print` statement into a likelihood function. Instead, you can use the {class}`pytensor.printing.Print` class to print intermediate values." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import arviz as az\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import pandas as pd\n", "import pymc as pm" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "%config InlineBackend.figure_format = \"retina\"\n", "\n", "RANDOM_SEED = 8927" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### How to print intermediate values of `PyTensor` functions\n", "Since `PyTensor` functions are compiled to C, you have to use `pytensor.printing.Print` class to print intermediate values (imported below as `Print`). Python `print` function will not work. Below is a simple example of using `Print`. For more information, see {ref}`Debugging PyTensor `." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "import pytensor.tensor as pt\n", "\n", "from pytensor import function\n", "from pytensor.printing import Print" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ inf, 0.5 , 0.25])" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = pt.dvector(\"x\")\n", "y = pt.dvector(\"y\")\n", "func = function([x, y], 1 / (x - y))\n", "func([1, 2, 3], [1, 0, -1])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To see what causes the `inf` value in the output, we can print intermediate values of $(x-y)$ using `Print`. `Print` class simply passes along its caller but prints out its value along a user-define message:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "x - y = __str__ = [0. 2. 4.]\n" ] }, { "data": { "text/plain": [ "array([ inf, 0.5 , 0.25])" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "z_with_print = Print(\"x - y = \")(x - y)\n", "func_with_print = function([x, y], 1 / z_with_print)\n", "func_with_print([1, 2, 3], [1, 0, -1])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`Print` reveals the root cause: $(x-y)$ takes a zero value when $x=1, y=1$, causing the `inf` output." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### How to capture `Print` output for further analysis\n", "\n", "When we expect many rows of output from `Print`, it can be desirable to redirect the output to a string buffer and access the values later on (thanks to **Lindley Lentati** for inspiring this example). Here is a toy example using Python `print` function:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['Test values: 0',\n", " 'Test values: 1',\n", " 'Test values: 2',\n", " 'Test values: 3',\n", " 'Test values: 4',\n", " '']" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import sys\n", "\n", "from io import StringIO\n", "\n", "old_stdout = sys.stdout\n", "mystdout = sys.stdout = StringIO()\n", "\n", "for i in range(5):\n", " print(f\"Test values: {i}\")\n", "\n", "output = mystdout.getvalue().split(\"\\n\")\n", "sys.stdout = old_stdout # setting sys.stdout back\n", "output" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Troubleshooting a toy PyMC model" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "rng = np.random.default_rng(RANDOM_SEED)\n", "x = rng.normal(size=100)\n", "\n", "with pm.Model() as model:\n", " # priors\n", " mu = pm.Normal(\"mu\", mu=0, sigma=1)\n", " sd = pm.Normal(\"sd\", mu=0, sigma=1)\n", "\n", " # setting out printing for mu and sd\n", " mu_print = Print(\"mu\")(mu)\n", " sd_print = Print(\"sd\")(sd)\n", "\n", " # likelihood\n", " obs = pm.Normal(\"obs\", mu=mu_print, sigma=sd_print, observed=x)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\r\n", "\r\n", "\r\n", "\r\n", "\r\n", "\r\n", "%3\r\n", "\r\n", "cluster100\r\n", "\r\n", "100\r\n", "\r\n", "\r\n", "sd\r\n", "\r\n", "sd\r\n", "~\r\n", "Normal\r\n", "\r\n", "\r\n", "obs\r\n", "\r\n", "obs\r\n", "~\r\n", "Normal\r\n", "\r\n", "\r\n", "sd->obs\r\n", "\r\n", "\r\n", "\r\n", "\r\n", "mu\r\n", "\r\n", "mu\r\n", "~\r\n", "Normal\r\n", "\r\n", "\r\n", "mu->obs\r\n", "\r\n", "\r\n", "\r\n", "\r\n", "\r\n" ], "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pm.model_to_graphviz(model)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Only 5 samples in chain.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "sd __str__ = 0.0\n", "mu __str__ = 0.0\n" ] }, { "ename": "SamplingError", "evalue": "Initial evaluation of model at starting point failed!\nStarting values:\n{'mu': array(0.), 'sd': array(0.)}\n\nInitial evaluation results:\n{'mu': -0.92, 'sd': -0.92, 'obs': -inf}", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mSamplingError\u001b[0m Traceback (most recent call last)", "Input \u001b[1;32mIn [9]\u001b[0m, in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[38;5;28;01mwith\u001b[39;00m model:\n\u001b[0;32m 2\u001b[0m step \u001b[38;5;241m=\u001b[39m pm\u001b[38;5;241m.\u001b[39mMetropolis()\n\u001b[1;32m----> 3\u001b[0m trace \u001b[38;5;241m=\u001b[39m \u001b[43mpm\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43msample\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m5\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mstep\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mtune\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;241;43m0\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mchains\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;241;43m1\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mprogressbar\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mFalse\u001b[39;49;00m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mrandom_seed\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mRANDOM_SEED\u001b[49m\u001b[43m)\u001b[49m\n", "File \u001b[1;32mc:\\users\\igork\\pycharmprojects\\pymc\\pymc\\sampling.py:558\u001b[0m, in \u001b[0;36msample\u001b[1;34m(draws, step, init, n_init, initvals, trace, chain_idx, chains, cores, tune, progressbar, model, random_seed, discard_tuned_samples, compute_convergence_checks, callback, jitter_max_retries, return_inferencedata, idata_kwargs, mp_ctx, **kwargs)\u001b[0m\n\u001b[0;32m 556\u001b[0m \u001b[38;5;66;03m# One final check that shapes and logps at the starting points are okay.\u001b[39;00m\n\u001b[0;32m 557\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m ip \u001b[38;5;129;01min\u001b[39;00m initial_points:\n\u001b[1;32m--> 558\u001b[0m \u001b[43mmodel\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mcheck_start_vals\u001b[49m\u001b[43m(\u001b[49m\u001b[43mip\u001b[49m\u001b[43m)\u001b[49m\n\u001b[0;32m 559\u001b[0m _check_start_shape(model, ip)\n\u001b[0;32m 561\u001b[0m sample_args \u001b[38;5;241m=\u001b[39m {\n\u001b[0;32m 562\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mdraws\u001b[39m\u001b[38;5;124m\"\u001b[39m: draws,\n\u001b[0;32m 563\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mstep\u001b[39m\u001b[38;5;124m\"\u001b[39m: step,\n\u001b[1;32m (...)\u001b[0m\n\u001b[0;32m 573\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mdiscard_tuned_samples\u001b[39m\u001b[38;5;124m\"\u001b[39m: discard_tuned_samples,\n\u001b[0;32m 574\u001b[0m }\n", "File \u001b[1;32mc:\\users\\igork\\pycharmprojects\\pymc\\pymc\\model.py:1794\u001b[0m, in \u001b[0;36mModel.check_start_vals\u001b[1;34m(self, start)\u001b[0m\n\u001b[0;32m 1791\u001b[0m initial_eval \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mpoint_logps(point\u001b[38;5;241m=\u001b[39melem)\n\u001b[0;32m 1793\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28mall\u001b[39m(np\u001b[38;5;241m.\u001b[39misfinite(v) \u001b[38;5;28;01mfor\u001b[39;00m v \u001b[38;5;129;01min\u001b[39;00m initial_eval\u001b[38;5;241m.\u001b[39mvalues()):\n\u001b[1;32m-> 1794\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m SamplingError(\n\u001b[0;32m 1795\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mInitial evaluation of model at starting point failed!\u001b[39m\u001b[38;5;130;01m\\n\u001b[39;00m\u001b[38;5;124m\"\u001b[39m\n\u001b[0;32m 1796\u001b[0m \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mStarting values:\u001b[39m\u001b[38;5;130;01m\\n\u001b[39;00m\u001b[38;5;132;01m{\u001b[39;00melem\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;130;01m\\n\u001b[39;00m\u001b[38;5;130;01m\\n\u001b[39;00m\u001b[38;5;124m\"\u001b[39m\n\u001b[0;32m 1797\u001b[0m \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mInitial evaluation results:\u001b[39m\u001b[38;5;130;01m\\n\u001b[39;00m\u001b[38;5;132;01m{\u001b[39;00minitial_eval\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m\n\u001b[0;32m 1798\u001b[0m )\n", "\u001b[1;31mSamplingError\u001b[0m: Initial evaluation of model at starting point failed!\nStarting values:\n{'mu': array(0.), 'sd': array(0.)}\n\nInitial evaluation results:\n{'mu': -0.92, 'sd': -0.92, 'obs': -inf}" ] } ], "source": [ "with model:\n", " step = pm.Metropolis()\n", " trace = pm.sample(5, step, tune=0, chains=1, progressbar=False, random_seed=RANDOM_SEED)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Exception handling of PyMC v4 has improved, so now SamplingError exception prints out the intermediate values of `mu` and `sd` which led to likelihood of `-inf`. However, this technique of printing intermediate values with `aeasara.printing.Print` can be valuable in more complicated cases." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Bringing it all together" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "scrolled": true }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Only 10 samples in chain.\n", "Auto-assigning NUTS sampler...\n", "Initializing NUTS using jitter+adapt_diag...\n", "Sequential sampling (1 chains in 1 job)\n", "NUTS: [mu, a, b]\n", "Sampling 1 chain for 0 tune and 10 draw iterations (0 + 10 draws total) took 1 seconds.\n", "The chain contains only diverging samples. The model is probably misspecified.\n", "The acceptance probability does not match the target. It is 0, but should be close to 0.8. Try to increase the number of tuning steps.\n", "C:\\Users\\igork\\AppData\\Local\\Temp\\ipykernel_14804\\1992602661.py:15: UserWarning: The number of samples is too small to check convergence reliably.\n", " trace = pm.sample(draws=10, tune=0, chains=1, progressbar=False, random_seed=RANDOM_SEED)\n" ] } ], "source": [ "rng = np.random.default_rng(RANDOM_SEED)\n", "y = rng.normal(loc=5, size=20)\n", "\n", "old_stdout = sys.stdout\n", "mystdout = sys.stdout = StringIO()\n", "\n", "with pm.Model() as model:\n", " mu = pm.Normal(\"mu\", mu=0, sigma=10)\n", " a = pm.Normal(\"a\", mu=0, sigma=10, initval=0.1)\n", " b = pm.Normal(\"b\", mu=0, sigma=10, initval=0.1)\n", " sd_print = Print(\"Delta\")(a / b)\n", " obs = pm.Normal(\"obs\", mu=mu, sigma=sd_print, observed=y)\n", "\n", " # limiting number of samples and chains to simplify output\n", " trace = pm.sample(draws=10, tune=0, chains=1, progressbar=False, random_seed=RANDOM_SEED)\n", "\n", "output = mystdout.getvalue()\n", "sys.stdout = old_stdout # setting sys.stdout back" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "'Delta __str__ = -85.74093608165128\\nDelta __str__ = -9.182002291671038\\nDelta __str__ = 0.10737295473067673\\nDelta __str__ = 0.10737295473067673\\nDelta __str__ = 0.10737295473067673\\nDelta __str__ = -9.315734173890055\\nDelta __str__ = 0.10737295473067673\\nDelta __str__ = -9.312485782438435\\nDelta __str__ = 0.10737295473067673\\nDelta __str__ = -9.314669656412736\\nDelta __str__ = 0.10737295473067673\\nDelta __str__ = -9.31581619157038\\nDelta __str__ = 0.10737295473067673\\nDelta __str__ = -9.315114719133609\\nDelta __str__ = 0.10737295473067673\\nDelta __str__ = -9.31511040479387\\nDelta __str__ = 0.10737295473067673\\nDelta __str__ = -9.314077394936474\\nDelta __str__ = 0.10737295473067673\\nDelta __str__ = -9.313673830463395\\nDelta __str__ = 0.10737295473067673\\nDelta __str__ = -9.31561025339713\\nDelta __str__ = 0.10737295473067673\\nDelta __str__ = -9.31526569370057\\nDelta __str__ = 0.10737295473067673\\nDelta __str__ = 0.10737295473067673\\nDelta __str__ = 0.10737295473067673\\nDelta __str__ = 0.10737295473067673\\nDelta __str__ = 0.10737295473067673\\nDelta __str__ = 0.10737295473067673\\nDelta __str__ = 0.10737295473067673\\nDelta __str__ = 0.10737295473067673\\nDelta __str__ = 0.10737295473067673\\nDelta __str__ = 0.10737295473067673\\n'" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "output" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Raw output is a bit messy and requires some cleanup and formatting to convert to {ref}`numpy.ndarray`. In the example below regex is used to clean up the output, and then it is evaluated with `eval` to give a list of floats. Code below also works with higher-dimensional outputs (in case you want to experiment with different models)." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "import re\n", "\n", "# output cleanup and conversion to numpy array\n", "# this is code accepts more complicated inputs\n", "pattern = re.compile(\"Delta __str__ = \")\n", "output = re.sub(pattern, \" \", output)\n", "pattern = re.compile(\"\\\\s+\")\n", "output = re.sub(pattern, \",\", output)\n", "pattern = re.compile(r\"\\[,\")\n", "output = re.sub(pattern, \"[\", output)\n", "output += \"]\"\n", "output = \"[\" + output[1:]\n", "output = eval(output)\n", "output = np.array(output)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([-85.74093608, -9.18200229, 0.10737295, 0.10737295,\n", " 0.10737295, -9.31573417, 0.10737295, -9.31248578,\n", " 0.10737295, -9.31466966, 0.10737295, -9.31581619,\n", " 0.10737295, -9.31511472, 0.10737295, -9.3151104 ,\n", " 0.10737295, -9.31407739, 0.10737295, -9.31367383,\n", " 0.10737295, -9.31561025, 0.10737295, -9.31526569,\n", " 0.10737295, 0.10737295, 0.10737295, 0.10737295,\n", " 0.10737295, 0.10737295, 0.10737295, 0.10737295,\n", " 0.10737295, 0.10737295])" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "output" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that we requested 5 draws, but got 34 sets of $a/b$ values. The reason is that for each iteration, all proposed values are printed (not just the accepted values). Negative values are clearly problematic." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(34,)" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "output.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Authors\n", "\n", "* Authored by Thomas Wiecki in July, 2016\n", "* Updated by Igor Kuvychko in August, 2022 ([pymc#406] (https://github.com/pymc-devs/pymc-examples/pull/406))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Watermark" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Last updated: Tue Aug 02 2022\n", "\n", "Python implementation: CPython\n", "Python version : 3.10.5\n", "IPython version : 8.4.0\n", "\n", "pytensor: 2.7.5\n", "xarray: 2022.3.0\n", "\n", "matplotlib: 3.5.2\n", "pytensor : 2.7.5\n", "numpy : 1.23.0\n", "arviz : 0.12.1\n", "sys : 3.10.5 | packaged by conda-forge | (main, Jun 14 2022, 06:57:19) [MSC v.1929 64 bit (AMD64)]\n", "re : 2.2.1\n", "pandas : 1.4.3\n", "pymc : 4.1.2\n", "\n", "Watermark: 2.3.1\n", "\n" ] } ], "source": [ "%load_ext watermark\n", "%watermark -n -u -v -iv -w -p pytensor,xarray" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ ":::{include} ../page_footer.md\n", ":::" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.6" } }, "nbformat": 4, "nbformat_minor": 4 }