Reliability Statistics and Predictive Calibration#

Attention

This notebook uses libraries that are not PyMC dependencies and therefore need to be installed specifically to run this notebook. Open the dropdown below for extra guidance.

import random

from io import StringIO

import arviz as az
import matplotlib.cm as cm
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pymc as pm

from lifelines import LogNormalFitter, WeibullFitter
from lifelines.utils import survival_table_from_events
from scipy.stats import binom, lognorm, norm, weibull_min

RANDOM_SEED = 8927
rng = np.random.default_rng(RANDOM_SEED)
az.style.use("arviz-variat")
%config InlineBackend.figure_format = 'retina'

Reliability Statistics#

When we want to make inferences about likely failures on a production line, we may have large or small sample data set depending on the industry, nature of the goods or specificifty of the question we’re seeking to answer. But in all cases there is a question of cost and a quantity of tolerable failures.

A reliability study therefore has to account for the period in which a failure is important to observe, the cost of the failure and cost of running a mis-specified study. The requirements for precision in the definition of the question and the nature of the modelling exercise are paramount.

The key feature of time-to-failure data is the manner in which it is censored and how this biases traditional statistical summaries and estimation techniques. In this notebook we’re going to focus on the prediction of failure times and compare the Bayesian notion of a calibrated prediction interval to some frequentist alternatives. We draw on the work in the book Statistical Methods for Reliability Data. For details (see Meeker et al. [2022])

Types of Prediction#

We might want to know about different views of the failure time distribution such as:

Time to failure of a new item
Time until k failures in a future sample of m units

While there are non-parametric and descriptive methods that can be used to assess these kinds of question we’re going to focus on the case where we have a probability model e.g. a lognormal distribution of failure times \(F(t: \mathbf{\theta})\) parameterised by an unknown \(\mathbf{\theta}\).

Structure of the Presentation#

We will

Discuss non-parametric estimates of the cumulative density function CDF for reliability data
Show how a frequentist or MLE of the same function can be derived to inform our prediction interval
Show how Bayesian methods can be used to augment the analysis of the same CDF in cases with sparse information.

Throughout the focus will be how the understanding of the CDF can help us understand risk of failure in a reliability setting. In particular how the CDF can be used to derive a well calibrated prediction interval.

Example Failure Distribution#

In the study of reliability statistics there is a focus on location-scale based distributions with long tails. In an ideal world we’d know exactly which distribution described our failure process and the prediction interval for the next failure could be defined exactly.

mu, sigma = 6, 0.3


def plot_ln_pi(mu, sigma, xy=(700, 75), title="Exact Prediction Interval for Known Lognormal"):
    failure_dist = lognorm(s=sigma, scale=np.exp(mu))
    samples = failure_dist.rvs(size=1000, random_state=100)
    fig, axs = plt.subplots(1, 3, figsize=(20, 8))
    axs = axs.flatten()
    axs[0].hist(samples, ec="black", color="slateblue", bins=30)
    axs[0].set_title(f"Failure Time Distribution: LN({mu}, {sigma})")
    count, bins_count = np.histogram(samples, bins=30)
    pdf = count / sum(count)
    cdf = np.cumsum(pdf)
    axs[1].plot(bins_count[1:], cdf, label="CDF", color="slateblue")
    axs[2].plot(bins_count[1:], 1 - cdf, label="Survival", color="slateblue")
    axs[2].legend()
    axs[1].legend()
    axs[1].set_title("Cumulative Density Function")
    axs[2].set_title("Survival Curve")

    lb = failure_dist.ppf(0.01)
    ub = failure_dist.ppf(0.99)
    axs[0].annotate(
        f"99% Prediction \nInterval: [{np.round(lb, 3)}, {np.round(ub, 3)}]",
        xy=(xy[0], xy[1] - 25),
        fontweight="bold",
    )
    axs[0].fill_betweenx(y=range(200), x1=lb, x2=ub, alpha=0.2, label="p99 PI", color="cyan")
    axs[1].fill_betweenx(y=range(2), x1=lb, x2=ub, alpha=0.2, label="p99 PI", color="cyan")
    axs[2].fill_betweenx(y=range(2), x1=lb, x2=ub, alpha=0.2, label="p99 PI", color="cyan")
    lb = failure_dist.ppf(0.025)
    ub = failure_dist.ppf(0.975)
    axs[0].annotate(
        f"95% Prediction \nInterval: [{np.round(lb, 3)}, {np.round(ub, 3)}]",
        xy=(xy[0], xy[1]),
        fontweight="bold",
    )
    axs[0].fill_betweenx(y=range(200), x1=lb, x2=ub, alpha=0.2, label="p95 PI", color="magenta")
    axs[1].fill_betweenx(y=range(2), x1=lb, x2=ub, alpha=0.2, label="p95 PI", color="magenta")
    axs[2].fill_betweenx(y=range(2), x1=lb, x2=ub, alpha=0.2, label="p95 PI", color="magenta")
    axs[0].legend()
    axs[1].legend()
    axs[2].legend()
    plt.suptitle(title, fontsize=20)


plot_ln_pi(mu, sigma)

../_images/b744550211b1900cade95eb2b233aa45a242565051cef7b66c719dd559d15edf.png

Estimation of the Failure Distribution from Data#

In the real world we rarely have such exact knowledge. Instead we start with altogether less clear data. We will first examine failure data about heat exchanges across three plants and pool the information to quantify the lifetime of the heat-exchanges over the three factories.

The data is small deliberately so we can focus on the descriptive statistics involved in assessing time-to-failure data. In particular we’ll estimate the empirical CDF and survival functions. We will then generalise this style of analysis to a larger data set afterwards.

Heat Exchange Data#

Note on Censored Data: See below how the failure data flags whether or not an observation has been censored i.e. whether or not we have observed the full course of the life-time of the heat-exchanger. This is a crucial feature of failure time data. Too simple a statistical summary will be biased in its estimation of the prevalance of failure by the fact that our study has not seen out the full-course of every item’s life-cycle. The most prevalent form of censoring is so called “Right censored” data where we have not seen the “failure” event for a subset of the observations. Our histories are incomplete due to prematurely ending the data collection.

Left censoring (where we don’t observe an item from the beginning of their history) and interval censoring (both left and right censoring) can also occur but are less common.

heat_exchange_df

	Years Lower	Years Upper	Censoring Indicator	Count	Plant	year_interval	failed	censored	risk_set
0	0	1	Left	1	1	0,1	1	0	100
1	1	2	Interval	2	1	1,2	2	0	99
2	2	3	Interval	2	1	2,3	2	0	97
3	3		Right	95	1	3,	0	95	0
4	0	1	Left	2	2	0,1	2	0	100
5	1	2	Interval	3	2	1,2	3	0	98
6	2		Right	95	2	2,	0	95	0
7	0	1	Left	1	3	0,1	1	0	100
8	1		Right	99	3	1,	0	99	0

actuarial_table = heat_exchange_df.groupby(["Years Upper"])[["failed", "risk_set"]].sum()
actuarial_table = actuarial_table.tail(3)


def greenwood_variance(df):
    ### Used to estimate the variance in the CDF
    n = len(df)
    ps = [df.iloc[i]["p_hat"] / (df.iloc[i]["risk_set"] * df.iloc[i]["1-p_hat"]) for i in range(n)]
    s = [(df.iloc[i]["S_hat"] ** 2) * np.sum(ps[0 : i + 1]) for i in range(n)]
    return s


def logit_transform_interval(df):
    ### Used for robustness in the estimation of the Confidence intervals in the CDF
    df["logit_CI_95_lb"] = df["F_hat"] / (
        df["F_hat"]
        + df["S_hat"] * np.exp((1.960 * df["Standard_Error"]) / (df["F_hat"] * df["S_hat"]))
    )
    df["logit_CI_95_ub"] = df["F_hat"] / (
        df["F_hat"]
        + df["S_hat"] / np.exp((1.960 * df["Standard_Error"]) / (df["F_hat"] * df["S_hat"]))
    )
    df["logit_CI_95_lb"] = np.where(df["logit_CI_95_lb"] < 0, 0, df["logit_CI_95_lb"])
    df["logit_CI_95_ub"] = np.where(df["logit_CI_95_ub"] > 1, 1, df["logit_CI_95_ub"])
    return df


def make_actuarial_table(actuarial_table):
    ### Actuarial lifetables are used to describe the nature of the risk over time and estimate
    actuarial_table["p_hat"] = actuarial_table["failed"] / actuarial_table["risk_set"]
    actuarial_table["1-p_hat"] = 1 - actuarial_table["p_hat"]
    actuarial_table["S_hat"] = actuarial_table["1-p_hat"].cumprod()
    actuarial_table["CH_hat"] = -np.log(actuarial_table["S_hat"])
    ### The Estimate of the CDF function
    actuarial_table["F_hat"] = 1 - actuarial_table["S_hat"]
    actuarial_table["V_hat"] = greenwood_variance(actuarial_table)
    actuarial_table["Standard_Error"] = np.sqrt(actuarial_table["V_hat"])
    actuarial_table["CI_95_lb"] = (
        actuarial_table["F_hat"] - actuarial_table["Standard_Error"] * 1.960
    )
    actuarial_table["CI_95_lb"] = np.where(
        actuarial_table["CI_95_lb"] < 0, 0, actuarial_table["CI_95_lb"]
    )
    actuarial_table["CI_95_ub"] = (
        actuarial_table["F_hat"] + actuarial_table["Standard_Error"] * 1.960
    )
    actuarial_table["CI_95_ub"] = np.where(
        actuarial_table["CI_95_ub"] > 1, 1, actuarial_table["CI_95_ub"]
    )
    actuarial_table["plotting_position"] = actuarial_table["F_hat"].rolling(1).median()
    actuarial_table = logit_transform_interval(actuarial_table)
    return actuarial_table


actuarial_table_heat = make_actuarial_table(actuarial_table)
actuarial_table_heat = actuarial_table_heat.reset_index()
actuarial_table_heat.rename({"Years Upper": "t"}, axis=1, inplace=True)
actuarial_table_heat["t"] = actuarial_table_heat["t"].astype(int)
actuarial_table_heat

	t	failed	risk_set	p_hat	1-p_hat	S_hat	CH_hat	F_hat	V_hat	Standard_Error	CI_95_lb	CI_95_ub	plotting_position	logit_CI_95_lb	logit_CI_95_ub
0	1	4	300	0.013333	0.986667	0.986667	0.013423	0.013333	0.000044	0.006622	0.000354	0.026313	0.013333	0.005013	0.034977
1	2	5	197	0.025381	0.974619	0.961624	0.039131	0.038376	0.000164	0.012802	0.013283	0.063468	0.038376	0.019818	0.073016
2	3	2	97	0.020619	0.979381	0.941797	0.059965	0.058203	0.000350	0.018701	0.021550	0.094856	0.058203	0.030694	0.107629

It’s worth taking some time to walk through this example because it establishes estimates of some key quantities in time-to-failure modelling.

First note how we’re treating time as a series of discrete intervals. The data format is in discrete period format, since it records aggregate failures over time. We’ll see below another format of failure data - the item-period format which records each individual item over all periods and their corresponding status. In this format the key quantities are the set of failed items and the risk_set in each period. Everything else is derived from these facts.

First we’ve established across all three companies in three consecutive years the number of heat-exchanges that were produced and subsequently failed. This provides an estimate of the probability of failure in the year: p_hat and its inverse 1-p_hat respectively. These are further combined over the course of the year to estimate the survival curve S_hat which can be further transformed to recover estimates of the cumulative hazard CH_hat and the cumulative density function F_hat.

Next we want a quick a dirty way to quantify the extent of the uncertainty in our estimate of the CDF. For this purpose we use greenwood’s formula to estimate the variance of our V_hat of our estimate F_hat. This gives us the standard error and the two varieties of confidence interval recommended in the literature.

We’ apply the same techniques to a larger dataset and plot some of these quantities below.

The Shock Absorbers Data: A Study in Frequentist Reliability Analysis#

The shock absorbers data is in period format but it records a constantly decreasing risk set over time with one item being censored or failing at each time point i.e. removed from testing successfully (approved) or removed due to failure. This is a special case of the period format data.

actuarial_table_shock = make_actuarial_table(shockabsorbers_events)
actuarial_table_shock

	t	removed	failed	censored	entrance	risk_set	p_hat	1-p_hat	S_hat	CH_hat	F_hat	V_hat	Standard_Error	CI_95_lb	CI_95_ub	plotting_position	logit_CI_95_lb	logit_CI_95_ub
0	0.0	0	0	0	38	38	0.000000	1.000000	1.000000	-0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	NaN	NaN
1	6700.0	1	1	0	0	38	0.026316	0.973684	0.973684	0.026668	0.026316	0.000674	0.025967	0.000000	0.077212	0.026316	0.003694	0.164570
2	6950.0	1	0	1	0	37	0.000000	1.000000	0.973684	0.026668	0.026316	0.000674	0.025967	0.000000	0.077212	0.026316	0.003694	0.164570
3	7820.0	1	0	1	0	36	0.000000	1.000000	0.973684	0.026668	0.026316	0.000674	0.025967	0.000000	0.077212	0.026316	0.003694	0.164570
4	8790.0	1	0	1	0	35	0.000000	1.000000	0.973684	0.026668	0.026316	0.000674	0.025967	0.000000	0.077212	0.026316	0.003694	0.164570
5	9120.0	1	1	0	0	34	0.029412	0.970588	0.945046	0.056521	0.054954	0.001431	0.037831	0.000000	0.129103	0.054954	0.013755	0.195137
6	9660.0	1	0	1	0	33	0.000000	1.000000	0.945046	0.056521	0.054954	0.001431	0.037831	0.000000	0.129103	0.054954	0.013755	0.195137
7	9820.0	1	0	1	0	32	0.000000	1.000000	0.945046	0.056521	0.054954	0.001431	0.037831	0.000000	0.129103	0.054954	0.013755	0.195137
8	11310.0	1	0	1	0	31	0.000000	1.000000	0.945046	0.056521	0.054954	0.001431	0.037831	0.000000	0.129103	0.054954	0.013755	0.195137
9	11690.0	1	0	1	0	30	0.000000	1.000000	0.945046	0.056521	0.054954	0.001431	0.037831	0.000000	0.129103	0.054954	0.013755	0.195137
10	11850.0	1	0	1	0	29	0.000000	1.000000	0.945046	0.056521	0.054954	0.001431	0.037831	0.000000	0.129103	0.054954	0.013755	0.195137
11	11880.0	1	0	1	0	28	0.000000	1.000000	0.945046	0.056521	0.054954	0.001431	0.037831	0.000000	0.129103	0.054954	0.013755	0.195137
12	12140.0	1	0	1	0	27	0.000000	1.000000	0.945046	0.056521	0.054954	0.001431	0.037831	0.000000	0.129103	0.054954	0.013755	0.195137
13	12200.0	1	1	0	0	26	0.038462	0.961538	0.908698	0.095742	0.091302	0.002594	0.050927	0.000000	0.191119	0.091302	0.029285	0.250730
14	12870.0	1	0	1	0	25	0.000000	1.000000	0.908698	0.095742	0.091302	0.002594	0.050927	0.000000	0.191119	0.091302	0.029285	0.250730
15	13150.0	1	1	0	0	24	0.041667	0.958333	0.870836	0.138302	0.129164	0.003756	0.061285	0.009046	0.249282	0.129164	0.048510	0.301435
16	13330.0	1	0	1	0	23	0.000000	1.000000	0.870836	0.138302	0.129164	0.003756	0.061285	0.009046	0.249282	0.129164	0.048510	0.301435
17	13470.0	1	0	1	0	22	0.000000	1.000000	0.870836	0.138302	0.129164	0.003756	0.061285	0.009046	0.249282	0.129164	0.048510	0.301435
18	14040.0	1	0	1	0	21	0.000000	1.000000	0.870836	0.138302	0.129164	0.003756	0.061285	0.009046	0.249282	0.129164	0.048510	0.301435
19	14300.0	1	1	0	0	20	0.050000	0.950000	0.827294	0.189595	0.172706	0.005191	0.072047	0.031495	0.313917	0.172706	0.072098	0.359338
20	17520.0	1	1	0	0	19	0.052632	0.947368	0.783752	0.243662	0.216248	0.006455	0.080342	0.058778	0.373717	0.216248	0.098254	0.411308
21	17540.0	1	0	1	0	18	0.000000	1.000000	0.783752	0.243662	0.216248	0.006455	0.080342	0.058778	0.373717	0.216248	0.098254	0.411308
22	17890.0	1	0	1	0	17	0.000000	1.000000	0.783752	0.243662	0.216248	0.006455	0.080342	0.058778	0.373717	0.216248	0.098254	0.411308
23	18450.0	1	0	1	0	16	0.000000	1.000000	0.783752	0.243662	0.216248	0.006455	0.080342	0.058778	0.373717	0.216248	0.098254	0.411308
24	18960.0	1	0	1	0	15	0.000000	1.000000	0.783752	0.243662	0.216248	0.006455	0.080342	0.058778	0.373717	0.216248	0.098254	0.411308
25	18980.0	1	0	1	0	14	0.000000	1.000000	0.783752	0.243662	0.216248	0.006455	0.080342	0.058778	0.373717	0.216248	0.098254	0.411308
26	19410.0	1	0	1	0	13	0.000000	1.000000	0.783752	0.243662	0.216248	0.006455	0.080342	0.058778	0.373717	0.216248	0.098254	0.411308
27	20100.0	2	1	1	0	12	0.083333	0.916667	0.718440	0.330673	0.281560	0.009334	0.096613	0.092199	0.470922	0.281560	0.133212	0.499846
28	20150.0	1	0	1	0	10	0.000000	1.000000	0.718440	0.330673	0.281560	0.009334	0.096613	0.092199	0.470922	0.281560	0.133212	0.499846
29	20320.0	1	0	1	0	9	0.000000	1.000000	0.718440	0.330673	0.281560	0.009334	0.096613	0.092199	0.470922	0.281560	0.133212	0.499846
30	20900.0	1	1	0	0	8	0.125000	0.875000	0.628635	0.464205	0.371365	0.014203	0.119177	0.137778	0.604953	0.371365	0.178442	0.616380
31	22700.0	1	1	0	0	7	0.142857	0.857143	0.538830	0.618356	0.461170	0.017348	0.131711	0.203016	0.719324	0.461170	0.232453	0.707495
32	23490.0	1	0	1	0	6	0.000000	1.000000	0.538830	0.618356	0.461170	0.017348	0.131711	0.203016	0.719324	0.461170	0.232453	0.707495
33	26510.0	1	1	0	0	5	0.200000	0.800000	0.431064	0.841499	0.568936	0.020393	0.142806	0.289037	0.848835	0.568936	0.296551	0.805150
34	27410.0	1	0	1	0	4	0.000000	1.000000	0.431064	0.841499	0.568936	0.020393	0.142806	0.289037	0.848835	0.568936	0.296551	0.805150
35	27490.0	1	1	0	0	3	0.333333	0.666667	0.287376	1.246964	0.712624	0.022828	0.151089	0.416490	1.000000	0.712624	0.368683	0.913267
36	27890.0	1	0	1	0	2	0.000000	1.000000	0.287376	1.246964	0.712624	0.022828	0.151089	0.416490	1.000000	0.712624	0.368683	0.913267
37	28100.0	1	0	1	0	1	0.000000	1.000000	0.287376	1.246964	0.712624	0.022828	0.151089	0.416490	1.000000	0.712624	0.368683	0.913267

Maximum Likelihood Fits for Failure Data#

In addition to taking descriptive summaries of our data we can use the life-table data to estimate a univariate model fit to our distribution of failure times. Such a fit, if good, would enable us to have a clearer view of the predictive distribution a particular set of predictive intervals. Here we’ll use the functions from the lifelines package to estimate the MLE fit on right-censored data.

lnf = LogNormalFitter().fit(actuarial_table_shock["t"] + 1e-25, actuarial_table_shock["failed"])
lnf.print_summary()

model	lifelines.LogNormalFitter
number of observations	38
number of events observed	11
log-likelihood	-124.20
hypothesis	mu_ != 0, sigma_ != 1

	coef	se(coef)	coef lower 95%	coef upper 95%	cmp to	z	p	-log2(p)
mu_	10.13	0.14	9.85	10.41	0.00	71.08	<0.005	inf
sigma_	0.53	0.11	0.31	0.74	1.00	-4.26	<0.005	15.58

AIC	252.41

Although it’s tempting to take this model and run with it, we need to be cautious in the case of limited data. For instance in the heat-exchange data we have three years of data with a total of 11 failures. A too simple model can get this quite wrong. For the moment we’ll focus on the shock-absorber data - its non-parametric description and a simple univariate fit to this data.

Show code cell source

Hide code cell source

def plot_cdfs(actuarial_table, dist_fits=True, ax=None, title="", xy=(3000, 0.5), item_period=None):
    if item_period is None:
        lnf = LogNormalFitter().fit(actuarial_table["t"] + 1e-25, actuarial_table["failed"])
        wbf = WeibullFitter().fit(actuarial_table["t"] + 1e-25, actuarial_table["failed"])
    else:
        lnf = LogNormalFitter().fit(item_period["t"] + 1e-25, item_period["failed"])
        wbf = WeibullFitter().fit(item_period["t"] + 1e-25, item_period["failed"])
    if ax is None:
        _, ax = plt.subplots(figsize=(20, 10))
    ax.plot(
        actuarial_table["t"],
        actuarial_table["F_hat"],
        "-o",
        color="black",
        label="Non-Parametric Estimate of CDF",
    )
    ax.plot(
        actuarial_table["t"],
        actuarial_table["CI_95_lb"],
        color="darkorchid",
        linestyle="--",
        label="Non-Parametric 95% CI based on Normal Approx",
    )
    ax.plot(actuarial_table["t"], actuarial_table["CI_95_ub"], color="darkorchid", linestyle="--")
    ax.fill_between(
        actuarial_table["t"].values,
        actuarial_table["CI_95_lb"].values,
        actuarial_table["CI_95_ub"].values,
        color="darkorchid",
        alpha=0.2,
    )
    ax.plot(
        actuarial_table["t"],
        actuarial_table["logit_CI_95_lb"],
        color="royalblue",
        linestyle="--",
        label="Non-Parametric 95% CI based on Logit Approx",
    )
    ax.plot(
        actuarial_table["t"], actuarial_table["logit_CI_95_ub"], color="royalblue", linestyle="--"
    )
    ax.fill_between(
        actuarial_table["t"].values,
        actuarial_table["logit_CI_95_lb"].values,
        actuarial_table["logit_CI_95_ub"].values,
        color="royalblue",
        alpha=0.2,
    )
    if dist_fits:
        lnf.plot_cumulative_density(ax=ax, color="crimson", alpha=0.8)
        wbf.plot_cumulative_density(ax=ax, color="cyan", alpha=0.8)
        ax.annotate(
            f"Lognormal Fit: mu = {np.round(lnf.mu_, 3)}, sigma = {np.round(lnf.sigma_, 3)} \nWeibull Fit: lambda = {np.round(wbf.lambda_, 3)}, rho = {np.round(wbf.rho_, 3)}",
            xy=(xy[0], xy[1]),
            fontsize=12,
            weight="bold",
        )
    ax.set_title(
        f"Estimates of the Cumulative Density Function \n derived from our {title} Failure Data",
        fontsize=20,
    )
    ax.set_ylabel("Fraction Failing")
    ax.set_xlabel("Time Scale")
    ax.legend()
    return ax

plot_cdfs(actuarial_table_shock, title="Shock Absorber");

../_images/169e0e8b1ac419a6d35c253bd5fe71efdc1e6cdc152c4f0ea6d505a322f5d61c.png

This shows a good fit to the data and implies, as you might expect, that the failing fraction of shock absorbers increases with age as they wear out. But how do we quantify the prediction given an estimated model?

The Plug-in-Procedure for calculating Approximate Statistical Prediction Intervals#

Since we’ve estimated a lognormal fit for the CDF of the shock absorbers data we can plot their approximate prediction interval. The interest here is likely to be in the lower bound of the prediction interval since we as manufacturers might want to be aware of warranty claims and the risk of exposure to refunds if the lower bound is too low.

plot_ln_pi(
    10.128,
    0.526,
    xy=(40000, 120),
    title="Plug-in Estimate of Shock Absorber Failure Prediction Interval",
)

../_images/91a0ce9268c227a7f40ffcfc6b8da6ce817f3fb646c2922436b2208843791325.png

Bootstrap Calibration and Coverage Estimation#

We want now to estimate the coverage implied by this prediction interval, and to do so we will bootstrap estimates for the lower and upper bounds of the 95% confidence interval and ultimately assess their coverage conditional on the MLE fit. We will use the fractional weighted (bayesian) bootstrap. We’ll report two methods of estimating the coverage statistic - the first is the empirical coverage based on sampling a random value from within the known range and assess if it falls between the 95% MLE lower bound and upper bounds.

The second method we’ll use to assess coverage is to bootstrap estimates of a 95% lower bound and upper bound and then assess how much those bootstrapped values would cover conditional on the MLE fit.

def bayes_boot(df, lb, ub, seed=100):
    w = np.random.dirichlet(np.ones(len(df)), 1)[0]
    lnf = LogNormalFitter().fit(df["t"] + 1e-25, df["failed"], weights=w)
    rv = lognorm(s=lnf.sigma_, scale=np.exp(lnf.mu_))
    ## Sample random choice from implied bootstrapped distribution
    choices = rv.rvs(1000)
    future = random.choice(choices)
    ## Check if choice is contained within the MLE 95% PI
    contained = (future >= lb) & (future <= ub)
    ## Record 95% interval of bootstrapped dist
    lb = rv.ppf(0.025)
    ub = rv.ppf(0.975)
    return lb, ub, contained, future, lnf.sigma_, lnf.mu_

CIs = [bayes_boot(actuarial_table_shock, 8928, 70188, seed=i) for i in range(1000)]
draws = pd.DataFrame(
    CIs, columns=["Lower Bound PI", "Upper Bound PI", "Contained", "future", "Sigma", "Mu"]
)
draws

	Lower Bound PI	Upper Bound PI	Contained	future	Sigma	Mu
0	7496.331182	84050.660612	True	32415.277526	0.616594	10.130672
1	12434.834416	44505.565518	True	32842.827878	0.325290	10.065813
2	7292.231625	70516.916059	False	8445.813433	0.578848	10.029086
3	8433.382091	68773.554775	True	27374.154662	0.535372	10.089264
4	11457.776873	68711.870457	True	17689.213609	0.456961	10.242051
...	...	...	...	...	...	...
995	9141.320471	63166.059259	True	15027.857397	0.493112	10.087041
996	11500.651408	59895.568809	True	21947.357754	0.420977	10.175258
997	7370.981431	77246.707431	True	25893.463239	0.599361	10.080033
998	10200.893999	57921.492680	True	40963.997112	0.443022	10.098537
999	10413.897114	44436.591927	True	26805.448541	0.370140	9.976358

1000 rows × 6 columns

We can use these bootstrapped statistics to further calculate quantities of the predictive distribution. In our case we could use the parametric CDF for our simple parametric model, but we’ll adopt the empirical cdf here to show how this technique can be used when we have more complicated models too.

def ecdf(sample):
    # convert sample to a numpy array, if it isn't already
    sample = np.atleast_1d(sample)

    # find the unique values and their corresponding counts
    quantiles, counts = np.unique(sample, return_counts=True)

    # take the cumulative sum of the counts and divide by the sample size to
    # get the cumulative probabilities between 0 and 1
    cumprob = np.cumsum(counts).astype(np.double) / sample.size

    return quantiles, cumprob


fig, axs = plt.subplots(1, 2, figsize=(20, 6))
axs = axs.flatten()
ax = axs[0]
ax1 = axs[1]
hist_data = []
for i in range(1000):
    samples = lognorm(s=draws.iloc[i]["Sigma"], scale=np.exp(draws.iloc[i]["Mu"])).rvs(1000)
    qe, pe = ecdf(samples)
    ax.plot(qe, pe, color="skyblue", alpha=0.2)
    lkup = dict(zip(pe, qe))
    hist_data.append([lkup[0.05]])
hist_data = pd.DataFrame(hist_data, columns=["p05"])
samples = lognorm(s=draws["Sigma"].mean(), scale=np.exp(draws["Mu"].mean())).rvs(1000)
qe, pe = ecdf(samples)
ax.plot(qe, pe, color="red", label="Expected CDF for Shock Absorbers Data")
ax.set_xlim(0, 30_000)
ax.set_title("Bootstrapped CDF functions for the Shock Absorbers Data", fontsize=20)
ax1.hist(hist_data["p05"], color="slateblue", ec="black", alpha=0.4, bins=30)
ax1.set_title("Estimate of Uncertainty in the 5% Failure Time", fontsize=20)
ax1.axvline(
    hist_data["p05"].quantile(0.025), color="cyan", label="Lower Bound CI for 5% failure time"
)
ax1.axvline(
    hist_data["p05"].quantile(0.975), color="cyan", label="Upper Bound CI for 5% failure time"
)
ax1.legend()
ax.legend();

../_images/ce60113a6ea60a95b0c80174597637cd086d4c2baa3291eefd2ee569a092243b.png

Next we’ll plot the bootstrapped data and the two estimates of coverage we achieve conditional on the MLE fit. In other words when we want to assess the coverage of a prediction interval based on our MLE fit we can also bootstrap an estimate for this quantity.

Show code cell source

Hide code cell source

mosaic = """AABB
            CCCC"""
fig, axs = plt.subplot_mosaic(mosaic=mosaic, figsize=(20, 12))
mle_rv = lognorm(s=0.53, scale=np.exp(10.128))
axs = [axs[k] for k in axs.keys()]
axs[0].scatter(
    draws["Mu"],
    draws["Lower Bound PI"],
    c=draws["Contained"],
    cmap=cm.cool,
    alpha=0.3,
    label="Fits in MLE 95% CI",
)
axs[1].scatter(
    draws["Sigma"],
    draws["Lower Bound PI"],
    c=draws["Contained"],
    cmap=cm.cool,
    alpha=0.3,
    label="Fits in MLE 95% CI",
)
axs[0].set_title("Bootstrapped Mu against Bootstrapped 95% Lower Bound")
prop = draws["Contained"].sum() / len(draws)
axs[0].annotate(
    f"Estimated Prediction \nEmpirical Coverage \nBased on Sampling : {np.round(prop, 3)}",
    xy=(10.4, 12000),
    fontweight="bold",
)
axs[1].set_title("Bootstrapped Sigma against Bootstrapped 95% Lower Bound")
axs[0].legend()
axs[0].set_xlabel("Mu")
axs[1].set_xlabel("Sigma")
axs[0].set_ylabel("Estimated Lower 95% PI")
axs[1].legend()
axs[2].hist(
    mle_rv.cdf(draws["Lower Bound PI"]),
    bins=50,
    label="Bootstrap 95% LB",
    ec="k",
    color="royalblue",
    alpha=0.2,
)
axs[2].hist(
    mle_rv.cdf(draws["Upper Bound PI"]),
    bins=50,
    label="Bootstrap 95% UB",
    ec="k",
    color="darkorchid",
    alpha=0.2,
)
axs[2].hist(
    np.abs(mle_rv.cdf(draws["Lower Bound PI"]) - mle_rv.cdf(draws["Upper Bound PI"])),
    alpha=0.2,
    bins=50,
    color="slateblue",
    ec="black",
    label="Bootstrap Abs Diff",
)
axs[2].axvline(
    np.abs(mle_rv.cdf(draws["Lower Bound PI"]) - mle_rv.cdf(draws["Upper Bound PI"])).mean(),
    label="Expected Coverage",
)
axs[2].set_title("Difference in LB and UB | MLE(mu, sigma)")
axs[2].legend()

plug_in = np.abs(
    np.mean(mle_rv.cdf(draws["Lower Bound PI"])) - np.mean(mle_rv.cdf(draws["Upper Bound PI"]))
)
lb = np.round(draws["Lower Bound PI"].mean(), 3)
ub = np.round(draws["Upper Bound PI"].mean(), 3)
axs[2].annotate(
    f"Estimated Prediction Interval \n Coverage Based on Plug in Method : {np.round(plug_in, 3)} \n with [{lb, ub}]",
    xy=(0.6, 80),
    fontweight="bold",
);

../_images/47e2cea016bf75304e4120515410ec66cff5f2e925eefe3f94361f2080f85efd.png

These simulations should be repeated a far larger number of times than we do here. It should be clear to see how we can also vary the MLE interval size to achieve the desired coverage level.

Bearing Cage Data: A Study in Bayesian Reliability Analysis#

Next we’ll look at a data set which has a slightly less clean parametric fit. The most obvious feature of this data is the small amount of failing records. The data is recorded in the period format with counts showing the extent of the risk set in each period.

We want to spend some time with this example to show how the frequentist techniques which worked well to estimate the shock-absorbers data can be augmented in the case of the Bearing cage data. In particular we’ll show how the issues arising can be resolved with a Bayesian approach.

actuarial_table_bearings = make_actuarial_table(bearing_cage_events)

actuarial_table_bearings

	t	removed	failed	censored	entrance	risk_set	p_hat	1-p_hat	S_hat	CH_hat	F_hat	V_hat	Standard_Error	CI_95_ub	plotting_position	logit_CI_95_lb	logit_CI_95_ub
0	0.0	0	0	0	1703	1703	0.000000	1.000000	1.000000	-0.000000	0.000000	0.000000e+00	0.000000	0.000000	0.000000	NaN	NaN
1	50.0	288	0	288	0	1703	0.000000	1.000000	1.000000	-0.000000	0.000000	0.000000e+00	0.000000	0.000000	0.000000	NaN	NaN
2	150.0	148	0	148	0	1415	0.000000	1.000000	1.000000	-0.000000	0.000000	0.000000e+00	0.000000	0.000000	0.000000	NaN	NaN
3	230.0	1	1	0	0	1267	0.000789	0.999211	0.999211	0.000790	0.000789	6.224491e-07	0.000789	0.002336	0.000789	0.000111	0.005581
4	250.0	124	0	124	0	1266	0.000000	1.000000	0.999211	0.000790	0.000789	6.224491e-07	0.000789	0.002336	0.000789	0.000111	0.005581
5	334.0	1	1	0	0	1142	0.000876	0.999124	0.998336	0.001666	0.001664	1.386254e-06	0.001177	0.003972	0.001664	0.000415	0.006641
6	350.0	111	0	111	0	1141	0.000000	1.000000	0.998336	0.001666	0.001664	1.386254e-06	0.001177	0.003972	0.001664	0.000415	0.006641
7	423.0	1	1	0	0	1030	0.000971	0.999029	0.997367	0.002637	0.002633	2.322113e-06	0.001524	0.005620	0.002633	0.000846	0.008165
8	450.0	106	0	106	0	1029	0.000000	1.000000	0.997367	0.002637	0.002633	2.322113e-06	0.001524	0.005620	0.002633	0.000846	0.008165
9	550.0	99	0	99	0	923	0.000000	1.000000	0.997367	0.002637	0.002633	2.322113e-06	0.001524	0.005620	0.002633	0.000846	0.008165
10	650.0	110	0	110	0	824	0.000000	1.000000	0.997367	0.002637	0.002633	2.322113e-06	0.001524	0.005620	0.002633	0.000846	0.008165
11	750.0	114	0	114	0	714	0.000000	1.000000	0.997367	0.002637	0.002633	2.322113e-06	0.001524	0.005620	0.002633	0.000846	0.008165
12	850.0	119	0	119	0	600	0.000000	1.000000	0.997367	0.002637	0.002633	2.322113e-06	0.001524	0.005620	0.002633	0.000846	0.008165
13	950.0	127	0	127	0	481	0.000000	1.000000	0.997367	0.002637	0.002633	2.322113e-06	0.001524	0.005620	0.002633	0.000846	0.008165
14	990.0	1	1	0	0	354	0.002825	0.997175	0.994549	0.005466	0.005451	1.022444e-05	0.003198	0.011718	0.005451	0.001722	0.017117
15	1009.0	1	1	0	0	353	0.002833	0.997167	0.991732	0.008303	0.008268	1.808196e-05	0.004252	0.016603	0.008268	0.003008	0.022519
16	1050.0	123	0	123	0	352	0.000000	1.000000	0.991732	0.008303	0.008268	1.808196e-05	0.004252	0.016603	0.008268	0.003008	0.022519
17	1150.0	93	0	93	0	229	0.000000	1.000000	0.991732	0.008303	0.008268	1.808196e-05	0.004252	0.016603	0.008268	0.003008	0.022519
18	1250.0	47	0	47	0	136	0.000000	1.000000	0.991732	0.008303	0.008268	1.808196e-05	0.004252	0.016603	0.008268	0.003008	0.022519
19	1350.0	41	0	41	0	89	0.000000	1.000000	0.991732	0.008303	0.008268	1.808196e-05	0.004252	0.016603	0.008268	0.003008	0.022519
20	1450.0	27	0	27	0	48	0.000000	1.000000	0.991732	0.008303	0.008268	1.808196e-05	0.004252	0.016603	0.008268	0.003008	0.022519
21	1510.0	1	1	0	0	21	0.047619	0.952381	0.944506	0.057093	0.055494	2.140430e-03	0.046265	0.146173	0.055494	0.010308	0.248927
22	1550.0	11	0	11	0	20	0.000000	1.000000	0.944506	0.057093	0.055494	2.140430e-03	0.046265	0.146173	0.055494	0.010308	0.248927
23	1650.0	6	0	6	0	9	0.000000	1.000000	0.944506	0.057093	0.055494	2.140430e-03	0.046265	0.146173	0.055494	0.010308	0.248927
24	1850.0	1	0	1	0	3	0.000000	1.000000	0.944506	0.057093	0.055494	2.140430e-03	0.046265	0.146173	0.055494	0.010308	0.248927
25	2050.0	2	0	2	0	2	0.000000	1.000000	0.944506	0.057093	0.055494	2.140430e-03	0.046265	0.146173	0.055494	0.010308	0.248927

To estimate a univariate or non-parametric CDF we need to disaggregate the period format data into an item-period format.

Item Period Data Format#

item_period = bearing_cage_df["Hours"].to_list() * bearing_cage_df["Count"].sum()
ids = [[i] * 25 for i in range(bearing_cage_df["Count"].sum())]
ids = [int(i) for l in ids for i in l]
item_period_bearing_cage = pd.DataFrame(item_period, columns=["t"])
item_period_bearing_cage["id"] = ids
item_period_bearing_cage["failed"] = np.zeros(len(item_period_bearing_cage))

## Censor appropriate number of ids
unique_ids = item_period_bearing_cage["id"].unique()
censored = bearing_cage_df[bearing_cage_df["Censoring Indicator"] == "Censored"]
i = 0
stack = []
for hour, count, idx in zip(censored["Hours"], censored["Count"], censored["Count"].cumsum()):
    temp = item_period_bearing_cage[
        item_period_bearing_cage["id"].isin(unique_ids[i:idx])
        & (item_period_bearing_cage["t"] == hour)
    ]
    stack.append(temp)
    i = idx

censored_clean = pd.concat(stack)

### Add  appropriate number of failings
stack = []
unique_times = censored_clean["t"].unique()
for id, fail_time in zip(
    [9999, 9998, 9997, 9996, 9995, 9994],
    bearing_cage_df[bearing_cage_df["failed"] == 1]["t"].values,
):
    temp = pd.DataFrame(unique_times[unique_times < fail_time], columns=["t"])
    temp["id"] = id
    temp["failed"] = 0
    temp = pd.concat([temp, pd.DataFrame({"t": [fail_time], "id": [id], "failed": [1]}, index=[0])])
    stack.append(temp)

failed_clean = pd.concat(stack).sort_values(["id", "t"])
censored_clean
item_period_bearing_cage = pd.concat([failed_clean, censored_clean])
## Transpose for more concise visual
item_period_bearing_cage.head(30).T

	0	1	2	3	4	5	6	7	8	9	...	4	5	6	7	8	9	0	0	1	2
t	50.0	150.0	250.0	350.0	450.0	550.0	650.0	750.0	850.0	950.0	...	450.0	550.0	650.0	750.0	850.0	950.0	1009.0	50.0	150.0	250.0
id	9994.0	9994.0	9994.0	9994.0	9994.0	9994.0	9994.0	9994.0	9994.0	9994.0	...	9995.0	9995.0	9995.0	9995.0	9995.0	9995.0	9995.0	9996.0	9996.0	9996.0
failed	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	1.0	0.0	0.0	0.0

3 rows × 30 columns

assert item_period_bearing_cage["id"].nunique() == 1703
assert item_period_bearing_cage["failed"].sum() == 6
assert item_period_bearing_cage[item_period_bearing_cage["t"] >= 1850]["id"].nunique() == 3

As we plot the empirical CDF we see that the y-axis only ever reaches as maximum height of 0.05. A naive MLE fit will go dramatically wrong in extrapolating outside the observed range of the data.

ax = plot_cdfs(
    actuarial_table_bearings,
    title="Bearings",
    dist_fits=False,
    xy=(20, 0.7),
    item_period=item_period_bearing_cage,
)

../_images/7a15eec22d18d47468511029af96d186181094fda7a516f553618638bca0d012.png

Probability Plots: Comparing CDFs in a Restricted Linear Range#

In this section we’ll use the technique of linearising the MLE fits so that can perform a visual “goodness of fit” check. These types of plots rely on a transformation that can be applied to the location and scale distributions to turn their CDF into a linear space.

For both the Lognormal and Weibull fits we can represent their CDF in a linear space as a relationship between the logged value t and an appropriate \(CDF^{-1}\).

Show code cell source

Hide code cell source

def sev_ppf(p):
    return np.log(-np.log(1 - p))


mosaic = """AABB
            CCCC"""
fig, axs = plt.subplot_mosaic(mosaic=mosaic, figsize=(20, 15))
axs = [axs[k] for k in axs.keys()]
ax = axs[0]
ax2 = axs[1]
ax3 = axs[2]
ax.plot(
    np.log(actuarial_table_bearings["t"]),
    norm.ppf(actuarial_table_bearings["logit_CI_95_ub"]),
    "-o",
    label="Non-Parametric CI UB",
    color="slateblue",
)
ax.scatter(
    np.log(actuarial_table_bearings["t"]),
    norm.ppf(actuarial_table_bearings["plotting_position"]),
    label="Non-Parametric CDF",
    color="black",
)
ax.plot(
    np.log(actuarial_table_bearings["t"]),
    norm.ppf(actuarial_table_bearings["logit_CI_95_lb"]),
    "-o",
    label="Non-Parametric CI LB",
    color="slateblue",
)
for mu in np.linspace(10, 12, 3):
    for sigma in np.linspace(1.6, 1.9, 3):
        rv = lognorm(s=sigma, scale=np.exp(mu))
        ax.plot(
            np.log(actuarial_table_bearings["t"]),
            norm.ppf(rv.cdf(actuarial_table_bearings["t"])),
            "--",
            label=f"LN({np.round(mu, 3)}, {np.round(sigma, 3)})",
            color="grey",
        )

lnf = LogNormalFitter().fit(item_period_bearing_cage["t"], item_period_bearing_cage["failed"])
rv = lognorm(s=lnf.sigma_, scale=np.exp(lnf.mu_))
ax.plot(
    np.log(actuarial_table_bearings["t"]),
    norm.ppf(rv.cdf(actuarial_table_bearings["t"])),
    "--",
    label=f"MLE LN({np.round(lnf.mu_, 3)}, {np.round(lnf.sigma_, 3)})",
    color="RED",
)

for r in np.linspace(1, 2, 3):
    for s in np.linspace(12000, 25000, 2):
        rv = weibull_min(c=r, scale=s)
        ax2.plot(
            np.log(actuarial_table_bearings["t"]),
            sev_ppf(rv.cdf(actuarial_table_bearings["t"])),
            "--",
            label=f"Wb({np.round(s, 3)}, {np.round(r, 3)})",
            color="lightblue",
        )


wbf = WeibullFitter().fit(item_period_bearing_cage["t"], item_period_bearing_cage["failed"])
rv = weibull_min(c=wbf.rho_, scale=wbf.lambda_)
ax2.plot(
    np.log(actuarial_table_bearings["t"]),
    sev_ppf(rv.cdf(actuarial_table_bearings["t"])),
    "--",
    label=f"MLE Wb({np.round(wbf.lambda_, 3)}, {np.round(wbf.rho_, 3)})",
    color="red",
)

ax2.plot(
    np.log(actuarial_table_bearings["t"]),
    sev_ppf(actuarial_table_bearings["logit_CI_95_ub"]),
    "-o",
    label="Non-Parametric CI UB",
    color="slateblue",
)
ax2.scatter(
    np.log(actuarial_table_bearings["t"]),
    sev_ppf(actuarial_table_bearings["plotting_position"]),
    label="Non-Parametric CDF",
    color="black",
)
ax2.plot(
    np.log(actuarial_table_bearings["t"]),
    sev_ppf(actuarial_table_bearings["logit_CI_95_lb"]),
    "-o",
    label="Non-Parametric CI LB",
    color="slateblue",
)

ax3.plot(
    actuarial_table_bearings["t"],
    actuarial_table_bearings["logit_CI_95_ub"],
    "-o",
    label="Non-Parametric CI UB",
    color="slateblue",
)
ax3.scatter(
    actuarial_table_bearings["t"],
    actuarial_table_bearings["F_hat"],
    label="Non-Parametric CDF",
    color="black",
)
ax3.plot(
    actuarial_table_bearings["t"],
    actuarial_table_bearings["logit_CI_95_lb"],
    "-o",
    label="Non-Parametric CI LB",
    color="slateblue",
)
lnf.plot_cumulative_density(ax=ax3, color="cyan")
wbf.plot_cumulative_density(ax=ax3, color="darkorchid")


ax2.set_title("Linearizing the Weibull CDF", fontsize=20)
ax.set_title("Linearizing the Lognormal CDF", fontsize=20)
ax3.set_title("MLE CDF Fits", fontsize=20)
ax.legend()
ax.set_xlabel("Time")
ax2.set_xlabel("Time")
ax2.legend()
ax.set_ylabel("Fraction Failing");

/home/osvaldo/anaconda3/envs/arviz_1/lib/python3.14/site-packages/pandas/core/arraylike.py:402: RuntimeWarning: divide by zero encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/home/osvaldo/anaconda3/envs/arviz_1/lib/python3.14/site-packages/pandas/core/arraylike.py:402: RuntimeWarning: divide by zero encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/tmp/ipykernel_28418/2234914544.py:2: RuntimeWarning: divide by zero encountered in log
  return np.log(-np.log(1 - p))
/home/osvaldo/anaconda3/envs/arviz_1/lib/python3.14/site-packages/pandas/core/arraylike.py:402: RuntimeWarning: divide by zero encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/tmp/ipykernel_28418/2234914544.py:2: RuntimeWarning: divide by zero encountered in log
  return np.log(-np.log(1 - p))

../_images/0b8a3739500e23795e973d78323f8ac487b7662ee3f26a6e9ceda0d530833bc9.png

We can see here how neither MLE fit covers the range of observed data.

Bayesian Modelling of Reliability Data#

We’ve now seen how to model and visualise the parametric model fits to sparse reliability using a frequentist or MLE framework. We want to now show how the same style of inferences can be achieved in the Bayesian paradigm.

As in the MLE paradigm we need to model the censored likelihood. For most log-location distributions we’ve seen above the likelihood is expressed as a function of a combination of the distribution pdf \(\phi\) and cdf \(\Phi\) applied as appropriately depending on whether or not the data point was fully observed in the time window or censored.

\[ L(\mu, \sigma) = \prod_{i = 1}^{n} \Bigg(\dfrac{1}{\sigma t_{i}} \phi\Bigg[ \dfrac{log(t_{i}) - \mu}{\sigma} \Bigg] \Bigg)^{\delta_{i}} \cdot \Bigg(1 - \Phi \Bigg[ \dfrac{log(t_{i}) - \mu}{\sigma} \Bigg] \Bigg)^{1-\delta}\]

where \(\delta_{i}\) is an indicator for whether the observation is a failure or a right censored observation. More complicated types of censoring can be included with similar modifications of the CDF depending on the nature of the censored observations.

Direct PYMC implementation of Weibull Survival#

We fit two versions of this model with different specifications for the priors, one takes a vague uniform prior over the data, and another specifies priors closer to the MLE fits. We will show the implications of the prior specification has for the calibration of the model with the observed data below.

def weibull_lccdf(y, alpha, beta):
    """Log complementary cdf of Weibull distribution."""
    return -((y / beta) ** alpha)


item_period_max = item_period_bearing_cage.groupby("id")[["t", "failed"]].max()
y = item_period_max["t"].values
censored = ~item_period_max["failed"].values.astype(bool)


priors = {"beta": [100, 15_000], "alpha": [4, 1, 0.02, 8]}
priors_informative = {"beta": [10_000, 500], "alpha": [2, 0.5, 0.02, 3]}


def make_model(p, info=False):
    with pm.Model() as model:
        if info:
            beta = pm.Normal("beta", p["beta"][0], p["beta"][1])
        else:
            beta = pm.Uniform("beta", p["beta"][0], p["beta"][1])
        alpha = pm.TruncatedNormal(
            "alpha", p["alpha"][0], p["alpha"][1], lower=p["alpha"][2], upper=p["alpha"][3]
        )

        y_obs = pm.Weibull("y_obs", alpha=alpha, beta=beta, observed=y[~censored])
        y_cens = pm.Potential("y_cens", weibull_lccdf(y[censored], alpha, beta))
        idata = pm.sample(random_seed=100, target_accept=0.95)
        idata["prior_predictive"] = pm.sample_prior_predictive().prior_predictive
        pm.sample_posterior_predictive(idata, extend_inferencedata=True)

    return idata, model


idata, model = make_model(priors)
idata_informative, model = make_model(priors_informative, info=True)

Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [beta, alpha]

Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 16 seconds.
There were 4 divergences after tuning. Increase `target_accept` or reparameterize.
/tmp/ipykernel_28418/3377964431.py:28: UserWarning: The effect of Potentials on other parameters is ignored during prior predictive sampling. This is likely to lead to invalid or biased predictive samples.
  idata["prior_predictive"] = pm.sample_prior_predictive().prior_predictive
/home/osvaldo/anaconda3/envs/arviz_1/lib/python3.14/site-packages/pytensor/link/numba/dispatch/basic.py:215: UserWarning: Numba will use object mode to run truncated_normal_rv{"(),(),(),()->()"}'s perform method. Set `pytensor.config.compiler_verbose = True` to see more details.
  warnings.warn(
Sampling: [alpha, beta, y_obs]
/tmp/ipykernel_28418/3377964431.py:29: UserWarning: The effect of Potentials on other parameters is ignored during posterior predictive sampling. This is likely to lead to invalid or biased predictive samples.
  pm.sample_posterior_predictive(idata, extend_inferencedata=True)
Sampling: [y_obs]

Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [beta, alpha]

Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 17 seconds.
/tmp/ipykernel_28418/3377964431.py:28: UserWarning: The effect of Potentials on other parameters is ignored during prior predictive sampling. This is likely to lead to invalid or biased predictive samples.
  idata["prior_predictive"] = pm.sample_prior_predictive().prior_predictive
/home/osvaldo/anaconda3/envs/arviz_1/lib/python3.14/site-packages/pytensor/link/numba/dispatch/basic.py:215: UserWarning: Numba will use object mode to run truncated_normal_rv{"(),(),(),()->()"}'s perform method. Set `pytensor.config.compiler_verbose = True` to see more details.
  warnings.warn(
Sampling: [alpha, beta, y_obs]
/tmp/ipykernel_28418/3377964431.py:29: UserWarning: The effect of Potentials on other parameters is ignored during posterior predictive sampling. This is likely to lead to invalid or biased predictive samples.
  pm.sample_posterior_predictive(idata, extend_inferencedata=True)
Sampling: [y_obs]

az.plot_rank_dist(idata);

../_images/5478f6dfaaa78731c70c7ce08c56cda54c31e744ae099df5e1a5b723e91df0bb.png

az.plot_rank_dist(idata_informative);

../_images/f27a9e038475d0ae7931e5d83cc35c99290cf125cd366234f3482e6fe363604b.png

az.summary(idata)

	mean	sd	eti89_lb	eti89_ub	ess_bulk	ess_tail	r_hat	mcse_mean	mcse_sd
beta	8e+03	3e+03	4100	14000	575	550	1.00	130	72
alpha	2.64	0.54	1.9	3.6	625	690	1.00	0.022	0.017

az.summary(idata_informative)

	mean	sd	eti89_lb	eti89_ub	ess_bulk	ess_tail	r_hat	mcse_mean	mcse_sd
beta	1e+04	5e+02	9200	11000	2140	2438	1.00	11	7.9
alpha	2.174	0.169	1.9	2.5	2161	1782	1.00	0.0037	0.0026

joint_draws = az.extract(idata, num_samples=1000)[["alpha", "beta"]]
alphas = joint_draws["alpha"].values
betas = joint_draws["beta"].values

joint_draws_informative = az.extract(idata_informative, num_samples=1000)[["alpha", "beta"]]
alphas_informative = joint_draws_informative["alpha"].values
betas_informative = joint_draws_informative["beta"].values

mosaic = """AAAA
            BBCC"""
fig, axs = plt.subplot_mosaic(mosaic=mosaic, figsize=(20, 13))
axs = [axs[k] for k in axs.keys()]
ax = axs[0]
ax1 = axs[2]
ax2 = axs[1]
hist_data = []
for i in range(1000):
    draws = pm.draw(pm.Weibull.dist(alpha=alphas[i], beta=betas[i]), 1000)
    qe, pe = ecdf(draws)
    lkup = dict(zip(pe, qe))
    hist_data.append([lkup[0.1], lkup[0.05]])
    ax.plot(qe, pe, color="slateblue", alpha=0.1)
hist_data_info = []
for i in range(1000):
    draws = pm.draw(pm.Weibull.dist(alpha=alphas_informative[i], beta=betas_informative[i]), 1000)
    qe, pe = ecdf(draws)
    lkup = dict(zip(pe, qe))
    hist_data_info.append([lkup[0.1], lkup[0.05]])
    ax.plot(qe, pe, color="pink", alpha=0.1)
hist_data = pd.DataFrame(hist_data, columns=["p10", "p05"])
hist_data_info = pd.DataFrame(hist_data_info, columns=["p10", "p05"])
draws = pm.draw(pm.Weibull.dist(alpha=np.mean(alphas), beta=np.mean(betas)), 1000)
qe, pe = ecdf(draws)
ax.plot(qe, pe, color="purple", label="Expected CDF Uninformative Prior")
draws = pm.draw(
    pm.Weibull.dist(alpha=np.mean(alphas_informative), beta=np.mean(betas_informative)), 1000
)
qe, pe = ecdf(draws)
ax.plot(qe, pe, color="magenta", label="Expected CDF Informative Prior")
ax.plot(
    actuarial_table_bearings["t"],
    actuarial_table_bearings["logit_CI_95_ub"],
    "--",
    label="Non-Parametric CI UB",
    color="black",
)
ax.plot(
    actuarial_table_bearings["t"],
    actuarial_table_bearings["F_hat"],
    "-o",
    label="Non-Parametric CDF",
    color="black",
    alpha=1,
)
ax.plot(
    actuarial_table_bearings["t"],
    actuarial_table_bearings["logit_CI_95_lb"],
    "--",
    label="Non-Parametric CI LB",
    color="black",
)
ax.set_xlim(0, 2500)
ax.set_title(
    "Bayesian Estimation of Uncertainty in the Posterior Predictive CDF \n Informative and Non-Informative Priors",
    fontsize=20,
)
ax.set_ylabel("Fraction Failing")
ax.set_xlabel("Time")
ax1.hist(
    hist_data["p10"], bins=30, ec="black", color="skyblue", alpha=0.4, label="Uninformative Prior"
)
ax1.hist(
    hist_data_info["p10"],
    bins=30,
    ec="black",
    color="slateblue",
    alpha=0.4,
    label="Informative Prior",
)
ax1.set_title("Distribution of 10% failure Time", fontsize=20)
ax1.legend()
ax2.hist(
    hist_data["p05"], bins=30, ec="black", color="cyan", alpha=0.4, label="Uninformative Prior"
)
ax2.hist(
    hist_data_info["p05"], bins=30, ec="black", color="pink", alpha=0.4, label="Informative Prior"
)
ax2.legend()
ax2.set_title("Distribution of 5% failure Time", fontsize=20)
wbf = WeibullFitter().fit(item_period_bearing_cage["t"] + 1e-25, item_period_bearing_cage["failed"])
wbf.plot_cumulative_density(ax=ax, color="cyan", label="Weibull MLE Fit")
ax.legend()
ax.set_ylim(0, 0.25);

../_images/d7a5caae6da3ebe59d7d9ec52330dbc7f77f2d35670cf1cd1c8b41137538d71f.png

We can see here how the Bayesian uncertainty estimates driven by our deliberately vague priors encompasses more uncertainty than our MLE fit and the uninformative prior implies a wider predictive distribution for the 5% and 10% failure times. The Bayesian model with uninformative priors seems to do a better job of capturing the uncertainty in the non-parametric estimates of our CDF, but without more information it’s hard to tell which is the more appropriate model.

The concrete estimates of failure percentage over time of each model fit are especially crucial in a situation where we have sparse data. It is a meaningful sense check that we can consult with subject matter experts about how plausible the expectation and range for the 10% failure time is for their product.

Conclusion#

We’ve seen how to analyse and model reliability from both a frequentist and Bayesian perspective and compare against the non-parametric estimates. We’ve shown how prediction intervals can be derived for a number of key statistics by both a bootstrapping and a bayesian approach. We’ve seen approaches to calibrating these prediction intervals through re-sampling methods and informative prior specification. These views on the problem are complementary and the choice of technique which is appropriate should be driven by factors of the questions of interest, not some ideological commitment.

In particular we’ve seen how the MLE fits to our bearings data provide a decent first guess approach to establishing priors in the Bayesian analysis. We’ve also seen how subject matter expertise can be elicited by deriving key quantities from the implied models and subjecting these implications to scrutiny. The choice of Bayesian prediction interval is calibrated to our priors expectations, and where we have none - we can supply vague or non-informative priors. The implications of these priors can again be checked and analysed against an appropriate cost function.

Authors#

Authored by Nathaniel Forde on 9th of January 2022 (pymc-examples#491)
Updated by Osvaldo Martin on January 2026
Updated by Osvaldo Martin in April 2026

References#

[1]

W.Q. Meeker, L.A. Escobar, and F.G Pascual. Statistical Methods for Reliability Data. Wiley, 2022.

Watermark#

%load_ext watermark
%watermark -n -u -v -iv -w -p pytensor,xarray_einstats

Last updated: Sat, 25 Apr 2026

Python implementation: CPython
Python version       : 3.14.4
IPython version      : 9.12.0

pytensor       : 2.38.0+133.g80cc113b5
xarray_einstats: 0.10.0

arviz          : 1.1.0
lifelines      : 0.30.3
matplotlib     : 3.10.8
numpy          : 2.4.4
pandas         : 3.0.2
pymc           : 5.28.0+58.gf58491a3
xarray         : 2026.4.0
xarray_einstats: 0.10.0

Watermark: 2.6.0

License notice#

All the notebooks in this example gallery are provided under the MIT License which allows modification, and redistribution for any use provided the copyright and license notices are preserved.

Citing PyMC examples#

To cite this notebook, use the DOI provided by Zenodo for the pymc-examples repository.

Important

Many notebooks are adapted from other sources: blogs, books… In such cases you should cite the original source as well.

Also remember to cite the relevant libraries used by your code.

Here is an citation template in bibtex:

@incollection{citekey,
  author    = "<notebook authors, see above>",
  title     = "<notebook title>",
  editor    = "PyMC Team",
  booktitle = "PyMC examples",
  doi       = "10.5281/zenodo.5654871"
}

which once rendered could look like:

Categories

Tags

Reliability Statistics and Predictive Calibration#

Reliability Statistics#

Types of Prediction#

Structure of the Presentation#

Example Failure Distribution#

Estimation of the Failure Distribution from Data#

Heat Exchange Data#

The Shock Absorbers Data: A Study in Frequentist Reliability Analysis#

Maximum Likelihood Fits for Failure Data#

The Plug-in-Procedure for calculating Approximate Statistical Prediction Intervals#

Bootstrap Calibration and Coverage Estimation#

Bearing Cage Data: A Study in Bayesian Reliability Analysis#

Item Period Data Format#

Probability Plots: Comparing CDFs in a Restricted Linear Range#

Bayesian Modelling of Reliability Data#

Direct PYMC implementation of Weibull Survival#

Predicting the Number of Failures in an Interval#

The Plugin Estimate#

Applying the Same Procedure on the Bayesian Posterior#

Conclusion#

Authors#

References#

Watermark#

License notice#

Citing PyMC examples#