Posts tagged experimentation

Sensitivity Analysis for Unmeasured Confounding

All applied inference is argument. Against every experiment you can set the contention that the working conditions were imperfect. Some aspect of the evaluation was flawed. Maybe treatment assignment introduced a subtle kind of bias, or the subjects didn’t comply fully with the design. Against every experiment you can contrast the scientific ideal of perfect randomisation and clear adherence. Holding an experiment against that ideal is due diligence. Sensitivity analysis does it systematically, by varying how far the working conditions fall short of perfect randomisation.

Read more ...


Multiple Experiments and Bayesian Meta-analysis

Eight quarterly A/B tests of the same checkout-flow redesign, run across eight markets, return eight different point estimates. Two cross the conventional significance threshold; the other six do not. The product manager asks the natural question, “did it work?”, and gets two incompatible defaults depending on which colleague answers: vote-counting (“four out of eight worked, so it’s a wash”), or pool-everything (“the combined estimate is positive, so it works”). Both are mistakes. The vote-count discards the magnitude information in each estimate; the pool-everything pretends the markets are exchangeable in a way the evidence does not support. The honest answer requires a model that estimates between-market differences rather than assuming them away.

Read more ...


Assurance Planning via Simulation

Experimental questions are seeded in the science that preceded them. Answers are stress-tested and refined. New experiments spawn further questions again. This is the cycle.

Read more ...