Posts by Juan Martin Loyola
Using Data Containers
- 16 December 2021
After building the statistical model of your dreams, you’re going to need to feed it some data. Data is typically introduced to a PyMC model in one of two ways. Some data is used as an exogenous input, called X
in linear regression models, where mu = X @ beta
. Other data are “observed” examples of the endogenous outputs of your model, called y
in regression models, and is used as input to the likelihood function implied by your model. These data, either exogenous or endogenous, can be included in your model as wide variety of datatypes, including numpy ndarrays
, pandas Series
and DataFrame
, and even pytensor TensorVariables
.