pymc.Data#
- pymc.Data(name, value, *, dims=None, coords=None, infer_dims_and_coords=False, mutable=None, **kwargs)[source]#
Data container that registers a data variable with the model.
Depending on the
mutable
setting (default: True), the variable is registered as aSharedVariable
, enabling it to be altered in value and shape, but NOT in dimensionality usingpymc.set_data()
.To set the value of the data container variable, check out
pymc.Model.set_data()
.When making predictions or doing posterior predictive sampling, the shape of the registered data variable will most likely need to be changed. If you encounter an PyTensor shape mismatch error, refer to the documentation for
pymc.model.set_data()
.For more information, read the notebook Using Data Containers.
- Parameters:
- name
str
The name for this variable.
- valuearray_like or
pandas.Series
,pandas.Dataframe
A value to associate with this variable.
- dims
str
ortuple
ofstr
, optional Dimension names of the random variables (as opposed to the shapes of these random variables). Use this when
value
is a pandas Series or DataFrame. Thedims
will then be the name of the Series / DataFrame’s columns. See ArviZ documentation for more information about dimensions and coordinates: ArviZ Quickstart. If this parameter is not specified, the random variables will not have dimension names.- coords
dict
, optional Coordinate values to set for new dimensions introduced by this
Data
variable.- export_index_as_coordsbool
Deprecated, previous version of “infer_dims_and_coords”
- infer_dims_and_coordsbool, default=False
If True, the
Data
container will try to infer what the coordinates and dimension names should be if there is an index invalue
.- **kwargs
dict
, optional Extra arguments passed to
pytensor.shared()
.
- name
Examples
>>> import pymc as pm >>> import numpy as np >>> # We generate 10 datasets >>> true_mu = [np.random.randn() for _ in range(10)] >>> observed_data = [mu + np.random.randn(20) for mu in true_mu]
>>> with pm.Model() as model: ... data = pm.Data("data", observed_data[0]) ... mu = pm.Normal("mu", 0, 10) ... pm.Normal("y", mu=mu, sigma=1, observed=data)
>>> # Generate one trace for each dataset >>> idatas = [] >>> for data_vals in observed_data: ... with model: ... # Switch out the observed dataset ... model.set_data("data", data_vals) ... idatas.append(pm.sample())