pymc.Data#
- pymc.Data(name, value, *, dims=None, coords=None, export_index_as_coords=False, mutable=None, **kwargs)[source]#
Data container that registers a data variable with the model.
Depending on the
mutable
setting (default: True), the variable is registered as aSharedVariable
, enabling it to be altered in value and shape, but NOT in dimensionality usingpymc.set_data()
.To set the value of the data container variable, check out
pymc.Model.set_data()
.For more information, read the notebook Using shared variables (Data container adaptation).
- Parameters
- name
str
The name for this variable.
- valuearray_like or
pandas.Series
,pandas.Dataframe
A value to associate with this variable.
- dims
str
ortuple
ofstr
, optional Dimension names of the random variables (as opposed to the shapes of these random variables). Use this when
value
is a pandas Series or DataFrame. Thedims
will then be the name of the Series / DataFrame’s columns. See ArviZ documentation for more information about dimensions and coordinates: ArviZ Quickstart. If this parameter is not specified, the random variables will not have dimension names.- coords
dict
, optional Coordinate values to set for new dimensions introduced by this
Data
variable.- export_index_as_coordsbool, default=False
If True, the
Data
container will try to infer what the coordinates and dimension names should be if there is an index invalue
.- mutablebool, optional
Switches between creating a
SharedVariable
(mutable=True
) vs. creating aTensorConstant
(mutable=False
). Consider usingpymc.ConstantData
orpymc.MutableData
as less verbose alternatives topm.Data(..., mutable=...)
. If this parameter is not specified, the value it takes will depend on the version of the package. Sincev4.1.0
the default value ismutable=False
, with previous versions havingmutable=True
.- **kwargs
dict
, optional Extra arguments passed to
aesara.shared()
.
- name
Examples
>>> import pymc as pm >>> import numpy as np >>> # We generate 10 datasets >>> true_mu = [np.random.randn() for _ in range(10)] >>> observed_data = [mu + np.random.randn(20) for mu in true_mu]
>>> with pm.Model() as model: ... data = pm.MutableData('data', observed_data[0]) ... mu = pm.Normal('mu', 0, 10) ... pm.Normal('y', mu=mu, sigma=1, observed=data)
>>> # Generate one trace for each dataset >>> idatas = [] >>> for data_vals in observed_data: ... with model: ... # Switch out the observed dataset ... model.set_data('data', data_vals) ... idatas.append(pm.sample())