pymc.Group#
- class pymc.Group(group=None, vfam=None, params=None, *args, **kwargs)[source]#
Base class for grouping variables in VI
Grouped Approximation is used for modelling mutual dependencies for a specified group of variables. Base for local and global group.
- Parameters:
- group: list
List of PyMC variables or None indicating that group takes all the rest variables
- vfam: str
String that marks the corresponding variational family for the group. Cannot be passed both with params
- params: dict
Dict with variational family parameters, full description can be found below. Cannot be passed both with vfam
- random_seed: int
Random seed for underlying random generator
- model
PyMC Model
- options: dict
Special options for the group
- kwargs: Other kwargs for the group
See also
Notes
Group instance/class has some important constants:
has_logq Tells that distribution is defined explicitly
These constants help providing the correct inference method for given parametrization
References
Kingma, D. P., & Welling, M. (2014). Auto-Encoding Variational Bayes. stat, 1050, 1.
Examples
Basic Initialization
Group
is a factory class. You do not need to call every ApproximationGroup explicitly. Passing the correct vfam (Variational FAMily) argument you’ll tell what parametrization is desired for the group. This helps not to overload code with lots of classes.>>> group = Group([latent1, latent2], vfam="mean_field")
The other way to select approximation is to provide params dictionary that has some predefined well shaped parameters. Keys of the dict serve as an identifier for variational family and help to autoselect the correct group class. To identify what approximation to use, params dict should have the full set of needed parameters. As there are 2 ways to instantiate the
Group
passing both vfam and params is prohibited. Partial parametrization is prohibited by design to avoid corner cases and possible problems.>>> group = Group([latent3], params=dict(mu=my_mu, rho=my_rho))
Important to note that in case you pass custom params they will not be autocollected by optimizer, you’ll have to provide them with more_obj_params keyword.
Supported dict keys:
{‘mu’, ‘rho’}:
MeanFieldGroup
{‘mu’, ‘L_tril’}:
FullRankGroup
{‘histogram’}:
EmpiricalGroup
Delayed Initialization
When you have a lot of latent variables it is impractical to do it all manually. To make life much simpler, You can pass None instead of list of variables. That case you’ll not create shared parameters until you pass all collected groups to Approximation object that collects all the groups together and checks that every group is correctly initialized. For those groups which have group equal to None it will collect all the rest variables not covered by other groups and perform delayed init.
>>> group_1 = Group([latent1], vfam="fr") # latent1 has full rank approximation >>> group_other = Group(None, vfam="mf") # other variables have mean field Q >>> approx = Approximation([group_1, group_other])
Summing Up
When you have created all the groups they need to pass all the groups to
Approximation
. It does not accept any other parameter rather than groups>>> approx = Approximation(my_groups)
Methods
Group.__init__
(group[, vfam, params, ...])Group.get_param_spec_for
(**kwargs)Group.group_for_params
(params)Dev - creates correct replacements for initial depending on sample size and deterministic flag
Group.register
(sbcls)Group.set_size_and_deterministic
(node, s, d)Dev - after node is sampled via
symbolic_sample_over_posterior()
orsymbolic_single_sample()
new random generator can be allocated and applied to nodeDev - performs sampling of node applying independent samples from posterior each time.
Dev - performs sampling of node applying single sample from posterior.
Group.to_flat_input
(node)Dev - replace vars with flattened view stored in self.inputs
Group.var_to_data
(shared)Takes a flat 1-dimensional tensor variable and maps it to an xarray data set based on the information in self.ordering.
Attributes
alias_names
cov
Covariance between the latent variables as an unstructured 2-dimensional tensor variable
ddim
has_logq
initial_dist_map
initial_dist_name
input
logq
Dev - Monte Carlo estimate for group logQ
logq_norm
Dev - Monte Carlo estimate for group logQ normalized
mean
Mean of the latent variables as an unstructured 1-dimensional tensor variable
mean_data
Mean of the latent variables as an xarray Dataset
ndim
params
params_dict
replacements
shared_params
short_name
std
Standard deviation of the latent variables as an unstructured 1-dimensional tensor variable
std_data
Standard deviation of the latent variables as an xarray Dataset
symbolic_initial
symbolic_logq
Dev - correctly scaled self.symbolic_logq_not_scaled
symbolic_logq_not_scaled
Dev - symbolically computed logq for self.symbolic_random computations can be more efficient since all is known beforehand including self.symbolic_random
symbolic_normalizing_constant
Dev - normalizing constant for self.logq, scales it to minibatch_size instead of total_size
symbolic_random
Dev - abstract node that takes self.symbolic_initial and creates approximate posterior that is parametrized with self.params_dict.