pymc.Group#

class pymc.Group(group=None, vfam=None, params=None, *args, **kwargs)[source]#

Base class for grouping variables in VI

Grouped Approximation is used for modelling mutual dependencies for a specified group of variables. Base for local and global group.

Parameters:

group: list: List of PyMC variables or None indicating that group takes all the rest variables
vfam: str: String that marks the corresponding variational family for the group. Cannot be passed both with params
params: dict: Dict with variational family parameters, full description can be found below. Cannot be passed both with vfam
random_seed: int: Random seed for underlying random generator
model: PyMC Model
options: dict: Special options for the group
kwargs: Other kwargs for the group

See also

Approximation

Notes

Group instance/class has some important constants:

has_logq Tells that distribution is defined explicitly

These constants help providing the correct inference method for given parametrization

References

Kingma, D. P., & Welling, M. (2014). Auto-Encoding Variational Bayes. stat, 1050, 1.

Examples

Basic Initialization

Group is a factory class. You do not need to call every ApproximationGroup explicitly. Passing the correct vfam (Variational FAMily) argument you’ll tell what parametrization is desired for the group. This helps not to overload code with lots of classes.

>>> group = Group([latent1, latent2], vfam='mean_field')

The other way to select approximation is to provide params dictionary that has some predefined well shaped parameters. Keys of the dict serve as an identifier for variational family and help to autoselect the correct group class. To identify what approximation to use, params dict should have the full set of needed parameters. As there are 2 ways to instantiate the Group passing both vfam and params is prohibited. Partial parametrization is prohibited by design to avoid corner cases and possible problems.

>>> group = Group([latent3], params=dict(mu=my_mu, rho=my_rho))

Important to note that in case you pass custom params they will not be autocollected by optimizer, you’ll have to provide them with more_obj_params keyword.

Supported dict keys:

{‘mu’, ‘rho’}: MeanFieldGroup
{‘mu’, ‘L_tril’}: FullRankGroup
{‘histogram’}: EmpiricalGroup

Delayed Initialization

When you have a lot of latent variables it is impractical to do it all manually. To make life much simpler, You can pass None instead of list of variables. That case you’ll not create shared parameters until you pass all collected groups to Approximation object that collects all the groups together and checks that every group is correctly initialized. For those groups which have group equal to None it will collect all the rest variables not covered by other groups and perform delayed init.

>>> group_1 = Group([latent1], vfam='fr')  # latent1 has full rank approximation
>>> group_other = Group(None, vfam='mf')  # other variables have mean field Q
>>> approx = Approximation([group_1, group_other])

Summing Up

When you have created all the groups they need to pass all the groups to Approximation. It does not accept any other parameter rather than groups

>>> approx = Approximation(my_groups)

Methods

`Group.__init__`(group[, vfam, params, ...])
`Group.get_param_spec_for`(**kwargs)
`Group.group_for_params`(params)
`Group.group_for_short_name`(name)
`Group.make_size_and_deterministic_replacements`(s, d)	Dev - creates correct replacements for initial depending on sample size and deterministic flag
`Group.register`(sbcls)
`Group.set_size_and_deterministic`(node, s, d)	Dev - after node is sampled via `symbolic_sample_over_posterior()` or `symbolic_single_sample()` new random generator can be allocated and applied to node
`Group.symbolic_sample_over_posterior`(node)	Dev - performs sampling of node applying independent samples from posterior each time.
`Group.symbolic_single_sample`(node)	Dev - performs sampling of node applying single sample from posterior.
`Group.to_flat_input`(node)	Dev - replace vars with flattened view stored in self.inputs
`Group.var_to_data`(shared)	Takes a flat 1-dimensional tensor variable and maps it to an xarray data set based on the information in self.ordering.

Attributes

`alias_names`
`cov`	Covariance between the latent variables as an unstructured 2-dimensional tensor variable
`ddim`
`has_logq`
`initial_dist_map`
`initial_dist_name`
`input`
`logq`	Dev - Monte Carlo estimate for group logQ
`logq_norm`	Dev - Monte Carlo estimate for group logQ normalized
`mean`	Mean of the latent variables as an unstructured 1-dimensional tensor variable
`mean_data`	Mean of the latent variables as an xarray Dataset
`ndim`
`params`
`params_dict`
`replacements`
`shared_params`
`short_name`
`std`	Standard deviation of the latent variables as an unstructured 1-dimensional tensor variable
`std_data`	Standard deviation of the latent variables as an xarray Dataset
`symbolic_initial`
`symbolic_logq`	Dev - correctly scaled self.symbolic_logq_not_scaled
`symbolic_logq_not_scaled`	Dev - symbolically computed logq for self.symbolic_random computations can be more efficient since all is known beforehand including self.symbolic_random
`symbolic_normalizing_constant`	Dev - normalizing constant for self.logq, scales it to minibatch_size instead of total_size
`symbolic_random`	Dev - abstract node that takes self.symbolic_initial and creates approximate posterior that is parametrized with self.params_dict.