# pymc.Group#

class pymc.Group(group=None, vfam=None, params=None, *args, **kwargs)[source]#

Base class for grouping variables in VI

Grouped Approximation is used for modelling mutual dependencies for a specified group of variables. Base for local and global group.

Parameters
group: list

List of PyMC variables or None indicating that group takes all the rest variables

vfam: str

String that marks the corresponding variational family for the group. Cannot be passed both with params

params: dict

Dict with variational family parameters, full description can be found below. Cannot be passed both with vfam

random_seed: int

Random seed for underlying random generator

model

PyMC Model

options: dict

Special options for the group

kwargs: Other kwargs for the group

Notes

Group instance/class has some important constants:

• supports_batched Determines whether such variational family can be used for AEVB or rowwise approx.

AEVB approx is such approx that somehow depends on input data. It can be treated as conditional distribution. You can see more about in the corresponding paper mentioned in references.

Rowwise mode is a special case approximation that treats every ‘row’, of a tensor as independent from each other. Some distributions can’t do that by definition e.g. Empirical that consists of particles only.

• has_logq Tells that distribution is defined explicitly

These constants help providing the correct inference method for given parametrization

References

Examples

Basic Initialization

Group is a factory class. You do not need to call every ApproximationGroup explicitly. Passing the correct vfam (Variational FAMily) argument you’ll tell what parametrization is desired for the group. This helps not to overload code with lots of classes.

>>> group = Group([latent1, latent2], vfam='mean_field')


The other way to select approximation is to provide params dictionary that has some predefined well shaped parameters. Keys of the dict serve as an identifier for variational family and help to autoselect the correct group class. To identify what approximation to use, params dict should have the full set of needed parameters. As there are 2 ways to instantiate the Group passing both vfam and params is prohibited. Partial parametrization is prohibited by design to avoid corner cases and possible problems.

>>> group = Group([latent3], params=dict(mu=my_mu, rho=my_rho))


Important to note that in case you pass custom params they will not be autocollected by optimizer, you’ll have to provide them with more_obj_params keyword.

Supported dict keys:

• {‘mu’, ‘rho’}: MeanFieldGroup

• {‘mu’, ‘L_tril’}: FullRankGroup

• {‘histogram’}: EmpiricalGroup

• {0, 1, 2, 3, …, k-1}: NormalizingFlowGroup of depth k

NormalizingFlows have other parameters than ordinary groups and should be passed as nested dicts with the following keys:

• {‘u’, ‘w’, ‘b’}: PlanarFlow

• {‘a’, ‘b’, ‘z_ref’}: RadialFlow

• {‘loc’}: LocFlow

• {‘rho’}: ScaleFlow

• {‘v’}: HouseholderFlow

Note that all integer keys should be present in the dictionary. An example of NormalizingFlow initialization can be found below.

Using AEVB

Autoencoding variational Bayes is a powerful tool to get conditional $$q(\lambda|X)$$ distribution on latent variables. It is well supported by PyMC and all you need is to provide a dictionary with well shaped variational parameters, the correct approximation will be autoselected as mentioned in section above. However we have some implementation restrictions in AEVB. They require autoencoded variable to have first dimension as batch dimension and other dimensions should stay fixed. With this assumptions it is possible to generalize all variational approximation families as batched approximations that have flexible parameters and leading axis.

Delayed Initialization

When you have a lot of latent variables it is impractical to do it all manually. To make life much simpler, You can pass None instead of list of variables. That case you’ll not create shared parameters until you pass all collected groups to Approximation object that collects all the groups together and checks that every group is correctly initialized. For those groups which have group equal to None it will collect all the rest variables not covered by other groups and perform delayed init.

>>> group_1 = Group([latent1], vfam='fr')  # latent1 has full rank approximation
>>> group_other = Group(None, vfam='mf')  # other variables have mean field Q
>>> approx = Approximation([group_1, group_other])


Summing Up

When you have created all the groups they need to pass all the groups to Approximation. It does not accept any other parameter rather than groups

>>> approx = Approximation(my_groups)


Methods

 Group.__init__(group[, vfam, params, ...]) Group.get_param_spec_for(**kwargs) Group.group_for_params(params) Dev - creates correct replacements for initial depending on sample size and deterministic flag Group.register(sbcls) Group.set_size_and_deterministic(node, s, d) Dev - after node is sampled via symbolic_sample_over_posterior() or symbolic_single_sample() new random generator can be allocated and applied to node Dev - performs sampling of node applying independent samples from posterior each time. Dev - performs sampling of node applying single sample from posterior. Dev - replace vars with flattened view stored in self.inputs

Attributes

 alias_names cov ddim has_logq initial_dist_map initial_dist_name input logq Dev - Monte Carlo estimate for group logQ logq_norm Dev - Monte Carlo estimate for group logQ normalized mean ndim params params_dict replacements shared_params short_name std symbolic_initial symbolic_logq Dev - correctly scaled self.symbolic_logq_not_scaled symbolic_logq_not_scaled Dev - symbolically computed logq for self.symbolic_random computations can be more efficient since all is known beforehand including self.symbolic_random symbolic_normalizing_constant Dev - normalizing constant for self.logq, scales it to minibatch_size instead of total_size symbolic_random Dev - abstract node that takes self.symbolic_initial and creates approximate posterior that is parametrized with self.params_dict.