pymc.Group#

class pymc.Group(group=None, vfam=None, params=None, *args, **kwargs)[source]#

Base class for grouping variables in VI

Grouped Approximation is used for modelling mutual dependencies for a specified group of variables. Base for local and global group.

Parameters:
group: list

List of PyMC variables or None indicating that group takes all the rest variables

vfam: str

String that marks the corresponding variational family for the group. Cannot be passed both with params

params: dict

Dict with variational family parameters, full description can be found below. Cannot be passed both with vfam

random_seed: int

Random seed for underlying random generator

model

PyMC Model

options: dict

Special options for the group

kwargs: Other kwargs for the group

See also

Approximation

Notes

Group instance/class has some important constants:

  • has_logq Tells that distribution is defined explicitly

These constants help providing the correct inference method for given parametrization

References

Examples

Basic Initialization

Group is a factory class. You do not need to call every ApproximationGroup explicitly. Passing the correct vfam (Variational FAMily) argument you’ll tell what parametrization is desired for the group. This helps not to overload code with lots of classes.

>>> group = Group([latent1, latent2], vfam='mean_field')

The other way to select approximation is to provide params dictionary that has some predefined well shaped parameters. Keys of the dict serve as an identifier for variational family and help to autoselect the correct group class. To identify what approximation to use, params dict should have the full set of needed parameters. As there are 2 ways to instantiate the Group passing both vfam and params is prohibited. Partial parametrization is prohibited by design to avoid corner cases and possible problems.

>>> group = Group([latent3], params=dict(mu=my_mu, rho=my_rho))

Important to note that in case you pass custom params they will not be autocollected by optimizer, you’ll have to provide them with more_obj_params keyword.

Supported dict keys:

  • {‘mu’, ‘rho’}: MeanFieldGroup

  • {‘mu’, ‘L_tril’}: FullRankGroup

  • {‘histogram’}: EmpiricalGroup

Delayed Initialization

When you have a lot of latent variables it is impractical to do it all manually. To make life much simpler, You can pass None instead of list of variables. That case you’ll not create shared parameters until you pass all collected groups to Approximation object that collects all the groups together and checks that every group is correctly initialized. For those groups which have group equal to None it will collect all the rest variables not covered by other groups and perform delayed init.

>>> group_1 = Group([latent1], vfam='fr')  # latent1 has full rank approximation
>>> group_other = Group(None, vfam='mf')  # other variables have mean field Q
>>> approx = Approximation([group_1, group_other])

Summing Up

When you have created all the groups they need to pass all the groups to Approximation. It does not accept any other parameter rather than groups

>>> approx = Approximation(my_groups)

Methods

Group.__init__(group[, vfam, params, ...])

Group.get_param_spec_for(**kwargs)

Group.group_for_params(params)

Group.group_for_short_name(name)

Group.make_size_and_deterministic_replacements(s, d)

Dev - creates correct replacements for initial depending on sample size and deterministic flag

Group.register(sbcls)

Group.set_size_and_deterministic(node, s, d)

Dev - after node is sampled via symbolic_sample_over_posterior() or symbolic_single_sample() new random generator can be allocated and applied to node

Group.symbolic_sample_over_posterior(node)

Dev - performs sampling of node applying independent samples from posterior each time.

Group.symbolic_single_sample(node)

Dev - performs sampling of node applying single sample from posterior.

Group.to_flat_input(node)

Dev - replace vars with flattened view stored in self.inputs

Group.var_to_data(shared)

Takes a flat 1-dimensional tensor variable and maps it to an xarray data set based on the information in self.ordering.

Attributes

alias_names

cov

Covariance between the latent variables as an unstructured 2-dimensional tensor variable

ddim

has_logq

initial_dist_map

initial_dist_name

input

logq

Dev - Monte Carlo estimate for group logQ

logq_norm

Dev - Monte Carlo estimate for group logQ normalized

mean

Mean of the latent variables as an unstructured 1-dimensional tensor variable

mean_data

Mean of the latent variables as an xarray Dataset

ndim

params

params_dict

replacements

shared_params

short_name

std

Standard deviation of the latent variables as an unstructured 1-dimensional tensor variable

std_data

Standard deviation of the latent variables as an xarray Dataset

symbolic_initial

symbolic_logq

Dev - correctly scaled self.symbolic_logq_not_scaled

symbolic_logq_not_scaled

Dev - symbolically computed logq for self.symbolic_random computations can be more efficient since all is known beforehand including self.symbolic_random

symbolic_normalizing_constant

Dev - normalizing constant for self.logq, scales it to minibatch_size instead of total_size

symbolic_random

Dev - abstract node that takes self.symbolic_initial and creates approximate posterior that is parametrized with self.params_dict.