pymc.adamax#

pymc.adamax(loss_or_grads=None, params=None, learning_rate=0.002, beta1=0.9, beta2=0.999, epsilon=1e-08)[source]#

Adamax updates

Adamax updates implemented as in [1]. This is a variant of the Adam algorithm based on the infinity norm.

Parameters:

loss_or_grads: symbolic expression or list of expressions: A scalar loss expression, or a list of gradient expressions
params: list of shared variables: The variables to generate update expressions for
learning_rate: float: Learning rate
beta1: float: Exponential decay rate for the first moment estimates.
beta2: float: Exponential decay rate for the weighted infinity norm estimates.
epsilon: float: Constant for numerical stability.

Returns:

OrderedDict: A dictionary mapping each parameter to its update expression

Notes

Optimizer can be called without both loss_or_grads and params in that case partial function is returned

References

[1]

Kingma, Diederik, and Jimmy Ba (2014): Adam: A Method for Stochastic Optimization. arXiv preprint arXiv:1412.6980.

Examples

>>> a = pytensor.shared(1.)
>>> b = a*2
>>> updates = adamax(b, [a], learning_rate=.01)
>>> isinstance(updates, dict)
True
>>> optimizer = adamax(learning_rate=.01)
>>> callable(optimizer)
True
>>> updates = optimizer(b, [a])
>>> isinstance(updates, dict)
True