pymc.adamax#

pymc.adamax(loss_or_grads=None, params=None, learning_rate=0.002, beta1=0.9, beta2=0.999, epsilon=1e-08)[source]#

Adamax updates

Adamax updates implemented as in [1]. This is a variant of the Adam algorithm based on the infinity norm.

Parameters:
loss_or_grads: symbolic expression or list of expressions

A scalar loss expression, or a list of gradient expressions

params: list of shared variables

The variables to generate update expressions for

learning_rate: float

Learning rate

beta1: float

Exponential decay rate for the first moment estimates.

beta2: float

Exponential decay rate for the weighted infinity norm estimates.

epsilon: float

Constant for numerical stability.

Returns:
OrderedDict

A dictionary mapping each parameter to its update expression

Notes

Optimizer can be called without both loss_or_grads and params in that case partial function is returned

References

[1]

Kingma, Diederik, and Jimmy Ba (2014): Adam: A Method for Stochastic Optimization. arXiv preprint arXiv:1412.6980.

Examples

>>> a = pytensor.shared(1.0)
>>> b = a * 2
>>> updates = adamax(b, [a], learning_rate=0.01)
>>> isinstance(updates, dict)
True
>>> optimizer = adamax(learning_rate=0.01)
>>> callable(optimizer)
True
>>> updates = optimizer(b, [a])
>>> isinstance(updates, dict)
True