pymc.adamax#
- pymc.adamax(loss_or_grads=None, params=None, learning_rate=0.002, beta1=0.9, beta2=0.999, epsilon=1e-08)[source]#
Adamax updates
Adamax updates implemented as in [1]. This is a variant of the Adam algorithm based on the infinity norm.
- Parameters:
- loss_or_grads: symbolic expression or list of expressions
A scalar loss expression, or a list of gradient expressions
- params: list of shared variables
The variables to generate update expressions for
- learning_rate: float
Learning rate
- beta1: float
Exponential decay rate for the first moment estimates.
- beta2: float
Exponential decay rate for the weighted infinity norm estimates.
- epsilon: float
Constant for numerical stability.
- Returns:
OrderedDict
A dictionary mapping each parameter to its update expression
Notes
Optimizer can be called without both loss_or_grads and params in that case partial function is returned
References
[1]Kingma, Diederik, and Jimmy Ba (2014): Adam: A Method for Stochastic Optimization. arXiv preprint arXiv:1412.6980.
Examples
>>> a = pytensor.shared(1.0) >>> b = a * 2 >>> updates = adamax(b, [a], learning_rate=0.01) >>> isinstance(updates, dict) True >>> optimizer = adamax(learning_rate=0.01) >>> callable(optimizer) True >>> updates = optimizer(b, [a]) >>> isinstance(updates, dict) True