versions/1.6.0/api/r/docs/_sources/api/mx.opt.adadelta.rst - mxnet-site - Git at Google



 ``mx.opt.adadelta``
 ======================================

 Description
 ----------------------

 Create an AdaDelta optimizer with respective parameters.

 AdaDelta optimizer as described in Zeiler, M. D. (2012).
 *ADADELTA: An adaptive learning rate method.*
 http://arxiv.org/abs/1212.5701

 Usage
 ----------

 .. code:: r

 	mx.opt.adadelta(

 	  rho = 0.9,

 	  epsilon = 1e-05,

 	  wd = 0,

 	  rescale.grad = 1,

 	  clip_gradient = -1

 	)

 Arguments
 ------------------

 +----------------------------------------+------------------------------------------------------------+
 | Argument                               | Description                                                |
 +========================================+============================================================+
 | ``rho``                                | float, default=0.90.                                       |
 |                                        |                                                            |
 |                                        | Decay rate for both squared gradients and delta x.         |
 +----------------------------------------+------------------------------------------------------------+
 | ``epsilon``                            | float, default=1e-5.                                       |
 |                                        |                                                            |
 |                                        | The constant as described in the thesis.                   |
 +----------------------------------------+------------------------------------------------------------+
 | ``wd``                                 | float, default=0.0.                                        |
 |                                        |                                                            |
 |                                        | L2 regularization coefficient add to all the weights.      |
 +----------------------------------------+------------------------------------------------------------+
 | ``rescale.grad``                       | float, default=1.                                          |
 |                                        |                                                            |
 |                                        | rescaling factor of gradient.                              |
 +----------------------------------------+------------------------------------------------------------+
 | ``clip_gradient``                      | float, default=-1 (no clipping if < 0).                    |
 |                                        |                                                            |
 |                                        | clip gradient in range [-clip_gradient, clip_gradient].    |
 +----------------------------------------+------------------------------------------------------------+


	``mx.opt.adadelta``
	======================================

	Description
	----------------------

	Create an AdaDelta optimizer with respective parameters.

	AdaDelta optimizer as described in Zeiler, M. D. (2012).
	ADADELTA: An adaptive learning rate method.
	http://arxiv.org/abs/1212.5701

	Usage
	----------

	.. code:: r

	mx.opt.adadelta(

	rho = 0.9,

	epsilon = 1e-05,

	wd = 0,

	rescale.grad = 1,

	clip_gradient = -1

	)

	Arguments
	------------------

	+----------------------------------------+------------------------------------------------------------+
	\| Argument \| Description \|
	+========================================+============================================================+
	\| ``rho`` \| float, default=0.90. \|
	\| \| \|
	\| \| Decay rate for both squared gradients and delta x. \|
	+----------------------------------------+------------------------------------------------------------+
	\| ``epsilon`` \| float, default=1e-5. \|
	\| \| \|
	\| \| The constant as described in the thesis. \|
	+----------------------------------------+------------------------------------------------------------+
	\| ``wd`` \| float, default=0.0. \|
	\| \| \|
	\| \| L2 regularization coefficient add to all the weights. \|
	+----------------------------------------+------------------------------------------------------------+
	\| ``rescale.grad`` \| float, default=1. \|
	\| \| \|
	\| \| rescaling factor of gradient. \|
	+----------------------------------------+------------------------------------------------------------+
	\| ``clip_gradient`` \| float, default=-1 (no clipping if < 0). \|
	\| \| \|
	\| \| clip gradient in range [-clip_gradient, clip_gradient]. \|
	+----------------------------------------+------------------------------------------------------------+