You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2019/07/23 22:26:00 UTC

[GitHub] [incubator-mxnet] ChaiBapchya opened a new issue #15643: NDArray API NN Optimizer (Multi-* update category) absent in Doc

ChaiBapchya opened a new issue #15643: NDArray API NN Optimizer (Multi-* update category) absent in Doc
URL: https://github.com/apache/incubator-mxnet/issues/15643
 
 
   Neural Network Optimizer updates such as
   `multi_mp_sgd_mom_update`, `multi_mp_sgd_update`, `multi_sgd_mom_update`, `multi_sgd_update`
   
   They are present in Symbol API doc but not in NDArray API doc.
   However, upon checking definition of 1 of the operators
   
   ```
   >>> help(mx.nd.multi_sgd_mom_update)
   ```
   ```
   Help on function multi_sgd_mom_update:
   
   multi_sgd_mom_update(*data, **kwargs)
       Momentum update function for Stochastic Gradient Descent (SGD) optimizer.
       
       Momentum update has better convergence rates on neural networks. Mathematically it looks
       like below:
       
       .. math::
       
         v_1 = \alpha * \nabla J(W_0)\\
         v_t = \gamma v_{t-1} - \alpha * \nabla J(W_{t-1})\\
         W_t = W_{t-1} + v_t
       
       It updates the weights using::
       
         v = momentum * v - learning_rate * gradient
         weight += v
       
       Where the parameter ``momentum`` is the decay rate of momentum estimates at each epoch.
       
       
       
       Defined in src/operator/optimizer_op.cc:L372
       
       Parameters
       ----------
       data : NDArray[]
           Weights, gradients and momentum
       lrs : tuple of <float>, required
           Learning rates.
       wds : tuple of <float>, required
           Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight.
       momentum : float, optional, default=0
           The decay rate of momentum estimates at each epoch.
       rescale_grad : float, optional, default=1
           Rescale gradient to grad = rescale_grad*grad.
       clip_gradient : float, optional, default=-1
           Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient).
       num_weights : int, optional, default='1'
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services