You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/10/15 16:22:27 UTC

[GitHub] leopd commented on issue #10002: General support of OPs for second-order gradient

leopd commented on issue #10002: General support of OPs for second-order gradient
URL: https://github.com/apache/incubator-mxnet/issues/10002#issuecomment-429920329

+1 to prioritizing. I think we should pick an order that allows real applications to be built as quickly as possible. There are lots of applications like better optimization algorithms, GAN training, RL algorithms, neural architecture search. Each of these require having the second derivative for every op in a network. So we should pick useful network architectures where we can get the 2nd derivative for the entire network.

In order to get something working as quickly as possible, that implies to me starting with the simplest useful network architectures, and then moving towards progressively more complex architectures ordered by how useful/important they are. This makes me think the order should be approximately:
* Fully-connected feedforward networks (multi-layer perceptron MLP)
* CNN. Starting with AlexNet (simple) and then adding ResNet (common) and similar
* RNN. Start with simple stacked RNN, add LSTM, GRU, encoder-decoder, attention, transformer,
* Everything else

Something like that for order of architecture types. But I do think it makes sense to start with MLP since that's the easiest way to get an end-to-end example working, and does cover some interesting real-world use cases. Also MLP requires a pretty short list of ops. I think it's basically:

* Fully-connected layer (vector-matrix product)
* Softmax output (most common output)
* ReLU activation (most common, also trivial 2nd derivative)
* Dropout (not required for MLP, but very commonly used)
* Batch-Norm (not required for MLP, but quite useful in my experience)
* Anything else?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services