You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/03/01 20:55:41 UTC

[GitHub] [incubator-mxnet] RuRo opened a new pull request #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

RuRo opened a new pull request #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734
 
 
   ## Description ##
   Implements 3 new node conversion functions, to facilitate the conversion of `gluon.rnn.LSTM` blocks to ONNX.
   
   1) `_zeros` (as well as `_ones` and `_full` as a bonus)
       `gluon.rnn.LSTM` uses `sym.zeros` to [initialize the initial recurrent states](https://github.com/apache/incubator-mxnet/blob/10a12d59f67ad21032a94b1721aaf9b96fddac85/python/mxnet/gluon/rnn/rnn_layer.py#L234).
   2) `_rnn_param_concat` which is used by mxnet to [concatenate all the flattened parameters](https://github.com/apache/incubator-mxnet/blob/10a12d59f67ad21032a94b1721aaf9b96fddac85/python/mxnet/gluon/rnn/rnn_layer.py#L278), before passing them to the `RNN` node.
   3) `RNN` which is the backend op for the actual `LSTM` computation.
   
   I've also implemented/updated a couple helper functions in the [`_op_translations.py`](https://github.com/RuRo/incubator-mxnet/blob/master/python/mxnet/contrib/onnx/mx2onnx/_op_translations.py) file in a separate commit to hopefully reduce the boilerplate amount.
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [x] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) created (except PRs with tiny changes)
   - [x] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
   - Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
   - Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
   - [x] Code is well-documented: 
   - For user-facing API changes, API doc string has been updated. 
   - For new C++ functions in header files, their functionalities and arguments are documented. 
   - For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
   - Check the API doc at https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
   - [x] To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
   
   ## Tests ##
   
   I didn't add any tests for this node yet, since I am not 100% sure, how exactly to do that.
   Some other PRs with ONNX exports, just add the new node names to a list in [`tests/python-pytest/onnx/test_cases.py`](https://github.com/apache/incubator-mxnet/blob/10a12d59f67ad21032a94b1721aaf9b96fddac85/tests/python-pytest/onnx/test_cases.py#L96), however, I don't see, how would this add any actual tests, without a reference ONNX implementation available.
   
   I would gladly add some unittests, if someone can point the right direction though. For now, I've verified the correctness of my conversion by
   1) Running this script:
   
       <details>
       <summary>test_rnn_export_onnx.py</summary>
   
       ```python
       #!/bin/env python3
       import numpy as np
       import mxnet as mx
       import onnxruntime as rt
   
       import logging
       import shutil
       import os
       logging.getLogger().setLevel(logging.INFO)
   
   
       # Configuration
       path = 'lstm-model'
       seq_len = 10
       batch_size = 32
       input_size = 16
       hidden_size = 8
       bidirectional = True
   
   
       # Create input data
       shape = (seq_len, batch_size, input_size)
       I = mx.nd.random.randn(*shape)
   
   
       # Create LSTM network
       seq = mx.gluon.nn.HybridSequential()
       with seq.name_scope():
           seq.add(mx.gluon.rnn.LSTM(hidden_size, bidirectional=bidirectional))
       seq.initialize()
       seq(I)
   
   
       # Init all parameters with random values
       for k, p in seq.collect_params().items():
           p.data()[...] = np.random.randn(*p.data().shape)
       seq.hybridize(static_alloc=True, static_shape=True)
       res_mx = seq(I)
       res_mx = res_mx.asnumpy()
   
   
       # Export to symbol and then to ONNX and forward with onnxruntime
       seq.export(path)
       mx.contrib.onnx.export_model(
           f'{path}-symbol.json',
           f'{path}-0000.params',
           [shape], np.float32,
           onnx_file_path=f'{path}.onnx',
           verbose=True
       )
   
   
       # Forward, using onnxruntime
       sess = rt.InferenceSession(f'{path}.onnx')
       res_onnx, = sess.run(None, {'data': I.asnumpy()})
   
   
       # Check results
       diff = np.abs(res_mx - res_onnx)
       print(f"{diff.mean()=}")
       print(f"{diff.max()=}")
       np.testing.assert_almost_equal(res_mx, res_onnx, decimal=4)
       # decimal=4 is the same as mxnet unittests for other ONNX ops
   
   
       # Clean up files
       os.remove(f"{path}-symbol.json")
       os.remove(f"{path}-0000.params")
       os.remove(f"{path}.onnx")
       ```
   
       </details>
   
       Which generates a model with an `LSTM` node, fills all the parameters with random data, exports it to a `symbol.json` and `0000.params` and then converts the model to `ONNX`. It then loads the model using `onnxruntime` and forwards some random data through both the `mxnet` as well as the `onnx` models and asserts, that the results are close.
   
   2) Using this conversion on a CRNN network, I trained at my job and verifying that the prediction results are the same on a subset of out test set. (unfortunately, I can't give too much details about this model, so you'll have to trust me here)
   
   ## Comments ##
   There are a few details in my current implementation that should be noted:
   1) `_zeros`
       - It seems, that `mx.sym` can broadcast a `_zeros/_ones/_full` node along an axis with dimension 0 as if it were a 1, while `ONNX` can't do that.
           ```python
           import mxnet as mx
           a = mx.sym.zeros((10, 0, 30))
           b = mx.sym.Variable('b')
           (a * b).eval(b=mx.nd.zeros((10, 20, 30)))[0].shape
           ```
           Currently, my implementation converts all the 0 dims to 1s. Is this correct?
       - The version of `ONNX`, that mxnet uses doesn't support some newer nodes like `ConstantOfShape` and I was not able to find a working onnx implementation, which actually supported the currently deprecated `ConstantFill` op, so I had to actually store the `_zeros` data in a tensor, which ends up being stored in the onnx model file.<br/>
       This also applies to the helper funtions and the other nodes. I've had to settle for some "less than elegant" solutions in some cases, because of the lacking opset support. I might have also missed a better way to do some operations, so if you have any better alternatives, I would gladly change my implementation.<br/>
   
   2) `_rnn_param_concat`
       You can read more about my implementation in the code (comments), but the general outline is that since `ONNX` expects the `RNN` parameters prepared in a completely different way compared to mxnet, I've had to employ some "dark" magic, involving `convert_rnn_param_concat` generating multiple nodes and `convert_RNN` 'guessing' these node names, using regular expressions.<br/>
       As a result, the current conversion mechanism only works, if the parameters of the RNN block are concatenated using `_rnn_param_concat` (and not for example `Concat`). If the conversion of of `mx.sym.RNN` (without `gluon`) is a widely requested feature, we would have to convert the `_rnn_param_concat` node to a regular `Concat`, and then split the parameters back apart in the `RNN` conversion, which is not very nice IMO.<br/>
   
   3) `RNN`
       Even though, this PR implements the conversion of the `RNN` node, currently only the `LSTM` mode is supported and many of the "customization" options are also not implemented.
       **Implemented:**
       - `mode=LSTM`
       - `layout='TNC'`
       - `bidirectional=True` and `False`
       - `state_clip_min` and `state_clip_max` (but only if they are the same value)
       
       **Not Implemented**, but probably supported by ONNX (easy to implement):
       - `mode=GRU, RNN_relu, RNN_tanh`
       - `use_sequence_length`
       - custom `states` inputs and `states` outputs
       
       **Not Implemented**, and probably **not** supported by ONNX (hard to implement):
       - `state_clip_nan`
       - `state_clip_min != state_clip_max`
       - `dropout`
       - `projection_size`
       - `num_layers`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-596992233
 
 
   @leezu @szha any feedback?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-599426988
 
 
   Nice. All green now.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-598750693
 
 
   @anirudhacharya Done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] RuRo edited a comment on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

RuRo edited a comment on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-597648893
 
 
   @anirudhacharya I haven't implemented any unittests yet, but the code **is** tested. I've provided an example test case in my original post, using ONNXRuntime as a reference implementation.
   
   I've looked at how the unittests are implemented for other operators. The `backend_test.py` export test won't work in this case.
   
   1) The current backend unittests don't actually work for nodes, which can **only** be exported from MXNet to ONNX. That is because the current 'strategy' to check exports is to [first import ONNX -> MXNet and then re-export it back to ONNX](https://github.com/apache/incubator-mxnet/blob/713d962359cfe5b014ca24c0f9b3676d86b28569/tests/python-pytest/onnx/backend.py#L104). Since we don't have an implementation for importing LSTMs from ONNX to MXNet, the *export*-only tests will fail, because the *import* for the operator LSTM is not implemented. Implementing an ONNX->MXNet conversion for the LSTM node is a non-trivial task.
   2) This PR only implements a limited subset of the ONNX supported LSTM configurations and only a limited subset of MXNet supported RNN configurations. And currently, this conversion only supports Gluon LSTM (not arbitrary MXNet RNNs), so even if we could import the LSTM node from ONNX, the current implementation would fail on unsupported configurations. See "Comments:RNN" section in my original post.
   
   I could add a test case to [test_node.py export_test_cases](https://github.com/apache/incubator-mxnet/blob/713d962359cfe5b014ca24c0f9b3676d86b28569/tests/python-pytest/onnx/test_node.py#L290), but this test only verifies, that a valid onnx graph is exported, not that the conversion is correct. Any kind of test, that involves actually forwarding the LSTM node will **require** implementing the import from ONNX (to use the official test) or using some reference implementation (onnxruntime or some other alternative).
   
   Btw, the "reference" ONNX implementation used in the tests doesn't even support [bidirectional](https://github.com/onnx/onnx/blob/8d15705ec29c2867847d1a8affad9fe19dedb2e3/onnx/backend/test/case/node/lstm.py#L54) LSTMs.
   
   Alternatively, if you are willing to pull onnxruntime or some other ONNX implementation as a test time dependency, then we could just use the test case, I provided in my original post.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] mxnet-bot commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

mxnet-bot commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-609007544
 
 
   Jenkins CI successfully triggered : [unix-gpu, unix-cpu]

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] RuRo edited a comment on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

RuRo edited a comment on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-597648893
 
 
   @anirudhacharya I haven't implemented any unittests yet, but the code **is** tested. I've provided an example test case in my original post, using ONNXRuntime as a reference implementation.
   
   I've looked at how the unittests are implemented for other operators. The `backend_test.py` export test won't work in this case.
   
   1) The current backend unittests don't actually work for nodes, which can **only** be exported from MXNet to ONNX. That is because the current 'strategy' to check exports is to [first import ONNX -> MXNet and then re-export it back to ONNX](https://github.com/apache/incubator-mxnet/blob/713d962359cfe5b014ca24c0f9b3676d86b28569/tests/python-pytest/onnx/backend.py#L104). Since we don't have an implementation for importing LSTMs from ONNX to MXNet, the *export*-only tests will fail, because the *import* for the operator LSTM is not implemented. Implementing an ONNX->MXNet conversion for the LSTM node is a non-trivial task.
   2) This PR only implements a limited subset of the ONNX supported LSTM configurations and only a limited subset of MXNet supported RNN configurations. And currently, this conversion only supports Gluon LSTM (not arbitrary MXNet RNNs), so even if we could import the LSTM node from ONNX, the current implementation would fail on unsupported configurations. See "Comments:RNN" section in my original post.
   
   I could add a test case to [test_node.py export_test_cases](https://github.com/apache/incubator-mxnet/blob/713d962359cfe5b014ca24c0f9b3676d86b28569/tests/python-pytest/onnx/test_node.py#L290), but this test only verifies, that a valid onnx graph is exported, not that the conversion is correct. Any kind of test, that involves actually forwarding the LSTM node will **require** implementing the import from ONNX (to use the official test) or using some reference implementation (an LSTM implementation written in numpy or onnxruntime etc).
   
   Alternatively, if you are willing to pull onnxruntime or some other ONNX implementation as a test time dependency, then we could just use the test case, I provided in my original post.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

leezu commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-593521923
 
 
   Unfortunately there are issues with CI. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] RuRo removed a comment on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

RuRo removed a comment on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-609469361
 
 
   @mxnet-bot run ci [unix-gpu]

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-609469361
 
 
   @mxnet-bot run ci [unix-gpu]

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] RuRo edited a comment on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

RuRo edited a comment on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-597648893
 
 
   @anirudhacharya I haven't implemented any unittests yet, but the code **is** tested. I've provided an example test case in my original post, using ONNXRuntime as a reference implementation.
   
   I've looked at how the unittests are implemented for other operators. The `backend_test.py` export test won't work in this case.
   
   1) The current backend unittests don't actually work for nodes, which can **only** be exported from MXNet to ONNX. That is because the current 'strategy' to check exports is to [first import ONNX -> MXNet and then re-export it back to ONNX](https://github.com/apache/incubator-mxnet/blob/713d962359cfe5b014ca24c0f9b3676d86b28569/tests/python-pytest/onnx/backend.py#L104). Since we don't have an implementation for importing LSTMs from ONNX to MXNet, the *export*-only tests will fail, because the *import* for the operator LSTM is not implemented. Implementing an ONNX->MXNet conversion for the LSTM node is a non-trivial task.
   2) This PR only implements a limited subset of the ONNX supported LSTM configurations and only a limited subset of MXNet supported RNN configurations. And currently, this conversion only supports Gluon LSTM (not arbitrary MXNet RNNs), so even if we could import the LSTM node from ONNX, the current implementation would fail on unsupported configurations. See "Comments:RNN" section in my original post.
   
   I could add a test case to [test_node.py export_test_cases](https://github.com/apache/incubator-mxnet/blob/713d962359cfe5b014ca24c0f9b3676d86b28569/tests/python-pytest/onnx/test_node.py#L290), but this test only verifies, that a valid onnx graph is exported, not that the conversion is correct. Any kind of test, that involves actually forwarding the LSTM node will **require** implementing the import from ONNX (to use the official test) or using some reference implementation (an LSTM implementation written in numpy or onnxruntime etc).
   
   Btw, the "reference" ONNX implementation used in the tests doesn't even support [bidirectional](https://github.com/onnx/onnx/blob/8d15705ec29c2867847d1a8affad9fe19dedb2e3/onnx/backend/test/case/node/lstm.py#L54) LSTMs.
   
   Alternatively, if you are willing to pull onnxruntime or some other ONNX implementation as a test time dependency, then we could just use the test case, I provided in my original post.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-609038723
 
 
   @mxnet-bot run ci [unix-gpu]

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] RuRo edited a comment on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

RuRo edited a comment on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-601585057
 
 
   @szha @Roshrini can you, please, review this PR or maybe nominate somebody else, if you're currently busy? Thanks.
   
   Edit: my bad, misclicked on anirudhacharya's name.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] mxnet-bot commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

mxnet-bot commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-609469374
 
 
   Jenkins CI successfully triggered : [unix-gpu]

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-599187828
 
 
   @leezu did the restart not work or did it just fail again?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu edited a comment on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

leezu edited a comment on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-593521923
 
 
   Unfortunately there are issues with CI. I restarted the two failing pipelines.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

leezu commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-609910148
 
 
   Jenkins CI successfully triggered : [centos-gpu]

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-593346550
 
 
   Some of the builds seem to be spuriously failing with weird errors like `Could not connect to mxnetlinux-cpu_.... to send interrupt signal to process` or `requests.exceptions.ReadTimeout: UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)`. Is there something wrong with the current build pipeline or is this somehow my fault?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] RuRo edited a comment on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

RuRo edited a comment on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-593346550
 
 
   Some of the builds seem to be spuriously failing with weird errors like `Could not connect to mxnetlinux-cpu_.... to send interrupt signal to process` or `requests.exceptions.ReadTimeout: UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)`. Looks like networking issues? Is there something wrong with the current build pipeline or is this somehow my fault?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] szha commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

szha commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-599238486
 
 
   @RuRo It failed again and I restarted the failed pipeline just now. Thanks for the contribution.
   
   For the CI, we will coordinate efforts to improve its status soon.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-598346865
 
 
   The CI seems to be borked again btw 😩

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

leezu commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-598870617
 
 
   thank you. restarted unix-gpu

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-601585057
 
 
   @szha @Roshrini can you, please, review this PR or maybe nominate somebody else, if you're currently busy? Thanks

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-609007533
 
 
   @mxnet-bot run ci [unix-cpu, unix-gpu]

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] mxnet-bot commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

mxnet-bot commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-609038733
 
 
   Jenkins CI successfully triggered : [unix-gpu]

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] RuRo removed a comment on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

RuRo removed a comment on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-609038723
 
 
   @mxnet-bot run ci [unix-gpu]

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-612088280
 
 
   @szha @Roshrini PTAL

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] RuRo removed a comment on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

RuRo removed a comment on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-609007533
 
 
   @mxnet-bot run ci [unix-cpu, unix-gpu]

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] anirudhacharya commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

anirudhacharya commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-598573803
 
 
   > I could add a test case to [test_node.py export_test_cases](https://github.com/apache/incubator-mxnet/blob/713d962359cfe5b014ca24c0f9b3676d86b28569/tests/python-pytest/onnx/test_node.py#L290), but this test only verifies, that a valid onnx graph is exported, not that the conversion is correct
   
   Lets do this for now. we can use the tests from the onnx repo when the lstm import is also implemented.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

RuRo commented on issue #17734: [MXNET-889] Implement ONNX export for gluon LSTM.
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-597648893
 
 
   @anirudhacharya I haven't implemented any unittests yet, but the code **is** tested. I've provided an example test case in my original post, using ONNXRuntime as a reference implementation.
   
   I've looked at how the unittests are implemented for other operators. The `backend_test.py` export test won't work in this case.
   
   1) The current backend unittests don't actually work for nodes, which can **only** be exported from MXNet to ONNX. That is because the current 'strategy' to check exports is to [first import ONNX -> MXNet and then re-export it back to ONNX](https://github.com/apache/incubator-mxnet/blob/713d962359cfe5b014ca24c0f9b3676d86b28569/tests/python-pytest/onnx/backend.py#L104). Since we don't have an implementation for importing LSTMs from ONNX to MXNet, the *export*-only tests will fail, because the *import* for the operator LSTM is not implemented. Implementing an ONNX->MXNet conversion for the LSTM node is a non-trivial task.
   2) This PR only implements a limited subset of the ONNX supported LSTM configurations and only a limited subset of MXNet supported RNN configurations. And currently, this conversion only supports Gluon LSTM (not arbitrary MXNet RNNs), so even if we could import the LSTM node from ONNX, the current implementation would fail on unsupported configurations. See "Comments:RNN" section in my original post.
   
   I could add a test case to [test_node.py export_test_cases](https://github.com/apache/incubator-mxnet/blob/713d962359cfe5b014ca24c0f9b3676d86b28569/tests/python-pytest/onnx/test_node.py#L290), but this test only verifies, that a valid onnx graph is exported, not that the conversion is correct. Any kind of test, that involves actually forwarding the LSTM node will **require** implementing the import from ONNX (to use the official test) or using some reference implementation (an LSTM implementation written in numpy or onnxruntime etc).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu commented on pull request #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

leezu commented on pull request #17734:
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-646904365


   There was some undetected conflict and this broke the tests on at least macOS https://github.com/apache/incubator-mxnet/issues/18596


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] RuRo commented on pull request #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

RuRo commented on pull request #17734:
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-646593314


   > Feel free to ping me or other committers if a PR that's passing CI check needs attention.
   
   @szha ping )
   
   The CI is passing except for `codecov/project`, which I think claims that the coverage decreased in the following files (?)
   ```
   Changes in src/operator/tensor/dot-inl.h    -2 → +2
   Changes in src/io/iter_image_recordio_2.cc  -1 → +1
   ```
   This PR doesn't change anything in `src` so it's not obvious to me, what is wrong here.
   
   I am not too familiar with how `codecov` works so I may be wrong, but I think that this is a false positive.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] leezu commented on pull request #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

leezu commented on pull request #17734:
URL: https://github.com/apache/incubator-mxnet/pull/17734#issuecomment-646704145


   @Ruro codecov is mistakenly displayed on a couple of PRs. Due to instability of the CI, the codecov is not stable and you can ignore it. It won't show up on new PRs again
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] szha merged pull request #17734: [MXNET-889] Implement ONNX export for gluon LSTM.

Posted by GitBox <gi...@apache.org>.

szha merged pull request #17734:
URL: https://github.com/apache/incubator-mxnet/pull/17734


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org