You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/11/12 12:07:18 UTC

[GitHub] [incubator-mxnet] AlexanderSerov opened a new issue #19521: [RFC] Integration with AndroidNN

AlexanderSerov opened a new issue #19521:
URL: https://github.com/apache/incubator-mxnet/issues/19521


   ## Problem statement
   Our team using mxnet for training and for inference. In recent time we have intention to run inference on Android devices so we compile mxnet using android ndk and it works fine. Now we have intention to accelerate inference on mobile devices using [android NN Api](https://developer.android.com/ndk/guides/neuralnetworks) which android support since version 8.1. This Api serve as common interface to hardware GPU/Accelerator drivers and provide api in the form of operators ( ANEURALNETWORKS_CONV_2D, ANEURALNETWORKS_AVERAGE_POOL_2D...).
   
   ## Proposed solutions
   My task is to implement a proxy between mxnet and android nn using subgraph api and actually i already on half a way. I already implement selector, subgraph property, register opearator, and impement addition of major operator to android nn model based on partitioned graph. The design is similar to TensorRT subgraph but we don't use onnx as interim. So the question is, is it wise to implement subgraph for running inference on mobile device using framework which initially not have intention to run inference on mobile. I mean mxnet size in apk is about 150 MB that is pretty thick. I use mxnet 1.7. Will there a lightweight version of mxnet in future like TFlite for tensorflow? Also, any suggestions and thoughts about more appropriate solution for our problem are welcome!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] samskalicky commented on issue #19521: [RFC] Integration with AndroidNN

Posted by GitBox <gi...@apache.org>.
samskalicky commented on issue #19521:
URL: https://github.com/apache/incubator-mxnet/issues/19521#issuecomment-726285689


   I did a quick set of builds from v1.x using make with different flags and looked at the size of libmxnet.so. Each subsequent row adds new build flags to the previous. 
   
   Build Flags | libmxnet.so size [bytes]
   ------------ | -------------
   None | 168573168
   +USE_MKLDNN=0 | 131479368
   +USE_INTGEMM=0 | 131251704
   +USE_INT64_TENSOR_SIZE=0 | 131251704
   +USE_CPP_PACKAGE=0 USE_DIST_KVSTORE=0 | 131251704
   +USE_OPENCV=0 |	130949000
   +USE_TVM_OP=0 |	130949000
   +USE_NNPACK=0 |	130949000


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] samskalicky commented on issue #19521: [RFC] Integration with AndroidNN

Posted by GitBox <gi...@apache.org>.
samskalicky commented on issue #19521:
URL: https://github.com/apache/incubator-mxnet/issues/19521#issuecomment-751914619


   Hi @AlexanderSerov great question, let me try and break it down.
   
   > In our design of androidnn we approach a situation when we need to pass devices to androidnn backend. Usually, other backends (mkl, tensort) get a device through Context. The problem is context support limited list of devices (CPU,GPU). On the other hand, androidnn support other set of devices (cpu, gpu, npu...) with indexes specific to android and acquired via android ANeuralNetworks_getDevice api. 
   
   Contexts in MXNet are chosen by users with the expectation that they can run their whole model with any particular context. The MXNet community has worked hard to maintain parity between CPU and GPU contexts so that the majority of models can be executed on either context successfully. A Context in MXNet needs to be able to support executing all currently supported operators. 
   
   How operators are implemented (custom C++, BLAS libraries, or custom NN libraries like MKL) is not visible at the context level, rather these are build-time configurations. For example, an MXNet user will use the same CPU context whether or not they use MKL or OpenBLAS as the BLAS library or whether they choose to use MKLDNN/oneDNN. We consider this a huge usability feature in MXNet, rather than having many contexts to enable each feature. Most users will find the build config that works best for them and stick to that. Having a single build with all those features enabled is not what most users want, inevitably they end up trying to minimize space on disk/device memory and try and reduce the size of the MXNet binary (like our previous discussion above). 
   
   > So we need custom context and we have choices:
   > 1. Modify existing Context by adding additional fields and defining a preprocessor flag MXNET_USE_ANDROIDNN in CMake. So if user pass USE_ANDROIDNN option to CMake he will use a custom context. This solution motivated by the fact that if there is a structure for passing devices - we should use it. Previous backends feel comfortable with provided set of devices, now, it's time to add support for new devices.
   
   In general adding new build flags to enable custom backends in MXNet is acceptable, especially in the case where the flag would only be enabled for a particular platform (ie. ARM or Android). Adding support for new backends that would be generally applicable for all CPU types requires much more careful consideration and testing.
   
   > 2. The second option is to pass all custom options, including device name and id, through MXOptimizeForBackend api which support options_map which was designed for passing custom options to backend and we can use it by passing all custom info required. Then use it when partition graph by adding a custom device to each subgraph as node attribute. Further, based on attribute, we will create a model in backend for a device based on this field.
   
   Amazon is using this 2nd option for its implementation of custom backends for the Inferentia and Elastic Inference Accelerator (EIA) devices, and also for its integration of TVM compilation for MXNet models in the SageMaker Neo service. This is the preferable way to get started. It will allow you to build your custom backend, easily stay current with MXNet versions (upgrading between versions with MXNet extensions is way easier than upgrading a custom fork of the whole MXNet codebase), and simplify your distribution of your custom backend with MXNet. 
   
   The MXNet community is currently reconsidering how MXNet has been architected for a variety of reasons (licensing issues, maintainability, etc) and looking to make the codebase more modular. So having a module backend is future-proofing your efforts as well. And it doesnt limit your contribution either, if your custom backend becomes popular you can always start a discussion on making your custom backend a default part of the MXNet codebase in the future. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] ptrendx commented on issue #19521: [RFC] Integration with AndroidNN

Posted by GitBox <gi...@apache.org>.
ptrendx commented on issue #19521:
URL: https://github.com/apache/incubator-mxnet/issues/19521#issuecomment-728284228


   I did not do any experiments, but I would be very surprised if it is not at least 50% - you only have a few very small functions for all the infers, while you need at least 8 variants for `MSHADOW_TYPE_SWITCH`, and especially on the NumPy side there are functions that use those switches nested (so 64+ variants), not to mention that those functions tend to be larger.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] marcoabreu commented on issue #19521: [RFC] Integration with AndroidNN

Posted by GitBox <gi...@apache.org>.
marcoabreu commented on issue #19521:
URL: https://github.com/apache/incubator-mxnet/issues/19521#issuecomment-727686426


   How about we just build and load the ops externally and people can just delete the ones they don't want - based on your external operator system. So we no longer embed them into the main .so file but have them alongside


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] samskalicky edited a comment on issue #19521: [RFC] Integration with AndroidNN

Posted by GitBox <gi...@apache.org>.
samskalicky edited a comment on issue #19521:
URL: https://github.com/apache/incubator-mxnet/issues/19521#issuecomment-726285689


   I did a quick set of builds from v1.x using make with different flags on x86 and looked at the size of libmxnet.so. Each subsequent row adds new build flags to the previous. 
   
   Build Flags | libmxnet.so size [bytes]
   ------------ | -------------
   None | 168573168
   +USE_MKLDNN=0 | 131479368
   +USE_INTGEMM=0 | 131251704
   +USE_INT64_TENSOR_SIZE=0 | 131251704
   +USE_CPP_PACKAGE=0 USE_DIST_KVSTORE=0 | 131251704
   +USE_OPENCV=0 |	130949000
   +USE_TVM_OP=0 |	130949000
   +USE_NNPACK=0 |	130949000


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] samskalicky commented on issue #19521: [RFC] Integration with AndroidNN

Posted by GitBox <gi...@apache.org>.
samskalicky commented on issue #19521:
URL: https://github.com/apache/incubator-mxnet/issues/19521#issuecomment-727653641


   long term what would we want to do to exclude ops from the build? Would we want to do something like this:
   https://github.com/samskalicky/incubator-mxnet/commit/f2184ceab711bf1081165d6e0c5dbca958111dae
   Where we set a flag like `__EXCLUDE_ALL_OPS__` and then set flags specifically for the ops we want to include like `__INCLUDE_OP_NORM__`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] AlexanderSerov commented on issue #19521: [RFC] Integration with AndroidNN

Posted by GitBox <gi...@apache.org>.
AlexanderSerov commented on issue #19521:
URL: https://github.com/apache/incubator-mxnet/issues/19521#issuecomment-751628566


   Hi folks. I have a question to mxnet team related to previous talk. In our design of androidnn we approach a situation when we need to pass devices to androidnn backend. Usually, other backends (mkl, tensort) get a device through Context. The problem is context support limited list of devices (CPU,GPU). On the other hand, androidnn support other set of devices (cpu, gpu, npu...) with indexes specific to android and acquired via android ANeuralNetworks_getDevice api. So we need custom context and we have choices: 1) Modify existing Context by adding additional fields and defining a preprocessor flag MXNET_USE_ANDROIDNN in CMake. So if user pass USE_ANDROIDNN option to CMake he will use a custom context. This solution motivated by the fact that if there is a structure for passing devices - we should use it. Previous backends feel comfortable with provided set of devices, now, it's time to add support for new devices. 2) The second option is to pass all custom options, including devi
 ce name and id, through MXOptimizeForBackend api which support options_map which was designed for passing custom options to backend and we can use it by passing all custom info required. Then use it when partition graph by adding a custom device to each subgraph as node attribute. Further, based on attribute, we will create a model in backend for a device based on this field. Thank you for response!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] samskalicky commented on issue #19521: [RFC] Integration with AndroidNN

Posted by GitBox <gi...@apache.org>.
samskalicky commented on issue #19521:
URL: https://github.com/apache/incubator-mxnet/issues/19521#issuecomment-726322649


   Using this set of build flags I removed diff sets of ops in v1.x on x86 and measured the libmxnet.so size. Each subsequent row removes additional ops from the previous.
   
   `make USE_MKLDNN=0 USE_INTGEMM=0 USE_INT64_TENSOR_SIZE=0 USE_DIST_KVSTORE=0 USE_CPP_PACKAGE=0 USE_OPENCV=0 USE_TVM_OP=0 USE_NNPACK=0 -j`
   
   Ops removed | libmxnet.so size [bytes]
   ------------ | -------------
   top-level src/operator | 122415272
   quantization | 121641000
   image | 121023728
   numpy | 77072424
   fusion | 77031648
   tvmop | 77031608
   nnpack | 77031568
   custom | 76693704
   
   Trying to remove any more from `nn` or `tensor` is a fluster cluck, all those are ops are used all over the place in other MXNet sources. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] samskalicky commented on issue #19521: [RFC] Integration with AndroidNN

Posted by GitBox <gi...@apache.org>.
samskalicky commented on issue #19521:
URL: https://github.com/apache/incubator-mxnet/issues/19521#issuecomment-728235060


   How much of the binary size comes from FCompute and associated functions versus the other Operator functions (attribute parsing, inputs/outputs, shape/type/storageType inference, etc)? Any guesses?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] AlexanderSerov edited a comment on issue #19521: [RFC] Integration with AndroidNN

Posted by GitBox <gi...@apache.org>.
AlexanderSerov edited a comment on issue #19521:
URL: https://github.com/apache/incubator-mxnet/issues/19521#issuecomment-751628566


   Hi folks. I have a question to mxnet team related to previous talk. In our design of androidnn we approach a situation when we need to pass devices to androidnn backend. Usually, other backends (mkl, tensort) get a device through Context. The problem is context support limited list of devices (CPU,GPU). On the other hand, androidnn support other set of devices (cpu, gpu, npu...) with indexes specific to android and acquired via android ANeuralNetworks_getDevice api. So we need custom context and we have choices:
   
   1) Modify existing Context by adding additional fields and defining a preprocessor flag MXNET_USE_ANDROIDNN in CMake. So if user pass USE_ANDROIDNN option to CMake he will use a custom context. This solution motivated by the fact that if there is a structure for passing devices - we should use it. Previous backends feel comfortable with provided set of devices, now, it's time to add support for new devices. 
   
   2) The second option is to pass all custom options, including device name and id, through MXOptimizeForBackend api which support options_map which was designed for passing custom options to backend and we can use it by passing all custom info required. Then use it when partition graph by adding a custom device to each subgraph as node attribute. Further, based on attribute, we will create a model in backend for a device based on this field. 
   
   Thank you!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] github-actions[bot] commented on issue #19521: [RFC] Integration with AndroidNN

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #19521:
URL: https://github.com/apache/incubator-mxnet/issues/19521#issuecomment-726038617


   Welcome to Apache MXNet (incubating)! We are on a mission to democratize AI, and we are glad that you are contributing to it by opening this issue.
   Please make sure to include all the relevant context, and one of the @apache/mxnet-committers will be here shortly.
   If you are interested in contributing to our project, let us know! Also, be sure to check out our guide on [contributing to MXNet](https://mxnet.apache.org/community/contribute) and our [development guides wiki](https://cwiki.apache.org/confluence/display/MXNET/Developments).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] dmitry-markeshov commented on issue #19521: [RFC] Integration with AndroidNN

Posted by GitBox <gi...@apache.org>.
dmitry-markeshov commented on issue #19521:
URL: https://github.com/apache/incubator-mxnet/issues/19521#issuecomment-726290390


   Hi, we've built excluding not used operators and reached ~20MB size. But the main objective is performance. We believe that AndroidNN allows to compute mnxet models on GPU. 
   The size should be minimal as well, coz a lot of operators just translated to AndroidNN calls.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] samskalicky commented on issue #19521: [RFC] Integration with AndroidNN

Posted by GitBox <gi...@apache.org>.
samskalicky commented on issue #19521:
URL: https://github.com/apache/incubator-mxnet/issues/19521#issuecomment-726325750


   Disentangling MXNet ops would be a good refactoring work. But it would be a lot of work. We may have to do it anyway to satisfy the licensing issue with Apache/Nvidia, so it might be worth doing. But like  @leezu pointed out no-one is currently working on this.  


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] samskalicky commented on issue #19521: [RFC] Integration with AndroidNN

Posted by GitBox <gi...@apache.org>.
samskalicky commented on issue #19521:
URL: https://github.com/apache/incubator-mxnet/issues/19521#issuecomment-727629237


   @dmitry-markeshov @AlexanderSerov the other thing you can do is run your subgraphing pass on x86 and remove the operators that will be executed by your custom backend. Then when you load optimized model on android you dont need to have those operators compiled in the mxnet build.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] ptrendx commented on issue #19521: [RFC] Integration with AndroidNN

Posted by GitBox <gi...@apache.org>.
ptrendx commented on issue #19521:
URL: https://github.com/apache/incubator-mxnet/issues/19521#issuecomment-728182152


   Yes, currently the structure we have is
    - `operator_name.cc` which contains operator definition (+ all the infershape/type etc.) and `FCompute<cpu>`
    - `operator_name.cu` which contains just `FCompute<gpu>`
   
   We should change that to something like:
    - `src/operator/operator_name.cc` which contains all the device independent operator definition
    - `src/operator_impl/cpu/operator_name.cc` which contains just `FCompute<cpu>`
    - `src/operator_impl/cuda/operator_name.cu` which contains just `FCompute<gpu>`
   
   This would make it possible to have a subgraph backend replace whatever they need, as all the operator definitions would still exist. And I agree, together with the external ops functionality we could make it so `libmxnet.so` contains just the operator definitions, while separate `.so` would contains the actual implementations for different platforms. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] leezu commented on issue #19521: [RFC] Integration with AndroidNN

Posted by GitBox <gi...@apache.org>.
leezu commented on issue #19521:
URL: https://github.com/apache/incubator-mxnet/issues/19521#issuecomment-726249168


   > I mean mxnet size in apk is about 150 MB that is pretty thick. I use mxnet 1.7. Will there a lightweight version of mxnet in future like TFlite for tensorflow?
   
   It would be nice to have a lightweight version, but no-one is working on it currently. One workaround is to manually delete the operator implementation files that you don't need.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org