You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by ns...@apache.org on 2017/08/12 01:18:03 UTC
[incubator-mxnet] branch master updated: MXNet -> Apple CoreML converter (#7438)

This is an automated email from the ASF dual-hosted git repository.

nswamy pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
     new 16c8f96  MXNet -> Apple CoreML converter (#7438)
16c8f96 is described below

commit 16c8f96e7cb961b6a7b0a25e780906098e8d9bff
Author: Pracheer Gupta <pr...@hotmail.com>
AuthorDate: Fri Aug 11 18:18:00 2017 -0700

    MXNet -> Apple CoreML converter (#7438)
    
    * added coreml test converter for mxnet
    
    * added prioritized todo items
    
    * Updating parameters for 0.4.0 coreml.
    
    This mainly required re-arranging the existing parameters.
    The current test output looks like this:
    ====START=====
    (coremltools) Pracheers-MacBook-Pro:core_ml pracheer$ python test_mxnet_converer.py
    test_conv_random (__main__.MXNetSingleLayerTest) ... test_mxnet_converer.py:21: DeprecationWarning: mxnet.model.FeedForward has been deprecated. Please use mxnet.mod.Module instead.
      model = mx.model.FeedForward(net, engine, arg_params = engine.arg_dict)
    3 : conv_1_output, Convolution
    ok
    test_flatten (__main__.MXNetSingleLayerTest) ... 3 : conv_1, Convolution
    4 : flatten1, Flatten
    7 : fc1, FullyConnected
    9 : softmax_output, SoftmaxOutput
    ok
    test_really_tiny_2_inner_product_ones_input (__main__.MXNetSingleLayerTest) ... 3 : fc1_output, FullyConnected
    ok
    test_really_tiny_conv_random_3d_input (__main__.MXNetSingleLayerTest) ... 3 : conv_1_output, Convolution
    ok
    test_really_tiny_conv_random_input (__main__.MXNetSingleLayerTest) ... 3 : conv_1_output, Convolution
    ok
    test_really_tiny_conv_random_input_multi_filter (__main__.MXNetSingleLayerTest) ... 3 : conv_1_output, Convolution
    ok
    test_really_tiny_inner_product_ones_input (__main__.MXNetSingleLayerTest) ... 3 : fc1_output, FullyConnected
    ok
    test_tiny_asym_conv_random_asym_input (__main__.MXNetSingleLayerTest) ... 3 : conv_1, Convolution
    4 : tanh_output, Activation
    ok
    test_tiny_asym_conv_random_input (__main__.MXNetSingleLayerTest) ... 3 : conv_1_output, Convolution
    ok
    test_tiny_conv_ones_input (__main__.MXNetSingleLayerTest) ... 3 : conv_1_output, Convolution
    ok
    test_tiny_conv_pooling_random_input (__main__.MXNetSingleLayerTest) ... 3 : conv_1, Convolution
    4 : pool_1_output, Pooling
    ok
    test_tiny_conv_random_3d_input (__main__.MXNetSingleLayerTest) ... 3 : conv_1_output, Convolution
    ok
    test_tiny_conv_random_input (__main__.MXNetSingleLayerTest) ... 3 : conv_1_output, Convolution
    ok
    test_tiny_conv_random_input_multi_filter (__main__.MXNetSingleLayerTest) ... 3 : conv_1_output, Convolution
    ok
    test_tiny_inner_product_ones_input (__main__.MXNetSingleLayerTest) ... 3 : fc1_output, FullyConnected
    ok
    test_tiny_inner_product_random_input (__main__.MXNetSingleLayerTest) ... 3 : fc1_output, FullyConnected
    ok
    test_tiny_inner_product_zero_input (__main__.MXNetSingleLayerTest) ... 3 : fc1_output, FullyConnected
    ok
    test_tiny_relu_activation_random_input (__main__.MXNetSingleLayerTest) ... 3 : fc1, FullyConnected
    4 : relu1_output, Activation
    ok
    test_tiny_sigmoid_activation_random_input (__main__.MXNetSingleLayerTest) ... 3 : fc1, FullyConnected
    4 : sigmoid1_output, Activation
    ok
    test_tiny_softmax_random_input (__main__.MXNetSingleLayerTest) ... 3 : fc1, FullyConnected
    5 : softmax_output, SoftmaxOutput
    ok
    test_tiny_tanh_activation_random_input (__main__.MXNetSingleLayerTest) ... 3 : fc1, FullyConnected
    4 : tanh1_output, Activation
    ok
    test_transpose (__main__.MXNetSingleLayerTest) ... 1 : transpose, transpose
    4 : conv_1_output, Convolution
    ok
    
    ----------------------------------------------------------------------
    Ran 22 tests in 2.167s
    
    OK
    
    ====END=====
    
    * Convert reshape operator into coreml equivalent.
    
    * Set pre-processing parameters on coreml model.
    
    * Adding synsets w/ unit test.
    
    class_labels only used when coreml model is used in classifier mode so had to re-juggle few parameters to make it work.
    
    * Minor documentation change for pre-processing args.
    
    * Adding Deconvolution layer.
    
    Currently target_shape has been compulsory since it is required by coreml. Also, we are not currently able to evaluate the affect of padding. If we try to xplicitly add padding to coreml model at the end we get an error. This is how we were adding the padding:
             pad = literal_eval(param['pad'])
             for i in range(len(pad)):
                 convLayer.valid.paddingAmounts.borderAmounts[i].startEdgeSize = pad[i]
                 convLayer.valid.paddingAmounts.borderAmounts[i].endEdgeSize = pad[i]
    
    Error:
      File "test_mxnet_converer.py", line 22, in _get_coreml_model
        spec = mxnet_converter.convert(model, class_labels=class_labels, mode=mode, **input_shape)
      File "_mxnet_converter.py", line 197, in convert
        converter_func(net, node, model, builder)
      File "_layers.py", line 515, in convert_deconvolution
        convLayer.valid.paddingAmounts.borderAmounts[i].startEdgeSize = pad[i]
      File "/Users/pracheer/miniconda3/envs/coremltools/lib/python2.7/site-packages/google/protobuf/internal/containers.py", line 204, in __getitem__
        return self._values[key]
    IndexError: list index out of range
    
    Will fix the above two issues after discussions with coreml people.
    
    * Skip dropout layer while converting.
    
    * added quick todo
    
    * Enable padding > 1 by adding an extra layer for padding in coreml.
    
    Added unit tests.
    Caveat: Currently deconv layer with padding !- (0, 0) is not working: Model successfully converts but mxnet predictions from coreml ones.
    
    * Unit tests to convert entire model from zoo.
    
    * fix module / BN
    
    * refactor
    
    * minor fix
    
    * [BugFix] input-data as a dictionary.
    
    In our previous commit, input-data was assumed as an array which caused unit tests to fail. This change fixes that. Also, add a missing parameter to couple of the tests in models-test.
    
    * Use delta of 1e-3 instead of 1e-7 which was accidently pushed.
    
    * Test inception_v3 and remove tests that do only conversion.
    
    fwiw:On inception_v3, the predictions are off by more than delta.
    
    * update converter unittests with mxnet module
    
    * BatchNorm UnitTest+eps.
    
    * add image classification test
    
    * "Force" flag to force conversion of layers.
    
    This is needed for layers which don't have an exact one-to-one correspondence in CoreML. By default, the conversion should fail if it detects that CoreML doesn't support the layer as it is but this behavior can be overriden if anyone provides force flag while calling convert.
    
    Summary of changes:
    - Add "force" flag to all the layers in _layers.
    - For batchnorm conversion, don't throw the error if force flag is provided.
    - 2 unit tests: one tests that an exception is thrown for converting batch-norm layer with local batch stats; other that tests that "force" flag causes it to not throw the exception.
    
    * Minor: documentation fixes, fixing imports, etc.
    
    * ModelsUnitTests: Improved documentation, using force flag where reqd, ability to download the model files if they don't exist.
    
    * Minor: documentation update on KL divergence.
    
    * Minor: Removing unused variables.
    
    * Minor: documentation update for MXNetSingleLayerTest.
    
    * test_image: add force flag for resnet.
    
    * README; change name of classes; assert KLDivergence < 1e-4.
    
    * Updated README.
    
    * Minor: cosmetic changes to readme.
    
    * convert return coreml instead of protobuf
    
    * Minor: cosmetic changes to README.
    
    * Enable SingleLayerTest.test_tiny_synset_random_input.
    
    * Minor: fixing some formatting of README
    
    * testing readme formatting
    
    * Minor: Heading for TODOs in README.
    
    * ImagenetTest: fix shapes, add more models.
    
    * move test location, add mxnet_coreml_converter as Command Line Tool
    
    * add mxnet random seed in converter unittest
    
    * Updated README,fix vgg16 test, refactor deconv code.
    
    * refactor directory and moving .mlmodel files
    
    * Fixing README to have dimensions as 224.
    
    * Adding periods at the end of sentences.
    
    * Instead of commenting out a test, skip it.
    
    * remove force flag, add preprocessing_args
    
    * Updated README for pre-processing arguments.
    
    * Deconv w/ padding; pooling w/ pooling_convention.
    
    Earlier deconv w/ padding was giving incorrect predictions. We added crop layer which fixed the issues.
    As for the pooling_convention, the current coremltools doesn't provide the support so we added our own custom implementation (w/ help from Apple) to overcome the issue.
    
    * added coreml test converter for mxnet
    
    * added prioritized todo items
    
    * Updating parameters for 0.4.0 coreml.
    
    This mainly required re-arranging the existing parameters.
    The current test output looks like this:
    ====START=====
    (coremltools) Pracheers-MacBook-Pro:core_ml pracheer$ python test_mxnet_converer.py
    test_conv_random (__main__.MXNetSingleLayerTest) ... test_mxnet_converer.py:21: DeprecationWarning: mxnet.model.FeedForward has been deprecated. Please use mxnet.mod.Module instead.
      model = mx.model.FeedForward(net, engine, arg_params = engine.arg_dict)
    3 : conv_1_output, Convolution
    ok
    test_flatten (__main__.MXNetSingleLayerTest) ... 3 : conv_1, Convolution
    4 : flatten1, Flatten
    7 : fc1, FullyConnected
    9 : softmax_output, SoftmaxOutput
    ok
    test_really_tiny_2_inner_product_ones_input (__main__.MXNetSingleLayerTest) ... 3 : fc1_output, FullyConnected
    ok
    test_really_tiny_conv_random_3d_input (__main__.MXNetSingleLayerTest) ... 3 : conv_1_output, Convolution
    ok
    test_really_tiny_conv_random_input (__main__.MXNetSingleLayerTest) ... 3 : conv_1_output, Convolution
    ok
    test_really_tiny_conv_random_input_multi_filter (__main__.MXNetSingleLayerTest) ... 3 : conv_1_output, Convolution
    ok
    test_really_tiny_inner_product_ones_input (__main__.MXNetSingleLayerTest) ... 3 : fc1_output, FullyConnected
    ok
    test_tiny_asym_conv_random_asym_input (__main__.MXNetSingleLayerTest) ... 3 : conv_1, Convolution
    4 : tanh_output, Activation
    ok
    test_tiny_asym_conv_random_input (__main__.MXNetSingleLayerTest) ... 3 : conv_1_output, Convolution
    ok
    test_tiny_conv_ones_input (__main__.MXNetSingleLayerTest) ... 3 : conv_1_output, Convolution
    ok
    test_tiny_conv_pooling_random_input (__main__.MXNetSingleLayerTest) ... 3 : conv_1, Convolution
    4 : pool_1_output, Pooling
    ok
    test_tiny_conv_random_3d_input (__main__.MXNetSingleLayerTest) ... 3 : conv_1_output, Convolution
    ok
    test_tiny_conv_random_input (__main__.MXNetSingleLayerTest) ... 3 : conv_1_output, Convolution
    ok
    test_tiny_conv_random_input_multi_filter (__main__.MXNetSingleLayerTest) ... 3 : conv_1_output, Convolution
    ok
    test_tiny_inner_product_ones_input (__main__.MXNetSingleLayerTest) ... 3 : fc1_output, FullyConnected
    ok
    test_tiny_inner_product_random_input (__main__.MXNetSingleLayerTest) ... 3 : fc1_output, FullyConnected
    ok
    test_tiny_inner_product_zero_input (__main__.MXNetSingleLayerTest) ... 3 : fc1_output, FullyConnected
    ok
    test_tiny_relu_activation_random_input (__main__.MXNetSingleLayerTest) ... 3 : fc1, FullyConnected
    4 : relu1_output, Activation
    ok
    test_tiny_sigmoid_activation_random_input (__main__.MXNetSingleLayerTest) ... 3 : fc1, FullyConnected
    4 : sigmoid1_output, Activation
    ok
    test_tiny_softmax_random_input (__main__.MXNetSingleLayerTest) ... 3 : fc1, FullyConnected
    5 : softmax_output, SoftmaxOutput
    ok
    test_tiny_tanh_activation_random_input (__main__.MXNetSingleLayerTest) ... 3 : fc1, FullyConnected
    4 : tanh1_output, Activation
    ok
    test_transpose (__main__.MXNetSingleLayerTest) ... 1 : transpose, transpose
    4 : conv_1_output, Convolution
    ok
    
    ----------------------------------------------------------------------
    Ran 22 tests in 2.167s
    
    OK
    
    ====END=====
    
    * Convert reshape operator into coreml equivalent.
    
    * Adding Deconvolution layer.
    
    Currently target_shape has been compulsory since it is required by coreml. Also, we are not currently able to evaluate the affect of padding. If we try to xplicitly add padding to coreml model at the end we get an error. This is how we were adding the padding:
             pad = literal_eval(param['pad'])
             for i in range(len(pad)):
                 convLayer.valid.paddingAmounts.borderAmounts[i].startEdgeSize = pad[i]
                 convLayer.valid.paddingAmounts.borderAmounts[i].endEdgeSize = pad[i]
    
    Error:
      File "test_mxnet_converer.py", line 22, in _get_coreml_model
        spec = mxnet_converter.convert(model, class_labels=class_labels, mode=mode, **input_shape)
      File "_mxnet_converter.py", line 197, in convert
        converter_func(net, node, model, builder)
      File "_layers.py", line 515, in convert_deconvolution
        convLayer.valid.paddingAmounts.borderAmounts[i].startEdgeSize = pad[i]
      File "/Users/pracheer/miniconda3/envs/coremltools/lib/python2.7/site-packages/google/protobuf/internal/containers.py", line 204, in __getitem__
        return self._values[key]
    IndexError: list index out of range
    
    Will fix the above two issues after discussions with coreml people.
    
    * Adding synsets w/ unit test.
    
    class_labels only used when coreml model is used in classifier mode so had to re-juggle few parameters to make it work.
    
    * Set pre-processing parameters on coreml model.
    
    * Minor documentation change for pre-processing args.
    
    * Skip dropout layer while converting.
    
    * added quick todo
    
    * Enable padding > 1 by adding an extra layer for padding in coreml.
    
    Added unit tests.
    Caveat: Currently deconv layer with padding !- (0, 0) is not working: Model successfully converts but mxnet predictions from coreml ones.
    
    * Unit tests to convert entire model from zoo.
    
    * fix module / BN
    
    * refactor
    
    * minor fix
    
    * [BugFix] input-data as a dictionary.
    
    In our previous commit, input-data was assumed as an array which caused unit tests to fail. This change fixes that. Also, add a missing parameter to couple of the tests in models-test.
    
    * Use delta of 1e-3 instead of 1e-7 which was accidently pushed.
    
    * Test inception_v3 and remove tests that do only conversion.
    
    fwiw:On inception_v3, the predictions are off by more than delta.
    
    * update converter unittests with mxnet module
    
    * BatchNorm UnitTest+eps.
    
    * add image classification test
    
    * "Force" flag to force conversion of layers.
    
    This is needed for layers which don't have an exact one-to-one correspondence in CoreML. By default, the conversion should fail if it detects that CoreML doesn't support the layer as it is but this behavior can be overriden if anyone provides force flag while calling convert.
    
    Summary of changes:
    - Add "force" flag to all the layers in _layers.
    - For batchnorm conversion, don't throw the error if force flag is provided.
    - 2 unit tests: one tests that an exception is thrown for converting batch-norm layer with local batch stats; other that tests that "force" flag causes it to not throw the exception.
    
    * Minor: documentation fixes, fixing imports, etc.
    
    * ModelsUnitTests: Improved documentation, using force flag where reqd, ability to download the model files if they don't exist.
    
    * Minor: documentation update on KL divergence.
    
    * Minor: Removing unused variables.
    
    * Minor: documentation update for MXNetSingleLayerTest.
    
    * test_image: add force flag for resnet.
    
    * README; change name of classes; assert KLDivergence < 1e-4.
    
    * Updated README.
    
    * Minor: cosmetic changes to readme.
    
    * convert return coreml instead of protobuf
    
    * Minor: cosmetic changes to README.
    
    * Enable SingleLayerTest.test_tiny_synset_random_input.
    
    * Minor: fixing some formatting of README
    
    * testing readme formatting
    
    * Minor: Heading for TODOs in README.
    
    * ImagenetTest: fix shapes, add more models.
    
    * move test location, add mxnet_coreml_converter as Command Line Tool
    
    * add mxnet random seed in converter unittest
    
    * Updated README,fix vgg16 test, refactor deconv code.
    
    * refactor directory and moving .mlmodel files
    
    * Fixing README to have dimensions as 224.
    
    * Adding periods at the end of sentences.
    
    * Instead of commenting out a test, skip it.
    
    * remove force flag, add preprocessing_args
    
    * Updated README for pre-processing arguments.
    
    * Deconv w/ padding; pooling w/ pooling_convention.
    
    Earlier deconv w/ padding was giving incorrect predictions. We added crop layer which fixed the issues.
    As for the pooling_convention, the current coremltools doesn't provide the support so we added our own custom implementation (w/ help from Apple) to overcome the issue.
    
    * Moving files from core_ml directory to coreml directory since Apple guys pushed their code to coreml and we don't want to lose their history.
    
    * Moved Apple files from core_ml to coreml directory since Apple pushed their code changes to coreml and we don't want to lose their history.
    
    This change also add Apache license to all the files.
    
    * Updated documentation for utils.py.
    
    * Fixing Batchnorm test with the right delta.
    
    * Updating README with information about mode/pre-processing-args/class-labels.
---
 tools/coreml/README.md                           |  95 +++
 tools/coreml/_layers.py                          | 397 ----------
 tools/coreml/{ => converter}/__init__.py         |   1 -
 tools/coreml/converter/_add_pooling.py           | 118 +++
 tools/coreml/converter/_layers.py                | 569 ++++++++++++++
 tools/coreml/{ => converter}/_mxnet_converter.py |  77 +-
 tools/coreml/mxnet_coreml_converter.py           | 114 +++
 tools/coreml/test/test_mxnet_converter.py        | 949 +++++++++++++++++++++++
 tools/coreml/test/test_mxnet_image.py            | 136 ++++
 tools/coreml/test/test_mxnet_models.py           | 155 ++++
 tools/coreml/test_mxnet_converer.py              | 477 ------------
 tools/coreml/utils.py                            |  77 ++
 12 files changed, 2262 insertions(+), 903 deletions(-)

diff --git a/tools/coreml/README.md b/tools/coreml/README.md
new file mode 100644
index 0000000..32cde33
--- /dev/null
+++ b/tools/coreml/README.md
@@ -0,0 +1,95 @@
+# Convert MXNet models into Apple CoreML format.
+
+This tool helps convert MXNet models into [Apple CoreML](https://developer.apple.com/documentation/coreml) format which can then be run on Apple devices.
+
+## Installation
+In order to use this tool you need to have these installed:
+* MacOS - High Sierra 10.13
+* Xcode 9
+* coremltools 0.5.0 or greater (pip install coremltools)
+* mxnet 0.10.0 or greater. [Installation instructions](http://mxnet.io/get_started/install.html).
+* yaml (pip install pyyaml)
+* python 2.7
+
+## How to use
+Let's say you want to use your MXNet model in an iPhone App. For the purpose of this example, let's say you want to use squeezenet-v1.1.
+
+1. Download the model into the directory where this converter resides. Squeezenet can be downloaded from [here](http://data.mxnet.io/models/imagenet/squeezenet/).
+2. Run this command:
+
+  ```bash
+python mxnet_coreml_converter.py --model-prefix='squeezenet_v1.1' --epoch=0 --input-shape='{"data":"3,227,227"}' --mode=classifier --pre-processing-arguments='{"image_input_names":"data"}' --class-labels classLabels.txt --output-file="squeezenetv11.mlmodel"
+```
+
+  The above command will save the converted model into squeezenet-v11.mlmodel in CoreML format. Internally MXNet first loads the model and then we walk through the entire symbolic graph converting each operator into its CoreML equivalent. Some of the parameters are used by MXNet in order to load and generate the symbolic graph in memory while others are used by CoreML either to pre-process the input before the going through the neural network or to process the output in a particular way. 
+
+  In the command above:
+
+  * _model-prefix_: refers to the MXNet model prefix (may include the directory path).
+  * _epoch_: refers to the suffix of the MXNet model file.
+  * _input-shape_: refers to the input shape information in a JSON string format where the key is the name of the input variable (="data") and the value is the shape of that variable. If the model takes multiple inputs, input-shape for all of them need to be provided.
+  * _mode_: refers to the coreml model mode. Can either be 'classifier', 'regressor' or None. In this case, we use 'classifier' since we want the resulting CoreML model to classify images into various categories.
+  * _pre-processing-arguments_: In the Apple world images have to be of type Image. By providing image_input_names as "data", we are saying that the input variable "data" is of type Image.
+  * _class-labels_: refers to the name of the file which contains the classification labels (a.k.a. synset file).
+output-file: the file where the CoreML model will be dumped.
+
+3. The generated ".mlmodel" file can directly be integrated into your app. For more instructions on how to do this, please see [Apple CoreML's tutorial](https://developer.apple.com/documentation/coreml/integrating_a_core_ml_model_into_your_app).
+
+
+### Providing class labels
+You could provide a file containing class labels (as above) so that CoreML will return the predicted category the image belongs to. The file should have a label per line and labels can have any special characters. The line number of the label in the file should correspond with the index of softmax output. E.g.
+
+```bash
+python mxnet_coreml_converter.py --model-prefix='squeezenet_v1.1' --epoch=0 --input-shape='{"data":"3,227,227"}' --mode=classifier --class-labels classLabels.txt --output-file="squeezenetv11.mlmodel"
+```
+
+### Providing label names
+You may have to provide the label names of the MXNet model's outputs. For example, if you try to convert [vgg16](http://data.mxnet.io/models/imagenet/vgg/), you may have to provide label-name as "prob_label". By default "softmax_label" is assumed.
+
+```bash
+python mxnet_coreml_converter.py --model-prefix='vgg16' --epoch=0 --input-shape='{"data":"3,224,224"}' --mode=classifier --pre-processing-arguments='{"image_input_names":"data"}' --class-labels classLabels.txt --output-file="vgg16.mlmodel" --label-names="prob_label"
+```
+ 
+### Adding a pre-processing to CoreML model.
+You could ask CoreML to pre-process the images before passing them through the model.
+
+```bash
+python mxnet_coreml_converter.py --model-prefix='squeezenet_v1.1' --epoch=0 --input-shape='{"data":"3,224,224"}' --pre-processing-arguments='{"red_bias":127,"blue_bias":117,"green_bias":103}' --output-file="squeezenet_v11.mlmodel"
+```
+
+If you are building an app for a model that takes image as an input, you will have to provide image_input_names as pre-processing arguments. This tells CoreML that a particular input variable is of type Image. E.g.:
+ 
+```bash
+python mxnet_coreml_converter.py --model-prefix='squeezenet_v1.1' --epoch=0 --input-shape='{"data":"3,224,224"}' --pre-processing-arguments='{"red_bias":127,"blue_bias":117,"green_bias":103,"image_input_names":"data"}' --output-file="squeezenet_v11.mlmodel"
+```
+
+## Currently supported
+### Models
+This is a (growing) list of standard MXNet models that can be successfully converted using the converter. This means that any other model that uses similar operators as these models can also be successfully converted.
+
+1. Inception: [Inception-BN](http://data.mxnet.io/models/imagenet/inception-bn/), [Inception-V3](http://data.mxnet.io/models/imagenet/inception-v3.tar.gz)
+2. [NiN](http://data.dmlc.ml/models/imagenet/nin/)
+2. [Resnet](http://data.mxnet.io/models/imagenet/resnet/)
+3. [Squeezenet](http://data.mxnet.io/models/imagenet/squeezenet/)
+4. [Vgg](http://data.mxnet.io/models/imagenet/vgg/)
+
+### Layers
+1. Activation
+2. Batchnorm
+3. Concat
+4. Convolution
+5. Deconvolution
+6. Dense
+7. Elementwise
+8. Flatten
+9. Pooling
+10. Reshape
+11. Softmax
+12. Transpose
+
+## Known issues
+Currently there are no known issues.
+
+## This tool has been tested on environment with:
+* MacOS - High Sierra 10.13 Beta.
+* Xcode 9 beta 5.
diff --git a/tools/coreml/_layers.py b/tools/coreml/_layers.py
deleted file mode 100644
index 5148984..0000000
--- a/tools/coreml/_layers.py
+++ /dev/null
@@ -1,397 +0,0 @@
-# Licensed to the Apache Software Foundation (ASF) under one
-# or more contributor license agreements.  See the NOTICE file
-# distributed with this work for additional information
-# regarding copyright ownership.  The ASF licenses this file
-# to you under the Apache License, Version 2.0 (the
-# "License"); you may not use this file except in compliance
-# with the License.  You may obtain a copy of the License at
-# 
-#   http://www.apache.org/licenses/LICENSE-2.0
-# 
-# Unless required by applicable law or agreed to in writing,
-# software distributed under the License is distributed on an
-# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-# KIND, either express or implied.  See the License for the
-# specific language governing permissions and limitations
-# under the License.
-import numpy as _np
-
-def _get_input_output_name(net, node, index = 0):
-    name = node['name']
-    inputs = node['inputs']
-
-    if index == 'all':
-        input_name = [_get_node_name(net, inputs[id][0]) for id in range(len(inputs))]
-    elif type(index) == int:
-        input_name = _get_node_name(net, inputs[0][0])
-    else:
-        input_name = [_get_node_name(net, inputs[id][0]) for id in index]
-    return input_name, name
-
-def _get_node_name(net, node_id):
-    return net['nodes'][node_id]['name']
-
-def _get_node_shape(net, node_id):
-    return net['nodes'][node_id]['shape']
-
-def convert_transpose(net, node, model, builder):
-    """Convert a transpose layer from mxnet to coreml.
-
-    Parameters
-    ----------
-    network: net
-        A mxnet network object.
-
-    layer: node
-        Node to convert.
-
-    model: model
-        An model for MXNet
-
-    builder: NeuralNetworkBuilder
-        A neural network builder object.
-    """
-    input_name, output_name = _get_input_output_name(net, node)
-    name = node['name']
-    param = node['attr']
-    from ast import literal_eval
-    axes = literal_eval(param['axes'])
-    builder.add_permute(name, input_name, output_name, axes)
-
-def convert_flatten(net, node, model, builder):
-    """Convert a flatten layer from mxnet to coreml.
-
-    Parameters
-    ----------
-    network: net
-        A mxnet network object.
-
-    layer: node
-        Node to convert.
-
-    model: model
-        An model for MXNet
-
-    builder: NeuralNetworkBuilder
-        A neural network builder object.
-    """
-    input_name, output_name = _get_input_output_name(net, node)
-    name = node['name']
-    builder.add_flatten(0, name, input_name, output_name)
-
-def convert_softmax(net, node, model, builder):
-    """Convert a softmax layer from mxnet to coreml.
-
-    Parameters
-    ----------
-    network: net
-        A mxnet network object.
-
-    layer: node
-        Node to convert.
-
-    model: model
-        An model for MXNet
-
-    builder: NeuralNetworkBuilder
-        A neural network builder object.
-    """
-    input_name, output_name = _get_input_output_name(net, node)
-    name = node['name']
-    builder.add_softmax(name = name,
-                        input_name = input_name,
-                        output_name = output_name)
-
-def convert_activation(net, node, model, builder):
-    """Convert an activation layer from mxnet to coreml.
-
-    Parameters
-    ----------
-    network: net
-        A mxnet network object.
-
-    layer: node
-        Node to convert.
-
-    model: model
-        An model for MXNet
-
-    builder: NeuralNetworkBuilder
-        A neural network builder object.
-    """
-    input_name, output_name = _get_input_output_name(net, node)
-    name = node['name']
-    mx_non_linearity = node['attr']['act_type']
-    if mx_non_linearity == 'relu':
-        non_linearity = 'RELU'
-    elif mx_non_linearity == 'tanh':
-        non_linearity = 'TANH'
-    elif mx_non_linearity == 'sigmoid':
-        non_linearity = 'SIGMOID'
-    else:
-        raise TypeError('Unknown activation type %s' % mx_non_linearity)
-    builder.add_activation(name = name,
-                           non_linearity = non_linearity,
-                           input_name = input_name,
-                           output_name = output_name)
-
-def convert_elementwise_add(net, node, model, builder):
-    """Convert an elementwise add layer from mxnet to coreml.
-
-        Parameters
-        ----------
-        network: net
-        A mxnet network object.
-
-        layer: node
-        Node to convert.
-
-        model: model
-        An model for MXNet
-
-        builder: NeuralNetworkBuilder
-        A neural network builder object.
-        """
-
-    input_names, output_name = _get_input_output_name(net, node,[0,1])
-    name = node['name']
-
-    builder.add_elementwise(name, input_names, output_name, 'ADD')
-
-def convert_dense(net, node, model, builder):
-    """Convert a dense layer from mxnet to coreml.
-
-    Parameters
-    ----------
-    network: net
-        A mxnet network object.
-
-    layer: node
-        Node to convert.
-
-    model: model
-        An model for MXNet
-
-    builder: NeuralNetworkBuilder
-        A neural network builder object.
-    """
-    input_name, output_name = _get_input_output_name(net, node)
-    param = node['attr']
-    has_bias = True
-    name = node['name']
-
-    inputs = node['inputs']
-    outputs = node['outputs']
-    args = model.arg_params
-    W = args[_get_node_name(net, inputs[1][0])].asnumpy()
-    if has_bias:
-        Wb = args[_get_node_name(net, inputs[2][0])].asnumpy()
-    else:
-        Wb = None
-    nC, nB = W.shape
-
-    builder.add_inner_product(name = name,
-            W = W,
-            Wb = Wb,
-            nB = nB,
-            nC = nC,
-            has_bias = has_bias,
-            input_name = input_name,
-            output_name = output_name)
-
-def convert_convolution(net, node, model, builder):
-    """Convert a convolution layer from mxnet to coreml.
-
-    Parameters
-    ----------
-    network: net
-        A mxnet network object.
-
-    layer: node
-        Node to convert.
-
-    model: model
-        An model for MXNet
-
-    builder: NeuralNetworkBuilder
-        A neural network builder object.
-    """
-    input_name, output_name = _get_input_output_name(net, node)
-    name = node['name']
-    param = node['attr']
-    inputs = node['inputs']
-    outputs = node['outputs']
-    args = model.arg_params
-
-    from ast import literal_eval
-
-    if 'no_bias' in param.keys():
-        has_bias = not literal_eval(param['no_bias'])
-    else:
-        has_bias = True
-
-    border_mode = "same" if literal_eval(param['pad']) != (0, 0) else 'valid'
-    border_mode = "valid"
-    n_filters = int(param['num_filter'])
-    output_shape = None  # (needed for de-conv)
-
-    W = args[_get_node_name(net, inputs[1][0])].asnumpy()
-    if has_bias:
-        Wb = args[_get_node_name(net, inputs[2][0])].asnumpy()
-    else:
-        Wb = None
-
-    n_filters, channels = W.shape[0:2]
-    stride_height, stride_width = literal_eval(param['stride'])
-    kernel_height, kernel_width = literal_eval(param['kernel'])
-
-    W = W.transpose((2, 3, 1, 0))
-    builder.add_convolution(name = name,
-             kernelChannels = channels,
-             outputChannels = n_filters,
-             height = kernel_height,
-             width = kernel_width,
-             stride_height = stride_height,
-             stride_width = stride_width,
-             borderMode = border_mode,
-             groups = 1,
-             W = W,
-             b = Wb,
-             has_bias = has_bias,
-             is_deconv = False,
-             output_shape = output_shape,
-             input_name = input_name,
-             output_name = output_name)
-
-    # Add padding if there is any
-    convLayer = builder.nn_spec.layers[-1].convolution
-    pad = literal_eval(param['pad'])
-    for i in range(len(pad)):
-        convLayer.valid.paddingAmounts.borderAmounts[i].startEdgeSize = pad[i]
-        convLayer.valid.paddingAmounts.borderAmounts[i].endEdgeSize = pad[i]
-
-def convert_pooling(net, node, model, builder):
-    """Convert a pooling layer from mxnet to coreml.
-
-    Parameters
-    ----------
-    network: net
-        A mxnet network object.
-
-    layer: node
-        Node to convert.
-
-    model: model
-        An model for MXNet
-
-    builder: NeuralNetworkBuilder
-        A neural network builder object.
-    """
-    input_name, output_name = _get_input_output_name(net, node)
-    name = node['name']
-    inputs = node['inputs']
-    param = node['attr']
-    outputs = node['outputs']
-    args = model.arg_params
-
-    layer_type_mx = param['pool_type']
-    if layer_type_mx == 'max':
-        layer_type= 'MAX'
-    elif layer_type_mx == 'avg':
-        layer_type = 'AVERAGE'
-    else:
-        raise TypeError("Pooling type %s not supported" % layer_type_mx)
-
-    from ast import literal_eval
-    stride_height, stride_width = literal_eval(param['stride'])
-    kernel_width, kernel_height = literal_eval(param['kernel'])
-
-    padding_type = 'VALID'
-    if 'global_pool' in param.keys():
-        is_global = literal_eval(param['global_pool'])
-    else:
-        is_global = False
-    builder.add_pooling(name = name,
-        height = kernel_height,
-        width = kernel_width,
-        stride_height = stride_height,
-        stride_width = stride_width,
-        layer_type = layer_type,
-        padding_type = padding_type,
-        exclude_pad_area = False,
-        is_global = is_global,
-        input_name = input_name,
-        output_name = output_name)
-
-    # Add padding if there is any
-    poolingLayer = builder.nn_spec.layers[-1].pooling
-    pad = literal_eval(param['pad'])
-    for i in range(len(pad)):
-        poolingLayer.valid.paddingAmounts.borderAmounts[i].startEdgeSize = pad[i]
-        poolingLayer.valid.paddingAmounts.borderAmounts[i].endEdgeSize = pad[i]
-
-def convert_batchnorm(net, node, model, builder):
-    """Convert a transpose layer from mxnet to coreml.
-
-    Parameters
-    ----------
-    network: net
-        A mxnet network object.
-
-    layer: node
-        Node to convert.
-
-    model: model
-        An model for MXNet
-
-    builder: NeuralNetworkBuilder
-        A neural network builder object.
-    """
-    input_name, output_name = _get_input_output_name(net, node)
-    name = node['name']
-    param = node['attr']
-    inputs = node['inputs']
-    outputs = node['outputs']
-    args = model.arg_params
-    aux = model.aux_params
-
-    gamma = args[_get_node_name(net, inputs[1][0])].asnumpy()
-    beta = args[_get_node_name(net, inputs[2][0])].asnumpy()
-    mean = aux[_get_node_name(net, inputs[3][0])].asnumpy()
-    variance = aux[_get_node_name(net, inputs[4][0])].asnumpy()
-
-    nb_channels = gamma.shape[0]
-
-    builder.add_batchnorm(
-        name = name,
-        channels = nb_channels,
-        gamma = gamma,
-        beta = beta,
-        mean = mean,
-        variance = variance,
-        input_name = input_name,
-        output_name = output_name)
-
-def convert_concat(net, node, model, builder):
-    """Convert concat layer from mxnet to coreml.
-
-    Parameters
-    ----------
-    network: net
-    A mxnet network object.
-
-    layer: node
-    Node to convert.
-
-    model: model
-    An model for MXNet
-
-    builder: NeuralNetworkBuilder
-    A neural network builder object.
-    """
-    # Get input and output names
-    input_names, output_name = _get_input_output_name(net, node, 'all')
-    name = node['name']
-    mode = 'CONCAT'
-    builder.add_elementwise(name = name, input_names = input_names,
-            output_name = output_name, mode = mode)
diff --git a/tools/coreml/__init__.py b/tools/coreml/converter/__init__.py
similarity index 96%
rename from tools/coreml/__init__.py
rename to tools/coreml/converter/__init__.py
index e56490a..2456923 100644
--- a/tools/coreml/__init__.py
+++ b/tools/coreml/converter/__init__.py
@@ -15,4 +15,3 @@
 # specific language governing permissions and limitations
 # under the License.
 
-from _mxnet_converter import *
diff --git a/tools/coreml/converter/_add_pooling.py b/tools/coreml/converter/_add_pooling.py
new file mode 100644
index 0000000..51934f2
--- /dev/null
+++ b/tools/coreml/converter/_add_pooling.py
@@ -0,0 +1,118 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from coremltools.proto import NeuralNetwork_pb2 as _NeuralNetwork_pb2
+
+
+def add_pooling_with_padding_types(builder, name, height, width, stride_height, stride_width,
+        layer_type, padding_type, input_name, output_name,
+        padding_top = 0, padding_bottom = 0, padding_left = 0, padding_right = 0,
+        same_padding_asymmetry_mode = 'BOTTOM_RIGHT_HEAVY',
+        exclude_pad_area = True, is_global = False):
+    """
+    Add a pooling layer to the model.
+
+    This is our own implementation of add_pooling since current CoreML's version (0.5.0) of builder
+    doesn't provide support for padding types apart from valid. This support will be added in the
+    next release of coremltools. When that happens, this can be removed.
+
+    Parameters
+
+    ----------
+    builder: NeuralNetworkBuilder
+        A neural network builder object.
+    name: str
+        The name of this layer.
+    height: int
+        Height of pooling region.
+    width: int
+        Number of elements to be padded on the right side of the input blob.
+    stride_height: int
+        Stride along the height direction.
+    stride_width: int
+        Stride along the height direction.
+    layer_type: str
+        Type of pooling performed. Can either be 'MAX', 'AVERAGE' or 'L2'.
+    padding_type: str
+        Option for the output blob shape. Can be either 'VALID' , 'SAME' or 'INCLUDE_LAST_PIXEL'. Kindly look at NeuralNetwork.proto for details.
+    input_name: str
+        The input blob name of this layer.
+    output_name: str
+        The output blob name of this layer.
+
+    padding_top, padding_bottom, padding_left, padding_right: int
+        values of height (top, bottom) and width (left, right) padding to be used if padding type is "VALID" or "INCLUDE_LAST_PIXEL"
+
+    same_padding_asymmetry_mode : str.
+        Type of asymmetric padding to be used when  padding_type = 'SAME'. Kindly look at NeuralNetwork.proto for details. Can be either 'BOTTOM_RIGHT_HEAVY' or  'TOP_LEFT_HEAVY'.
+
+    exclude_pad_area: boolean
+        Whether to exclude padded area in the pooling operation. Defaults to True.
+
+        - If True, the value of the padded area will be excluded.
+        - If False, the padded area will be included.
+        This flag is only used with average pooling.
+    is_global: boolean
+        Whether the pooling operation is global. Defaults to False.
+
+        - If True, the pooling operation is global -- the pooling region is of the same size of the input blob.
+        Parameters height, width, stride_height, stride_width will be ignored.
+
+        - If False, the pooling operation is not global.
+
+    See Also
+    --------
+    add_convolution, add_pooling, add_activation
+    """
+
+    spec = builder.spec
+    nn_spec = builder.nn_spec
+
+    # Add a new layer
+    spec_layer = nn_spec.layers.add()
+    spec_layer.name = name
+    spec_layer.input.append(input_name)
+    spec_layer.output.append(output_name)
+    spec_layer_params = spec_layer.pooling
+
+    # Set the parameters
+    spec_layer_params.type = \
+                _NeuralNetwork_pb2.PoolingLayerParams.PoolingType.Value(layer_type)
+
+    if padding_type == 'VALID':
+        height_border = spec_layer_params.valid.paddingAmounts.borderAmounts.add()
+        height_border.startEdgeSize = padding_top
+        height_border.endEdgeSize = padding_bottom
+        width_border = spec_layer_params.valid.paddingAmounts.borderAmounts.add()
+        width_border.startEdgeSize = padding_left
+        width_border.endEdgeSize = padding_right
+    elif padding_type == 'SAME':
+        if not (same_padding_asymmetry_mode == 'BOTTOM_RIGHT_HEAVY' or  same_padding_asymmetry_mode == 'TOP_LEFT_HEAVY'):
+            raise ValueError("Invalid value %d of same_padding_asymmetry_mode parameter" % same_padding_asymmetry_mode)
+        spec_layer_params.same.asymmetryMode = _NeuralNetwork_pb2.SamePadding.SamePaddingMode.Value(same_padding_asymmetry_mode)
+    elif padding_type == 'INCLUDE_LAST_PIXEL':
+        if padding_top != padding_bottom or padding_left != padding_right:
+            raise ValueError("Only symmetric padding is supported with the INCLUDE_LAST_PIXEL padding type")
+        spec_layer_params.includeLastPixel.paddingAmounts.append(padding_top)
+        spec_layer_params.includeLastPixel.paddingAmounts.append(padding_left)
+
+    spec_layer_params.kernelSize.append(height)
+    spec_layer_params.kernelSize.append(width)
+    spec_layer_params.stride.append(stride_height)
+    spec_layer_params.stride.append(stride_width)
+    spec_layer_params.avgPoolExcludePadding = exclude_pad_area
+    spec_layer_params.globalPooling = is_global
diff --git a/tools/coreml/converter/_layers.py b/tools/coreml/converter/_layers.py
new file mode 100644
index 0000000..0a08994
--- /dev/null
+++ b/tools/coreml/converter/_layers.py
@@ -0,0 +1,569 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import _add_pooling
+from ast import literal_eval
+
+def _get_input_output_name(net, node, index=0):
+    name = node['name']
+    inputs = node['inputs']
+
+    if index == 'all':
+        input_name = [_get_node_name(net, inputs[idx][0]) for idx in range(len(inputs))]
+    elif type(index) == int:
+        input_name = _get_node_name(net, inputs[0][0])
+    else:
+        input_name = [_get_node_name(net, inputs[idx][0]) for idx in index]
+    return input_name, name
+
+
+def _get_node_name(net, node_id):
+    return net['nodes'][node_id]['name']
+
+
+def _get_node_shape(net, node_id):
+    return net['nodes'][node_id]['shape']
+
+
+# TODO These operators still need to be converted (listing in order of priority):
+# High priority:
+# mxnet.symbol.repeat -> builder.add_repeat to flatten and repeat the NDArray sequence
+# mxnet.symbol.Crop -> builder.add_crop to crop image along spacial dimensions
+# mxnet.symbol.Pad -> builder.add_padding putting 0's on height and width for tensor
+# Low Priority:
+# depthwise seperable convolution support through groups in builder.add_convolution
+# add_optional -> for all RNNs defining what goes in and out (to define beam search or if input is streaming)
+# mx.symbol.Embedding -> add_embedding takes indicies, word ids from dict that is outside coreml or
+# in pipeline only if we have text mapping to indicies
+# FusedRNNCell -> add_bidirlstm
+#  add_unilstm -> reverse_input param true as second and concat on outputs
+# Do vanilla (0.9 mxnet) lstm, gru, vanilla_rnn
+
+
+def convert_reshape(net, node, module, builder):
+    """Converts a reshape layer from mxnet to coreml.
+
+    This doesn't currently handle the deprecated parameters for the reshape layer.
+
+    Parameters
+    ----------
+    network: net
+        An mxnet network object.
+
+    layer: node
+        Node to convert.
+
+    module: module
+        A module for MXNet
+
+    builder: NeuralNetworkBuilder
+        A neural network builder object.
+    """
+    input_name, output_name = _get_input_output_name(net, node)
+    name = node['name']
+    target_shape = node['shape']
+
+    if any(item <= 0 for item in target_shape):
+        raise NotImplementedError('Special dimensional values less than or equal to 0 are not supported yet.'
+                                  'Feel free to file an issue here: https://github.com/dmlc/mxnet/issues.')
+
+    if 'reverse' in node and node['reverse'] == 'True':
+        raise NotImplementedError('"reverse" parameter is not supported by yet.'
+                                  'Feel free to file an issue here: https://github.com/dmlc/mxnet/issues.')
+
+    mode = 0 # CHANNEL_FIRST
+    builder.add_reshape(name, input_name, output_name, target_shape, mode)
+
+
+def convert_transpose(net, node, module, builder):
+    """Convert a transpose layer from mxnet to coreml.
+
+    Parameters
+    ----------
+    network: net
+        A mxnet network object.
+
+    layer: node
+        Node to convert.
+
+    module: module
+        An module for MXNet
+
+    builder: NeuralNetworkBuilder
+        A neural network builder object.
+    """
+    input_name, output_name = _get_input_output_name(net, node)
+    name = node['name']
+    param = node['attr']
+
+    axes = literal_eval(param['axes'])
+    builder.add_permute(name, axes, input_name, output_name)
+
+
+def convert_flatten(net, node, module, builder):
+    """Convert a flatten layer from mxnet to coreml.
+
+    Parameters
+    ----------
+    network: net
+        A mxnet network object.
+
+    layer: node
+        Node to convert.
+
+    module: module
+        An module for MXNet
+
+    builder: NeuralNetworkBuilder
+        A neural network builder object.
+    """
+    input_name, output_name = _get_input_output_name(net, node)
+    name = node['name']
+    mode = 0 # CHANNEL_FIRST
+    builder.add_flatten(name, mode, input_name, output_name)
+
+
+def convert_softmax(net, node, module, builder):
+    """Convert a softmax layer from mxnet to coreml.
+
+    Parameters
+    ----------
+    network: net
+        A mxnet network object.
+
+    layer: node
+        Node to convert.
+
+    module: module
+        An module for MXNet
+
+    builder: NeuralNetworkBuilder
+        A neural network builder object.
+    """
+    input_name, output_name = _get_input_output_name(net, node)
+    name = node['name']
+    builder.add_softmax(name=name,
+                        input_name=input_name,
+                        output_name=output_name)
+
+
+def convert_activation(net, node, module, builder):
+    """Convert an activation layer from mxnet to coreml.
+
+    Parameters
+    ----------
+    network: net
+        A mxnet network object.
+
+    layer: node
+        Node to convert.
+
+    module: module
+        An module for MXNet
+
+    builder: NeuralNetworkBuilder
+        A neural network builder object.
+    """
+    input_name, output_name = _get_input_output_name(net, node)
+    name = node['name']
+    mx_non_linearity = node['attr']['act_type']
+    #TODO add SCALED_TANH, SOFTPLUS, SOFTSIGN, SIGMOID_HARD, LEAKYRELU, PRELU, ELU, PARAMETRICSOFTPLUS, THRESHOLDEDRELU, LINEAR
+    if mx_non_linearity == 'relu':
+        non_linearity = 'RELU'
+    elif mx_non_linearity == 'tanh':
+        non_linearity = 'TANH'
+    elif mx_non_linearity == 'sigmoid':
+        non_linearity = 'SIGMOID'
+    else:
+        raise TypeError('Unknown activation type %s' % mx_non_linearity)
+    builder.add_activation(name = name,
+                           non_linearity = non_linearity,
+                           input_name = input_name,
+                           output_name = output_name)
+
+
+def convert_elementwise_add(net, node, module, builder):
+    """Convert an elementwise add layer from mxnet to coreml.
+
+    Parameters
+    ----------
+    network: net
+        A mxnet network object.
+
+    layer: node
+        Node to convert.
+
+    module: module
+        An module for MXNet
+
+    builder: NeuralNetworkBuilder
+        A neural network builder object.
+    """
+
+    input_names, output_name = _get_input_output_name(net, node, [0, 1])
+    name = node['name']
+
+    builder.add_elementwise(name, input_names, output_name, 'ADD')
+
+
+def convert_dense(net, node, module, builder):
+    """Convert a dense layer from mxnet to coreml.
+
+    Parameters
+    ----------
+    network: net
+        A mxnet network object.
+
+    layer: node
+        Node to convert.
+
+    module: module
+        An module for MXNet
+
+    builder: NeuralNetworkBuilder
+        A neural network builder object.
+    """
+    input_name, output_name = _get_input_output_name(net, node)
+    has_bias = True
+    name = node['name']
+
+    inputs = node['inputs']
+    args, _ = module.get_params()
+    W = args[_get_node_name(net, inputs[1][0])].asnumpy()
+    if has_bias:
+        Wb = args[_get_node_name(net, inputs[2][0])].asnumpy()
+    else:
+        Wb = None
+    nC, nB = W.shape
+
+    builder.add_inner_product(
+        name=name,
+        W=W,
+        b=Wb,
+        input_channels=nB,
+        output_channels=nC,
+        has_bias=has_bias,
+        input_name=input_name,
+        output_name=output_name
+    )
+
+
+def convert_convolution(net, node, module, builder):
+    """Convert a convolution layer from mxnet to coreml.
+
+    Parameters
+    ----------
+    network: net
+        A mxnet network object.
+
+    layer: node
+        Node to convert.
+
+    module: module
+        An module for MXNet
+
+    builder: NeuralNetworkBuilder
+        A neural network builder object.
+    """
+    input_name, output_name = _get_input_output_name(net, node)
+    name = node['name']
+    param = node['attr']
+    inputs = node['inputs']
+    args, _ = module.get_params()
+
+    if 'no_bias' in param.keys():
+        has_bias = not literal_eval(param['no_bias'])
+    else:
+        has_bias = True
+
+    if literal_eval(param['pad']) != (0, 0):
+        pad = literal_eval(param['pad'])
+        builder.add_padding(
+            name=name+"_pad",
+            left=pad[1],
+            right=pad[1],
+            top=pad[0],
+            bottom=pad[0],
+            value=0,
+            input_name=input_name,
+            output_name=name+"_pad_output")
+        input_name = name+"_pad_output"
+
+    border_mode = "valid"
+
+    n_filters = int(param['num_filter'])
+
+    W = args[_get_node_name(net, inputs[1][0])].asnumpy()
+    if has_bias:
+        Wb = args[_get_node_name(net, inputs[2][0])].asnumpy()
+    else:
+        Wb = None
+
+    channels = W.shape[1]
+    stride_height, stride_width = literal_eval(param['stride'])
+    kernel_height, kernel_width = literal_eval(param['kernel'])
+
+    W = W.transpose((2, 3, 1, 0))
+    builder.add_convolution(
+        name=name,
+        kernel_channels=channels,
+        output_channels=n_filters,
+        height=kernel_height,
+        width=kernel_width,
+        stride_height=stride_height,
+        stride_width=stride_width,
+        border_mode=border_mode,
+        groups=1,
+        W=W,
+        b=Wb,
+        has_bias=has_bias,
+        is_deconv=False,
+        output_shape=None,
+        input_name=input_name,
+        output_name=output_name)
+
+
+def convert_pooling(net, node, module, builder):
+    """Convert a pooling layer from mxnet to coreml.
+
+    Parameters
+    ----------
+    network: net
+        A mxnet network object.
+
+    layer: node
+        Node to convert.
+
+    module: module
+        An module for MXNet
+
+    builder: NeuralNetworkBuilder
+        A neural network builder object.
+    """
+    input_name, output_name = _get_input_output_name(net, node)
+    name = node['name']
+    param = node['attr']
+
+    layer_type_mx = param['pool_type']
+    if layer_type_mx == 'max':
+        layer_type = 'MAX'
+    elif layer_type_mx == 'avg':
+        layer_type = 'AVERAGE'
+    else:
+        raise TypeError("Pooling type %s not supported" % layer_type_mx)
+
+    # Add padding if there is any
+    if literal_eval(param['pad']) != (0, 0):
+        pad = literal_eval(param['pad'])
+        builder.add_padding(
+            name=name+"_pad",
+            left=pad[1],
+            right=pad[1],
+            top=pad[0],
+            bottom=pad[0],
+            value=0,
+            input_name=input_name,
+            output_name=name+"_pad_output")
+        input_name = name+"_pad_output"
+
+    stride_height, stride_width = literal_eval(param['stride'])
+    kernel_width, kernel_height = literal_eval(param['kernel'])
+
+    type_map = {'valid': 'VALID', 'full': 'INCLUDE_LAST_PIXEL'}
+    padding_type = param['pooling_convention'] if 'pooling_convention' in param else 'valid'
+    if padding_type not in type_map:
+        raise KeyError("%s type is not supported in this converter. It is a Github issue.")
+    padding_type = type_map[padding_type]
+
+    if 'global_pool' in param.keys():
+        is_global = literal_eval(param['global_pool'])
+    else:
+        is_global = False
+
+    # For reasons why we are not using the standard builder but having our own implementation,
+    # see the function documentation.
+    _add_pooling.add_pooling_with_padding_types(
+        builder=builder,
+        name=name,
+        height=kernel_height,
+        width=kernel_width,
+        stride_height=stride_height,
+        stride_width=stride_width,
+        layer_type=layer_type,
+        padding_type=padding_type,
+        exclude_pad_area=False,
+        is_global=is_global,
+        input_name=input_name,
+        output_name=output_name
+    )
+
+
+def convert_batchnorm(net, node, module, builder):
+    """Convert a transpose layer from mxnet to coreml.
+
+    Parameters
+    ----------
+    network: net
+        A mxnet network object.
+
+    layer: node
+        Node to convert.
+
+    module: module
+        An module for MXNet
+
+    builder: NeuralNetworkBuilder
+        A neural network builder object.
+    """
+    input_name, output_name = _get_input_output_name(net, node)
+    name = node['name']
+    inputs = node['inputs']
+
+
+    eps = 1e-3 # Default value of eps for MXNet.
+    use_global_stats = False # Default value of use_global_stats for MXNet.
+    if 'attr' in node:
+        if 'eps' in node['attr']:
+            eps = literal_eval(node['attr']['eps'])
+
+    args, aux = module.get_params()
+    gamma = args[_get_node_name(net, inputs[1][0])].asnumpy()
+    beta = args[_get_node_name(net, inputs[2][0])].asnumpy()
+    mean = aux[_get_node_name(net, inputs[3][0])].asnumpy()
+    variance = aux[_get_node_name(net, inputs[4][0])].asnumpy()
+    nb_channels = gamma.shape[0]
+    builder.add_batchnorm(
+        name=name,
+        channels=nb_channels,
+        gamma=gamma,
+        beta=beta,
+        mean=mean,
+        variance=variance,
+        input_name=input_name,
+        output_name=output_name,
+        epsilon=eps)
+
+
+def convert_concat(net, node, module, builder):
+    """Convert concat layer from mxnet to coreml.
+
+    Parameters
+    ----------
+    network: net
+        A mxnet network object.
+
+    layer: node
+        Node to convert.
+
+    module: module
+        An module for MXNet
+
+    builder: NeuralNetworkBuilder
+        A neural network builder object.
+    """
+    # Get input and output names
+    input_names, output_name = _get_input_output_name(net, node, 'all')
+    name = node['name']
+    mode = 'CONCAT'
+    builder.add_elementwise(name = name, input_names = input_names,
+            output_name = output_name, mode = mode)
+
+
+def convert_deconvolution(net, node, module, builder):
+    """Convert a deconvolution layer from mxnet to coreml.
+
+    Parameters
+    ----------
+    network: net
+        A mxnet network object.
+
+    layer: node
+        Node to convert.
+
+    module: module
+        An module for MXNet
+
+    builder: NeuralNetworkBuilder
+        A neural network builder object.
+    """
+    input_name, output_name = _get_input_output_name(net, node)
+    name = node['name']
+    param = node['attr']
+    inputs = node['inputs']
+    args, _ = module.get_params()
+
+    if 'no_bias' in param.keys():
+        has_bias = not literal_eval(param['no_bias'])
+    else:
+        has_bias = False
+
+    border_mode = "valid"
+
+    n_filters = int(param['num_filter'])
+
+    output_shape = None
+    if 'target_shape' in param:
+        target_shape = literal_eval(param['target_shape'])
+        output_shape = (int(target_shape[0]), int(target_shape[1]))
+
+    W = args[_get_node_name(net, inputs[1][0])].asnumpy()
+
+    if has_bias:
+        Wb = args[_get_node_name(net, inputs[2][0])].asnumpy()
+    else:
+        Wb = None
+
+    channels = W.shape[0]
+    stride_height, stride_width = literal_eval(param['stride'])
+    kernel_height, kernel_width = literal_eval(param['kernel'])
+    W = W.transpose((2, 3, 0, 1))
+
+    use_crop = False
+    if literal_eval(param['pad']) != (0, 0) and output_shape is None:
+        use_crop = True
+
+    builder.add_convolution(
+        name=name,
+        kernel_channels=channels,
+        output_channels=n_filters,
+        height=kernel_height,
+        width=kernel_width,
+        stride_height=stride_height,
+        stride_width=stride_width,
+        border_mode=border_mode,
+        groups=1,
+        W=W,
+        b=Wb,
+        has_bias=has_bias,
+        is_deconv=True,
+        output_shape=output_shape,
+        input_name=input_name,
+        output_name=output_name+'before_pad' if use_crop else output_name
+    )
+
+    if use_crop:
+        pad = literal_eval(param['pad'])
+        builder.add_crop(
+            name=name+"_pad",
+            left=pad[1],
+            right=pad[1],
+            top=pad[0],
+            bottom=pad[0],
+            offset=0,
+            input_names=[output_name+'before_pad'],
+            output_name=output_name
+        )
diff --git a/tools/coreml/_mxnet_converter.py b/tools/coreml/converter/_mxnet_converter.py
similarity index 69%
rename from tools/coreml/_mxnet_converter.py
rename to tools/coreml/converter/_mxnet_converter.py
index 88a980c..a9ea0f4 100644
--- a/tools/coreml/_mxnet_converter.py
+++ b/tools/coreml/converter/_mxnet_converter.py
@@ -35,10 +35,13 @@ _MXNET_LAYER_REGISTRY  = {
     'Concat'         : _layers.convert_concat,
     'BatchNorm'      : _layers.convert_batchnorm,
     'elemwise_add'   : _layers.convert_elementwise_add,
+    'Reshape'        : _layers.convert_reshape,
+    'Deconvolution'  : _layers.convert_deconvolution,
 }
 
 _MXNET_SKIP_LAYERS = [
     '_MulScalar',
+    'Dropout',
 ]
 
 def _mxnet_remove_batch(input_data):
@@ -73,7 +76,6 @@ def check_error(model, path, shapes, output = 'softmax_output', verbose = True):
 def _set_input_output_layers(builder, input_names, output_names):
     input_layers_indices = []
     output_layers_indices = []
-    spec = builder.spec
     layers = builder.spec.neuralNetwork.layers
     for idx, l in enumerate(layers):
         if set(input_names).intersection(l.input):
@@ -83,8 +85,8 @@ def _set_input_output_layers(builder, input_names, output_names):
 
     builder.input_layers_indices = input_layers_indices
     builder.output_layers_indices = output_layers_indices
-    builder.input_layers_is1d = [False for i in input_names]
-    builder.output_layers_is1d = [False for i in output_names]
+    builder.input_layers_is1d = [False for _ in input_names]
+    builder.output_layers_is1d = [False for _ in output_names]
 
 def _get_layer_converter_fn(layer):
     """Get the right converter function for MXNet
@@ -94,8 +96,9 @@ def _get_layer_converter_fn(layer):
     else:
         raise TypeError("MXNet layer of type %s is not supported." % layer)
 
-def convert(model, order = None, **kwargs):
-    """Convert a keras model to the protobuf spec.
+
+def convert(model, input_shape, order = None, class_labels = None, mode = None, preprocessor_args = None):
+    """Convert an MXNet model to the protobuf spec.
 
     Parameters
     ----------
@@ -104,33 +107,46 @@ def convert(model, order = None, **kwargs):
 
     order: Order of inputs
 
+    class_labels: A string or list of strings.
+        As a string it represents the name of the file which contains the classification labels (one per line).
+        As a list of strings it represents a list of categories that map the index of the output of a neural network to labels in a classifier.
+
+    mode: str ('classifier', 'regressor' or None)
+        Mode of the converted coreml model.
+        When mode = 'classifier', a NeuralNetworkClassifier spec will be constructed.
+        When mode = 'regressor', a NeuralNetworkRegressor spec will be constructed.
+
     **kwargs :
-        Provide keyword arguments of known shapes.
+        Provide keyword arguments for:
+        - input shapes. Supplied as a dictionary object with keyword "input_shape".
+        - pre-processing arguments: Supplied as a dictionary object with keyword "preprocessor_args". The parameters in the dictionary
+            tell the converted coreml model how to pre-process any input before an inference is run on it.
+            For the list of pre-processing arguments see
+            http://pythonhosted.org/coremltools/generated/coremltools.models.neural_network.html#coremltools.models.neural_network.NeuralNetworkBuilder.set_pre_processing_parameters
 
     Returns
     -------
-    model_spec: An object of type ModelSpec_pb.
-        Protobuf representation of the model
+    model: A coreml model.
     """
-    if not kwargs:
-        raise TypeError("Must provide input shape to be able to perform conversion")
+    if not isinstance(input_shape, dict):
+         raise TypeError("Must provide a dictionary for input shape. e.g input_shape={'data':(3,224,224)}")
 
     def remove_batch(dim):
         return dim[1:]
 
     if order is None:
-        input_names = kwargs.keys()
-        input_dims  = map(remove_batch, kwargs.values())
+        input_names = input_shape.keys()
+        input_dims  = map(remove_batch, input_shape.values())
     else:
-        names = kwargs.keys()
-        shapes = map(remove_batch, kwargs.values())
+        names = input_shape.keys()
+        shapes = map(remove_batch, input_shape.values())
         input_names = [names[i] for i in order]
         input_dims = [shapes[i] for i in order]
 
     net = model.symbol
 
     # Infer shapes and store in a dictionary
-    shapes = net.infer_shape(**kwargs)
+    shapes = net.infer_shape(**input_shape)
     arg_names = net.list_arguments()
     output_names = net.list_outputs()
     aux_names = net.list_auxiliary_states()
@@ -142,7 +158,6 @@ def convert(model, order = None, **kwargs):
     for idx, op in enumerate(aux_names):
         shape_dict[op] = shapes[2][idx]
 
-
     # Get the inputs and outputs
     output_dims = shapes[1]
     input_types = [_datatypes.Array(*dim) for dim in input_dims]
@@ -151,11 +166,11 @@ def convert(model, order = None, **kwargs):
     # Make the builder
     input_features = zip(input_names, input_types)
     output_features = zip(output_names, output_types)
-    builder = _neural_network.NeuralNetworkBuilder(input_features, output_features)
-
+    builder = _neural_network.NeuralNetworkBuilder(input_features, output_features, mode)
     # Get out the layers
     net = _json.loads(net.tojson())
     nodes = net['nodes']
+
     for i, node in enumerate(nodes):
         node['id'] = i
 
@@ -178,7 +193,7 @@ def convert(model, order = None, **kwargs):
         head_node['shape'] = shape_dict[head_node['name']]
 
     # For skipped layers, make sure nodes are modified
-    for iter, node in enumerate(nodes):
+    for node in nodes:
         op = node['op']
         inputs = node['inputs']
         outputs = node['outputs']
@@ -187,24 +202,30 @@ def convert(model, order = None, **kwargs):
             nodes[outputs[0][0]]['inputs'][0] = inputs[0]
 
     # Find the input and output names for this node
-    for iter, node in enumerate(nodes):
+    for idx, node in enumerate(nodes):
         op = node['op']
         if op == 'null' or op in _MXNET_SKIP_LAYERS:
             continue
         name = node['name']
-        print("%d : %s, %s" % (iter, name, op))
+        print("%d : %s, %s" % (idx, name, op))
         converter_func = _get_layer_converter_fn(op)
         converter_func(net, node, model, builder)
 
-    spec = builder.spec
-    layers = spec.neuralNetwork.layers
-
     # Set the right inputs and outputs
     _set_input_output_layers(builder, input_names, output_names)
     builder.set_input(input_names, input_dims)
     builder.set_output(output_names, output_dims)
+    if preprocessor_args is not None:
+        builder.set_pre_processing_parameters(**preprocessor_args)
+
+    if class_labels is not None:
+        if type(class_labels) is str:
+            labels = [l.strip() for l in open(class_labels).readlines()]
+        elif type(class_labels) is list:
+            labels = class_labels
+        else:
+            raise TypeError("synset variable of unknown type. Type found: %s. Expected either string or list of strings." % type(class_labels))
+        builder.set_class_labels(class_labels = labels)
 
-    # Return the spec
-    spec = builder.spec
-    layers = spec.neuralNetwork.layers
-    return spec
+    # Return the model
+    return _coremltools.models.MLModel(builder.spec)
\ No newline at end of file
diff --git a/tools/coreml/mxnet_coreml_converter.py b/tools/coreml/mxnet_coreml_converter.py
new file mode 100644
index 0000000..502377e
--- /dev/null
+++ b/tools/coreml/mxnet_coreml_converter.py
@@ -0,0 +1,114 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import print_function
+import argparse
+from converter._mxnet_converter import convert
+from utils import load_model
+import yaml
+from ast import literal_eval
+
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser(description='Converts an MXNet model to a CoreML model')
+
+    parser.add_argument(
+        '--model-prefix', required=True, type=str,
+        help="Prefix of the existing model. The model is expected to be stored in the same directory from where "
+             "this tool is being run. E.g. --model-prefix=squeezenet_v1.1. Note that this can include entire "
+             "directory name too. E.g. --model-prefix=~/Downloads/squeezenet_v1.1."
+    )
+    parser.add_argument(
+        '--epoch', required=True, type=int,
+        help="The suffix of the MXNet model name which usually indicate the number of epochs. E.g. --epoch=0"
+    )
+    parser.add_argument(
+        '--output-file', required=True, type=str,
+        help="File where the resulting CoreML model will be saved. E.g. --output-file=\"squeezenet-v11.mlmodel\""
+    )
+    parser.add_argument(
+        '--input-shape', required=True, type=str,
+        help="Input shape information in a JSON string format. E.g. --input-shape='{\"data\":\"3,224,224\"}' where"
+             " 'data' is the name of the input variable of the MXNet model and '3,244,244' is its shape "
+             "(channel, height and weight) of the input image data."
+    )
+    parser.add_argument(
+        '--label-names', required=False, type=str, default='softmax_label',
+        help="label-names of the MXNet model's output variables. E.g. --label-names=softmax_label. "
+             "(Usually this is the name of the last layer followed by suffix _label.)"
+    )
+    parser.add_argument(
+        '--mode', required=False, type=str, default=None,
+        help="When mode='classifier', a CoreML NeuralNetworkClassifier will be constructed. "
+             "When mode='regressor', a CoreML NeuralNetworkRegressor will be constructed. "
+             "When mode=None (default), a CoreML NeuralNetwork will be constructed."
+    )
+    parser.add_argument(
+        '--class-labels', required=False, type=str, default=None,
+        help="As a string it represents the name of the file which contains the classification labels (synset file)."
+    )
+    parser.add_argument(
+        '--pre-processing-arguments', required=False, type=str, default=None,
+        help="The parameters in the dictionary tell the converted coreml model how to pre-process any input "
+             "before an inference is run on it. For the list of pre-processing arguments see https://goo.gl/GzFe86"
+             "e.g. --pre-processing-arguments='{\"red_bias\": 127, \"blue_bias\":117, \"green_bias\": 103}'"
+    )
+
+    # TODO
+    # We need to test how to use the order
+    # parser.add_argument(
+    #     '--order', required=True, type=str, default=None,
+    #     help=""
+    # )
+
+    args, unknown = parser.parse_known_args()
+
+    model_name = args.model_prefix
+    epoch_num = args.epoch
+    output_file = args.output_file
+    mode = args.mode
+    class_labels=args.class_labels
+
+    # parse the input data name/shape and label name/shape
+    input_shape = yaml.safe_load(args.input_shape)
+    data_shapes = []
+    for key in input_shape:
+        # We prepend 1 because the coreml model only accept 1 input data at a time.
+        shape = (1,)+literal_eval(input_shape[key])
+        input_shape[key] = shape
+        data_shapes.append((key, shape))
+
+    # if label name is not in input then do not use the label
+    label_names = [args.label_names,] if args.label_names in input_shape else None
+
+    pre_processing_arguments = args.pre_processing_arguments
+
+    mod = load_model(
+        model_name=model_name,
+        epoch_num=epoch_num,
+        data_shapes=data_shapes,
+        label_shapes=None,
+        label_names=label_names
+    )
+
+    kwargs = {'input_shape': input_shape}
+    if pre_processing_arguments is not None:
+        kwargs['preprocessor_args'] = yaml.safe_load(pre_processing_arguments)
+
+    coreml_model = convert(model=mod, mode=mode, class_labels=class_labels, **kwargs)
+    coreml_model.save(output_file)
+    print("\nSUCCESS\nModel %s has been converted and saved at %s\n" % (model_name, output_file))
diff --git a/tools/coreml/test/test_mxnet_converter.py b/tools/coreml/test/test_mxnet_converter.py
new file mode 100644
index 0000000..6692b44
--- /dev/null
+++ b/tools/coreml/test/test_mxnet_converter.py
@@ -0,0 +1,949 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import unittest
+import mxnet as mx
+import numpy as np
+import sys
+import os
+current_working_directory = os.getcwd()
+sys.path.append(current_working_directory + "/..")
+sys.path.append(current_working_directory + "/../converter/")
+import _mxnet_converter as mxnet_converter
+from collections import namedtuple
+
+
+def _mxnet_remove_batch(input_data):
+    for blob in input_data:
+        input_data[blob] = np.reshape(input_data[blob], input_data[blob].shape[1:])
+    return input_data
+
+
+def _get_mxnet_module(net, input_shape, mode, label_names, input_names=None):
+    """ Given a symbolic graph, input shape and the initialization mode,
+        returns an MXNet module.
+    """
+    mx.random.seed(1993)
+
+    mod = mx.mod.Module(
+        symbol=net,
+        context=mx.cpu(),
+        label_names=label_names
+    )
+    mod.bind(
+        for_training=False,
+        data_shapes=[('data', input_shape)],
+        label_shapes=input_names
+    )
+    if mode == 'random':
+        mod.init_params(
+            initializer=mx.init.Uniform(scale=.1)
+        )
+    elif mode == 'zeros':
+        mod.init_params(
+            initializer=mx.init.Zero()
+        )
+    elif mode == 'ones':
+        mod.init_params(
+            initializer=mx.init.One()
+        )
+    else:
+        Exception(KeyError("%s is not a valid initialization mode" % mode))
+
+    return mod
+
+
+class SingleLayerTest(unittest.TestCase):
+    """
+    Unit test class for testing where converter is able to convert individual layers or not.
+    In order to do so, it converts model and generates preds on both CoreML and MXNet and check they are the same.
+    """
+    def _test_mxnet_model(self, net, input_shape, mode, class_labels=None, coreml_mode=None, label_names=None, delta=1e-3,
+                          pre_processing_args=None):
+        """ Helper method that convert the CoreML model into CoreML and compares the predictions over random data.
+
+        Parameters
+        ----------
+        net: MXNet Symbol Graph
+            The graph that we'll be converting into CoreML.
+
+        input_shape: tuple of ints
+            The shape of input data. Generally of the format (batch-size, channels, height, width)
+
+        mode: (random|zeros|ones)
+            The mode to use in order to set the parameters (weights and biases).
+
+        label_names: list of strings
+            The names of the output labels. Default: None
+
+        delta: float
+            The maximum difference b/w predictions of MXNet and CoreML that is tolerable.
+        """
+        mod = _get_mxnet_module(net, input_shape, mode, label_names)
+
+        # Generate some dummy data
+        input_data = {'data': np.random.uniform(-10., 10., input_shape)}
+        Batch = namedtuple('Batch', ['data'])
+        mod.forward(Batch([mx.nd.array(input_data['data'])]))
+        mxnet_preds = mod.get_outputs()[0].asnumpy().flatten()
+
+        # Get predictions from coreml
+        coreml_model = mxnet_converter.convert(
+            model=mod,
+            class_labels=class_labels,
+            mode=coreml_mode,
+            input_shape={'data': input_shape},
+            preprocessor_args=pre_processing_args
+        )
+        coreml_preds = coreml_model.predict(_mxnet_remove_batch(input_data)).values()[0].flatten()
+
+        # Check prediction accuracy
+        self.assertEquals(len(mxnet_preds), len(coreml_preds))
+        for i in range(len(mxnet_preds)):
+            self.assertAlmostEquals(mxnet_preds[i], coreml_preds[i], delta = delta)
+
+    def test_tiny_inner_product_zero_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 10)
+        net = mx.sym.Variable('data')
+        net = mx.sym.FullyConnected(data=net, name='fc1', num_hidden=5)
+        self._test_mxnet_model(net, input_shape=input_shape, mode='zeros')
+
+    def test_really_tiny_inner_product_ones_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 1)
+        net = mx.sym.Variable('data')
+        net = mx.sym.FullyConnected(data=net, name='fc1', num_hidden=1)
+        self._test_mxnet_model(net, input_shape=input_shape, mode='ones')
+
+    def test_really_tiny_2_inner_product_ones_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 1)
+        net = mx.sym.Variable('data')
+        net = mx.sym.FullyConnected(data=net, name='fc1', num_hidden=5)
+        self._test_mxnet_model(net, input_shape=input_shape, mode='ones')
+
+    def test_tiny_inner_product_ones_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 10)
+        net = mx.sym.Variable('data')
+        net = mx.sym.FullyConnected(data=net, name='fc1', num_hidden=5)
+        self._test_mxnet_model(net, input_shape=input_shape, mode='ones')
+
+    def test_tiny_inner_product_random_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 10)
+        net = mx.sym.Variable('data')
+        net = mx.sym.FullyConnected(data=net, name='fc1', num_hidden=5)
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_tiny_softmax_random_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 10)
+        net = mx.sym.Variable('data')
+        net = mx.sym.FullyConnected(data=net, name='fc1', num_hidden=5)
+        net = mx.sym.SoftmaxOutput(net, name='softmax')
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random', label_names=['softmax_label'])
+
+    def test_tiny_relu_activation_random_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 10)
+        net = mx.sym.Variable('data')
+        net = mx.sym.FullyConnected(data=net, name='fc1', num_hidden=5)
+        net = mx.sym.Activation(net, name='relu1', act_type="relu")
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_tiny_sigmoid_activation_random_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 10)
+        net = mx.sym.Variable('data')
+        net = mx.sym.FullyConnected(data=net, name='fc1', num_hidden=5)
+        net = mx.sym.Activation(net, name='sigmoid1', act_type="sigmoid")
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_tiny_tanh_activation_random_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 10)
+
+        # Define a model
+        net = mx.sym.Variable('data')
+        net = mx.sym.FullyConnected(data=net, name='fc1', num_hidden=5)
+        net = mx.sym.Activation(net, name='tanh1', act_type="tanh")
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_really_tiny_conv_random_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 1, 10, 10)
+        num_filter = 1
+        kernel = (1 ,1)
+        stride = (1, 1)
+        pad = (0, 0)
+
+        # Define a model
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Convolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='conv_1'
+        )
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_tiny_conv_ones_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 1, 10, 10)
+        num_filter = 1
+        kernel = (5, 5)
+        stride = (1, 1)
+        pad = (0, 0)
+
+        # Define a model
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Convolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='conv_1'
+        )
+        self._test_mxnet_model(net, input_shape=input_shape, mode='ones')
+
+    def test_tiny_conv_random_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 1, 10, 10)
+        num_filter = 1
+        kernel = (5, 5)
+        stride = (1, 1)
+        pad = (0, 0)
+
+        # define a model
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Convolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='conv_1'
+        )
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_tiny_asym_conv_random_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 1, 10, 10)
+        num_filter = 1
+        kernel = (5 ,3)
+        stride = (1, 1)
+        pad = (0, 0)
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Convolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='conv_1'
+        )
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_tiny_asym_conv_random_asym_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 1, 28, 18)
+        num_filter = 16
+        kernel = (5, 3)
+        stride = (1, 1)
+        pad = (0, 0)
+        dilate = (1, 1)
+        # define a model
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Convolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='conv_1',
+            dilate=dilate)
+        net = mx.sym.Activation(net, name='tanh', act_type="tanh")
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_tiny_conv_valid_pooling_random_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 1, 10, 10)
+        num_filter = 1
+        kernel = (2, 2)
+        stride = (2, 2)
+        pad = (0, 0)
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Convolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='conv_1'
+        )
+        net = mx.symbol.Pooling(
+            data=net,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='pool_1',
+            pool_type='avg',
+            pooling_convention='valid'
+        )
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_tiny_conv_pooling_full_random_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 1, 10, 10)
+        num_filter = 1
+        kernel = (2, 2)
+        stride = (2, 2)
+        pad = (0, 0)
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Convolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='conv_1'
+        )
+        net = mx.symbol.Pooling(
+            data=net,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='pool_1',
+            pool_type='avg',
+            pooling_convention='full'
+        )
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_tiny_conv_pooling_full_random_input_with_padding(self):
+        np.random.seed(1988)
+        input_shape = (1, 3, 10, 10)
+        num_filter = 2
+        kernel = (2, 2)
+        stride = (2, 2)
+        pad = (1, 1)
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Convolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='conv_1'
+        )
+        net = mx.symbol.Pooling(
+            data=net,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='pool_1',
+            pool_type='avg',
+            pooling_convention='full'
+        )
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_really_tiny_conv_random_3d_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 3, 10, 10)
+        num_filter = 1
+        kernel = (1, 1)
+        stride = (1, 1)
+        pad = (0, 0)
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Convolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='conv_1'
+        )
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_really_tiny_conv_random_input_multi_filter(self):
+        np.random.seed(1988)
+        input_shape = (1, 1, 10, 10)
+        num_filter = 64
+        kernel = (1, 1)
+        stride = (1, 1)
+        pad = (0, 0)
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Convolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='conv_1'
+        )
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_tiny_conv_random_3d_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 3, 10, 10)
+        num_filter = 1
+        kernel = (5 ,5)
+        stride = (1, 1)
+        pad = (0, 0)
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Convolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='conv_1'
+        )
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_tiny_conv_random_input_multi_filter(self):
+        np.random.seed(1988)
+        input_shape = (1, 1, 10, 10)
+        num_filter = 64
+        kernel = (5, 5)
+        stride = (1, 1)
+        pad = (0, 0)
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Convolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='conv_1'
+        )
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_conv_random(self):
+        np.random.seed(1988)
+        input_shape = (1, 3, 10, 10)
+        num_filter = 64
+        kernel = (5, 5)
+        stride = (1, 1)
+        pad = (0, 0)
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Convolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='conv_1'
+        )
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_flatten(self):
+        np.random.seed(1988)
+        input_shape = (1, 3, 10, 10)
+        num_filter = 64
+        kernel = (5, 5)
+        stride = (1, 1)
+        pad = (0, 0)
+
+        # define a model
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Convolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='conv_1'
+        )
+        net = mx.sym.Flatten(data=net, name='flatten1')
+        net = mx.sym.FullyConnected(data=net, name='fc1', num_hidden=5)
+        net = mx.sym.SoftmaxOutput(net, name='softmax')
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random', label_names=['softmax_label'])
+
+    def test_transpose(self):
+        np.random.seed(1988)
+        input_shape = (1, 3, 10, 10)
+        num_filter = 64
+        kernel = (5, 5)
+        stride = (1, 1)
+        pad = (0, 0)
+
+        net = mx.sym.Variable('data')
+        net = mx.sym.transpose(data=net, name='transpose', axes=(0, 1, 2, 3))
+        net = mx.symbol.Convolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='conv_1'
+        )
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_reshape(self):
+        np.random.seed(1988)
+        input_shape = (1, 8)
+        net = mx.sym.Variable('data')
+        net = mx.sym.reshape(data=net, shape=(1, 2, 2, 2))
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_tiny_synset_random_input(self):
+        np.random.seed(1989)
+        input_shape = (1, 10)
+        net = mx.sym.Variable('data')
+        net = mx.sym.FullyConnected(data=net, name='fc1', num_hidden=5)
+        net = mx.sym.SoftmaxOutput(net, name='softmax')
+        mod = _get_mxnet_module(net,
+                                input_shape=input_shape,
+                                mode='random',
+                                label_names=['softmax_label'])
+
+        # Generate some dummy data
+        input_data = np.random.uniform(-0.1, 0.1, input_shape)
+
+        Batch = namedtuple('Batch', ['data'])
+        mod.forward(Batch([mx.nd.array(input_data)]))
+
+        kwargs = {'input_shape': {'data': input_shape}}
+        # Get predictions from coreml
+        coreml_model = mxnet_converter.convert(
+            model=mod,
+            class_labels=['Category1','Category2','Category3','Category4','Category5'],
+            mode='classifier',
+            **kwargs
+        )
+
+        prediction = coreml_model.predict(_mxnet_remove_batch({'data': input_data}))
+        self.assertEqual(prediction['classLabel'], 'Category3')
+
+    def test_really_tiny_deconv_random_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 1, 10, 10)
+        num_filter = 1
+        kernel = (1, 1)
+        stride = (1, 1)
+        pad = (0, 0)
+
+        # Define a model
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Deconvolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='deconv_1'
+        )
+        # Test the mxnet model
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_tiny_deconv_ones_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 1, 10, 10)
+        num_filter = 1
+        kernel = (5, 5)
+        stride = (1, 1)
+        pad = (0, 0)
+
+        # Define a model
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Deconvolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='deconv_1'
+        )
+        # Test the mxnet model
+        self._test_mxnet_model(net, input_shape=input_shape, mode='ones')
+
+    def test_tiny_deconv_random_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 1, 10, 10)
+        num_filter = 1
+        kernel = (5, 5)
+        stride = (1, 1)
+        pad = (0, 0)
+
+        # Define a model
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Deconvolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='deconv_1'
+        )
+        # Test the mxnet model
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_tiny_asym_deconv_random_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 1, 10, 10)
+        num_filter = 1
+        kernel = (5, 3)
+        stride = (1, 1)
+        pad = (0, 0)
+
+        # Define a model
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Deconvolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='deconv_1'
+        )
+        # Test the mxnet model
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_tiny_asym_deconv_random_asym_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 1, 28, 18)
+        num_filter = 16
+        kernel = (5, 3)
+        stride = (1, 1)
+        pad = (0, 0)
+        dilate = (1, 1)
+
+        # define a model
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Deconvolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            dilate=dilate,
+            name='deconv_1'
+        )
+        net = mx.sym.Activation(net, name = 'tanh', act_type = "tanh")
+        # Test the mxnet model
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_tiny_deconv_pooling_random_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 1, 10, 10)
+        num_filter = 1
+        kernel = (5, 5)
+        stride = (1, 1)
+        pad = (0, 0)
+
+        # define a model
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Deconvolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='deconv_1'
+        )
+        net = mx.symbol.Pooling(
+            data=net,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='pool_1',
+            pool_type='max'
+        )
+        # Test the mxnet model
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_really_tiny_deconv_random_3d_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 3, 10, 10)
+        num_filter = 1
+        kernel = (1, 1)
+        stride = (1, 1)
+        pad = (0, 0)
+
+        # define a model
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Deconvolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='deconv_1'
+        )
+        # Test the mxnet model
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_really_tiny_deconv_random_input_multi_filter(self):
+        np.random.seed(1988)
+        input_shape = (1, 1, 10, 10)
+        num_filter = 64
+        kernel = (1, 1)
+        stride = (1, 1)
+        pad = (0, 0)
+
+        # define a model
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Deconvolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='deconv_1'
+        )
+        # Test the mxnet model
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_tiny_deconv_random_3d_input(self):
+        np.random.seed(1988)
+        input_shape = (1, 3, 10, 10)
+        num_filter = 1
+        kernel = (5, 5)
+        stride = (1, 1)
+        pad = (0, 0)
+
+        # define a model
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Deconvolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='deconv_1'
+        )
+        # Test the mxnet model
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_tiny_deconv_random_input_multi_filter(self):
+        np.random.seed(1988)
+        input_shape = (1, 1, 10, 10)
+        num_filter = 64
+        kernel = (5 ,5)
+        stride = (1, 1)
+        pad = (0, 0)
+
+        # define a model
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Deconvolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            name='deconv_1'
+        )
+        # Test the mxnet model
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_deconv_random(self):
+        np.random.seed(1988)
+        input_shape = (1, 10, 4, 4)
+        num_filter = 3
+        kernel = (2, 2)
+        stride = (1, 1)
+        pad = (0, 0)
+
+        # define a model
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Deconvolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            no_bias=False,
+            name='deconv_1'
+        )
+        # test the mxnet model
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_deconv_random_output_shape(self):
+        np.random.seed(1988)
+        input_shape = (1, 10, 4, 4)
+        num_filter = 3
+        kernel = (2, 2)
+        stride = (1, 1)
+        pad = (0, 0)
+        target_shape = (5, 5)
+
+        # define a model
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Deconvolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            no_bias=False,
+            target_shape=target_shape,
+            name='deconv_1'
+        )
+        # test the mxnet model
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_deconv_random_padding(self):
+        np.random.seed(1988)
+        input_shape = (1, 10, 9, 9)
+        num_filter = 3
+        kernel = (3, 3)
+        stride = (3, 3)
+        pad = (2, 2)
+
+        # define a model
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Deconvolution(
+                data=net,
+                num_filter=num_filter,
+                kernel=kernel,
+                stride=stride,
+                pad=pad,
+                no_bias=False,
+                name='deconv_1')
+        # test the mxnet model
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_conv_random_padding_odd(self):
+        np.random.seed(1988)
+        input_shape = (1, 10, 6, 6)
+        num_filter = 3
+        kernel = (5, 5)
+        stride = (1, 1)
+        pad = (3, 3)
+
+        # define a model
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Convolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            no_bias=False,
+            name='conv_1'
+        )
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_conv_random_padding_even(self):
+        np.random.seed(1988)
+        input_shape = (1, 10, 6, 6)
+        num_filter = 3
+        kernel = (5, 5)
+        stride = (1, 1)
+        pad = (2, 2)
+
+        # define a model
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Convolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            no_bias=False,
+            name='conv_1'
+        )
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_deconv_random_all_inputs(self):
+        np.random.seed(1988)
+        input_shape = (1, 10, 5, 5)
+        num_filter = 3
+        kernel = (3, 3)
+        stride = (2, 2)
+        pad = (1, 1)
+        dilate = (1, 1)
+        target_shape = (11, 11)
+
+        # define a model
+        net = mx.sym.Variable('data')
+        net = mx.symbol.Deconvolution(
+            data=net,
+            num_filter=num_filter,
+            kernel=kernel,
+            stride=stride,
+            pad=pad,
+            no_bias=False,
+            target_shape=target_shape,
+            dilate=dilate,
+            name='deconv_1'
+        )
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random')
+
+    def test_batch_norm(self):
+        np.random.seed(1988)
+        input_shape = (1, 1, 2, 3)
+
+        net = mx.sym.Variable('data')
+        gamma = mx.sym.Variable('gamma')
+        beta = mx.sym.Variable('beta')
+        moving_mean = mx.sym.Variable('moving_mean')
+        moving_var = mx.sym.Variable('moving_var')
+        net = mx.symbol.BatchNorm(
+            data=net,
+            gamma=gamma,
+            beta=beta,
+            moving_mean=moving_mean,
+            moving_var=moving_var,
+            use_global_stats=True,
+            name='batch_norm_1')
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random', delta=1e-2)
+
+    def test_batch_norm_no_global_stats(self):
+        """ This test should throw an exception since converter doesn't support
+            conversion of MXNet models that use local batch stats (i.e.
+            use_global_stats=False). The reason for this is CoreML doesn't support
+            local batch stats.
+        """
+        np.random.seed(1988)
+        input_shape = (1, 1, 2, 3)
+
+        net = mx.sym.Variable('data')
+        gamma = mx.sym.Variable('gamma')
+        beta = mx.sym.Variable('beta')
+        moving_mean = mx.sym.Variable('moving_mean')
+        moving_var = mx.sym.Variable('moving_var')
+        net = mx.symbol.BatchNorm(
+            data=net,
+            gamma=gamma,
+            beta=beta,
+            moving_mean=moving_mean,
+            moving_var=moving_var,
+            use_global_stats=False,
+            name='batch_norm_1')
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random', delta=1e-2)
+
+    def test_pre_processing_args(self):
+        np.random.seed(1988)
+        input_shape = (1, 10)
+        net = mx.sym.Variable('data')
+        net = mx.sym.FullyConnected(data=net, name='fc1', num_hidden=5)
+        net = mx.sym.SoftmaxOutput(net, name='softmax')
+        self._test_mxnet_model(net, input_shape=input_shape, mode='random', label_names=['softmax_label'],
+                               pre_processing_args={'red_bias':0, 'blue_bias':0, 'green_bias':0, 'image_scale':1})
+
+    # TODO test_concat
+
+
+if __name__ == '__main__':
+    suite = unittest.TestLoader().loadTestsFromTestCase(SingleLayerTest)
+    unittest.TextTestRunner(verbosity=2).run(suite)
diff --git a/tools/coreml/test/test_mxnet_image.py b/tools/coreml/test/test_mxnet_image.py
new file mode 100644
index 0000000..ac30ac7
--- /dev/null
+++ b/tools/coreml/test/test_mxnet_image.py
@@ -0,0 +1,136 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import mxnet as mx
+import numpy as np
+import unittest
+import sys
+import os
+current_working_directory = os.getcwd()
+sys.path.append(current_working_directory + "/..")
+sys.path.append(current_working_directory + "/../converter/")
+import _mxnet_converter as mxnet_converter
+from utils import load_model
+
+
+VAL_DATA = 'data/val-5k-256.rec'
+URL = 'http://data.mxnet.io/data/val-5k-256.rec'
+
+
+def download_data():
+    return mx.test_utils.download(URL, VAL_DATA)
+
+
+def read_image(data_val, label_name):
+    data = mx.io.ImageRecordIter(
+        path_imgrec=data_val,
+        label_width=1,
+        preprocess_threads=4,
+        batch_size=32,
+        data_shape=(3,224,224),
+        label_name=label_name,
+        rand_corp=False,
+        rand_mirror=False,
+        shuffle=True
+    )
+    return data
+
+
+def is_correct_top_one(predict, label):
+    assert isinstance(predict, np.ndarray)
+    assert isinstance(label, np.float32)
+    predicted_label = np.argmax(predict)
+    return predicted_label == label
+
+
+def is_correct_top_five(predict, label):
+    assert isinstance(predict, np.ndarray)
+    assert isinstance(label, np.float32)
+    top_five_preds = set(predict.argsort()[-5:])
+    return label in top_five_preds
+
+
+class ImageNetTest(unittest.TestCase):
+    def _test_image_prediction(self, model_name, epoch, label_name):
+        try:
+            data = read_image(VAL_DATA, label_name=label_name)
+        except:
+            download_data()
+            data = read_image(VAL_DATA, label_name=label_name)
+
+        mod = load_model(
+            model_name=model_name,
+            epoch_num=epoch,
+            data_shapes=data.provide_data,
+            label_shapes=data.provide_label,
+            label_names=[label_name,]
+        )
+
+        input_shape = (1, 3, 224, 224)
+        coreml_model = mxnet_converter.convert(mod, input_shape={'data': input_shape})
+
+        mxnet_acc = []
+        mxnet_top_5_acc = []
+        coreml_acc = []
+        coreml_top_5_acc = []
+
+        num_batch = 0
+
+        for batch in data:
+            mod.forward(batch, is_train=False)
+            mxnet_preds = mod.get_outputs()[0].asnumpy()
+            data_numpy = batch.data[0].asnumpy()
+            label_numpy = batch.label[0].asnumpy()
+            for i in xrange(32):
+                input_data = {'data': data_numpy[i]}
+                coreml_predict = coreml_model.predict(input_data).values()[0].flatten()
+                mxnet_predict = mxnet_preds[i]
+                label = label_numpy[i]
+                mxnet_acc.append(is_correct_top_one(mxnet_predict, label))
+                mxnet_top_5_acc.append(is_correct_top_five(mxnet_predict, label))
+                coreml_acc.append(is_correct_top_one(coreml_predict, label))
+                coreml_top_5_acc.append(is_correct_top_five(coreml_predict, label))
+                num_batch += 1
+            if (num_batch == 5): break # we only use a subset of the batches.
+
+        print "MXNet acc %s" % np.mean(mxnet_acc)
+        print "Coreml acc %s" % np.mean(coreml_acc)
+        print "MXNet top 5 acc %s" % np.mean(mxnet_top_5_acc)
+        print "Coreml top 5 acc %s" % np.mean(coreml_top_5_acc)
+        self.assertAlmostEqual(np.mean(mxnet_acc), np.mean(coreml_acc), delta=1e-4)
+        self.assertAlmostEqual(np.mean(mxnet_top_5_acc), np.mean(coreml_top_5_acc), delta=1e-4)
+
+    def test_squeezenet(self):
+        print "Testing Image Classification with Squeezenet"
+        self._test_image_prediction(model_name='squeezenet_v1.1', epoch=0, label_name='prob_label')
+
+    def test_inception_with_batch_normalization(self):
+        print "Testing Image Classification with Inception/BatchNorm"
+        self._test_image_prediction(model_name='Inception-BN', epoch=126, label_name='softmax_label')
+
+    def test_resnet18(self):
+        print "Testing Image Classification with ResNet18"
+        self._test_image_prediction(model_name='resnet-18', epoch=0, label_name='softmax_label')
+
+    def test_vgg16(self):
+        print "Testing Image Classification with vgg16"
+        self._test_image_prediction(model_name='vgg16', epoch=0, label_name='prob_label')
+
+
+if __name__ == '__main__':
+    suite = unittest.TestLoader().loadTestsFromTestCase(ImageNetTest)
+    unittest.TextTestRunner(verbosity=2).run(suite)
\ No newline at end of file
diff --git a/tools/coreml/test/test_mxnet_models.py b/tools/coreml/test/test_mxnet_models.py
new file mode 100644
index 0000000..1732fb8
--- /dev/null
+++ b/tools/coreml/test/test_mxnet_models.py
@@ -0,0 +1,155 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import unittest
+import mxnet as mx
+import numpy as np
+import sys
+import os
+current_working_directory = os.getcwd()
+sys.path.append(current_working_directory + "/..")
+sys.path.append(current_working_directory + "/../converter/")
+import _mxnet_converter as mxnet_converter
+from collections import namedtuple
+
+
+def _mxnet_remove_batch(input_data):
+    for blob in input_data:
+        input_data[blob] = np.reshape(input_data[blob], input_data[blob].shape[1:])
+    return input_data
+
+
+def _kl_divergence(distribution1, distribution2):
+    """ Calculates Kullback-Leibler Divergence b/w two distributions.
+
+    Parameters
+    ----------
+    distribution1: list of floats
+    distribution2: list of floats
+    """
+    assert len(distribution1) == len(distribution2)
+    n = len(distribution1)
+    result = 1./n * sum(distribution1 * (np.log(distribution1) - np.log(distribution2)))
+    return result
+
+
+class ModelsTest(unittest.TestCase):
+    """
+    Unit test class that tests converter on entire MXNet models .
+    In order to test each unit test converts MXNet model into CoreML model using the converter, generate predictions
+    on both MXNet and CoreML and verifies that predictions are same (or similar).
+    """
+    def _load_model(self, model_name, epoch_num, input_shape):
+        sym, arg_params, aux_params = mx.model.load_checkpoint(model_name, epoch_num)
+        mod = mx.mod.Module(
+            symbol=sym,
+            context=mx.cpu(),
+            label_names=None
+        )
+        mod.bind(
+            for_training=False,
+            data_shapes=[('data', input_shape)],
+            label_shapes=mod._label_shapes
+        )
+        mod.set_params(
+            arg_params=arg_params,
+            aux_params=aux_params,
+            allow_missing=True
+        )
+        return mod
+
+    def _test_model(self, model_name, epoch_num, input_shape=(1, 3, 224, 224), files=None):
+        """ Tests whether the converted CoreML model's preds are equal to MXNet preds for a given model or not.
+
+        Parameters
+        ----------
+        model_name: str
+            Prefix of the MXNet model name as stored on the local directory.
+
+        epoch_num : int
+            Epoch number of model we would like to load.
+
+        input_shape: tuple
+            The shape of the input data in the form of (batch_size, channels, height, width)
+
+        files: list of strings
+            List of URLs pertaining to files that need to be downloaded in order to use the model.
+        """
+
+        if files is not None:
+            print("Downloading files from urls: %s" % (files))
+            for url in files:
+                mx.test_utils.download(url)
+                print("Downloaded %s" % (url))
+
+        module = self._load_model(
+            model_name=model_name,
+            epoch_num=epoch_num,
+            input_shape=input_shape
+        )
+
+        coreml_model = mxnet_converter.convert(module, input_shape={'data': input_shape})
+
+        # Get predictions from MXNet and coreml
+        div=[] # For storing KL divergence for each input.
+        for _ in xrange(1):
+            np.random.seed(1993)
+            input_data = {'data': np.random.uniform(0, 1, input_shape).astype(np.float32)}
+            Batch = namedtuple('Batch', ['data'])
+            module.forward(Batch([mx.nd.array(input_data['data'])]), is_train=False)
+            mxnet_pred = module.get_outputs()[0].asnumpy().flatten()
+            coreml_pred = coreml_model.predict(_mxnet_remove_batch(input_data)).values()[0].flatten()
+            self.assertEqual(len(mxnet_pred), len(coreml_pred))
+            div.append(_kl_divergence(mxnet_pred, coreml_pred))
+
+        print "Average KL divergence is % s" % np.mean(div)
+        self.assertTrue(np.mean(div) < 1e-4)
+
+    def test_pred_inception_bn(self):
+        self._test_model(model_name='Inception-BN', epoch_num=126,
+                         files=["http://data.mxnet.io/models/imagenet/inception-bn/Inception-BN-0126.params",
+                                "http://data.mxnet.io/models/imagenet/inception-bn/Inception-BN-symbol.json"])
+
+    def test_pred_squeezenet_v11(self):
+        self._test_model(model_name='squeezenet_v1.1', epoch_num=0,
+                         files=["http://data.mxnet.io/models/imagenet/squeezenet/squeezenet_v1.1-symbol.json",
+                                "http://data.mxnet.io/models/imagenet/squeezenet/squeezenet_v1.1-0000.params"])
+
+    def test_pred_resnet_50(self):
+        self._test_model(model_name='resnet-50', epoch_num=0,
+                         files=["http://data.mxnet.io/models/imagenet/resnet/50-layers/resnet-50-symbol.json",
+                                "http://data.mxnet.io/models/imagenet/resnet/50-layers/resnet-50-0000.params"])
+
+    def test_pred_vgg16(self):
+        self._test_model(model_name='vgg16', epoch_num=0,
+                         files=["http://data.mxnet.io/models/imagenet/vgg/vgg16-symbol.json",
+                                "http://data.mxnet.io/models/imagenet/vgg/vgg16-0000.params"])
+
+    def test_pred_nin(self):
+        self._test_model(model_name='nin', epoch_num=0,
+                         files=["http://data.dmlc.ml/models/imagenet/nin/nin-symbol.json",
+                                "http://data.dmlc.ml/models/imagenet/nin/nin-0000.params"])
+
+    @unittest.skip("You need to download and unzip file: "
+                   "http://data.mxnet.io/models/imagenet/inception-v3.tar.gz in order to run this test.")
+    def test_pred_inception_v3(self):
+        self._test_model(model_name='Inception-7', epoch_num=1, input_shape=(1, 3, 299, 299))
+
+
+if __name__ == '__main__':
+    suite = unittest.TestLoader().loadTestsFromTestCase(ModelsTest)
+    unittest.TextTestRunner(verbosity=2).run(suite)
diff --git a/tools/coreml/test_mxnet_converer.py b/tools/coreml/test_mxnet_converer.py
deleted file mode 100644
index 179d04a..0000000
--- a/tools/coreml/test_mxnet_converer.py
+++ /dev/null
@@ -1,477 +0,0 @@
-# Licensed to the Apache Software Foundation (ASF) under one
-# or more contributor license agreements.  See the NOTICE file
-# distributed with this work for additional information
-# regarding copyright ownership.  The ASF licenses this file
-# to you under the Apache License, Version 2.0 (the
-# "License"); you may not use this file except in compliance
-# with the License.  You may obtain a copy of the License at
-#
-#   http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing,
-# software distributed under the License is distributed on an
-# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-# KIND, either express or implied.  See the License for the
-# specific language governing permissions and limitations
-# under the License.
-
-import unittest
-import mxnet as mx
-import numpy as np
-import tempfile
-import os
-import mxnet_converter
-import coremltools
-
-def _mxnet_remove_batch(input_data):
-    for blob in input_data:
-        input_data[blob] = np.reshape(input_data[blob], input_data[blob].shape[1:])
-    return input_data
-
-def _get_coreml_model(net, engine, model_path, input_shape,
-            input_names = ['data'], output_names = ['output']):
-    model = mx.model.FeedForward(net, engine, arg_params = engine.arg_dict)
-    spec = mxnet_converter.convert(model, **input_shape)
-    return coremltools.models.MLModel(spec)
-
-def set_weights(net, engine, mode = 'random'):
-    for arg in net.list_arguments():
-        if mode == 'random':
-            engine.arg_dict[arg][:] = np.random.uniform(-0.1, 0.1, engine.arg_dict[arg].shape)
-        elif mode == 'zeros':
-            engine.arg_dict[arg][:] = np.zeros(engine.arg_dict[arg].shape)
-        elif mode == 'ones':
-            engine.arg_dict[arg][:] = np.ones(engine.arg_dict[arg].shape)
-    return net
-
-class MXNetSingleLayerTest(unittest.TestCase):
-    """
-    Unit test class for testing mxnet converter.
-    """
-    def _test_mxnet_model(self, net, engine, delta = 1e-3, **input_shape):
-
-        # Generate some dummy data
-        input_data = {}
-        for ip in input_shape:
-            input_data[ip] = engine.arg_dict[ip].asnumpy()
-        output_blob = net.list_outputs()[0]
-
-        # Make predictions from mxnet (only works on single output for now)
-        mxnet_preds = engine.forward()[0].asnumpy().flatten()
-
-        # Get predictions from coreml
-        model_path = os.path.join(tempfile.mkdtemp(), 'mxnet.mlmodel')
-        model = _get_coreml_model(net, engine, model_path, input_shape, input_data.keys())
-        coreml_preds = model.predict(_mxnet_remove_batch(input_data)).values()[0].flatten()
-
-        # Check prediction accuracy
-        self.assertEquals(len(mxnet_preds), len(coreml_preds))
-        for i in range(len(mxnet_preds)):
-            self.assertAlmostEquals(mxnet_preds[i], coreml_preds[i], delta = delta)
-
-    def test_tiny_inner_product_zero_input(self):
-        np.random.seed(1988)
-        input_shape = (1, 10)
-
-        # Define a model
-        net = mx.sym.Variable('data')
-        net = mx.sym.FullyConnected(data = net, name = 'fc1', num_hidden = 5)
-        engine = net.simple_bind(ctx=mx.cpu(), data=input_shape)
-
-        # Set some random weights
-        set_weights(net, engine, mode = 'zeros')
-
-        # Test the mxnet model
-        self._test_mxnet_model(net, engine, data = input_shape)
-
-    def test_really_tiny_inner_product_ones_input(self):
-        np.random.seed(1988)
-        input_shape = (1, 1)
-
-        # Define a model
-        net = mx.sym.Variable('data')
-        net = mx.sym.FullyConnected(data = net, name = 'fc1', num_hidden = 1)
-        engine = net.simple_bind(ctx=mx.cpu(), data=input_shape)
-
-        # Set some random weights
-        set_weights(net, engine, mode = 'ones')
-
-        # Test the mxnet model
-        self._test_mxnet_model(net, engine, data = input_shape)
-
-    def test_really_tiny_2_inner_product_ones_input(self):
-        np.random.seed(1988)
-        input_shape = (1, 1)
-
-        # Define a model
-        net = mx.sym.Variable('data')
-        net = mx.sym.FullyConnected(data = net, name = 'fc1', num_hidden = 5)
-        engine = net.simple_bind(ctx=mx.cpu(), data=input_shape)
-
-        # Set some random weights
-        set_weights(net, engine, mode = 'ones')
-
-        # Test the mxnet model
-        self._test_mxnet_model(net, engine, data = input_shape)
-
-    def test_tiny_inner_product_ones_input(self):
-        np.random.seed(1988)
-        input_shape = (1, 10)
-
-        # Define a model
-        net = mx.sym.Variable('data')
-        net = mx.sym.FullyConnected(data = net, name = 'fc1', num_hidden = 5)
-        engine = net.simple_bind(ctx=mx.cpu(), data=input_shape)
-
-        # Set some random weights
-        set_weights(net, engine, mode = 'ones')
-
-        # Test the mxnet model
-        self._test_mxnet_model(net, engine, data = input_shape)
-
-    def test_tiny_inner_product_random_input(self):
-        np.random.seed(1988)
-        input_shape = (1, 10)
-
-        # Define a model
-        net = mx.sym.Variable('data')
-        net = mx.sym.FullyConnected(data = net, name = 'fc1', num_hidden = 5)
-        engine = net.simple_bind(ctx=mx.cpu(), data=input_shape)
-
-        # Set some random weights
-        set_weights(net, engine, mode = 'random')
-
-        # Test the mxnet model
-        self._test_mxnet_model(net, engine, data = input_shape)
-
-    def test_tiny_softmax_random_input(self):
-        np.random.seed(1988)
-        input_shape = (1, 10)
-
-        # Define a model
-        net = mx.sym.Variable('data')
-        net = mx.sym.FullyConnected(data = net, name = 'fc1', num_hidden = 5)
-        net = mx.sym.SoftmaxOutput(net, name = 'softmax')
-        engine = net.simple_bind(ctx = mx.cpu(), data = input_shape)
-
-        # Set some random weights
-        set_weights(net, engine, mode = 'random')
-
-        # Test the mxnet model
-        self._test_mxnet_model(net, engine, data = input_shape)
-
-    def test_tiny_relu_activation_random_input(self):
-        np.random.seed(1988)
-        input_shape = (1, 10)
-
-        # Define a model
-        net = mx.sym.Variable('data')
-        net = mx.sym.FullyConnected(data = net, name = 'fc1', num_hidden = 5)
-        net = mx.sym.Activation(net, name = 'relu1', act_type = "relu")
-        engine = net.simple_bind(ctx = mx.cpu(), data = input_shape)
-
-        # Set some random weights
-        set_weights(net, engine, mode = 'random')
-
-        # Test the mxnet model
-        self._test_mxnet_model(net, engine, data = input_shape)
-
-    def test_tiny_sigmoid_activation_random_input(self):
-        np.random.seed(1988)
-        input_shape = (1, 10)
-
-        # Define a model
-        net = mx.sym.Variable('data')
-        net = mx.sym.FullyConnected(data = net, name = 'fc1', num_hidden = 5)
-        net = mx.sym.Activation(net, name = 'sigmoid1', act_type = "sigmoid")
-        engine = net.simple_bind(ctx = mx.cpu(), data = input_shape)
-
-        # Set some random weights
-        set_weights(net, engine, mode = 'random')
-
-        # Test the mxnet model
-        self._test_mxnet_model(net, engine, data = input_shape)
-
-    def test_tiny_tanh_activation_random_input(self):
-        np.random.seed(1988)
-        input_shape = (1, 10)
-
-        # Define a model
-        net = mx.sym.Variable('data')
-        net = mx.sym.FullyConnected(data = net, name = 'fc1', num_hidden = 5)
-        net = mx.sym.Activation(net, name = 'tanh1', act_type = "tanh")
-        engine = net.simple_bind(ctx = mx.cpu(), data = input_shape)
-
-        # Set some random weights
-        set_weights(net, engine, mode = 'random')
-
-        # Test the mxnet model
-        self._test_mxnet_model(net, engine, data = input_shape)
-
-    def test_really_tiny_conv_random_input(self):
-        np.random.seed(1988)
-        input_shape = (1, 1, 10, 10)
-        num_filter = 1
-        kernel = (1 ,1)
-        stride = (1, 1)
-        pad = (0, 0)
-
-        # Define a model
-        net = mx.sym.Variable('data')
-        net = mx.symbol.Convolution(data = net, num_filter = num_filter, kernel=kernel,
-                stride = stride, pad = pad, name = 'conv_1')
-        engine = net.simple_bind(ctx = mx.cpu(), data = input_shape)
-
-        # Set some random weights
-        set_weights(net, engine, mode = 'random')
-
-        # Test the mxnet model
-        self._test_mxnet_model(net, engine, data = input_shape)
-
-    def test_tiny_conv_ones_input(self):
-        np.random.seed(1988)
-        input_shape = (1, 1, 10, 10)
-        num_filter = 1
-        kernel = (5 ,5)
-        stride = (1, 1)
-        pad = (0, 0)
-
-        # Define a model
-        net = mx.sym.Variable('data')
-        net = mx.symbol.Convolution(data = net, num_filter = num_filter, kernel=kernel,
-                stride = stride, pad = pad, name = 'conv_1')
-        engine = net.simple_bind(ctx = mx.cpu(), data = input_shape)
-
-        # Set some random weights
-        set_weights(net, engine, mode = 'ones')
-
-        # Test the mxnet model
-        self._test_mxnet_model(net, engine, data = input_shape)
-
-    def test_tiny_conv_random_input(self):
-        np.random.seed(1988)
-        input_shape = (1, 1, 10, 10)
-        num_filter = 1
-        kernel = (5 ,5)
-        stride = (1, 1)
-        pad = (0, 0)
-
-        # define a model
-        net = mx.sym.Variable('data')
-        net = mx.symbol.Convolution(data = net, num_filter = num_filter, kernel=kernel,
-                stride = stride, pad = pad, name = 'conv_1')
-        engine = net.simple_bind(ctx = mx.cpu(), data = input_shape)
-
-        # set some random weights
-        set_weights(net, engine, mode = 'random')
-
-        # test the mxnet model
-        self._test_mxnet_model(net, engine, data = input_shape)
-
-    def test_tiny_asym_conv_random_input(self):
-        np.random.seed(1988)
-        input_shape = (1, 1, 10, 10)
-        num_filter = 1
-        kernel = (5 ,3)
-        stride = (1, 1)
-        pad = (0, 0)
-
-        # define a model
-        net = mx.sym.Variable('data')
-        net = mx.symbol.Convolution(data = net, num_filter = num_filter, kernel=kernel,
-                stride = stride, pad = pad, name = 'conv_1')
-        engine = net.simple_bind(ctx = mx.cpu(), data = input_shape)
-
-        # set some random weights
-        set_weights(net, engine, mode = 'random')
-
-        # test the mxnet model
-        self._test_mxnet_model(net, engine, data = input_shape)
-
-    def test_tiny_asym_conv_random_asym_input(self):
-        np.random.seed(1988)
-        input_shape = (1, 1, 28, 18)
-        num_filter = 16
-        kernel = (5 ,3)
-        stride = (1, 1)
-        pad = (0, 0)
-        dilate = (1, 1)
-
-        # define a model
-        net = mx.sym.Variable('data')
-        net = mx.symbol.Convolution(data = net, num_filter = num_filter, kernel=kernel,
-                stride = stride, pad = pad, name = 'conv_1', dilate = dilate)
-        net = mx.sym.Activation(net, name = 'tanh', act_type = "tanh")
-        engine = net.simple_bind(ctx = mx.cpu(), data = input_shape)
-
-        # set some random weights
-        set_weights(net, engine, mode = 'random')
-
-        # test the mxnet model
-        self._test_mxnet_model(net, engine, data = input_shape)
-
-    def test_tiny_conv_pooling_random_input(self):
-        np.random.seed(1988)
-        input_shape = (1, 1, 10, 10)
-        num_filter = 1
-        kernel = (5 ,5)
-        stride = (1, 1)
-        pad = (0, 0)
-
-        # define a model
-        net = mx.sym.Variable('data')
-        net = mx.symbol.Convolution(data = net, num_filter = num_filter, kernel=kernel,
-                stride = stride, pad = pad, name = 'conv_1')
-        net = mx.symbol.Pooling(data = net, kernel=kernel,
-                stride = stride, pad = pad, name = 'pool_1', pool_type = 'max')
-        engine = net.simple_bind(ctx = mx.cpu(), data = input_shape)
-
-        # set some random weights
-        set_weights(net, engine, mode = 'random')
-
-        # test the mxnet model
-        self._test_mxnet_model(net, engine, data = input_shape)
-
-    def test_really_tiny_conv_random_3d_input(self):
-        np.random.seed(1988)
-        input_shape = (1, 3, 10, 10)
-        num_filter = 1
-        kernel = (1 ,1)
-        stride = (1, 1)
-        pad = (0, 0)
-
-        # define a model
-        net = mx.sym.Variable('data')
-        net = mx.symbol.Convolution(data = net, num_filter = num_filter, kernel=kernel,
-                stride = stride, pad = pad, name = 'conv_1')
-        engine = net.simple_bind(ctx = mx.cpu(), data = input_shape)
-
-        # set some random weights
-        set_weights(net, engine, mode = 'random')
-
-        # test the mxnet model
-        self._test_mxnet_model(net, engine, data = input_shape)
-
-    def test_really_tiny_conv_random_input_multi_filter(self):
-        np.random.seed(1988)
-        input_shape = (1, 1, 10, 10)
-        num_filter = 64
-        kernel = (1 ,1)
-        stride = (1, 1)
-        pad = (0, 0)
-
-        # define a model
-        net = mx.sym.Variable('data')
-        net = mx.symbol.Convolution(data = net, num_filter = num_filter, kernel=kernel,
-                stride = stride, pad = pad, name = 'conv_1')
-        engine = net.simple_bind(ctx = mx.cpu(), data = input_shape)
-
-        # set some random weights
-        set_weights(net, engine, mode = 'random')
-
-        # test the mxnet model
-        self._test_mxnet_model(net, engine, data = input_shape)
-
-    def test_tiny_conv_random_3d_input(self):
-        np.random.seed(1988)
-        input_shape = (1, 3, 10, 10)
-        num_filter = 1
-        kernel = (5 ,5)
-        stride = (1, 1)
-        pad = (0, 0)
-
-        # define a model
-        net = mx.sym.Variable('data')
-        net = mx.symbol.Convolution(data = net, num_filter = num_filter, kernel=kernel,
-                stride = stride, pad = pad, name = 'conv_1')
-        engine = net.simple_bind(ctx = mx.cpu(), data = input_shape)
-
-        # set some random weights
-        set_weights(net, engine, mode = 'random')
-
-        # test the mxnet model
-        self._test_mxnet_model(net, engine, data = input_shape)
-
-    def test_tiny_conv_random_input_multi_filter(self):
-        np.random.seed(1988)
-        input_shape = (1, 1, 10, 10)
-        num_filter = 64
-        kernel = (5 ,5)
-        stride = (1, 1)
-        pad = (0, 0)
-
-        # define a model
-        net = mx.sym.Variable('data')
-        net = mx.symbol.Convolution(data = net, num_filter = num_filter, kernel=kernel,
-                stride = stride, pad = pad, name = 'conv_1')
-        engine = net.simple_bind(ctx = mx.cpu(), data = input_shape)
-
-        # set some random weights
-        set_weights(net, engine, mode = 'random')
-
-        # test the mxnet model
-        self._test_mxnet_model(net, engine, data = input_shape)
-
-    def test_conv_random(self):
-        np.random.seed(1988)
-        input_shape = (1, 3, 10, 10)
-        num_filter = 64
-        kernel = (5 ,5)
-        stride = (1, 1)
-        pad = (0, 0)
-
-        # define a model
-        net = mx.sym.Variable('data')
-        net = mx.symbol.Convolution(data = net, num_filter = num_filter, kernel=kernel,
-                stride = stride, pad = pad, name = 'conv_1')
-        engine = net.simple_bind(ctx = mx.cpu(), data = input_shape)
-
-        # set some random weights
-        set_weights(net, engine, mode = 'random')
-
-        # test the mxnet model
-        self._test_mxnet_model(net, engine, data = input_shape)
-
-    def test_flatten(self):
-        np.random.seed(1988)
-        input_shape = (1, 3, 10, 10)
-        num_filter = 64
-        kernel = (5 ,5)
-        stride = (1, 1)
-        pad = (0, 0)
-
-        # define a model
-        net = mx.sym.Variable('data')
-        net = mx.symbol.Convolution(data = net, num_filter = num_filter, kernel=kernel,
-                stride = stride, pad = pad, name = 'conv_1')
-        net = mx.sym.Flatten(data = net, name = 'flatten1')
-        net = mx.sym.FullyConnected(data = net, name = 'fc1', num_hidden = 5)
-        net = mx.sym.SoftmaxOutput(net, name = 'softmax')
-        engine = net.simple_bind(ctx = mx.cpu(), data = input_shape)
-
-        # set some random weights
-        set_weights(net, engine, mode = 'random')
-
-        # test the mxnet model
-        self._test_mxnet_model(net, engine, data = input_shape)
-
-    def test_transpose(self):
-        np.random.seed(1988)
-        input_shape = (1, 3, 10, 10)
-        num_filter = 64
-        kernel = (5 ,5)
-        stride = (1, 1)
-        pad = (0, 0)
-
-        # define a model
-        net = mx.sym.Variable('data')
-        net = mx.sym.transpose(data = net, name = 'transpose', axes = (0, 1, 2, 3))
-        net = mx.symbol.Convolution(data = net, num_filter = num_filter, kernel=kernel,
-                stride = stride, pad = pad, name = 'conv_1')
-        engine = net.simple_bind(ctx = mx.cpu(), data = input_shape)
-
-        # set some random weights
-        set_weights(net, engine, mode = 'random')
-
-        # test the mxnet model
-        self._test_mxnet_model(net, engine, data = input_shape)
diff --git a/tools/coreml/utils.py b/tools/coreml/utils.py
new file mode 100644
index 0000000..1e4ff7a
--- /dev/null
+++ b/tools/coreml/utils.py
@@ -0,0 +1,77 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import mxnet as mx
+
+
+def load_model(model_name, epoch_num, data_shapes, label_shapes, label_names, gpus=''):
+    """Loads and returns a given MXNet model.
+
+    Parameters
+    ----------
+    model_name: str
+        Prefix of the MXNet model name as stored on the local directory.
+
+    epoch_num : int
+        Epoch number of model we would like to load.
+
+    input_shape: tuple
+        The shape of the input data in the form of (batch_size, channels, height, width)
+
+    files: list of strings
+        List of URLs pertaining to files that need to be downloaded in order to use the model.
+
+    data_shapes: list of tuples.
+        List of tuples where each tuple is a pair of input variable name and its shape.
+
+    label_shapes: list of (str, tuple)
+        Typically is ``data_iter.provide_label``.
+
+    label_names: list of str
+        Name of the output labels in the MXNet symbolic graph.
+
+    gpus: str
+        Comma separated string of gpu ids on which inferences are executed. E.g. 3,5,6 would refer to GPUs 3, 5 and 6.
+        If empty, we use CPU.
+
+    Returns
+    -------
+    MXNet module
+    """
+    sym, arg_params, aux_params = mx.model.load_checkpoint(model_name, epoch_num)
+    if gpus == '':
+        devices = mx.cpu()
+    else:
+        devices = [mx.gpu(int(i)) for i in gpus.split(',')]
+    mod = mx.mod.Module(
+        symbol=sym,
+        context=devices,
+        label_names=label_names
+    )
+    mod.bind(
+        for_training=False,
+        data_shapes=data_shapes,
+        label_shapes=label_shapes
+    )
+    mod.set_params(
+        arg_params=arg_params,
+        aux_params=aux_params,
+        allow_missing=True
+    )
+    return mod
+
+

-- 
To stop receiving notification emails like this one, please contact
['"commits@mxnet.apache.org" <co...@mxnet.apache.org>'].