You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by lx...@apache.org on 2017/07/07 15:58:16 UTC
[03/50] [abbrv] incubator-mxnet-test git commit: Better ssd (#6827)

Better ssd (#6827)

* adjust according to # ctx

increase default cpu threads

add ms coco dataset

fix det_aug_default when label is invalid

fix bracelets

revert grad_scale and lr

change lr-steps and end-epoch

rescale by num_ctx only

fix mxnet path

remove hard-coded class names

add coco utils for ground-truth

fix os.path

add new inception v3 ssd symbol

rename

fix rename

add resnet 50 ssd symbol

* rewrite symbol composing methods

fix name

add symbol to path

modify lr-steps and end-epoch

better names

rename conv_act_layer names

update demo,deploy, evalute

remove handcrafted symbols

add doc

fix vgg16 fc->relu

modify inceptionv3-ssd

relax data_shape constraints

fix resnet101

parameter tuning

add back legacy models

change default lr

fix typo

* add model link, update readme, wrap up

fix path

* fix lost commit train_net.py


Project: http://git-wip-us.apache.org/repos/asf/incubator-mxnet-test/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-mxnet-test/commit/cc62aded
Tree: http://git-wip-us.apache.org/repos/asf/incubator-mxnet-test/tree/cc62aded
Diff: http://git-wip-us.apache.org/repos/asf/incubator-mxnet-test/diff/cc62aded

Branch: refs/heads/master
Commit: cc62aded9ce8bd078458553dc6bcb5ee16472f51
Parents: 984aaa1
Author: Joshua Z. Zhang <ch...@gmail.com>
Authored: Mon Jun 26 20:40:00 2017 -0700
Committer: Eric Junyuan Xie <pi...@users.noreply.github.com>
Committed: Mon Jun 26 20:40:00 2017 -0700

----------------------------------------------------------------------
 example/ssd/README.md                       |  35 +-
 example/ssd/config/config.py                |   3 +-
 example/ssd/dataset/imdb.py                 |  32 +-
 example/ssd/dataset/mscoco.py               | 115 ++++++
 example/ssd/dataset/names/mscoco.names      |  80 +++++
 example/ssd/dataset/names/pascal_voc.names  |  20 ++
 example/ssd/dataset/pascal_voc.py           |  10 +-
 example/ssd/dataset/pycocotools/README.md   |   2 +
 example/ssd/dataset/pycocotools/__init__.py |   1 +
 example/ssd/dataset/pycocotools/coco.py     | 435 +++++++++++++++++++++++
 example/ssd/demo.py                         |  66 ++--
 example/ssd/deploy.py                       |  25 +-
 example/ssd/evaluate.py                     |  23 +-
 example/ssd/evaluate/evaluate_net.py        |   6 +-
 example/ssd/symbol/README.md                |  49 +++
 example/ssd/symbol/common.py                | 100 +++++-
 example/ssd/symbol/inceptionv3.py           | 168 +++++++++
 example/ssd/symbol/legacy_vgg16_ssd_300.py  | 191 ++++++++++
 example/ssd/symbol/legacy_vgg16_ssd_512.py  | 194 ++++++++++
 example/ssd/symbol/resnet.py                | 169 +++++++++
 example/ssd/symbol/symbol_builder.py        | 166 +++++++++
 example/ssd/symbol/symbol_factory.py        | 122 +++++++
 example/ssd/symbol/symbol_vgg16_ssd_300.py  | 189 ----------
 example/ssd/symbol/symbol_vgg16_ssd_512.py  | 194 ----------
 example/ssd/symbol/vgg16_reduced.py         |  86 +++++
 example/ssd/tools/prepare_coco.sh           |   4 +
 example/ssd/tools/prepare_dataset.py        |  30 ++
 example/ssd/tools/visualize_net.py          |  17 +-
 example/ssd/train.py                        |  14 +-
 example/ssd/train/train_net.py              | Bin 9900 -> 9651 bytes
 src/io/image_det_aug_default.cc             |   6 +-
 31 files changed, 2084 insertions(+), 468 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-mxnet-test/blob/cc62aded/example/ssd/README.md
----------------------------------------------------------------------
diff --git a/example/ssd/README.md b/example/ssd/README.md
index 8703a7c..5759fca 100644
--- a/example/ssd/README.md
+++ b/example/ssd/README.md
@@ -17,6 +17,8 @@ remarkable traits of MXNet.
 Due to the permission issue, this example is maintained in this [repository](https://github.com/zhreshold/mxnet-ssd) separately. You can use the link regarding specific per example [issues](https://github.com/zhreshold/mxnet-ssd/issues).
 
 ### What's new
+* Added multiple trained models.
+* Added a much simpler way to compose network from mainstream classification networks (resnet, inception...) and [Guide](symbol/README.md).
 * Update to the latest version according to caffe version, with 5% mAP increase.
 * Use C++ record iterator based on back-end multi-thread engine to achieve huge speed up on multi-gpu environments.
 * Monitor validation mAP during training.
@@ -30,11 +32,12 @@ Due to the permission issue, this example is maintained in this [repository](htt
 ![demo3](https://cloud.githubusercontent.com/assets/3307514/19171086/a9346842-8be0-11e6-8011-c17716b22ad3.png)
 
 ### mAP
-|        Model          | Training data    | Test data |  mAP |
-|:-----------------:|:----------------:|:---------:|:----:|
-| [VGG16_reduced 300x300](https://github.com/zhreshold/mxnet-ssd/releases/download/v0.5-beta/vgg16_ssd_300_voc0712_trainval.zip) | VOC07+12 trainval| VOC07 test| 77.8|
-| [VGG16_reduced 512x512](https://github.com/zhreshold/mxnet-ssd/releases/download/v0.5-beta/vgg16_ssd_512_voc0712_trainval.zip) | VOC07+12 trainval | VOC07 test| 79.9|
-*More to be added*
+|        Model          | Training data    | Test data |  mAP | Note |
+|:-----------------:|:----------------:|:---------:|:----:|:-----|
+| [VGG16_reduced 300x300](https://github.com/zhreshold/mxnet-ssd/releases/download/v0.5-beta/vgg16_ssd_300_voc0712_trainval.zip) | VOC07+12 trainval| VOC07 test| 77.8| fast |
+| [VGG16_reduced 512x512](https://github.com/zhreshold/mxnet-ssd/releases/download/v0.5-beta/vgg16_ssd_512_voc0712_trainval.zip) | VOC07+12 trainval | VOC07 test| 79.9| slow |
+| [Inception-v3 512x512](https://github.com/zhreshold/mxnet-ssd/releases/download/v0.6/inceptionv3_ssd_512_voc0712_trainval.zip) | VOC07+12 trainval| VOC07 test| 78.9 | fastest |
+| [Resnet-50 512x512](https://github.com/zhreshold/mxnet-ssd/releases/download/v0.6/resnet50_ssd_512_voc0712_trainval.zip) | VOC07+12 trainval| VOC07 test| 78.9 | fast |
 
 ### Speed
 |         Model         |   GPU            | CUDNN | Batch-size | FPS* |
@@ -65,13 +68,14 @@ Remember to enable CUDA if you want to be able to train, since CPU training is
 insanely slow. Using CUDNN is optional, but highly recommended.
 
 ### Try the demo
-* Download the pretrained model: [`ssd_300_voc_0712.zip`](https://github.com/zhreshold/mxnet-ssd/releases/download/v0.5-beta/vgg16_ssd_300_voc0712_trainval.zip), and extract to `model/` directory.
+* Download the pretrained model: [`ssd_resnet50_0712.zip`](https://github.com/zhreshold/mxnet-ssd/releases/download/v0.6/resnet50_ssd_512_voc0712_trainval.zip), and extract to `model/` directory.
 * Run
 ```
-# cd /path/to/mxnet/example/ssd
-python demo.py
+# cd /path/to/mxnet-ssd
+python demo.py --gpu 0
 # play with examples:
 python demo.py --epoch 0 --images ./data/demo/dog.jpg --thresh 0.5
+python demo.py --cpu --network resnet50 --data-shape 512
 # wait for library to load for the first time
 ```
 * Check `python demo.py --help` for more options.
@@ -93,7 +97,7 @@ tar -xvf VOCtrainval_11-May-2012.tar
 tar -xvf VOCtrainval_06-Nov-2007.tar
 tar -xvf VOCtest_06-Nov-2007.tar
 ```
-* We are goint to use `trainval` set in VOC2007/2012 as a common strategy.
+* We are going to use `trainval` set in VOC2007/2012 as a common strategy.
 The suggested directory structure is to store `VOC2007` and `VOC2012` directories
 in the same `VOCdevkit` folder.
 * Then link `VOCdevkit` folder to `data/VOCdevkit` by default:
@@ -114,12 +118,12 @@ python tools/prepare_dataset.py --dataset pascal --year 2007 --set test --target
 # cd /path/to/mxnet/example/ssd
 python train.py
 ```
-* By default, this example will use `batch-size=32` and `learning_rate=0.004`.
+* By default, this example will use `batch-size=32` and `learning_rate=0.002`.
 You might need to change the parameters a bit if you have different configurations.
 Check `python train.py --help` for more training options. For example, if you have 4 GPUs, use:
 ```
 # note that a perfect training parameter set is yet to be discovered for multi-GPUs
-python train.py --gpus 0,1,2,3 --batch-size 128 --lr 0.001
+python train.py --gpus 0,1,2,3 --batch-size 32
 ```
 
 ### Evalute trained model
@@ -148,3 +152,12 @@ python convert_model.py deploy.prototxt name_of_pretrained_caffe_model.caffemode
 python demo.py --prefix ssd_converted --epoch 1 --deploy
 ```
 There is no guarantee that conversion will always work, but at least it's good for now.
+
+### Legacy models
+Since the new interface for composing network is introduced, the old models have inconsistent names for weights.
+You can still load the previous model by rename the symbol to `legacy_xxx.py`
+and call with `python train/demo.py --network legacy_xxx `
+For example:
+```
+python demo.py --network 'legacy_vgg16_ssd_300.py' --prefix model/ssd_300 --epoch 0
+```

http://git-wip-us.apache.org/repos/asf/incubator-mxnet-test/blob/cc62aded/example/ssd/config/config.py
----------------------------------------------------------------------
diff --git a/example/ssd/config/config.py b/example/ssd/config/config.py
index 931ad16..278b770 100644
--- a/example/ssd/config/config.py
+++ b/example/ssd/config/config.py
@@ -53,7 +53,7 @@ cfg.train.inter_method = 10  # random interpolation
 cfg.train.rand_mirror_prob = 0.5
 cfg.train.shuffle = True
 cfg.train.seed = 233
-cfg.train.preprocess_threads = 6
+cfg.train.preprocess_threads = 48
 cfg.train = config_as_dict(cfg.train)  # convert to normal dict
 
 # validation
@@ -64,4 +64,5 @@ cfg.valid.color_jitter = ColorJitter()
 cfg.valid.rand_mirror_prob = 0
 cfg.valid.shuffle = False
 cfg.valid.seed = 0
+cfg.valid.preprocess_threads = 32
 cfg.valid = config_as_dict(cfg.valid)  # convert to normal dict

http://git-wip-us.apache.org/repos/asf/incubator-mxnet-test/blob/cc62aded/example/ssd/dataset/imdb.py
----------------------------------------------------------------------
diff --git a/example/ssd/dataset/imdb.py b/example/ssd/dataset/imdb.py
index 95b082d..279fe9c 100644
--- a/example/ssd/dataset/imdb.py
+++ b/example/ssd/dataset/imdb.py
@@ -14,7 +14,7 @@ class Imdb(object):
         self.name = name
         self.classes = []
         self.num_classes = 0
-        self.image_set_index = []
+        self.image_set_index = None
         self.num_images = 0
         self.labels = None
         self.padding = 0
@@ -59,9 +59,22 @@ class Imdb(object):
         fname : str
             saved filename
         """
+        def progress_bar(count, total, suffix=''):
+            import sys
+            bar_len = 24
+            filled_len = int(round(bar_len * count / float(total)))
+
+            percents = round(100.0 * count / float(total), 1)
+            bar = '=' * filled_len + '-' * (bar_len - filled_len)
+            sys.stdout.write('[%s] %s%s ...%s\r' % (bar, percents, '%', suffix))
+            sys.stdout.flush()
+
         str_list = []
         for index in range(self.num_images):
+            progress_bar(index, self.num_images)
             label = self.label_from_index(index)
+            if label.size < 1:
+                continue
             path = self.image_path_from_index(index)
             if root:
                 path = osp.relpath(path, root)
@@ -78,3 +91,20 @@ class Imdb(object):
                     f.write(line)
         else:
             raise RuntimeError("No image in imdb")
+
+    def _load_class_names(self, filename, dirname):
+        """
+        load class names from text file
+
+        Parameters:
+        ----------
+        filename: str
+            file stores class names
+        dirname: str
+            file directory
+        """
+        full_path = osp.join(dirname, filename)
+        classes = []
+        with open(full_path, 'r') as f:
+            classes = [l.strip() for l in f.readlines()]
+        return classes

http://git-wip-us.apache.org/repos/asf/incubator-mxnet-test/blob/cc62aded/example/ssd/dataset/mscoco.py
----------------------------------------------------------------------
diff --git a/example/ssd/dataset/mscoco.py b/example/ssd/dataset/mscoco.py
new file mode 100644
index 0000000..b46b227
--- /dev/null
+++ b/example/ssd/dataset/mscoco.py
@@ -0,0 +1,115 @@
+import os
+import numpy as np
+from imdb import Imdb
+from pycocotools.coco import COCO
+
+
+class Coco(Imdb):
+    """
+    Implementation of Imdb for MSCOCO dataset: https://http://mscoco.org
+
+    Parameters:
+    ----------
+    anno_file : str
+        annotation file for coco, a json file
+    image_dir : str
+        image directory for coco images
+    shuffle : bool
+        whether initially shuffle image list
+
+    """
+    def __init__(self, anno_file, image_dir, shuffle=True, names='mscoco.names'):
+        assert os.path.isfile(anno_file), "Invalid annotation file: " + anno_file
+        basename = os.path.splitext(os.path.basename(anno_file))[0]
+        super(Coco, self).__init__('coco_' + basename)
+        self.image_dir = image_dir
+
+        self.classes = self._load_class_names(names,
+            os.path.join(os.path.dirname(__file__), 'names'))
+
+        self.num_classes = len(self.classes)
+        self._load_all(anno_file, shuffle)
+        self.num_images = len(self.image_set_index)
+
+
+    def image_path_from_index(self, index):
+        """
+        given image index, find out full path
+
+        Parameters:
+        ----------
+        index: int
+            index of a specific image
+        Returns:
+        ----------
+        full path of this image
+        """
+        assert self.image_set_index is not None, "Dataset not initialized"
+        name = self.image_set_index[index]
+        image_file = os.path.join(self.image_dir, 'images', name)
+        assert os.path.isfile(image_file), 'Path does not exist: {}'.format(image_file)
+        return image_file
+
+    def label_from_index(self, index):
+        """
+        given image index, return preprocessed ground-truth
+
+        Parameters:
+        ----------
+        index: int
+            index of a specific image
+        Returns:
+        ----------
+        ground-truths of this image
+        """
+        assert self.labels is not None, "Labels not processed"
+        return self.labels[index]
+
+    def _load_all(self, anno_file, shuffle):
+        """
+        initialize all entries given annotation json file
+
+        Parameters:
+        ----------
+        anno_file: str
+            annotation json file
+        shuffle: bool
+            whether to shuffle image list
+        """
+        image_set_index = []
+        labels = []
+        coco = COCO(anno_file)
+        img_ids = coco.getImgIds()
+        for img_id in img_ids:
+            # filename
+            image_info = coco.loadImgs(img_id)[0]
+            filename = image_info["file_name"]
+            subdir = filename.split('_')[1]
+            height = image_info["height"]
+            width = image_info["width"]
+            # label
+            anno_ids = coco.getAnnIds(imgIds=img_id)
+            annos = coco.loadAnns(anno_ids)
+            label = []
+            for anno in annos:
+                cat_id = int(anno["category_id"])
+                bbox = anno["bbox"]
+                assert len(bbox) == 4
+                xmin = float(bbox[0]) / width
+                ymin = float(bbox[1]) / height
+                xmax = xmin + float(bbox[2]) / width
+                ymax = ymin + float(bbox[3]) / height
+                label.append([cat_id, xmin, ymin, xmax, ymax, 0])
+            if label:
+                labels.append(np.array(label))
+                image_set_index.append(os.path.join(subdir, filename))
+
+        if shuffle:
+            import random
+            indices = range(len(image_set_index))
+            random.shuffle(indices)
+            image_set_index = [image_set_index[i] for i in indices]
+            labels = [labels[i] for i in indices]
+        # store the results
+        self.image_set_index = image_set_index
+        self.labels = labels

http://git-wip-us.apache.org/repos/asf/incubator-mxnet-test/blob/cc62aded/example/ssd/dataset/names/mscoco.names
----------------------------------------------------------------------
diff --git a/example/ssd/dataset/names/mscoco.names b/example/ssd/dataset/names/mscoco.names
new file mode 100644
index 0000000..ca76c80
--- /dev/null
+++ b/example/ssd/dataset/names/mscoco.names
@@ -0,0 +1,80 @@
+person
+bicycle
+car
+motorbike
+aeroplane
+bus
+train
+truck
+boat
+traffic light
+fire hydrant
+stop sign
+parking meter
+bench
+bird
+cat
+dog
+horse
+sheep
+cow
+elephant
+bear
+zebra
+giraffe
+backpack
+umbrella
+handbag
+tie
+suitcase
+frisbee
+skis
+snowboard
+sports ball
+kite
+baseball bat
+baseball glove
+skateboard
+surfboard
+tennis racket
+bottle
+wine glass
+cup
+fork
+knife
+spoon
+bowl
+banana
+apple
+sandwich
+orange
+broccoli
+carrot
+hot dog
+pizza
+donut
+cake
+chair
+sofa
+pottedplant
+bed
+diningtable
+toilet
+tvmonitor
+laptop
+mouse
+remote
+keyboard
+cell phone
+microwave
+oven
+toaster
+sink
+refrigerator
+book
+clock
+vase
+scissors
+teddy bear
+hair drier
+toothbrush

http://git-wip-us.apache.org/repos/asf/incubator-mxnet-test/blob/cc62aded/example/ssd/dataset/names/pascal_voc.names
----------------------------------------------------------------------
diff --git a/example/ssd/dataset/names/pascal_voc.names b/example/ssd/dataset/names/pascal_voc.names
new file mode 100644
index 0000000..8420ab3
--- /dev/null
+++ b/example/ssd/dataset/names/pascal_voc.names
@@ -0,0 +1,20 @@
+aeroplane
+bicycle
+bird
+boat
+bottle
+bus
+car
+cat
+chair
+cow
+diningtable
+dog
+horse
+motorbike
+person
+pottedplant
+sheep
+sofa
+train
+tvmonitor

http://git-wip-us.apache.org/repos/asf/incubator-mxnet-test/blob/cc62aded/example/ssd/dataset/pascal_voc.py
----------------------------------------------------------------------
diff --git a/example/ssd/dataset/pascal_voc.py b/example/ssd/dataset/pascal_voc.py
index 2c61be7..31e287e 100644
--- a/example/ssd/dataset/pascal_voc.py
+++ b/example/ssd/dataset/pascal_voc.py
@@ -24,7 +24,8 @@ class PascalVoc(Imdb):
     is_train : boolean
         if true, will load annotations
     """
-    def __init__(self, image_set, year, devkit_path, shuffle=False, is_train=False):
+    def __init__(self, image_set, year, devkit_path, shuffle=False, is_train=False,
+            names='pascal_voc.names'):
         super(PascalVoc, self).__init__('voc_' + year + '_' + image_set)
         self.image_set = image_set
         self.year = year
@@ -33,11 +34,8 @@ class PascalVoc(Imdb):
         self.extension = '.jpg'
         self.is_train = is_train
 
-        self.classes = ['aeroplane', 'bicycle', 'bird', 'boat',
-                        'bottle', 'bus', 'car', 'cat', 'chair',
-                        'cow', 'diningtable', 'dog', 'horse',
-                        'motorbike', 'person', 'pottedplant',
-                        'sheep', 'sofa', 'train', 'tvmonitor']
+        self.classes = self._load_class_names(names,
+            os.path.join(os.path.dirname(__file__), 'names'))
 
         self.config = {'use_difficult': True,
                        'comp_id': 'comp4',}

http://git-wip-us.apache.org/repos/asf/incubator-mxnet-test/blob/cc62aded/example/ssd/dataset/pycocotools/README.md
----------------------------------------------------------------------
diff --git a/example/ssd/dataset/pycocotools/README.md b/example/ssd/dataset/pycocotools/README.md
new file mode 100755
index 0000000..d358f53
--- /dev/null
+++ b/example/ssd/dataset/pycocotools/README.md
@@ -0,0 +1,2 @@
+This is a modified version of https://github.com/pdollar/coco python API.
+No `make` is required, but this will not support mask functions.

http://git-wip-us.apache.org/repos/asf/incubator-mxnet-test/blob/cc62aded/example/ssd/dataset/pycocotools/__init__.py
----------------------------------------------------------------------
diff --git a/example/ssd/dataset/pycocotools/__init__.py b/example/ssd/dataset/pycocotools/__init__.py
new file mode 100755
index 0000000..3f7d85b
--- /dev/null
+++ b/example/ssd/dataset/pycocotools/__init__.py
@@ -0,0 +1 @@
+__author__ = 'tylin'

http://git-wip-us.apache.org/repos/asf/incubator-mxnet-test/blob/cc62aded/example/ssd/dataset/pycocotools/coco.py
----------------------------------------------------------------------
diff --git a/example/ssd/dataset/pycocotools/coco.py b/example/ssd/dataset/pycocotools/coco.py
new file mode 100755
index 0000000..a8939f6
--- /dev/null
+++ b/example/ssd/dataset/pycocotools/coco.py
@@ -0,0 +1,435 @@
+__author__ = 'tylin'
+__version__ = '2.0'
+# Interface for accessing the Microsoft COCO dataset.
+
+# Microsoft COCO is a large image dataset designed for object detection,
+# segmentation, and caption generation. pycocotools is a Python API that
+# assists in loading, parsing and visualizing the annotations in COCO.
+# Please visit http://mscoco.org/ for more information on COCO, including
+# for the data, paper, and tutorials. The exact format of the annotations
+# is also described on the COCO website. For example usage of the pycocotools
+# please see pycocotools_demo.ipynb. In addition to this API, please download both
+# the COCO images and annotations in order to run the demo.
+
+# An alternative to using the API is to load the annotations directly
+# into Python dictionary
+# Using the API provides additional utility functions. Note that this API
+# supports both *instance* and *caption* annotations. In the case of
+# captions not all functions are defined (e.g. categories are undefined).
+
+# The following API functions are defined:
+#  COCO       - COCO api class that loads COCO annotation file and prepare data structures.
+#  decodeMask - Decode binary mask M encoded via run-length encoding.
+#  encodeMask - Encode binary mask M using run-length encoding.
+#  getAnnIds  - Get ann ids that satisfy given filter conditions.
+#  getCatIds  - Get cat ids that satisfy given filter conditions.
+#  getImgIds  - Get img ids that satisfy given filter conditions.
+#  loadAnns   - Load anns with the specified ids.
+#  loadCats   - Load cats with the specified ids.
+#  loadImgs   - Load imgs with the specified ids.
+#  annToMask  - Convert segmentation in an annotation to binary mask.
+#  showAnns   - Display the specified annotations.
+#  loadRes    - Load algorithm results and create API for accessing them.
+#  download   - Download COCO images from mscoco.org server.
+# Throughout the API "ann"=annotation, "cat"=category, and "img"=image.
+# Help on each functions can be accessed by: "help COCO>function".
+
+# See also COCO>decodeMask,
+# COCO>encodeMask, COCO>getAnnIds, COCO>getCatIds,
+# COCO>getImgIds, COCO>loadAnns, COCO>loadCats,
+# COCO>loadImgs, COCO>annToMask, COCO>showAnns
+
+# Microsoft COCO Toolbox.      version 2.0
+# Data, paper, and tutorials available at:  http://mscoco.org/
+# Code written by Piotr Dollar and Tsung-Yi Lin, 2014.
+# Licensed under the Simplified BSD License [see bsd.txt]
+
+import json
+import time
+import matplotlib.pyplot as plt
+from matplotlib.collections import PatchCollection
+from matplotlib.patches import Polygon
+import numpy as np
+import copy
+import itertools
+# from . import mask as maskUtils
+import os
+from collections import defaultdict
+import sys
+PYTHON_VERSION = sys.version_info[0]
+if PYTHON_VERSION == 2:
+    from urllib import urlretrieve
+elif PYTHON_VERSION == 3:
+    from urllib.request import urlretrieve
+
+class COCO:
+    def __init__(self, annotation_file=None):
+        """
+        Constructor of Microsoft COCO helper class for reading and visualizing annotations.
+        :param annotation_file (str): location of annotation file
+        :param image_folder (str): location to the folder that hosts images.
+        :return:
+        """
+        # load dataset
+        self.dataset,self.anns,self.cats,self.imgs = dict(),dict(),dict(),dict()
+        self.imgToAnns, self.catToImgs = defaultdict(list), defaultdict(list)
+        if not annotation_file == None:
+            print('loading annotations into memory...')
+            tic = time.time()
+            dataset = json.load(open(annotation_file, 'r'))
+            assert type(dataset)==dict, 'annotation file format {} not supported'.format(type(dataset))
+            print('Done (t={:0.2f}s)'.format(time.time()- tic))
+            self.dataset = dataset
+            self.createIndex()
+
+    def createIndex(self):
+        # create index
+        print('creating index...')
+        anns, cats, imgs = {}, {}, {}
+        imgToAnns,catToImgs = defaultdict(list),defaultdict(list)
+        if 'annotations' in self.dataset:
+            for ann in self.dataset['annotations']:
+                imgToAnns[ann['image_id']].append(ann)
+                anns[ann['id']] = ann
+
+        if 'images' in self.dataset:
+            for img in self.dataset['images']:
+                imgs[img['id']] = img
+
+        if 'categories' in self.dataset:
+            for cat in self.dataset['categories']:
+                cats[cat['id']] = cat
+
+        if 'annotations' in self.dataset and 'categories' in self.dataset:
+            for ann in self.dataset['annotations']:
+                catToImgs[ann['category_id']].append(ann['image_id'])
+
+        print('index created!')
+
+        # create class members
+        self.anns = anns
+        self.imgToAnns = imgToAnns
+        self.catToImgs = catToImgs
+        self.imgs = imgs
+        self.cats = cats
+
+    def info(self):
+        """
+        Print information about the annotation file.
+        :return:
+        """
+        for key, value in self.dataset['info'].items():
+            print('{}: {}'.format(key, value))
+
+    def getAnnIds(self, imgIds=[], catIds=[], areaRng=[], iscrowd=None):
+        """
+        Get ann ids that satisfy given filter conditions. default skips that filter
+        :param imgIds  (int array)     : get anns for given imgs
+               catIds  (int array)     : get anns for given cats
+               areaRng (float array)   : get anns for given area range (e.g. [0 inf])
+               iscrowd (boolean)       : get anns for given crowd label (False or True)
+        :return: ids (int array)       : integer array of ann ids
+        """
+        imgIds = imgIds if type(imgIds) == list else [imgIds]
+        catIds = catIds if type(catIds) == list else [catIds]
+
+        if len(imgIds) == len(catIds) == len(areaRng) == 0:
+            anns = self.dataset['annotations']
+        else:
+            if not len(imgIds) == 0:
+                lists = [self.imgToAnns[imgId] for imgId in imgIds if imgId in self.imgToAnns]
+                anns = list(itertools.chain.from_iterable(lists))
+            else:
+                anns = self.dataset['annotations']
+            anns = anns if len(catIds)  == 0 else [ann for ann in anns if ann['category_id'] in catIds]
+            anns = anns if len(areaRng) == 0 else [ann for ann in anns if ann['area'] > areaRng[0] and ann['area'] < areaRng[1]]
+        if not iscrowd == None:
+            ids = [ann['id'] for ann in anns if ann['iscrowd'] == iscrowd]
+        else:
+            ids = [ann['id'] for ann in anns]
+        return ids
+
+    def getCatIds(self, catNms=[], supNms=[], catIds=[]):
+        """
+        filtering parameters. default skips that filter.
+        :param catNms (str array)  : get cats for given cat names
+        :param supNms (str array)  : get cats for given supercategory names
+        :param catIds (int array)  : get cats for given cat ids
+        :return: ids (int array)   : integer array of cat ids
+        """
+        catNms = catNms if type(catNms) == list else [catNms]
+        supNms = supNms if type(supNms) == list else [supNms]
+        catIds = catIds if type(catIds) == list else [catIds]
+
+        if len(catNms) == len(supNms) == len(catIds) == 0:
+            cats = self.dataset['categories']
+        else:
+            cats = self.dataset['categories']
+            cats = cats if len(catNms) == 0 else [cat for cat in cats if cat['name']          in catNms]
+            cats = cats if len(supNms) == 0 else [cat for cat in cats if cat['supercategory'] in supNms]
+            cats = cats if len(catIds) == 0 else [cat for cat in cats if cat['id']            in catIds]
+        ids = [cat['id'] for cat in cats]
+        return ids
+
+    def getImgIds(self, imgIds=[], catIds=[]):
+        '''
+        Get img ids that satisfy given filter conditions.
+        :param imgIds (int array) : get imgs for given ids
+        :param catIds (int array) : get imgs with all given cats
+        :return: ids (int array)  : integer array of img ids
+        '''
+        imgIds = imgIds if type(imgIds) == list else [imgIds]
+        catIds = catIds if type(catIds) == list else [catIds]
+
+        if len(imgIds) == len(catIds) == 0:
+            ids = self.imgs.keys()
+        else:
+            ids = set(imgIds)
+            for i, catId in enumerate(catIds):
+                if i == 0 and len(ids) == 0:
+                    ids = set(self.catToImgs[catId])
+                else:
+                    ids &= set(self.catToImgs[catId])
+        return list(ids)
+
+    def loadAnns(self, ids=[]):
+        """
+        Load anns with the specified ids.
+        :param ids (int array)       : integer ids specifying anns
+        :return: anns (object array) : loaded ann objects
+        """
+        if type(ids) == list:
+            return [self.anns[id] for id in ids]
+        elif type(ids) == int:
+            return [self.anns[ids]]
+
+    def loadCats(self, ids=[]):
+        """
+        Load cats with the specified ids.
+        :param ids (int array)       : integer ids specifying cats
+        :return: cats (object array) : loaded cat objects
+        """
+        if type(ids) == list:
+            return [self.cats[id] for id in ids]
+        elif type(ids) == int:
+            return [self.cats[ids]]
+
+    def loadImgs(self, ids=[]):
+        """
+        Load anns with the specified ids.
+        :param ids (int array)       : integer ids specifying img
+        :return: imgs (object array) : loaded img objects
+        """
+        if type(ids) == list:
+            return [self.imgs[id] for id in ids]
+        elif type(ids) == int:
+            return [self.imgs[ids]]
+
+    def showAnns(self, anns):
+        """
+        Display the specified annotations.
+        :param anns (array of object): annotations to display
+        :return: None
+        """
+        if len(anns) == 0:
+            return 0
+        if 'segmentation' in anns[0] or 'keypoints' in anns[0]:
+            datasetType = 'instances'
+        elif 'caption' in anns[0]:
+            datasetType = 'captions'
+        else:
+            raise Exception('datasetType not supported')
+        if datasetType == 'instances':
+            ax = plt.gca()
+            ax.set_autoscale_on(False)
+            polygons = []
+            color = []
+            for ann in anns:
+                c = (np.random.random((1, 3))*0.6+0.4).tolist()[0]
+                if 'segmentation' in ann:
+                    if type(ann['segmentation']) == list:
+                        # polygon
+                        for seg in ann['segmentation']:
+                            poly = np.array(seg).reshape((int(len(seg)/2), 2))
+                            polygons.append(Polygon(poly))
+                            color.append(c)
+                    else:
+                        # mask
+                        t = self.imgs[ann['image_id']]
+                        if type(ann['segmentation']['counts']) == list:
+                            # rle = maskUtils.frPyObjects([ann['segmentation']], t['height'], t['width'])
+                            raise NotImplementedError("maskUtils disabled!")
+                        else:
+                            rle = [ann['segmentation']]
+                        # m = maskUtils.decode(rle)
+                        raise NotImplementedError("maskUtils disabled!")
+                        img = np.ones( (m.shape[0], m.shape[1], 3) )
+                        if ann['iscrowd'] == 1:
+                            color_mask = np.array([2.0,166.0,101.0])/255
+                        if ann['iscrowd'] == 0:
+                            color_mask = np.random.random((1, 3)).tolist()[0]
+                        for i in range(3):
+                            img[:,:,i] = color_mask[i]
+                        ax.imshow(np.dstack( (img, m*0.5) ))
+                if 'keypoints' in ann and type(ann['keypoints']) == list:
+                    # turn skeleton into zero-based index
+                    sks = np.array(self.loadCats(ann['category_id'])[0]['skeleton'])-1
+                    kp = np.array(ann['keypoints'])
+                    x = kp[0::3]
+                    y = kp[1::3]
+                    v = kp[2::3]
+                    for sk in sks:
+                        if np.all(v[sk]>0):
+                            plt.plot(x[sk],y[sk], linewidth=3, color=c)
+                    plt.plot(x[v>0], y[v>0],'o',markersize=8, markerfacecolor=c, markeredgecolor='k',markeredgewidth=2)
+                    plt.plot(x[v>1], y[v>1],'o',markersize=8, markerfacecolor=c, markeredgecolor=c, markeredgewidth=2)
+            p = PatchCollection(polygons, facecolor=color, linewidths=0, alpha=0.4)
+            ax.add_collection(p)
+            p = PatchCollection(polygons, facecolor='none', edgecolors=color, linewidths=2)
+            ax.add_collection(p)
+        elif datasetType == 'captions':
+            for ann in anns:
+                print(ann['caption'])
+
+    def loadRes(self, resFile):
+        """
+        Load result file and return a result api object.
+        :param   resFile (str)     : file name of result file
+        :return: res (obj)         : result api object
+        """
+        res = COCO()
+        res.dataset['images'] = [img for img in self.dataset['images']]
+
+        print('Loading and preparing results...')
+        tic = time.time()
+        if type(resFile) == str or type(resFile) == unicode:
+            anns = json.load(open(resFile))
+        elif type(resFile) == np.ndarray:
+            anns = self.loadNumpyAnnotations(resFile)
+        else:
+            anns = resFile
+        assert type(anns) == list, 'results in not an array of objects'
+        annsImgIds = [ann['image_id'] for ann in anns]
+        assert set(annsImgIds) == (set(annsImgIds) & set(self.getImgIds())), \
+               'Results do not correspond to current coco set'
+        if 'caption' in anns[0]:
+            imgIds = set([img['id'] for img in res.dataset['images']]) & set([ann['image_id'] for ann in anns])
+            res.dataset['images'] = [img for img in res.dataset['images'] if img['id'] in imgIds]
+            for id, ann in enumerate(anns):
+                ann['id'] = id+1
+        elif 'bbox' in anns[0] and not anns[0]['bbox'] == []:
+            res.dataset['categories'] = copy.deepcopy(self.dataset['categories'])
+            for id, ann in enumerate(anns):
+                bb = ann['bbox']
+                x1, x2, y1, y2 = [bb[0], bb[0]+bb[2], bb[1], bb[1]+bb[3]]
+                if not 'segmentation' in ann:
+                    ann['segmentation'] = [[x1, y1, x1, y2, x2, y2, x2, y1]]
+                ann['area'] = bb[2]*bb[3]
+                ann['id'] = id+1
+                ann['iscrowd'] = 0
+        elif 'segmentation' in anns[0]:
+            res.dataset['categories'] = copy.deepcopy(self.dataset['categories'])
+            for id, ann in enumerate(anns):
+                # now only support compressed RLE format as segmentation results
+                # ann['area'] = maskUtils.area(ann['segmentation'])
+                raise NotImplementedError("maskUtils disabled!")
+                if not 'bbox' in ann:
+                    # ann['bbox'] = maskUtils.toBbox(ann['segmentation'])
+                    raise NotImplementedError("maskUtils disabled!")
+                ann['id'] = id+1
+                ann['iscrowd'] = 0
+        elif 'keypoints' in anns[0]:
+            res.dataset['categories'] = copy.deepcopy(self.dataset['categories'])
+            for id, ann in enumerate(anns):
+                s = ann['keypoints']
+                x = s[0::3]
+                y = s[1::3]
+                x0,x1,y0,y1 = np.min(x), np.max(x), np.min(y), np.max(y)
+                ann['area'] = (x1-x0)*(y1-y0)
+                ann['id'] = id + 1
+                ann['bbox'] = [x0,y0,x1-x0,y1-y0]
+        print('DONE (t={:0.2f}s)'.format(time.time()- tic))
+
+        res.dataset['annotations'] = anns
+        res.createIndex()
+        return res
+
+    def download(self, tarDir = None, imgIds = [] ):
+        '''
+        Download COCO images from mscoco.org server.
+        :param tarDir (str): COCO results directory name
+               imgIds (list): images to be downloaded
+        :return:
+        '''
+        if tarDir is None:
+            print('Please specify target directory')
+            return -1
+        if len(imgIds) == 0:
+            imgs = self.imgs.values()
+        else:
+            imgs = self.loadImgs(imgIds)
+        N = len(imgs)
+        if not os.path.exists(tarDir):
+            os.makedirs(tarDir)
+        for i, img in enumerate(imgs):
+            tic = time.time()
+            fname = os.path.join(tarDir, img['file_name'])
+            if not os.path.exists(fname):
+                urlretrieve(img['coco_url'], fname)
+            print('downloaded {}/{} images (t={:0.1f}s)'.format(i, N, time.time()- tic))
+
+    def loadNumpyAnnotations(self, data):
+        """
+        Convert result data from a numpy array [Nx7] where each row contains {imageID,x1,y1,w,h,score,class}
+        :param  data (numpy.ndarray)
+        :return: annotations (python nested list)
+        """
+        print('Converting ndarray to lists...')
+        assert(type(data) == np.ndarray)
+        print(data.shape)
+        assert(data.shape[1] == 7)
+        N = data.shape[0]
+        ann = []
+        for i in range(N):
+            if i % 1000000 == 0:
+                print('{}/{}'.format(i,N))
+            ann += [{
+                'image_id'  : int(data[i, 0]),
+                'bbox'  : [ data[i, 1], data[i, 2], data[i, 3], data[i, 4] ],
+                'score' : data[i, 5],
+                'category_id': int(data[i, 6]),
+                }]
+        return ann
+
+    def annToRLE(self, ann):
+        """
+        Convert annotation which can be polygons, uncompressed RLE to RLE.
+        :return: binary mask (numpy 2D array)
+        """
+        t = self.imgs[ann['image_id']]
+        h, w = t['height'], t['width']
+        segm = ann['segmentation']
+        if type(segm) == list:
+            # polygon -- a single object might consist of multiple parts
+            # we merge all parts into one mask rle code
+            # rles = maskUtils.frPyObjects(segm, h, w)
+            # rle = maskUtils.merge(rles)
+            raise NotImplementedError("maskUtils disabled!")
+        elif type(segm['counts']) == list:
+            # uncompressed RLE
+            # rle = maskUtils.frPyObjects(segm, h, w)
+            raise NotImplementedError("maskUtils disabled!")
+        else:
+            # rle
+            rle = ann['segmentation']
+        return rle
+
+    def annToMask(self, ann):
+        """
+        Convert annotation which can be polygons, uncompressed RLE, or RLE to binary mask.
+        :return: binary mask (numpy 2D array)
+        """
+        rle = self.annToRLE(ann)
+        # m = maskUtils.decode(rle)
+        raise NotImplementedError("maskUtils disabled!")
+        return m

http://git-wip-us.apache.org/repos/asf/incubator-mxnet-test/blob/cc62aded/example/ssd/demo.py
----------------------------------------------------------------------
diff --git a/example/ssd/demo.py b/example/ssd/demo.py
index ededbdb..bda4606 100644
--- a/example/ssd/demo.py
+++ b/example/ssd/demo.py
@@ -2,18 +2,12 @@ import argparse
 import tools.find_mxnet
 import mxnet as mx
 import os
-import importlib
 import sys
 from detect.detector import Detector
+from symbol.symbol_factory import get_symbol
 
-CLASSES = ('aeroplane', 'bicycle', 'bird', 'boat',
-           'bottle', 'bus', 'car', 'cat', 'chair',
-           'cow', 'diningtable', 'dog', 'horse',
-           'motorbike', 'person', 'pottedplant',
-           'sheep', 'sofa', 'train', 'tvmonitor')
-
-def get_detector(net, prefix, epoch, data_shape, mean_pixels, ctx,
-                 nms_thresh=0.5, force_nms=True):
+def get_detector(net, prefix, epoch, data_shape, mean_pixels, ctx, num_class,
+                 nms_thresh=0.5, force_nms=True, nms_topk=400):
     """
     wrapper for initialize a detector
 
@@ -31,23 +25,25 @@ def get_detector(net, prefix, epoch, data_shape, mean_pixels, ctx,
         mean pixel values (R, G, B)
     ctx : mx.ctx
         running context, mx.cpu() or mx.gpu(?)
+    num_class : int
+        number of classes
+    nms_thresh : float
+        non-maximum suppression threshold
     force_nms : bool
         force suppress different categories
     """
-    sys.path.append(os.path.join(os.getcwd(), 'symbol'))
     if net is not None:
-        net = importlib.import_module("symbol_" + net) \
-            .get_symbol(len(CLASSES), nms_thresh, force_nms)
-    detector = Detector(net, prefix + "_" + str(data_shape), epoch, \
-        data_shape, mean_pixels, ctx=ctx)
+        net = get_symbol(net, data_shape, num_classes=num_class, nms_thresh=nms_thresh,
+            force_nms=force_nms, nms_topk=nms_topk)
+    detector = Detector(net, prefix, epoch, data_shape, mean_pixels, ctx=ctx)
     return detector
 
 def parse_args():
     parser = argparse.ArgumentParser(description='Single-shot detection network demo')
-    parser.add_argument('--network', dest='network', type=str, default='vgg16_ssd_300',
-                        choices=['vgg16_ssd_300', 'vgg16_ssd_512'], help='which network to use')
+    parser.add_argument('--network', dest='network', type=str, default='resnet50',
+                        help='which network to use')
     parser.add_argument('--images', dest='images', type=str, default='./data/demo/dog.jpg',
-                        help='run demo with images, use comma(without space) to seperate multiple images')
+                        help='run demo with images, use comma to seperate multiple images')
     parser.add_argument('--dir', dest='dir', nargs='?',
                         help='demo image directory, optional', type=str)
     parser.add_argument('--ext', dest='extension', help='image extension, optional',
@@ -55,12 +51,13 @@ def parse_args():
     parser.add_argument('--epoch', dest='epoch', help='epoch of trained model',
                         default=0, type=int)
     parser.add_argument('--prefix', dest='prefix', help='trained model prefix',
-                        default=os.path.join(os.getcwd(), 'model', 'ssd'), type=str)
+                        default=os.path.join(os.getcwd(), 'model', 'ssd_'),
+                        type=str)
     parser.add_argument('--cpu', dest='cpu', help='(override GPU) use CPU to detect',
                         action='store_true', default=False)
     parser.add_argument('--gpu', dest='gpu_id', type=int, default=0,
                         help='GPU device id to detect with')
-    parser.add_argument('--data-shape', dest='data_shape', type=int, default=300,
+    parser.add_argument('--data-shape', dest='data_shape', type=int, default=512,
                         help='set image shape')
     parser.add_argument('--mean-r', dest='mean_r', type=float, default=123,
                         help='red mean value')
@@ -78,9 +75,29 @@ def parse_args():
                         help='show detection time')
     parser.add_argument('--deploy', dest='deploy_net', action='store_true', default=False,
                         help='Load network from json file, rather than from symbol')
+    parser.add_argument('--class-names', dest='class_names', type=str,
+                        default='aeroplane, bicycle, bird, boat, bottle, bus, \
+                        car, cat, chair, cow, diningtable, dog, horse, motorbike, \
+                        person, pottedplant, sheep, sofa, train, tvmonitor',
+                        help='string of comma separated names, or text filename')
     args = parser.parse_args()
     return args
 
+def parse_class_names(class_names):
+    """ parse # classes and class_names if applicable """
+    if len(class_names) > 0:
+        if os.path.isfile(class_names):
+            # try to open it to read class names
+            with open(class_names, 'r') as f:
+                class_names = [l.strip() for l in f.readlines()]
+        else:
+            class_names = [c.strip() for c in class_names.split(',')]
+        for name in class_names:
+            assert len(name) > 0
+    else:
+        raise RuntimeError("No valid class_name provided...")
+    return class_names
+
 if __name__ == '__main__':
     args = parse_args()
     if args.cpu:
@@ -93,10 +110,15 @@ if __name__ == '__main__':
     assert len(image_list) > 0, "No valid image specified to detect"
 
     network = None if args.deploy_net else args.network
-    detector = get_detector(network, args.prefix, args.epoch,
+    class_names = parse_class_names(args.class_names)
+    if args.prefix.endswith('_'):
+        prefix = args.prefix + args.network + '_' + str(args.data_shape)
+    else:
+        prefix = args.prefix
+    detector = get_detector(network, prefix, args.epoch,
                             args.data_shape,
                             (args.mean_r, args.mean_g, args.mean_b),
-                            ctx, args.nms_thresh, args.force_nms)
+                            ctx, len(class_names), args.nms_thresh, args.force_nms)
     # run detection
     detector.detect_and_visualize(image_list, args.dir, args.extension,
-                                  CLASSES, args.thresh, args.show_timer)
+                                  class_names, args.thresh, args.show_timer)

http://git-wip-us.apache.org/repos/asf/incubator-mxnet-test/blob/cc62aded/example/ssd/deploy.py
----------------------------------------------------------------------
diff --git a/example/ssd/deploy.py b/example/ssd/deploy.py
index 264314a..aa70cac 100644
--- a/example/ssd/deploy.py
+++ b/example/ssd/deploy.py
@@ -5,32 +5,41 @@ import mxnet as mx
 import os
 import importlib
 import sys
+from symbol.symbol_factory import get_symbol
 
 def parse_args():
     parser = argparse.ArgumentParser(description='Convert a trained model to deploy model')
-    parser.add_argument('--network', dest='network', type=str, default='vgg16_ssd_300',
-                        choices=['vgg16_ssd_300', 'vgg16_ssd_512'], help='which network to use')
+    parser.add_argument('--network', dest='network', type=str, default='vgg16_reduced',
+                        help='which network to use')
     parser.add_argument('--epoch', dest='epoch', help='epoch of trained model',
                         default=0, type=int)
     parser.add_argument('--prefix', dest='prefix', help='trained model prefix',
-                        default=os.path.join(os.getcwd(), 'model', 'ssd_300'), type=str)
+                        default=os.path.join(os.getcwd(), 'model', 'ssd_'), type=str)
+    parser.add_argument('--data-shape', dest='data_shape', type=int, default=300,
+                        help='data shape')
     parser.add_argument('--num-class', dest='num_classes', help='number of classes',
                         default=20, type=int)
     parser.add_argument('--nms', dest='nms_thresh', type=float, default=0.5,
                         help='non-maximum suppression threshold, default 0.5')
     parser.add_argument('--force', dest='force_nms', type=bool, default=True,
                         help='force non-maximum suppression on different class')
+    parser.add_argument('--topk', dest='nms_topk', type=int, default=400,
+                        help='apply nms only to top k detections based on scores.')
     args = parser.parse_args()
     return args
 
 if __name__ == '__main__':
     args = parse_args()
-    sys.path.append(os.path.join(os.getcwd(), 'symbol'))
-    net = importlib.import_module("symbol_" + args.network) \
-        .get_symbol(args.num_classes, args.nms_thresh, args.force_nms)
-    _, arg_params, aux_params = mx.model.load_checkpoint(args.prefix, args.epoch)
+    net = get_symbol(args.network, args.data_shape,
+        num_classes=args.num_classes, nms_thresh=args.nms_thresh,
+        force_suppress=args.force_nms, nms_topk=args.nms_topk)
+    if args.prefix.endswith('_'):
+        prefix = args.prefix + args.network + '_' + str(args.data_shape)
+    else:
+        prefix = args.prefix
+    _, arg_params, aux_params = mx.model.load_checkpoint(prefix, args.epoch)
     # new name
-    tmp = args.prefix.rsplit('/', 1)
+    tmp = prefix.rsplit('/', 1)
     save_prefix = '/deploy_'.join(tmp)
     mx.model.save_checkpoint(save_prefix, args.epoch, net, arg_params, aux_params)
     print("Saved model: {}-{:04d}.param".format(save_prefix, args.epoch))

http://git-wip-us.apache.org/repos/asf/incubator-mxnet-test/blob/cc62aded/example/ssd/evaluate.py
----------------------------------------------------------------------
diff --git a/example/ssd/evaluate.py b/example/ssd/evaluate.py
index a38a7f6..65e0b30 100644
--- a/example/ssd/evaluate.py
+++ b/example/ssd/evaluate.py
@@ -5,30 +5,27 @@ import os
 import sys
 from evaluate.evaluate_net import evaluate_net
 
-CLASSES = ('aeroplane', 'bicycle', 'bird', 'boat',
-           'bottle', 'bus', 'car', 'cat', 'chair',
-           'cow', 'diningtable', 'dog', 'horse',
-           'motorbike', 'person', 'pottedplant',
-           'sheep', 'sofa', 'train', 'tvmonitor')
-
 def parse_args():
     parser = argparse.ArgumentParser(description='Evaluate a network')
     parser.add_argument('--rec-path', dest='rec_path', help='which record file to use',
                         default=os.path.join(os.getcwd(), 'data', 'val.rec'), type=str)
     parser.add_argument('--list-path', dest='list_path', help='which list file to use',
                         default="", type=str)
-    parser.add_argument('--network', dest='network', type=str, default='vgg16_ssd_300',
-                        choices=['vgg16_ssd_300', 'vgg16_ssd_512'], help='which network to use')
+    parser.add_argument('--network', dest='network', type=str, default='vgg16_reduced',
+                        help='which network to use')
     parser.add_argument('--batch-size', dest='batch_size', type=int, default=32,
                         help='evaluation batch size')
     parser.add_argument('--num-class', dest='num_class', type=int, default=20,
                         help='number of classes')
-    parser.add_argument('--class-names', dest='class_names', type=str, default=",".join(CLASSES),
+    parser.add_argument('--class-names', dest='class_names', type=str,
+                        default='aeroplane, bicycle, bird, boat, bottle, bus, \
+                        car, cat, chair, cow, diningtable, dog, horse, motorbike, \
+                        person, pottedplant, sheep, sofa, train, tvmonitor',
                         help='string of comma separated names, or text filename')
     parser.add_argument('--epoch', dest='epoch', help='epoch of pretrained model',
                         default=0, type=int)
     parser.add_argument('--prefix', dest='prefix', help='load model prefix',
-                        default=os.path.join(os.getcwd(), 'model', 'ssd'), type=str)
+                        default=os.path.join(os.getcwd(), 'model', 'ssd_'), type=str)
     parser.add_argument('--gpus', dest='gpu_id', help='GPU devices to evaluate with',
                         default='0', type=str)
     parser.add_argument('--cpu', dest='cpu', help='use cpu to evaluate, this can be slow',
@@ -79,9 +76,13 @@ if __name__ == '__main__':
         class_names = None
 
     network = None if args.deploy_net else args.network
+    if args.prefix.endswith('_'):
+        prefix = args.prefix + args.network
+    else:
+        prefix = args.prefix
     evaluate_net(network, args.rec_path, num_class,
                  (args.mean_r, args.mean_g, args.mean_b), args.data_shape,
-                 args.prefix, args.epoch, ctx, batch_size=args.batch_size,
+                 prefix, args.epoch, ctx, batch_size=args.batch_size,
                  path_imglist=args.list_path, nms_thresh=args.nms_thresh,
                  force_nms=args.force_nms, ovp_thresh=args.overlap_thresh,
                  use_difficult=args.use_difficult, class_names=class_names,

http://git-wip-us.apache.org/repos/asf/incubator-mxnet-test/blob/cc62aded/example/ssd/evaluate/evaluate_net.py
----------------------------------------------------------------------
diff --git a/example/ssd/evaluate/evaluate_net.py b/example/ssd/evaluate/evaluate_net.py
index 8d86f8e..4c629f8 100644
--- a/example/ssd/evaluate/evaluate_net.py
+++ b/example/ssd/evaluate/evaluate_net.py
@@ -7,6 +7,7 @@ from dataset.iterator import DetRecordIter
 from config.config import cfg
 from evaluate.eval_metric import MApMetric, VOC07MApMetric
 import logging
+from symbol.symbol_factory import get_symbol
 
 def evaluate_net(net, path_imgrec, num_classes, mean_pixels, data_shape,
                  model_prefix, epoch, ctx=mx.cpu(), batch_size=1,
@@ -71,9 +72,8 @@ def evaluate_net(net, path_imgrec, num_classes, mean_pixels, data_shape,
     if net is None:
         net = load_net
     else:
-        sys.path.append(os.path.join(cfg.ROOT_DIR, 'symbol'))
-        net = importlib.import_module("symbol_" + net) \
-            .get_symbol(num_classes, nms_thresh, force_nms)
+        net = get_symbol(net, data_shape[1], num_classes=num_classes,
+            nms_thresh=nms_thresh, force_suppress=force_nms)
     if not 'label' in net.list_arguments():
         label = mx.sym.Variable(name='label')
         net = mx.sym.Group([net, label])

http://git-wip-us.apache.org/repos/asf/incubator-mxnet-test/blob/cc62aded/example/ssd/symbol/README.md
----------------------------------------------------------------------
diff --git a/example/ssd/symbol/README.md b/example/ssd/symbol/README.md
new file mode 100644
index 0000000..8fee319
--- /dev/null
+++ b/example/ssd/symbol/README.md
@@ -0,0 +1,49 @@
+## How to compose SSD network on top of mainstream classification networks
+
+1. Have the base network ready in this directory as `name.py`, such as `inceptionv3.py`.
+2. Add configuration to `symbol_factory.py`, an example would be:
+```
+if network == 'vgg16_reduced':
+    if data_shape >= 448:
+        from_layers = ['relu4_3', 'relu7', '', '', '', '', '']
+        num_filters = [512, -1, 512, 256, 256, 256, 256]
+        strides = [-1, -1, 2, 2, 2, 2, 1]
+        pads = [-1, -1, 1, 1, 1, 1, 1]
+        sizes = [[.07, .1025], [.15,.2121], [.3, .3674], [.45, .5196], [.6, .6708], \
+            [.75, .8216], [.9, .9721]]
+        ratios = [[1,2,.5], [1,2,.5,3,1./3], [1,2,.5,3,1./3], [1,2,.5,3,1./3], \
+            [1,2,.5,3,1./3], [1,2,.5], [1,2,.5]]
+        normalizations = [20, -1, -1, -1, -1, -1, -1]
+        steps = [] if data_shape != 512 else [x / 512.0 for x in
+            [8, 16, 32, 64, 128, 256, 512]]
+    else:
+        from_layers = ['relu4_3', 'relu7', '', '', '', '']
+        num_filters = [512, -1, 512, 256, 256, 256]
+        strides = [-1, -1, 2, 2, 1, 1]
+        pads = [-1, -1, 1, 1, 0, 0]
+        sizes = [[.1, .141], [.2,.272], [.37, .447], [.54, .619], [.71, .79], [.88, .961]]
+        ratios = [[1,2,.5], [1,2,.5,3,1./3], [1,2,.5,3,1./3], [1,2,.5,3,1./3], \
+            [1,2,.5], [1,2,.5]]
+        normalizations = [20, -1, -1, -1, -1, -1]
+        steps = [] if data_shape != 300 else [x / 300.0 for x in [8, 16, 32, 64, 100, 300]]
+    return locals()
+elif network == 'inceptionv3':
+    from_layers = ['ch_concat_mixed_7_chconcat', 'ch_concat_mixed_10_chconcat', '', '', '', '']
+    num_filters = [-1, -1, 512, 256, 256, 128]
+    strides = [-1, -1, 2, 2, 2, 2]
+    pads = [-1, -1, 1, 1, 1, 1]
+    sizes = [[.1, .141], [.2,.272], [.37, .447], [.54, .619], [.71, .79], [.88, .961]]
+    ratios = [[1,2,.5], [1,2,.5,3,1./3], [1,2,.5,3,1./3], [1,2,.5,3,1./3], \
+        [1,2,.5], [1,2,.5]]
+    normalizations = -1
+    steps = []
+    return locals()
+```
+Here `from_layers` indicate the feature layer you would like to extract from the base network.
+`''` indicate that we want add extra new layers on top of the last feature layer,
+and the number of filters must be specified in `num_filters`. Similarly, `strides` and `pads`
+are required to compose these new layers. `sizes` and `ratios` are the parameters controlling
+the anchor generation algorithm. `normalizations` is used to normalize and rescale feature if
+not `-1`. `steps`: optional, used to calculate the anchor sliding steps.
+
+3. Train or test with arguments `--network name --data-shape xxx --pretrained pretrained_model`

http://git-wip-us.apache.org/repos/asf/incubator-mxnet-test/blob/cc62aded/example/ssd/symbol/common.py
----------------------------------------------------------------------
diff --git a/example/ssd/symbol/common.py b/example/ssd/symbol/common.py
index 12ea718..474d3ea 100644
--- a/example/ssd/symbol/common.py
+++ b/example/ssd/symbol/common.py
@@ -29,6 +29,42 @@ def conv_act_layer(from_layer, name, num_filter, kernel=(1,1), pad=(0,0), \
     ----------
     (conv, relu) mx.Symbols
     """
+    conv = mx.symbol.Convolution(data=from_layer, kernel=kernel, pad=pad, \
+        stride=stride, num_filter=num_filter, name="{}_conv".format(name))
+    if use_batchnorm:
+        conv = mx.symbol.BatchNorm(data=conv, name="{}_bn".format(name))
+    relu = mx.symbol.Activation(data=conv, act_type=act_type, \
+        name="{}_{}".format(name, act_type))
+    return relu
+
+def legacy_conv_act_layer(from_layer, name, num_filter, kernel=(1,1), pad=(0,0), \
+    stride=(1,1), act_type="relu", use_batchnorm=False):
+    """
+    wrapper for a small Convolution group
+
+    Parameters:
+    ----------
+    from_layer : mx.symbol
+        continue on which layer
+    name : str
+        base name of the new layers
+    num_filter : int
+        how many filters to use in Convolution layer
+    kernel : tuple (int, int)
+        kernel size (h, w)
+    pad : tuple (int, int)
+        padding size (h, w)
+    stride : tuple (int, int)
+        stride size (h, w)
+    act_type : str
+        activation type, can be relu...
+    use_batchnorm : bool
+        whether to use batch normalization
+
+    Returns:
+    ----------
+    (conv, relu) mx.Symbols
+    """
     assert not use_batchnorm, "batchnorm not yet supported"
     bias = mx.symbol.Variable(name="conv{}_bias".format(name),
         init=mx.init.Constant(0.0), attr={'__lr_mult__': '2.0'})
@@ -40,9 +76,66 @@ def conv_act_layer(from_layer, name, num_filter, kernel=(1,1), pad=(0,0), \
         relu = mx.symbol.BatchNorm(data=relu, name="bn{}".format(name))
     return conv, relu
 
+def multi_layer_feature(body, from_layers, num_filters, strides, pads, min_filter=128):
+    """Wrapper function to extract features from base network, attaching extra
+    layers and SSD specific layers
+
+    Parameters
+    ----------
+    from_layers : list of str
+        feature extraction layers, use '' for add extra layers
+        For example:
+        from_layers = ['relu4_3', 'fc7', '', '', '', '']
+        which means extract feature from relu4_3 and fc7, adding 4 extra layers
+        on top of fc7
+    num_filters : list of int
+        number of filters for extra layers, you can use -1 for extracted features,
+        however, if normalization and scale is applied, the number of filter for
+        that layer must be provided.
+        For example:
+        num_filters = [512, -1, 512, 256, 256, 256]
+    strides : list of int
+        strides for the 3x3 convolution appended, -1 can be used for extracted
+        feature layers
+    pads : list of int
+        paddings for the 3x3 convolution, -1 can be used for extracted layers
+    min_filter : int
+        minimum number of filters used in 1x1 convolution
+
+    Returns
+    -------
+    list of mx.Symbols
+
+    """
+    # arguments check
+    assert len(from_layers) > 0
+    assert isinstance(from_layers[0], str) and len(from_layers[0].strip()) > 0
+    assert len(from_layers) == len(num_filters) == len(strides) == len(pads)
+
+    internals = body.get_internals()
+    layers = []
+    for k, params in enumerate(zip(from_layers, num_filters, strides, pads)):
+        from_layer, num_filter, s, p = params
+        if from_layer.strip():
+            # extract from base network
+            layer = internals[from_layer.strip() + '_output']
+            layers.append(layer)
+        else:
+            # attach from last feature layer
+            assert len(layers) > 0
+            assert num_filter > 0
+            layer = layers[-1]
+            num_1x1 = max(min_filter, num_filter // 2)
+            conv_1x1 = conv_act_layer(layer, 'multi_feat_%d_conv_1x1' % (k),
+                num_1x1, kernel=(1, 1), pad=(0, 0), stride=(1, 1), act_type='relu')
+            conv_3x3 = conv_act_layer(conv_1x1, 'multi_feat_%d_conv_3x3' % (k),
+                num_filter, kernel=(3, 3), pad=(p, p), stride=(s, s), act_type='relu')
+            layers.append(conv_3x3)
+    return layers
+
 def multibox_layer(from_layers, num_classes, sizes=[.2, .95],
                     ratios=[1], normalization=-1, num_channels=[],
-                    clip=True, interm_layer=0, steps=[]):
+                    clip=False, interm_layer=0, steps=[]):
     """
     the basic aggregation module for SSD detection. Takes in multiple layers,
     generate multiple object detection targets by customized layers
@@ -106,7 +199,7 @@ def multibox_layer(from_layers, num_classes, sizes=[.2, .95],
         normalization = [normalization] * len(from_layers)
     assert len(normalization) == len(from_layers)
 
-    assert sum(x > 0 for x in normalization) == len(num_channels), \
+    assert sum(x > 0 for x in normalization) <= len(num_channels), \
         "must provide number of channels for each normalized layer"
 
     if steps:
@@ -125,7 +218,8 @@ def multibox_layer(from_layers, num_classes, sizes=[.2, .95],
                 mode="channel", name="{}_norm".format(from_name))
             scale = mx.symbol.Variable(name="{}_scale".format(from_name),
                 shape=(1, num_channels.pop(0), 1, 1),
-                init=mx.init.Constant(normalization[k]))
+                init=mx.init.Constant(normalization[k]),
+                attr={'__wd_mult__': '0.1'})
             from_layer = mx.symbol.broadcast_mul(lhs=scale, rhs=from_layer)
         if interm_layer > 0:
             from_layer = mx.symbol.Convolution(data=from_layer, kernel=(3,3), \

http://git-wip-us.apache.org/repos/asf/incubator-mxnet-test/blob/cc62aded/example/ssd/symbol/inceptionv3.py
----------------------------------------------------------------------
diff --git a/example/ssd/symbol/inceptionv3.py b/example/ssd/symbol/inceptionv3.py
new file mode 100644
index 0000000..1c38ae6
--- /dev/null
+++ b/example/ssd/symbol/inceptionv3.py
@@ -0,0 +1,168 @@
+"""
+Inception V3, suitable for images with around 299 x 299
+
+Reference:
+
+Szegedy, Christian, et al. "Rethinking the Inception Architecture for Computer Vision." arXiv preprint arXiv:1512.00567 (2015).
+"""
+import mxnet as mx
+
+def Conv(data, num_filter, kernel=(1, 1), stride=(1, 1), pad=(0, 0), name=None, suffix=''):
+    conv = mx.sym.Convolution(data=data, num_filter=num_filter, kernel=kernel, stride=stride, pad=pad, no_bias=True, name='%s%s_conv2d' %(name, suffix))
+    bn = mx.sym.BatchNorm(data=conv, name='%s%s_batchnorm' %(name, suffix), fix_gamma=True)
+    act = mx.sym.Activation(data=bn, act_type='relu', name='%s%s_relu' %(name, suffix))
+    return act
+
+
+def Inception7A(data,
+                num_1x1,
+                num_3x3_red, num_3x3_1, num_3x3_2,
+                num_5x5_red, num_5x5,
+                pool, proj,
+                name):
+    tower_1x1 = Conv(data, num_1x1, name=('%s_conv' % name))
+    tower_5x5 = Conv(data, num_5x5_red, name=('%s_tower' % name), suffix='_conv')
+    tower_5x5 = Conv(tower_5x5, num_5x5, kernel=(5, 5), pad=(2, 2), name=('%s_tower' % name), suffix='_conv_1')
+    tower_3x3 = Conv(data, num_3x3_red, name=('%s_tower_1' % name), suffix='_conv')
+    tower_3x3 = Conv(tower_3x3, num_3x3_1, kernel=(3, 3), pad=(1, 1), name=('%s_tower_1' % name), suffix='_conv_1')
+    tower_3x3 = Conv(tower_3x3, num_3x3_2, kernel=(3, 3), pad=(1, 1), name=('%s_tower_1' % name), suffix='_conv_2')
+    pooling = mx.sym.Pooling(data=data, kernel=(3, 3), stride=(1, 1), pad=(1, 1), pool_type=pool, name=('%s_pool_%s_pool' % (pool, name)))
+    cproj = Conv(pooling, proj, name=('%s_tower_2' %  name), suffix='_conv')
+    concat = mx.sym.Concat(*[tower_1x1, tower_5x5, tower_3x3, cproj], name='ch_concat_%s_chconcat' % name)
+    return concat
+
+# First Downsample
+def Inception7B(data,
+                num_3x3,
+                num_d3x3_red, num_d3x3_1, num_d3x3_2,
+                pool,
+                name):
+    tower_3x3 = Conv(data, num_3x3, kernel=(3, 3), pad=(0, 0), stride=(2, 2), name=('%s_conv' % name))
+    tower_d3x3 = Conv(data, num_d3x3_red, name=('%s_tower' % name), suffix='_conv')
+    tower_d3x3 = Conv(tower_d3x3, num_d3x3_1, kernel=(3, 3), pad=(1, 1), stride=(1, 1), name=('%s_tower' % name), suffix='_conv_1')
+    tower_d3x3 = Conv(tower_d3x3, num_d3x3_2, kernel=(3, 3), pad=(0, 0), stride=(2, 2), name=('%s_tower' % name), suffix='_conv_2')
+    pooling = mx.symbol.Pooling(data=data, kernel=(3, 3), stride=(2, 2), pad=(0,0), pool_type="max", name=('max_pool_%s_pool' % name))
+    concat = mx.sym.Concat(*[tower_3x3, tower_d3x3, pooling], name='ch_concat_%s_chconcat' % name)
+    return concat
+
+def Inception7C(data,
+                num_1x1,
+                num_d7_red, num_d7_1, num_d7_2,
+                num_q7_red, num_q7_1, num_q7_2, num_q7_3, num_q7_4,
+                pool, proj,
+                name):
+    tower_1x1 = Conv(data=data, num_filter=num_1x1, kernel=(1, 1), name=('%s_conv' % name))
+    tower_d7 = Conv(data=data, num_filter=num_d7_red, name=('%s_tower' % name), suffix='_conv')
+    tower_d7 = Conv(data=tower_d7, num_filter=num_d7_1, kernel=(1, 7), pad=(0, 3), name=('%s_tower' % name), suffix='_conv_1')
+    tower_d7 = Conv(data=tower_d7, num_filter=num_d7_2, kernel=(7, 1), pad=(3, 0), name=('%s_tower' % name), suffix='_conv_2')
+    tower_q7 = Conv(data=data, num_filter=num_q7_red, name=('%s_tower_1' % name), suffix='_conv')
+    tower_q7 = Conv(data=tower_q7, num_filter=num_q7_1, kernel=(7, 1), pad=(3, 0), name=('%s_tower_1' % name), suffix='_conv_1')
+    tower_q7 = Conv(data=tower_q7, num_filter=num_q7_2, kernel=(1, 7), pad=(0, 3), name=('%s_tower_1' % name), suffix='_conv_2')
+    tower_q7 = Conv(data=tower_q7, num_filter=num_q7_3, kernel=(7, 1), pad=(3, 0), name=('%s_tower_1' % name), suffix='_conv_3')
+    tower_q7 = Conv(data=tower_q7, num_filter=num_q7_4, kernel=(1, 7), pad=(0, 3), name=('%s_tower_1' % name), suffix='_conv_4')
+    pooling = mx.sym.Pooling(data=data, kernel=(3, 3), stride=(1, 1), pad=(1, 1), pool_type=pool, name=('%s_pool_%s_pool' % (pool, name)))
+    cproj = Conv(data=pooling, num_filter=proj, kernel=(1, 1), name=('%s_tower_2' %  name), suffix='_conv')
+    # concat
+    concat = mx.sym.Concat(*[tower_1x1, tower_d7, tower_q7, cproj], name='ch_concat_%s_chconcat' % name)
+    return concat
+
+def Inception7D(data,
+                num_3x3_red, num_3x3,
+                num_d7_3x3_red, num_d7_1, num_d7_2, num_d7_3x3,
+                pool,
+                name):
+    tower_3x3 = Conv(data=data, num_filter=num_3x3_red, name=('%s_tower' % name), suffix='_conv')
+    tower_3x3 = Conv(data=tower_3x3, num_filter=num_3x3, kernel=(3, 3), pad=(0,0), stride=(2, 2), name=('%s_tower' % name), suffix='_conv_1')
+    tower_d7_3x3 = Conv(data=data, num_filter=num_d7_3x3_red, name=('%s_tower_1' % name), suffix='_conv')
+    tower_d7_3x3 = Conv(data=tower_d7_3x3, num_filter=num_d7_1, kernel=(1, 7), pad=(0, 3), name=('%s_tower_1' % name), suffix='_conv_1')
+    tower_d7_3x3 = Conv(data=tower_d7_3x3, num_filter=num_d7_2, kernel=(7, 1), pad=(3, 0), name=('%s_tower_1' % name), suffix='_conv_2')
+    tower_d7_3x3 = Conv(data=tower_d7_3x3, num_filter=num_d7_3x3, kernel=(3, 3), stride=(2, 2), name=('%s_tower_1' % name), suffix='_conv_3')
+    pooling = mx.sym.Pooling(data=data, kernel=(3, 3), stride=(2, 2), pool_type=pool, name=('%s_pool_%s_pool' % (pool, name)))
+    # concat
+    concat = mx.sym.Concat(*[tower_3x3, tower_d7_3x3, pooling], name='ch_concat_%s_chconcat' % name)
+    return concat
+
+def Inception7E(data,
+                num_1x1,
+                num_d3_red, num_d3_1, num_d3_2,
+                num_3x3_d3_red, num_3x3, num_3x3_d3_1, num_3x3_d3_2,
+                pool, proj,
+                name):
+    tower_1x1 = Conv(data=data, num_filter=num_1x1, kernel=(1, 1), name=('%s_conv' % name))
+    tower_d3 = Conv(data=data, num_filter=num_d3_red, name=('%s_tower' % name), suffix='_conv')
+    tower_d3_a = Conv(data=tower_d3, num_filter=num_d3_1, kernel=(1, 3), pad=(0, 1), name=('%s_tower' % name), suffix='_mixed_conv')
+    tower_d3_b = Conv(data=tower_d3, num_filter=num_d3_2, kernel=(3, 1), pad=(1, 0), name=('%s_tower' % name), suffix='_mixed_conv_1')
+    tower_3x3_d3 = Conv(data=data, num_filter=num_3x3_d3_red, name=('%s_tower_1' % name), suffix='_conv')
+    tower_3x3_d3 = Conv(data=tower_3x3_d3, num_filter=num_3x3, kernel=(3, 3), pad=(1, 1), name=('%s_tower_1' % name), suffix='_conv_1')
+    tower_3x3_d3_a = Conv(data=tower_3x3_d3, num_filter=num_3x3_d3_1, kernel=(1, 3), pad=(0, 1), name=('%s_tower_1' % name), suffix='_mixed_conv')
+    tower_3x3_d3_b = Conv(data=tower_3x3_d3, num_filter=num_3x3_d3_2, kernel=(3, 1), pad=(1, 0), name=('%s_tower_1' % name), suffix='_mixed_conv_1')
+    pooling = mx.sym.Pooling(data=data, kernel=(3, 3), stride=(1, 1), pad=(1, 1), pool_type=pool, name=('%s_pool_%s_pool' % (pool, name)))
+    cproj = Conv(data=pooling, num_filter=proj, kernel=(1, 1), name=('%s_tower_2' %  name), suffix='_conv')
+    # concat
+    concat = mx.sym.Concat(*[tower_1x1, tower_d3_a, tower_d3_b, tower_3x3_d3_a, tower_3x3_d3_b, cproj], name='ch_concat_%s_chconcat' % name)
+    return concat
+
+# In[49]:
+
+def get_symbol(num_classes=1000, **kwargs):
+    data = mx.symbol.Variable(name="data")
+    # stage 1
+    conv = Conv(data, 32, kernel=(3, 3), stride=(2, 2), name="conv")
+    conv_1 = Conv(conv, 32, kernel=(3, 3), name="conv_1")
+    conv_2 = Conv(conv_1, 64, kernel=(3, 3), pad=(1, 1), name="conv_2")
+    pool = mx.sym.Pooling(data=conv_2, kernel=(3, 3), stride=(2, 2), pool_type="max", name="pool")
+    # stage 2
+    conv_3 = Conv(pool, 80, kernel=(1, 1), name="conv_3")
+    conv_4 = Conv(conv_3, 192, kernel=(3, 3), name="conv_4")
+    pool1 = mx.sym.Pooling(data=conv_4, kernel=(3, 3), stride=(2, 2), pool_type="max", name="pool1")
+    # stage 3
+    in3a = Inception7A(pool1, 64,
+                       64, 96, 96,
+                       48, 64,
+                       "avg", 32, "mixed")
+    in3b = Inception7A(in3a, 64,
+                       64, 96, 96,
+                       48, 64,
+                       "avg", 64, "mixed_1")
+    in3c = Inception7A(in3b, 64,
+                       64, 96, 96,
+                       48, 64,
+                       "avg", 64, "mixed_2")
+    in3d = Inception7B(in3c, 384,
+                       64, 96, 96,
+                       "max", "mixed_3")
+    # stage 4
+    in4a = Inception7C(in3d, 192,
+                       128, 128, 192,
+                       128, 128, 128, 128, 192,
+                       "avg", 192, "mixed_4")
+    in4b = Inception7C(in4a, 192,
+                       160, 160, 192,
+                       160, 160, 160, 160, 192,
+                       "avg", 192, "mixed_5")
+    in4c = Inception7C(in4b, 192,
+                       160, 160, 192,
+                       160, 160, 160, 160, 192,
+                       "avg", 192, "mixed_6")
+    in4d = Inception7C(in4c, 192,
+                       192, 192, 192,
+                       192, 192, 192, 192, 192,
+                       "avg", 192, "mixed_7")
+    in4e = Inception7D(in4d, 192, 320,
+                       192, 192, 192, 192,
+                       "max", "mixed_8")
+    # stage 5
+    in5a = Inception7E(in4e, 320,
+                       384, 384, 384,
+                       448, 384, 384, 384,
+                       "avg", 192, "mixed_9")
+    in5b = Inception7E(in5a, 320,
+                       384, 384, 384,
+                       448, 384, 384, 384,
+                       "max", 192, "mixed_10")
+    # pool
+    pool = mx.sym.Pooling(data=in5b, kernel=(8, 8), stride=(1, 1), pool_type="avg", name="global_pool")
+    flatten = mx.sym.Flatten(data=pool, name="flatten")
+    fc1 = mx.symbol.FullyConnected(data=flatten, num_hidden=num_classes, name='fc1')
+    softmax = mx.symbol.SoftmaxOutput(data=fc1, name='softmax')
+    return softmax

http://git-wip-us.apache.org/repos/asf/incubator-mxnet-test/blob/cc62aded/example/ssd/symbol/legacy_vgg16_ssd_300.py
----------------------------------------------------------------------
diff --git a/example/ssd/symbol/legacy_vgg16_ssd_300.py b/example/ssd/symbol/legacy_vgg16_ssd_300.py
new file mode 100644
index 0000000..257fdd6
--- /dev/null
+++ b/example/ssd/symbol/legacy_vgg16_ssd_300.py
@@ -0,0 +1,191 @@
+import mxnet as mx
+from common import legacy_conv_act_layer
+from common import multibox_layer
+
+def get_symbol_train(num_classes=20, nms_thresh=0.5, force_suppress=False,
+                     nms_topk=400, **kwargs):
+    """
+    Single-shot multi-box detection with VGG 16 layers ConvNet
+    This is a modified version, with fc6/fc7 layers replaced by conv layers
+    And the network is slightly smaller than original VGG 16 network
+    This is a training network with losses
+
+    Parameters:
+    ----------
+    num_classes: int
+        number of object classes not including background
+    nms_thresh : float
+        non-maximum suppression threshold
+    force_suppress : boolean
+        whether suppress different class objects
+    nms_topk : int
+        apply NMS to top K detections
+
+    Returns:
+    ----------
+    mx.Symbol
+    """
+    data = mx.symbol.Variable(name="data")
+    label = mx.symbol.Variable(name="label")
+
+    # group 1
+    conv1_1 = mx.symbol.Convolution(
+        data=data, kernel=(3, 3), pad=(1, 1), num_filter=64, name="conv1_1")
+    relu1_1 = mx.symbol.Activation(data=conv1_1, act_type="relu", name="relu1_1")
+    conv1_2 = mx.symbol.Convolution(
+        data=relu1_1, kernel=(3, 3), pad=(1, 1), num_filter=64, name="conv1_2")
+    relu1_2 = mx.symbol.Activation(data=conv1_2, act_type="relu", name="relu1_2")
+    pool1 = mx.symbol.Pooling(
+        data=relu1_2, pool_type="max", kernel=(2, 2), stride=(2, 2), name="pool1")
+    # group 2
+    conv2_1 = mx.symbol.Convolution(
+        data=pool1, kernel=(3, 3), pad=(1, 1), num_filter=128, name="conv2_1")
+    relu2_1 = mx.symbol.Activation(data=conv2_1, act_type="relu", name="relu2_1")
+    conv2_2 = mx.symbol.Convolution(
+        data=relu2_1, kernel=(3, 3), pad=(1, 1), num_filter=128, name="conv2_2")
+    relu2_2 = mx.symbol.Activation(data=conv2_2, act_type="relu", name="relu2_2")
+    pool2 = mx.symbol.Pooling(
+        data=relu2_2, pool_type="max", kernel=(2, 2), stride=(2, 2), name="pool2")
+    # group 3
+    conv3_1 = mx.symbol.Convolution(
+        data=pool2, kernel=(3, 3), pad=(1, 1), num_filter=256, name="conv3_1")
+    relu3_1 = mx.symbol.Activation(data=conv3_1, act_type="relu", name="relu3_1")
+    conv3_2 = mx.symbol.Convolution(
+        data=relu3_1, kernel=(3, 3), pad=(1, 1), num_filter=256, name="conv3_2")
+    relu3_2 = mx.symbol.Activation(data=conv3_2, act_type="relu", name="relu3_2")
+    conv3_3 = mx.symbol.Convolution(
+        data=relu3_2, kernel=(3, 3), pad=(1, 1), num_filter=256, name="conv3_3")
+    relu3_3 = mx.symbol.Activation(data=conv3_3, act_type="relu", name="relu3_3")
+    pool3 = mx.symbol.Pooling(
+        data=relu3_3, pool_type="max", kernel=(2, 2), stride=(2, 2), \
+        pooling_convention="full", name="pool3")
+    # group 4
+    conv4_1 = mx.symbol.Convolution(
+        data=pool3, kernel=(3, 3), pad=(1, 1), num_filter=512, name="conv4_1")
+    relu4_1 = mx.symbol.Activation(data=conv4_1, act_type="relu", name="relu4_1")
+    conv4_2 = mx.symbol.Convolution(
+        data=relu4_1, kernel=(3, 3), pad=(1, 1), num_filter=512, name="conv4_2")
+    relu4_2 = mx.symbol.Activation(data=conv4_2, act_type="relu", name="relu4_2")
+    conv4_3 = mx.symbol.Convolution(
+        data=relu4_2, kernel=(3, 3), pad=(1, 1), num_filter=512, name="conv4_3")
+    relu4_3 = mx.symbol.Activation(data=conv4_3, act_type="relu", name="relu4_3")
+    pool4 = mx.symbol.Pooling(
+        data=relu4_3, pool_type="max", kernel=(2, 2), stride=(2, 2), name="pool4")
+    # group 5
+    conv5_1 = mx.symbol.Convolution(
+        data=pool4, kernel=(3, 3), pad=(1, 1), num_filter=512, name="conv5_1")
+    relu5_1 = mx.symbol.Activation(data=conv5_1, act_type="relu", name="relu5_1")
+    conv5_2 = mx.symbol.Convolution(
+        data=relu5_1, kernel=(3, 3), pad=(1, 1), num_filter=512, name="conv5_2")
+    relu5_2 = mx.symbol.Activation(data=conv5_2, act_type="relu", name="relu5_2")
+    conv5_3 = mx.symbol.Convolution(
+        data=relu5_2, kernel=(3, 3), pad=(1, 1), num_filter=512, name="conv5_3")
+    relu5_3 = mx.symbol.Activation(data=conv5_3, act_type="relu", name="relu5_3")
+    pool5 = mx.symbol.Pooling(
+        data=relu5_3, pool_type="max", kernel=(3, 3), stride=(1, 1),
+        pad=(1,1), name="pool5")
+    # group 6
+    conv6 = mx.symbol.Convolution(
+        data=pool5, kernel=(3, 3), pad=(6, 6), dilate=(6, 6),
+        num_filter=1024, name="conv6")
+    relu6 = mx.symbol.Activation(data=conv6, act_type="relu", name="relu6")
+    # drop6 = mx.symbol.Dropout(data=relu6, p=0.5, name="drop6")
+    # group 7
+    conv7 = mx.symbol.Convolution(
+        data=relu6, kernel=(1, 1), pad=(0, 0), num_filter=1024, name="conv7")
+    relu7 = mx.symbol.Activation(data=conv7, act_type="relu", name="relu7")
+    # drop7 = mx.symbol.Dropout(data=relu7, p=0.5, name="drop7")
+
+    ### ssd extra layers ###
+    conv8_1, relu8_1 = legacy_conv_act_layer(relu7, "8_1", 256, kernel=(1,1), pad=(0,0), \
+        stride=(1,1), act_type="relu", use_batchnorm=False)
+    conv8_2, relu8_2 = legacy_conv_act_layer(relu8_1, "8_2", 512, kernel=(3,3), pad=(1,1), \
+        stride=(2,2), act_type="relu", use_batchnorm=False)
+    conv9_1, relu9_1 = legacy_conv_act_layer(relu8_2, "9_1", 128, kernel=(1,1), pad=(0,0), \
+        stride=(1,1), act_type="relu", use_batchnorm=False)
+    conv9_2, relu9_2 = legacy_conv_act_layer(relu9_1, "9_2", 256, kernel=(3,3), pad=(1,1), \
+        stride=(2,2), act_type="relu", use_batchnorm=False)
+    conv10_1, relu10_1 = legacy_conv_act_layer(relu9_2, "10_1", 128, kernel=(1,1), pad=(0,0), \
+        stride=(1,1), act_type="relu", use_batchnorm=False)
+    conv10_2, relu10_2 = legacy_conv_act_layer(relu10_1, "10_2", 256, kernel=(3,3), pad=(0,0), \
+        stride=(1,1), act_type="relu", use_batchnorm=False)
+    conv11_1, relu11_1 = legacy_conv_act_layer(relu10_2, "11_1", 128, kernel=(1,1), pad=(0,0), \
+        stride=(1,1), act_type="relu", use_batchnorm=False)
+    conv11_2, relu11_2 = legacy_conv_act_layer(relu11_1, "11_2", 256, kernel=(3,3), pad=(0,0), \
+        stride=(1,1), act_type="relu", use_batchnorm=False)
+
+    # specific parameters for VGG16 network
+    from_layers = [relu4_3, relu7, relu8_2, relu9_2, relu10_2, relu11_2]
+    sizes = [[.1, .141], [.2,.272], [.37, .447], [.54, .619], [.71, .79], [.88, .961]]
+    ratios = [[1,2,.5], [1,2,.5,3,1./3], [1,2,.5,3,1./3], [1,2,.5,3,1./3], \
+        [1,2,.5], [1,2,.5]]
+    normalizations = [20, -1, -1, -1, -1, -1]
+    steps = [ x / 300.0 for x in [8, 16, 32, 64, 100, 300]]
+    num_channels = [512]
+
+    loc_preds, cls_preds, anchor_boxes = multibox_layer(from_layers, \
+        num_classes, sizes=sizes, ratios=ratios, normalization=normalizations, \
+        num_channels=num_channels, clip=False, interm_layer=0, steps=steps)
+
+    tmp = mx.contrib.symbol.MultiBoxTarget(
+        *[anchor_boxes, label, cls_preds], overlap_threshold=.5, \
+        ignore_label=-1, negative_mining_ratio=3, minimum_negative_samples=0, \
+        negative_mining_thresh=.5, variances=(0.1, 0.1, 0.2, 0.2),
+        name="multibox_target")
+    loc_target = tmp[0]
+    loc_target_mask = tmp[1]
+    cls_target = tmp[2]
+
+    cls_prob = mx.symbol.SoftmaxOutput(data=cls_preds, label=cls_target, \
+        ignore_label=-1, use_ignore=True, grad_scale=1., multi_output=True, \
+        normalization='valid', name="cls_prob")
+    loc_loss_ = mx.symbol.smooth_l1(name="loc_loss_", \
+        data=loc_target_mask * (loc_preds - loc_target), scalar=1.0)
+    loc_loss = mx.symbol.MakeLoss(loc_loss_, grad_scale=1., \
+        normalization='valid', name="loc_loss")
+
+    # monitoring training status
+    cls_label = mx.symbol.MakeLoss(data=cls_target, grad_scale=0, name="cls_label")
+    det = mx.contrib.symbol.MultiBoxDetection(*[cls_prob, loc_preds, anchor_boxes], \
+        name="detection", nms_threshold=nms_thresh, force_suppress=force_suppress,
+        variances=(0.1, 0.1, 0.2, 0.2), nms_topk=nms_topk)
+    det = mx.symbol.MakeLoss(data=det, grad_scale=0, name="det_out")
+
+    # group output
+    out = mx.symbol.Group([cls_prob, loc_loss, cls_label, det])
+    return out
+
+def get_symbol(num_classes=20, nms_thresh=0.5, force_suppress=False,
+               nms_topk=400, **kwargs):
+    """
+    Single-shot multi-box detection with VGG 16 layers ConvNet
+    This is a modified version, with fc6/fc7 layers replaced by conv layers
+    And the network is slightly smaller than original VGG 16 network
+    This is the detection network
+
+    Parameters:
+    ----------
+    num_classes: int
+        number of object classes not including background
+    nms_thresh : float
+        threshold of overlap for non-maximum suppression
+    force_suppress : boolean
+        whether suppress different class objects
+    nms_topk : int
+        apply NMS to top K detections
+
+    Returns:
+    ----------
+    mx.Symbol
+    """
+    net = get_symbol_train(num_classes)
+    cls_preds = net.get_internals()["multibox_cls_pred_output"]
+    loc_preds = net.get_internals()["multibox_loc_pred_output"]
+    anchor_boxes = net.get_internals()["multibox_anchors_output"]
+
+    cls_prob = mx.symbol.SoftmaxActivation(data=cls_preds, mode='channel', \
+        name='cls_prob')
+    out = mx.contrib.symbol.MultiBoxDetection(*[cls_prob, loc_preds, anchor_boxes], \
+        name="detection", nms_threshold=nms_thresh, force_suppress=force_suppress,
+        variances=(0.1, 0.1, 0.2, 0.2), nms_topk=nms_topk)
+    return out